From 836e7cb6af8b704fc91cbec0126c02e2e6593d48 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Tue, 26 Mar 2024 16:30:41 +0100 Subject: [PATCH 001/187] Document ROW_GROUPS_PER_FILE option --- docs/data/parquet/tips.md | 16 ++++++++++++++++ docs/sql/statements/copy.md | 1 + 2 files changed, 17 insertions(+) diff --git a/docs/data/parquet/tips.md b/docs/data/parquet/tips.md index 415f2afb51e..b0245b305ee 100644 --- a/docs/data/parquet/tips.md +++ b/docs/data/parquet/tips.md @@ -50,3 +50,19 @@ COPY ``` See the [Performance Guide on file formats](../../guides/performance/file_formats#parquet-file-sizes) for more tips. + +### The `ROW_GROUPS_PER_FILE` Option + +The `ROW_GROUPS_PER_FILE` parameter creates a new Parquet file if the current one has a specified number of row groups. + +```sql +COPY + (FROM generate_series(100_000)) + TO 'output-directory' + (FORMAT PARQUET, ROW_GROUP_SIZE 20_000, ROW_GROUPS_PER_FILE 2); +``` + +> If multiple threads are active, the number of row groups in a file may slightly exceed the specified number of row groups to limit the amount of locking – similarly to the behaviour of [`FILE_SIZE_BYTES`](../../sql/statements/copy#copy--to-options). +> However, if `PER_THREAD_OUTPUT` is set, only one thread writes to each file, and it becomes accurate again. + +See the [Performance Guide on file formats](../../guides/performance/file_formats#parquet-file-sizes) for more tips. diff --git a/docs/sql/statements/copy.md b/docs/sql/statements/copy.md index bbc0d400811..7eb57f1f9de 100644 --- a/docs/sql/statements/copy.md +++ b/docs/sql/statements/copy.md @@ -175,6 +175,7 @@ The below options are applicable when writing `Parquet` files. | `field_ids` | The `field_id` for each column. Pass `auto` to attempt to infer automatically. | `STRUCT` | (empty) | | `row_group_size_bytes` | The target size of each row group. You can pass either a human-readable string, e.g., '2MB', or an integer, i.e., the number of bytes. This option is only used when you have issued `SET preserve_insertion_order = false;`, otherwise it is ignored. | `BIGINT` | `row_group_size * 1024` | | `row_group_size` | The target size, i.e., number of rows, of each row group. | `BIGINT` | 122880 | +| `row_groups_per_file` | Create a new Parquet file if the current one has a specified number of row groups. If multiple threads are active, the number of row groups in a file may slightly exceed the specified number of row groups to limit the amount of locking – similarly to the behaviour of `file_size_bytes`. However, if `per_thread_output` is set, only one thread writes to each file, and it becomes accurate again. | `BIGINT` | (empty) | Some examples of `FIELD_IDS` are: From d23f4846b2017f5233cd33ae6f306550fd35f032 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Mon, 17 Jun 2024 16:33:29 +0200 Subject: [PATCH 002/187] Introduce ARM64 support on Windows --- docs/dev/building/supported_platforms.md | 3 ++- docs/extensions/overview.md | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/dev/building/supported_platforms.md b/docs/dev/building/supported_platforms.md index b6272d82782..46832a0b6c8 100644 --- a/docs/dev/building/supported_platforms.md +++ b/docs/dev/building/supported_platforms.md @@ -12,7 +12,8 @@ DuckDB officially supports the following platforms: | `osx_amd64` | macOS 12+ (Intel CPUs) | | `osx_arm64` | macOS 12+ (Apple Silicon: M1, M2, M3 CPUs) | | `windows_amd64` | Windows 10+ on Intel and AMD CPUs (x86_64) | +| `windows_arm64` | Windows 10+ on ARM CPUs (AArch64) | -DuckDB can be [built from source]({% link docs/dev/building/build_instructions.md %}) for several other platforms such as Windows with ARM64 CPUs (Qualcomm, Snapdragon, etc.) and macOS 11. +DuckDB can be [built from source]({% link docs/dev/building/build_instructions.md %}) for several other platforms such as FreeBSD and macOS 11. For details on free and commercial support, see the [support policy blog post](https://duckdblabs.com/news/2023/10/02/support-policy#platforms). diff --git a/docs/extensions/overview.md b/docs/extensions/overview.md index a7f0c5615dd..305c325ac09 100644 --- a/docs/extensions/overview.md +++ b/docs/extensions/overview.md @@ -12,7 +12,7 @@ DuckDB has a flexible extension mechanism that allows for dynamically loading ex These may extend DuckDB's functionality by providing support for additional file formats, introducing new types, and domain-specific functionality. > Extensions are loadable on all clients (e.g., Python and R). -> Extensions distributed via the official repository are built and tested on macOS (AMD64 and ARM64), Windows (AMD64) and Linux (AMD64 and ARM64). +> Extensions distributed via the official repository are built and tested on macOS, Windows and Linux. All operating systems are supported for both the AMD64 and the ARM64 architectures. ## Listing Extensions From 16bd8a7e012dc329740113410f1a3cafec39074c Mon Sep 17 00:00:00 2001 From: taniabogatsch <44262898+taniabogatsch@users.noreply.github.com> Date: Wed, 10 Jul 2024 13:01:29 +0200 Subject: [PATCH 003/187] add the default_block_size option, and expand ATTACH page --- docs/configuration/overview.md | 5 +++-- docs/configuration/pragmas.md | 18 ++++++++++++++++++ docs/sql/statements/attach.md | 16 ++++++++++++++++ 3 files changed, 37 insertions(+), 2 deletions(-) diff --git a/docs/configuration/overview.md b/docs/configuration/overview.md index 8579d38d017..2279aa42a8b 100644 --- a/docs/configuration/overview.md +++ b/docs/configuration/overview.md @@ -85,8 +85,8 @@ Configuration options come with different default [scopes]({% link docs/sql/stat ### Global Configuration Options -| Name | Description | Type | Default value | -|----|--------|--|---| +| Name | Description | Type | Default value | +|----------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|---| | `Calendar` | The current calendar | `VARCHAR` | System (locale) calendar | | `TimeZone` | The current time zone | `VARCHAR` | System (locale) timezone | | `access_mode` | Access mode of the database (**AUTOMATIC**, **READ_ONLY** or **READ_WRITE**) | `VARCHAR` | `automatic` | @@ -155,6 +155,7 @@ Configuration options come with different default [scopes]({% link docs/sql/stat | `temp_directory` | Set the directory to which to write temp files | `VARCHAR` | `⟨database_name⟩.tmp` or `.tmp` (in in-memory mode) | | `threads`, `worker_threads` | The number of total threads used by the system. | `BIGINT` | # CPU cores | | `username`, `user` | The username to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | +| `default_block_size` | The default block size when creating new database files via `ATTACH`. | `UBIGINT` | `16384` | ### Local Configuration Options diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index 8e8292e65bc..c2f6bab6b62 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -530,3 +530,21 @@ Disable force parallel query processing: ```sql PRAGMA disable_verify_parallelism; ``` + +## Block Sizes + +When persisting a database to disk, DuckDB writes to a dedicated file containing a list of blocks holding the data. +In the case of a file that only holds very little data, e.g., a small table, the standard block size of 256KB might not be ideal. +Therefore, DuckDB's storage format supports different block sizes. + +There are a few constraints on possible block size values. +- Must be a power of two. +- Must be greater or equal to 16384 (16KB). +- Must be lesser or equal to 262144 (256KB). + +You can set the default block size for all new DuckDB files created by an instance like so: +```sql +SET default_block_size = '16384'; +``` + +It is also possible to set the block size on a per-file basis; see [`ATTACH`]({% link docs/sql/statements/attach.md %}). diff --git a/docs/sql/statements/attach.md b/docs/sql/statements/attach.md index c598a04e20d..19903e3ef12 100644 --- a/docs/sql/statements/attach.md +++ b/docs/sql/statements/attach.md @@ -26,6 +26,12 @@ Attach the database `file.db` in read only mode: ATTACH 'file.db' (READ_ONLY); ``` +Attach the database `file.db` with a block size of 16KB: + +```sql +ATTACH 'file.db' (BLOCK SIZE 16384); +``` + Attach a SQLite database for reading and writing (see the [`sqlite` extension]({% link docs/extensions/sqlite.md %}) for more information): ```sql @@ -93,6 +99,16 @@ USE memory_db;
+## Options + +
+ +| Name | Description | Type | Default value | +|-----------------------------|-----------------------------------------------------------------------------------------------------------------------------|-----------|---------------| +| `access_mode` | Access mode of the database (**AUTOMATIC**, **READ_ONLY**, or **READ_WRITE**) | `VARCHAR` | `automatic` | +| `type` | The file type (**DUCKDB** or **SQLITE**), or deduced from the input string literal (MySQL, PostgreSQL). | `VARCHAR` | `DUCKDB` | +| `block_size` | The block size of a new database file. Must be a power of two and within [16384, 262144]. Cannot be set for existing files. | `UBIGINT` | `262144` | + ## Name Qualification The fully qualified name of catalog objects contains the *catalog*, the *schema* and the *name* of the object. For example: From 5703e22985e78dc3a4c04ac336a6b1ad0d514e76 Mon Sep 17 00:00:00 2001 From: taniabogatsch <44262898+taniabogatsch@users.noreply.github.com> Date: Wed, 10 Jul 2024 13:06:51 +0200 Subject: [PATCH 004/187] small nits --- docs/configuration/overview.md | 2 +- docs/configuration/pragmas.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/configuration/overview.md b/docs/configuration/overview.md index 2279aa42a8b..285d13ab03f 100644 --- a/docs/configuration/overview.md +++ b/docs/configuration/overview.md @@ -155,7 +155,7 @@ Configuration options come with different default [scopes]({% link docs/sql/stat | `temp_directory` | Set the directory to which to write temp files | `VARCHAR` | `⟨database_name⟩.tmp` or `.tmp` (in in-memory mode) | | `threads`, `worker_threads` | The number of total threads used by the system. | `BIGINT` | # CPU cores | | `username`, `user` | The username to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | -| `default_block_size` | The default block size when creating new database files via `ATTACH`. | `UBIGINT` | `16384` | +| `default_block_size` | The default block size when creating new database files via `ATTACH`. | `UBIGINT` | `262144` | ### Local Configuration Options diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index c2f6bab6b62..dedac8534c6 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -534,7 +534,7 @@ PRAGMA disable_verify_parallelism; ## Block Sizes When persisting a database to disk, DuckDB writes to a dedicated file containing a list of blocks holding the data. -In the case of a file that only holds very little data, e.g., a small table, the standard block size of 256KB might not be ideal. +In the case of a file that only holds very little data, e.g., a small table, the default block size of 256KB might not be ideal. Therefore, DuckDB's storage format supports different block sizes. There are a few constraints on possible block size values. From 42bcfaf50fe73e3a01ffa44007127bcaec40ec6d Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Wed, 10 Jul 2024 13:17:50 +0200 Subject: [PATCH 005/187] Nits --- docs/configuration/pragmas.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index dedac8534c6..ae157885ea3 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -538,13 +538,15 @@ In the case of a file that only holds very little data, e.g., a small table, the Therefore, DuckDB's storage format supports different block sizes. There are a few constraints on possible block size values. -- Must be a power of two. -- Must be greater or equal to 16384 (16KB). -- Must be lesser or equal to 262144 (256KB). + +* Must be a power of two. +* Must be greater or equal to 16384 (16 KB). +* Must be lesser or equal to 262144 (256 KB). You can set the default block size for all new DuckDB files created by an instance like so: + ```sql SET default_block_size = '16384'; ``` -It is also possible to set the block size on a per-file basis; see [`ATTACH`]({% link docs/sql/statements/attach.md %}). +It is also possible to set the block size on a per-file basis, see [`ATTACH`]({% link docs/sql/statements/attach.md %}) for details. From f2c17fd1f83f0602f43b4c8a419a83c9449b544d Mon Sep 17 00:00:00 2001 From: ykskb Date: Thu, 8 Aug 2024 23:44:11 +0800 Subject: [PATCH 006/187] Add write_partition_columns option into statement copy page. --- docs/sql/statements/copy.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/sql/statements/copy.md b/docs/sql/statements/copy.md index dc7c66b9dca..2500236d56e 100644 --- a/docs/sql/statements/copy.md +++ b/docs/sql/statements/copy.md @@ -210,6 +210,7 @@ The below options are applicable to all formats written with `COPY`. | `partition_by` | The columns to partition by using a Hive partitioning scheme, see the [partitioned writes section]({% link docs/data/partitioning/partitioned_writes.md %}). | `VARCHAR[]` | (empty) | | `per_thread_output` | Generate one file per thread, rather than one file in total. This allows for faster parallel writing. | `BOOL` | `false` | | `use_tmp_file` | Whether or not to write to a temporary file first if the original file exists (`target.csv.tmp`). This prevents overwriting an existing file with a broken file in case the writing is cancelled. | `BOOL` | `auto` | +| `write_partition_columns` | Whether or not to write partition columns into files. Only has an effect when used with `partition_by`. | `BOOL` | `false` | ### Syntax From 0c532dd116ef515b9d2308db3e119ed76e5a4c83 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Tue, 20 Aug 2024 06:24:25 +0200 Subject: [PATCH 007/187] v1.1.0: Add draft release blog post --- _data/past_releases.csv | 1 + _posts/2024-09-02-announcing-duckdb-110.md | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+) create mode 100644 _posts/2024-09-02-announcing-duckdb-110.md diff --git a/_data/past_releases.csv b/_data/past_releases.csv index a6333156cb0..ef33501fc96 100644 --- a/_data/past_releases.csv +++ b/_data/past_releases.csv @@ -1,4 +1,5 @@ release_date,version_number,codename,duck_species_primary,duck_species_secondary,duck_wikipage,blog_post +2024-09-02,1.1.0,Eatoni,Anas eatoni,Eaton's_pintail,,https://duckdb.org/2024/09/02/announcing-duckdb-110 2024-06-03,1.0.0,Nivis,Anas nivis,Snow duck,,https://duckdb.org/2024/06/03/announcing-duckdb-100 2024-05-22,0.10.3,,,,, 2024-04-17,0.10.2,,,,, diff --git a/_posts/2024-09-02-announcing-duckdb-110.md b/_posts/2024-09-02-announcing-duckdb-110.md new file mode 100644 index 00000000000..6081b1d0a73 --- /dev/null +++ b/_posts/2024-09-02-announcing-duckdb-110.md @@ -0,0 +1,19 @@ +--- +layout: post +title: "Announcing DuckDB 1.1.0" +author: Mark Raasveldt and Hannes Mühleisen +thumb: "/images/blog/thumbs/240902.svg" +excerpt: "The DuckDB team is happy to announce that today we’re releasing DuckDB version 1.1.0, codename “Eatoni”." +--- + +To install the new version, please visit the [installation guide]({% link docs/installation/index.html %}). +For the release notes, see the [release page](https://github.com/duckdb/duckdb/releases/tag/v1.1.0). + +Logos of DuckDB releases + + + +_For press inquiries, please reach out to [Gabor Szarnyas](mailto:gabor@duckdblabs.com)._ From a9df67714f11efae9287a139a5b4c3baddbfcd7f Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Tue, 20 Aug 2024 07:05:52 +0200 Subject: [PATCH 008/187] Bump version --- _config.yml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_config.yml b/_config.yml index 1c4e353f59a..5147e840898 100644 --- a/_config.yml +++ b/_config.yml @@ -20,9 +20,9 @@ description: >- # this means to ignore newlines until "baseurl:" baseurl: "" # the subpath of your site, e.g. /blog url: "https://duckdb.org" # the base hostname & protocol for your site, e.g. http://example.com # Set current version of DuckDB -currentshortduckdbversion: "1.0" -currentduckdbversion: 1.0.0 -currentsnapshotversion: 1.0.1-dev +currentshortduckdbversion: "1.1" +currentduckdbversion: 1.1.0 +currentsnapshotversion: 1.1.1-dev # Java currentjavaversion: 1.0.0 nextjavaversion: 1.1.0 # for java snapshots, should always be the next minor version with a patch version of 0 From 5e791825be03264c309771bb1c3ec7c421a3a6a5 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Wed, 21 Aug 2024 11:16:58 +0200 Subject: [PATCH 009/187] Add storage version --- docs/internals/storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/internals/storage.md b/docs/internals/storage.md index faa923270c4..0d0db22c08a 100644 --- a/docs/internals/storage.md +++ b/docs/internals/storage.md @@ -71,7 +71,7 @@ To see the commits that changed each storage version, see the [commit log](https | Storage version | DuckDB version(s) | |----------------:|---------------------------------| -| 64 | v0.9.x, v0.10.x, v1.0.x | +| 64 | v0.9.x, v0.10.x, v1.0.x, v1.1.x | | 51 | v0.8.0, v0.8.1 | | 43 | v0.7.0, v0.7.1 | | 39 | v0.6.0, v0.6.1 | From 760b6a6b0d0fa36995034492632a53fe0d5d86db Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Wed, 21 Aug 2024 11:22:17 +0200 Subject: [PATCH 010/187] Remove deprecation warning --- docs/data/csv/overview.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/docs/data/csv/overview.md b/docs/data/csv/overview.md index e7fde9ca850..ac6f29727e7 100644 --- a/docs/data/csv/overview.md +++ b/docs/data/csv/overview.md @@ -174,12 +174,6 @@ SELECT * FROM read_csv('flights.csv', header = true); Multiple files can be read at once by providing a glob or a list of files. Refer to the [multiple files section]({% link docs/data/multiple_files/overview.md %}) for more information. -## API Changes - -> Deprecated DuckDB v0.10.0 introduced breaking changes to the `read_csv` function. -> Namely, The `read_csv` function now attempts auto-detecting the CSV parameters, making its behavior identical to the old `read_csv_auto` function. -> If you would like to use `read_csv` with its old behavior, turn off the auto-detection manually by using `read_csv(..., auto_detect = false)`. - ## Writing Using the `COPY` Statement The [`COPY` statement]({% link docs/sql/statements/copy.md %}#copy-to) can be used to load data from a CSV file into a table. This statement has the same syntax as the one used in PostgreSQL. To load the data using the `COPY` statement, we must first create a table with the correct schema (which matches the order of the columns in the CSV file and uses types that fit the values in the CSV file). `COPY` detects the CSV's configuration options automatically. From ff3c42ef14e4dcd54e35d56dd0f082cb89127a30 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 05:38:23 +0200 Subject: [PATCH 011/187] Use variables to store/render the version's hash --- _config.yml | 1 + docs/api/cli/overview.md | 2 +- docs/guides/meta/duckdb_environment.md | 12 ++++++++---- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/_config.yml b/_config.yml index 5147e840898..2e34cdc3e92 100644 --- a/_config.yml +++ b/_config.yml @@ -22,6 +22,7 @@ url: "https://duckdb.org" # the base hostname & protocol for your site, e.g. htt # Set current version of DuckDB currentshortduckdbversion: "1.1" currentduckdbversion: 1.1.0 +currentduckdbhash: "ab123fg" currentsnapshotversion: 1.1.1-dev # Java currentjavaversion: 1.0.0 diff --git a/docs/api/cli/overview.md b/docs/api/cli/overview.md index f5bf685c45a..f50c8a4bff9 100644 --- a/docs/api/cli/overview.md +++ b/docs/api/cli/overview.md @@ -49,7 +49,7 @@ duckdb ``` ```text -v1.0.0 1f98600c2c +v{{ site.currentduckdbversion }} {{ site.currentduckdbhash }} Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. diff --git a/docs/guides/meta/duckdb_environment.md b/docs/guides/meta/duckdb_environment.md index 740b7a03454..ca53c9b708d 100644 --- a/docs/guides/meta/duckdb_environment.md +++ b/docs/guides/meta/duckdb_environment.md @@ -10,12 +10,14 @@ DuckDB provides a number of functions and `PRAGMA` options to retrieve informati The `version()` function returns the version number of DuckDB. ```sql -SELECT version(); +SELECT version() AS version; ``` -| version() | +
+ +| version | |-----------| -| v1.0.0 | +| v{{ site.currentduckdbversion }} | Using a `PRAGMA`: @@ -23,9 +25,11 @@ Using a `PRAGMA`: PRAGMA version; ``` +
+ | library_version | source_id | |-----------------|------------| -| v1.0.0 | 1f98600c2c | +| v{{ site.currentduckdbversion }} | {{ site.currentduckdbhash }} | ## Platform From 1107f1d7e06b0f78c901f3fdf2f70e58cd5c88d7 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 05:45:47 +0200 Subject: [PATCH 012/187] Revert "CI: Disable update docs job" This reverts commit 7673db4f8d889ce893542344ee8ddd3c58e7adfd. --- .github/workflows/update-docs.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/workflows/update-docs.yml b/.github/workflows/update-docs.yml index ee7757f701b..b996fbbf309 100644 --- a/.github/workflows/update-docs.yml +++ b/.github/workflows/update-docs.yml @@ -1,4 +1,6 @@ on: + schedule: + - cron: "0 5 * * 1" # run every Monday day at 5AM workflow_dispatch: {} # allow running manually from the github ui env: From 741d3f1fccab88ac4ac57b4180ff3cb03a9d2ebe Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 06:47:32 +0200 Subject: [PATCH 013/187] Document html_escape and html_unescape functions. Fixes #2511 --- docs/extensions/inet.md | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/docs/extensions/inet.md b/docs/extensions/inet.md index 3deb480172b..8ddfa2cb9fd 100644 --- a/docs/extensions/inet.md +++ b/docs/extensions/inet.md @@ -84,3 +84,31 @@ SELECT cidr, host(cidr) FROM tbl; | 192.168.0.0/16 | 192.168.0.0 | | 127.0.0.1 | 127.0.0.1 | | 2001:db8:3c4d:15::1a2f:1a2b/96 | 2001:db8:3c4d:15::1a2f:1a2b | + +## HTML Escape and Unescape Functions + +```sql +SELECT html_escape('&'); +``` + +```text +┌──────────────────┐ +│ html_escape('&') │ +│ varchar │ +├──────────────────┤ +│ & │ +└──────────────────┘ +``` + +```sql +SELECT html_unescape('&'); +``` + +```text +┌────────────────────────┐ +│ html_unescape('&') │ +│ varchar │ +├────────────────────────┤ +│ & │ +└────────────────────────┘ +``` From 339f6d9548a43234448f36d99c097afa6d4c617c Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 10:10:22 +0200 Subject: [PATCH 014/187] Update landing page --- index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.html b/index.html index 801c319dfcd..99ee0e74657 100644 --- a/index.html +++ b/index.html @@ -12,7 +12,7 @@

DuckDB is a fast in-process analytical database

DuckDB supports a feature-rich SQL dialect complemented with deep integrations into client APIs.
- DuckDB v1.0.0 was released in June 2024. + DuckDB v1.1.0 was released in September 2024.

Installation Documentation From ccb3278c07e5c9c8753af304790235c815d458ab Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 11:04:42 +0200 Subject: [PATCH 015/187] Adjust storage info --- docs/internals/storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/internals/storage.md b/docs/internals/storage.md index 0d0db22c08a..4cbd60b9d8b 100644 --- a/docs/internals/storage.md +++ b/docs/internals/storage.md @@ -71,7 +71,7 @@ To see the commits that changed each storage version, see the [commit log](https | Storage version | DuckDB version(s) | |----------------:|---------------------------------| -| 64 | v0.9.x, v0.10.x, v1.0.x, v1.1.x | +| 64 | v0.9.x, v0.10.x, v1.0.0, v1.1.x | | 51 | v0.8.0, v0.8.1 | | 43 | v0.7.0, v0.7.1 | | 39 | v0.6.0, v0.6.1 | From 85a0e7f0ecd74694efada05add53977d6bc794e6 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 11:05:22 +0200 Subject: [PATCH 016/187] Adjust storage info --- docs/internals/storage.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/internals/storage.md b/docs/internals/storage.md index 4cbd60b9d8b..9e0196e8184 100644 --- a/docs/internals/storage.md +++ b/docs/internals/storage.md @@ -72,10 +72,10 @@ To see the commits that changed each storage version, see the [commit log](https | Storage version | DuckDB version(s) | |----------------:|---------------------------------| | 64 | v0.9.x, v0.10.x, v1.0.0, v1.1.x | -| 51 | v0.8.0, v0.8.1 | -| 43 | v0.7.0, v0.7.1 | -| 39 | v0.6.0, v0.6.1 | -| 38 | v0.5.0, v0.5.1 | +| 51 | v0.8.x | +| 43 | v0.7.x | +| 39 | v0.6.x | +| 38 | v0.5.x | | 33 | v0.3.3, v0.3.4, v0.4.0 | | 31 | v0.3.2 | | 27 | v0.3.1 | From 9cf1a42055d1c8f5a8244b5282052eda896c630a Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 12:48:51 +0200 Subject: [PATCH 017/187] Document IEEE floating-point operation semantics Fixes #3413 --- docs/configuration/pragmas.md | 11 +++++++++++ docs/sql/dialect/postgresql_compatibility.md | 18 +++++++++--------- 2 files changed, 20 insertions(+), 9 deletions(-) diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index d98a185a83d..c94707416b2 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -503,6 +503,17 @@ SELECT * FROM nonexistent_tbl; } ``` +## IEEE Floating-Point Operation Semantics + +DuckDB follows IEEE floating-point operation semantics. If you would like to turn this off, run: + +```sql +SET ieee_floating_point_ops = false; +``` + +In this case, floating point division by zero (e.g., `1.0 / 0.0`, `0.0 / 0.0` and `-1.0 / 0.0`) will all return `NULL`. + + ## Query Verification (for Development) The following `PRAGMA`s are mostly used for development and internal testing. diff --git a/docs/sql/dialect/postgresql_compatibility.md b/docs/sql/dialect/postgresql_compatibility.md index 1be8bfc99c4..9619e2caf19 100644 --- a/docs/sql/dialect/postgresql_compatibility.md +++ b/docs/sql/dialect/postgresql_compatibility.md @@ -26,15 +26,15 @@ SELECT 'Infinity'::FLOAT - 1.0 AS x;
-| Expression | DuckDB | PostgreSQL | IEEE 754 | -| :---------------------- | -------: | ---------: | --------: | -| 1.0 / 0.0 | NULL | error | Infinity | -| 0.0 / 0.0 | NULL | error | Nan | -| -1.0 / 0.0 | NULL | error | -Infinity | -| 'Infinity' / 'Infinity' | NaN | NaN | NaN | -| 1.0 / 'Infinity' | 0.0 | 0.0 | 0.0 | -| 'Infinity' - 'Infinity' | NaN | NaN | NaN | -| 'Infinity' - 1.0 | Infinity | Infinity | Infinity | +| Expression | DuckDB | PostgreSQL | IEEE 754 | +| :---------------------- | --------: | ---------: | --------: | +| 1.0 / 0.0 | Infinity | error | Infinity | +| 0.0 / 0.0 | NaN | error | NaN | +| -1.0 / 0.0 | -Infinity | error | -Infinity | +| 'Infinity' / 'Infinity' | NaN | NaN | NaN | +| 1.0 / 'Infinity' | 0.0 | 0.0 | 0.0 | +| 'Infinity' - 'Infinity' | NaN | NaN | NaN | +| 'Infinity' - 1.0 | Infinity | Infinity | Infinity | ## Division on Integers From f104fad41406fd5fed89ccf414c7426ebbdf19fb Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 15:23:29 +0200 Subject: [PATCH 018/187] Remove Postgres-incompatibility that was fixed in v1.1 Fixes #3418 --- docs/sql/dialect/postgresql_compatibility.md | 16 ---------------- 1 file changed, 16 deletions(-) diff --git a/docs/sql/dialect/postgresql_compatibility.md b/docs/sql/dialect/postgresql_compatibility.md index 925e16475e8..7214adf1d45 100644 --- a/docs/sql/dialect/postgresql_compatibility.md +++ b/docs/sql/dialect/postgresql_compatibility.md @@ -153,19 +153,3 @@ SELECT table_name FROM duckdb_tables(); | mytable | However, the case insensitive matching in the system for identifiers cannot be turned off. - -## Scalar Subqueries - -Subqueries in DuckDB are not required to return a single row. Take the following query for example: - -```sql -SELECT (SELECT 1 UNION SELECT 2) AS b; -``` - -PostgreSQL returns an error: - -```console -ERROR: more than one row returned by a subquery used as an expression -``` - -DuckDB non-deterministically returns either `1` or `2`. From 5aebbfdceffa64c28e68bcb2485f7d7e86233a2e Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 23 Aug 2024 15:27:58 +0200 Subject: [PATCH 019/187] Add Windows Arm64 stable builds --- data/installation-data.yml | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/data/installation-data.yml b/data/installation-data.yml index 6f91055a2f7..6080a0717c2 100644 --- a/data/installation-data.yml +++ b/data/installation-data.yml @@ -31,6 +31,22 @@ Therefore, they need to be updated separately. usage_example: ./duckdb - variant: stable + environment: Command line + platform: Windows + download_method: Direct download + architecture: arm64 + has_sha_512_hash: 'yes' + installation_code: '' + link: https://github.com/duckdb/duckdb/releases/download/v1.0.0/duckdb_cli-windows-arm64.zip + note: >- + Each DuckDB client is installed without relying on any other DuckDB clients. + + For example, the Python library can use a different version than the CLI + client. + + Therefore, they need to be updated separately. + usage_example: ./duckdb +- variant: stable environment: Command line platform: macOS download_method: Package manager @@ -165,6 +181,16 @@ link: https://github.com/duckdb/duckdb/releases/download/v1.0.0/libduckdb-windows-amd64.zip note: '' usage_example: '' +- variant: stable + environment: C/C++ + platform: Windows + download_method: Direct download + architecture: arm64 + has_sha_512_hash: '' + installation_code: '' + link: https://github.com/duckdb/duckdb/releases/download/v1.0.0/libduckdb-windows-arm64.zip + note: '' + usage_example: '' - variant: stable environment: C/C++ platform: macOS @@ -205,6 +231,16 @@ link: https://github.com/duckdb/duckdb/releases/download/v1.0.0/duckdb_odbc-windows-amd64.zip note: '' usage_example: '' +- variant: stable + environment: ODBC + platform: Windows + download_method: Direct download + architecture: arm64 + has_sha_512_hash: '' + installation_code: '' + link: https://github.com/duckdb/duckdb/releases/download/v1.0.0/duckdb_odbc-windows-arm64.zip + note: '' + usage_example: '' - variant: stable environment: ODBC platform: macOS From 49265ac4539f070d08feebeea52b391755cc82e2 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 25 Aug 2024 19:46:43 +0200 Subject: [PATCH 020/187] Document duckdb_variables() metadata function Fixes #3448 --- docs/sql/meta/duckdb_table_functions.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/sql/meta/duckdb_table_functions.md b/docs/sql/meta/duckdb_table_functions.md index 9f6160e7e7f..b2ead8655cc 100644 --- a/docs/sql/meta/duckdb_table_functions.md +++ b/docs/sql/meta/duckdb_table_functions.md @@ -318,6 +318,16 @@ The `duckdb_types()` function provides metadata about the data types available i | `type_category` | The category to which this type belongs. Data types within the same category generally expose similar behavior when values of this type are used in expression. For example, the `NUMERIC` type_category includes integers, decimals, and floating point numbers. | `VARCHAR` | | `internal` | Whether this is an internal (built-in) or a user object. | `BOOLEAN` | +## `duckdb_variables` + +The `duckdb_variables()` function provides metadata about the variables available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `name` | The name of the variable, e.g., `x`. | `VARCHAR` | +| `value` | The value of the variable, e.g. `12`. | `VARCHAR` | +| `type` | The type of the variable, e.g., `INTEGER`. | `VARCHAR` | + ## `duckdb_views` The `duckdb_views()` function provides metadata about the views available in the DuckDB instance. From c8665cbd67a00ba86bad9a16d9e3f40c9001576e Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 25 Aug 2024 20:09:53 +0200 Subject: [PATCH 021/187] Track inet extension being moved out of the tree Fixes #13085 --- docs/extensions/core_extensions.md | 2 +- docs/extensions/inet.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/extensions/core_extensions.md b/docs/extensions/core_extensions.md index 5094f134b1b..4a26d5604b0 100644 --- a/docs/extensions/core_extensions.md +++ b/docs/extensions/core_extensions.md @@ -19,7 +19,7 @@ redirect_from: | [httpfs]({% link docs/extensions/httpfs/overview.md %}) | | Adds support for reading and writing files over an HTTP(S) or S3 connection | yes | http, https, s3 | | [iceberg]({% link docs/extensions/iceberg.md %}) | [GitHub](https://github.com/duckdb/duckdb_iceberg) | Adds support for Apache Iceberg | no | | | [icu]({% link docs/extensions/icu.md %}) | | Adds support for time zones and collations using the ICU library | yes | | -| [inet]({% link docs/extensions/inet.md %}) | | Adds support for IP-related data types and functions | yes | | +| [inet]({% link docs/extensions/inet.md %}) | [GitHub](https://github.com/duckdb/duckdb_inet) | Adds support for IP-related data types and functions | yes | | | [jemalloc]({% link docs/extensions/jemalloc.md %}) | | Overwrites system allocator with jemalloc | no | | | [json]({% link docs/extensions/json.md %}) | | Adds support for JSON operations | yes | | | [mysql]({% link docs/extensions/mysql.md %}) | [GitHub](https://github.com/duckdb/duckdb_mysql) | Adds support for reading from and writing to a MySQL database | no | | diff --git a/docs/extensions/inet.md b/docs/extensions/inet.md index 3deb480172b..ba239575824 100644 --- a/docs/extensions/inet.md +++ b/docs/extensions/inet.md @@ -1,7 +1,7 @@ --- layout: docu title: inet Extension -github_directory: https://github.com/duckdb/duckdb/tree/main/extension/inet +github_repository: https://github.com/duckdb/duckdb_inet --- The `inet` extension defines the `INET` data type for storing [IPv4](https://en.wikipedia.org/wiki/Internet_Protocol_version_4) and [IPv6](https://en.wikipedia.org/wiki/IPv6) Internet addresses. It supports the [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation) for subnet masks (e.g., `198.51.100.0/22`, `2001:db8:3c4d::/48`). From c2d02f1c019a0fb0bf38585f6965c3c34e49f486 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 25 Aug 2024 21:14:57 +0200 Subject: [PATCH 022/187] Update TPC-H documentation. Fixes #2942 --- docs/extensions/tpch.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/extensions/tpch.md b/docs/extensions/tpch.md index d23f6446bd9..18c1baa97ea 100644 --- a/docs/extensions/tpch.md +++ b/docs/extensions/tpch.md @@ -102,7 +102,6 @@ CALL dbgen(sf = 300, children = 10, step = 1); CALL dbgen(sf = 300, children = 10, step = 9); ``` -## Limitations +## Limitation -* The data generator function `dbgen` is single-threaded and does not support concurrency. Running multiple steps to parallelize over different partitions is also not supported at the moment. -* The `tpch(⟨query_id⟩)` function runs a fixed TPC-H query with pre-defined bind parameters (a.k.a. substitution parameters). It is not possible to change the query parameters using the `tpch` extension. +The `tpch(⟨query_id⟩)` function runs a fixed TPC-H query with pre-defined bind parameters (a.k.a. substitution parameters). It is not possible to change the query parameters using the `tpch` extension. To run the queries with the parameters prescribed by the TPC-H benchmark, use a TPC-H framework implementation. From 00a03b2e641da454ba30f5966c6b76bef9073a4c Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 25 Aug 2024 21:47:27 +0200 Subject: [PATCH 023/187] Remove build flag for SQLSmith extension as it's out-of-tree --- docs/dev/building/building_extensions.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/docs/dev/building/building_extensions.md b/docs/dev/building/building_extensions.md index 0c9bc33cba0..4bcfeada717 100644 --- a/docs/dev/building/building_extensions.md +++ b/docs/dev/building/building_extensions.md @@ -71,10 +71,6 @@ When this flag is set, the [`json` extension]({% link docs/extensions/json.md %} When this flag is set, the [`inet` extension]({% link docs/extensions/inet.md %}) is built. -#### `BUILD_SQLSMITH` - -When this flag is set, the [SQLSmith extension](https://github.com/duckdb/duckdb/pull/3410) is built. - ### Debug Flags #### `CRASH_ON_ASSERT` From 973cdf9bc82e57d96e4ee3b2cba1cc2cc3ca9909 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 25 Aug 2024 21:48:52 +0200 Subject: [PATCH 024/187] Add SQLSmith extension page --- docs/extensions/sqlsmith.md | 7 +++++++ 1 file changed, 7 insertions(+) create mode 100644 docs/extensions/sqlsmith.md diff --git a/docs/extensions/sqlsmith.md b/docs/extensions/sqlsmith.md new file mode 100644 index 00000000000..1ecc4e22630 --- /dev/null +++ b/docs/extensions/sqlsmith.md @@ -0,0 +1,7 @@ +--- +layout: docu +title: SQLSmith Extension +github_repository: https://github.com/duckdb/duckdb_sqlsmith +--- + +The `sqlsmith` extension is used for testing. From ec95ef8affe0faaa0f3197fa69b8dcf5b6e00c4d Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Mon, 26 Aug 2024 10:25:15 +0200 Subject: [PATCH 025/187] Update list_position semantics --- docs/sql/functions/list.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/sql/functions/list.md b/docs/sql/functions/list.md index 72d00665908..6d990eee476 100644 --- a/docs/sql/functions/list.md +++ b/docs/sql/functions/list.md @@ -29,7 +29,7 @@ title: List Functions | [`list_has_all(list, sub-list)`](#list_has_alllist-sub-list) | Returns true if all elements of sub-list exist in list. | | [`list_has_any(list1, list2)`](#list_has_anylist1-list2) | Returns true if any elements exist is both lists. | | [`list_intersect(list1, list2)`](#list_intersectlist1-list2) | Returns a list of all the elements that exist in both `l1` and `l2`, without duplicates. | -| [`list_position(list, element)`](#list_positionlist-element) | Returns the index of the element if the list contains the element. | +| [`list_position(list, element)`](#list_positionlist-element) | Returns the index of the element if the list contains the element. If the element is not found, it returns `NULL`. | | [`list_prepend(element, list)`](#list_prependelement-list) | Prepends `element` to `list`. | | [`list_reduce(list, lambda)`](#list_reducelist-lambda) | Returns a single value that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/sql/functions/lambda.md %}#reduce) page for more details. | | [`list_resize(list, size[, value])`](#list_resizelist-size-value) | Resizes the list to contain `size` elements. Initializes new elements with `value` or `NULL` if `value` is not set. | @@ -242,7 +242,7 @@ title: List Functions
-| **Description** | Returns the index of the element if the list contains the element. | +| **Description** | Returns the index of the element if the list contains the element. If the element is not found, it returns `NULL`. | | **Example** | `list_position([1, 2, NULL], 2)` | | **Result** | `2` | | **Aliases** | `list_indexof`, `array_position`, `array_indexof` | From 3e55fa0e7e540e42a491f8fee2d631f4608c2c26 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Mon, 26 Aug 2024 10:36:49 +0200 Subject: [PATCH 026/187] Document map_contains* functions Fixes #3469 --- docs/sql/functions/map.md | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/docs/sql/functions/map.md b/docs/sql/functions/map.md index 7a746ee8b7f..742a735fcdb 100644 --- a/docs/sql/functions/map.md +++ b/docs/sql/functions/map.md @@ -9,6 +9,9 @@ title: Map Functions |:--|:-------| | [`cardinality(map)`](#cardinalitymap) | Return the size of the map (or the number of entries in the map). | | [`element_at(map, key)`](#element_atmap-key) | Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map's keys else an error is returned. | +| [`map_contains(map, key)`](#map_containsmap-key) | Checks if a map contains a given key. | +| [`map_contains_entry(map, key, value)`](#map_contains_entrymap-key-value) | Check if a map contains a given key-value pair. | +| [`map_contains_value(map, value)`](#map_contains_valuemap-value) | Checks if a map contains a given value. | | [`map_entries(map)`](#map_entriesmap) | Return a list of struct(k, v) for each key-value pair in the map. | | [`map_extract(map, key)`](#map_extractmap-key) | Alias of `element_at`. Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map's keys else an error is returned. | | [`map_from_entries(STRUCT(k, v)[])`](#map_from_entriesstructk-v) | Returns a map created from the entries of the array. | @@ -33,6 +36,30 @@ title: Map Functions | **Example** | `element_at(map([100, 5], [42, 43]), 100)` | | **Result** | `[42]` | +#### `map_contains(map, key)` + +
+ +| **Description** | Checks if a map contains a given key. | +| **Example** | `map_contains(MAP {'key1': 10, 'key2': 20, 'key3': 30}, 'key2')` | +| **Result** | `true` | + +#### `map_contains_entry(map, key, value)` + +
+ +| **Description** | Check if a map contains a given key-value pair. | +| **Example** | `map_contains_entry(MAP {'key1': 10, 'key2': 20, 'key3': 30}, 'key2', 20)` | +| **Result** | `true` | + +#### `map_contains_value(map, value)` + +
+ +| **Description** | Checks if a map contains a given value. | +| **Example** | `map_contains_value(MAP {'key1': 10, 'key2': 20, 'key3': 30}, 20)` | +| **Result** | `true` | + #### `map_entries(map)`
From 2776edd0148afd30036143cde43a26779a6a3efd Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Tue, 27 Aug 2024 13:31:01 +0200 Subject: [PATCH 027/187] Make '/' separator more airy --- _data/menu_docs_dev.json | 8 ++++---- docs/sql/statements/attach.md | 2 +- docs/sql/statements/export.md | 2 +- docs/sql/statements/set.md | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/_data/menu_docs_dev.json b/_data/menu_docs_dev.json index f8b489d9574..fae0063f9ff 100644 --- a/_data/menu_docs_dev.json +++ b/_data/menu_docs_dev.json @@ -406,7 +406,7 @@ "url": "alter_view" }, { - "page": "ATTACH/DETACH", + "page": "ATTACH / DETACH", "url": "attach" }, { @@ -470,7 +470,7 @@ "url": "drop" }, { - "page": "EXPORT/IMPORT DATABASE", + "page": "EXPORT / IMPORT DATABASE", "url": "export" }, { @@ -490,7 +490,7 @@ "url": "select" }, { - "page": "SET/RESET", + "page": "SET / RESET", "url": "set" }, { @@ -1162,7 +1162,7 @@ "url": "cloudflare_r2_import" }, { - "page": "DuckDB over HTTPS/S3", + "page": "DuckDB over HTTPS / S3", "url": "duckdb_over_https_or_s3" } ] diff --git a/docs/sql/statements/attach.md b/docs/sql/statements/attach.md index 1d4bdec0086..5ccbad7dbd7 100644 --- a/docs/sql/statements/attach.md +++ b/docs/sql/statements/attach.md @@ -1,6 +1,6 @@ --- layout: docu -title: ATTACH/DETACH Statement +title: ATTACH and DETACH Statement railroad: statements/attach.js --- diff --git a/docs/sql/statements/export.md b/docs/sql/statements/export.md index 56c3c84062e..a7526f153af 100644 --- a/docs/sql/statements/export.md +++ b/docs/sql/statements/export.md @@ -1,6 +1,6 @@ --- layout: docu -title: EXPORT/IMPORT DATABASE Statements +title: EXPORT and IMPORT DATABASE Statements railroad: statements/export.js --- diff --git a/docs/sql/statements/set.md b/docs/sql/statements/set.md index b9ace45a04d..1edc465368e 100644 --- a/docs/sql/statements/set.md +++ b/docs/sql/statements/set.md @@ -1,6 +1,6 @@ --- layout: docu -title: SET/RESET Statements +title: SET and RESET Statements railroad: statements/set.js --- From a344baedb2c23ae3c11afdf4e04c984a26ab33ee Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Tue, 27 Aug 2024 15:41:47 +0200 Subject: [PATCH 028/187] Document SQL-level variables --- _data/menu_docs_dev.json | 4 ++ docs/sql/dialect/friendly_sql.md | 3 ++ docs/sql/statements/set_variable.md | 84 +++++++++++++++++++++++++++++ js/statements/setvariable.js | 34 ++++++++++++ 4 files changed, 125 insertions(+) create mode 100644 docs/sql/statements/set_variable.md create mode 100644 js/statements/setvariable.js diff --git a/_data/menu_docs_dev.json b/_data/menu_docs_dev.json index fae0063f9ff..5e7c7fadf85 100644 --- a/_data/menu_docs_dev.json +++ b/_data/menu_docs_dev.json @@ -493,6 +493,10 @@ "page": "SET / RESET", "url": "set" }, + { + "page": "SET VARIABLE", + "url": "set_variable" + }, { "page": "SUMMARIZE", "url": "summarize" diff --git a/docs/sql/dialect/friendly_sql.md b/docs/sql/dialect/friendly_sql.md index debaa859c43..8c3339d42d1 100644 --- a/docs/sql/dialect/friendly_sql.md +++ b/docs/sql/dialect/friendly_sql.md @@ -28,6 +28,9 @@ DuckDB offers several advanced SQL features as well syntactic sugar to make SQL * Transforming tables: * [`PIVOT`]({% link docs/sql/statements/pivot.md %}) to turn long tables to wide tables. * [`UNPIVOT`]({% link docs/sql/statements/unpivot.md %}) to turn wide tables to long tables. +* Defining SQL-level variables: + * [`SET VARIABLE`]({% link docs/sql/statements/set.md %}#set-variable) + * [`RESET VARIABLE`]({% link docs/sql/statements/set.md %}#reset-variable) ## Query Features diff --git a/docs/sql/statements/set_variable.md b/docs/sql/statements/set_variable.md new file mode 100644 index 00000000000..650a594aa01 --- /dev/null +++ b/docs/sql/statements/set_variable.md @@ -0,0 +1,84 @@ +--- +layout: docu +title: SET VARIABLE and RESET VARIABLE Statements +railroad: statements/setvariable.js +--- + +DuckDB supports the definition of SQL-level variables using the `SET VARIABLE` and `RESET VARIABLE` statements. + +## `SET VARIABLE` + +The `SET VARIABLE` statement assigns a value to a variable, which can be accessed using the `getvariable` call: + +```sql +SET VARIABLE my_var = 30; +SELECT 20 + getvariable('my_var') AS total; +``` + +| total | +|------:| +| 50 | + +If `SET VARIABLE` is invoked on an existing variable, it will overwrite its value: + +```sql +SET VARIABLE my_var = 30; +SET VARIABLE my_var = 100; +SELECT 20 + getvariable('my_var') AS total; +``` + +| total | +|------:| +| 120 | + +Variables can have different types: + +```sql +SET VARIABLE my_date = DATE '2018-07-13'; +SET VARIABLE my_string = 'Hello world'; +SET VARIABLE my_map = MAP {'k1': 10, 'k2': 20}; +``` + +If the variable is not set, the `getvariable` function returns `NULL`: + +```sql +SELECT getvariable('undefined_var') AS result; +``` + +| result | +|--------| +| NULL | + +The `getvariable` function can also be used in a [`COLUMNS` expression]({% link docs/sql/expressions/star.md %}#columns-expression): + +```sql +SET VARIABLE column_to_exclude = 'col1'; +CREATE TABLE tbl AS SELECT 12 AS col0, 34 AS col1, 56 AS col2; +SELECT COLUMNS(c -> c != getvariable('column_to_exclude')) FROM tbl; +``` + +| col0 | col2 | +|-----:|-----:| +| 12 | 56 | + +### Syntax + +
+ +## `RESET VARIABLE` + +The `RESET VARIABLE` statement unsets a variable. + +```sql +SET VARIABLE my_var = 30; +RESET VARIABLE my_var; +SELECT getvariable('my_var') AS my_var; +``` + +| my_var | +|--------| +| NULL | + +### Syntax + +
diff --git a/js/statements/setvariable.js b/js/statements/setvariable.js new file mode 100644 index 00000000000..86a1ca3fc72 --- /dev/null +++ b/js/statements/setvariable.js @@ -0,0 +1,34 @@ +function GenerateSet(options = {}) { + return Diagram([ + AutomaticStack([ + Keyword("SET VARIABLE"), + Expression("variable-name"), + Choice(0, ["=", Keyword("TO")]), + Expression("variable-value") + ]) + ]) +} + +function GenerateReset(options = {}) { + return Diagram([ + AutomaticStack([ + Keyword("RESET VARIABLE"), + Expression("variable-name") + ]) + ]) +} + +function Initialize(options = {}) { + document.getElementById("rrdiagram1").classList.add("limit-width"); + document.getElementById("rrdiagram1").innerHTML = GenerateSet(options).toString(); + document.getElementById("rrdiagram2").classList.add("limit-width"); + document.getElementById("rrdiagram2").innerHTML = GenerateReset(options).toString(); +} + +function Refresh(node_name, set_node) { + options[node_name] = set_node; + Initialize(options); +} + +options = {} +Initialize() From 648af67ef7d0556c69e38b209287d6aca4336223 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Tue, 27 Aug 2024 16:49:20 +0200 Subject: [PATCH 029/187] Adjust indentation --- docs/sql/query_syntax/with.md | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/docs/sql/query_syntax/with.md b/docs/sql/query_syntax/with.md index 596c2365095..28800be7950 100644 --- a/docs/sql/query_syntax/with.md +++ b/docs/sql/query_syntax/with.md @@ -39,18 +39,20 @@ By default, CTEs are inlined into the main query. Inlining can result in duplica ```sql WITH t(x) AS (⟨Q_t⟩) SELECT * -FROM t AS t1, - t AS t2, - t AS t3; +FROM + t AS t1, + t AS t2, + t AS t3; ``` Inlining duplicates the definition of `t` for each reference which results in the following query: ```sql SELECT * -FROM (⟨Q_t⟩) AS t1(x), - (⟨Q_t⟩) AS t2(x), - (⟨Q_t⟩) AS t3(x); +FROM + (⟨Q_t⟩) AS t1(x), + (⟨Q_t⟩) AS t2(x), + (⟨Q_t⟩) AS t3(x); ``` If `⟨Q_t⟩` is expensive, materializing it with the `MATERIALIZED` keyword can improve performance. In this case, `⟨Q_t⟩` is evaluated only once. @@ -58,9 +60,10 @@ If `⟨Q_t⟩` is expensive, materializing it with the `MATERIALIZED` keyword ca ```sql WITH t(x) AS MATERIALIZED (⟨Q_t⟩) SELECT * -FROM t AS t1, - t AS t2, - t AS t3; +FROM + t AS t1, + t AS t2, + t AS t3; ``` ## Recursive CTEs From a97b98986819f2570602cb27b66a53d70ad94980 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Tue, 27 Aug 2024 17:05:31 +0200 Subject: [PATCH 030/187] Document MATERIALIZED, NOT MATERIALIZED, and optimization heuristics --- docs/sql/query_syntax/with.md | 43 ++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 13 deletions(-) diff --git a/docs/sql/query_syntax/with.md b/docs/sql/query_syntax/with.md index 28800be7950..6c8cf95b872 100644 --- a/docs/sql/query_syntax/with.md +++ b/docs/sql/query_syntax/with.md @@ -4,11 +4,13 @@ title: WITH Clause railroad: query_syntax/with.js --- -The `WITH` clause allows you to specify common table expressions (CTEs). Regular (non-recursive) common-table-expressions are essentially views that are limited in scope to a particular query. CTEs can reference each-other and can be nested. [Recursive CTEs](#recursive-ctes) can reference themselves. +The `WITH` clause allows you to specify common table expressions (CTEs). +Regular (non-recursive) common-table-expressions are essentially views that are limited in scope to a particular query. +CTEs can reference each-other and can be nested. [Recursive CTEs](#recursive-ctes) can reference themselves. ## Basic CTE Examples -Create a CTE called “cte” and use it in the main query: +Create a CTE called `cte` and use it in the main query: ```sql WITH cte AS (SELECT 42 AS x) @@ -19,12 +21,12 @@ SELECT * FROM cte; |---:| | 42 | -Create two CTEs, where the second CTE references the first CTE: +Create two CTEs `cte1` and `cte2`, where the second CTE references the first CTE: ```sql WITH - cte AS (SELECT 42 AS i), - cte2 AS (SELECT i * 100 AS x FROM cte) + cte1 AS (SELECT 42 AS i), + cte2 AS (SELECT i * 100 AS x FROM cte1) SELECT * FROM cte2; ``` @@ -32,12 +34,16 @@ SELECT * FROM cte2; |-----:| | 4200 | -## Materialized CTEs +## CTE Materialization -By default, CTEs are inlined into the main query. Inlining can result in duplicate work, because the definition is copied for each reference. Take this query for example: +DuckDB can employ CTE materialization, i.e., inlining CTEs into the main query. +This is perforemd using heuristics: if the CTE performs a grouped aggregation and is queried more than once, it is materialized. +Materialization can be explicitly activated by defining the CTE using `AS MATERIALIZED` and disabled by using `AS NOT MATERIALIZED`. + +Take the following query for example, which invokes the same CTE three times: ```sql -WITH t(x) AS (⟨Q_t⟩) +WITH t(x) AS (⟨complex_query⟩) SELECT * FROM t AS t1, @@ -50,15 +56,26 @@ Inlining duplicates the definition of `t` for each reference which results in th ```sql SELECT * FROM - (⟨Q_t⟩) AS t1(x), - (⟨Q_t⟩) AS t2(x), - (⟨Q_t⟩) AS t3(x); + (⟨complex_query⟩) AS t1(x), + (⟨complex_query⟩) AS t2(x), + (⟨complex_query⟩) AS t3(x); +``` + +If `⟨complex_query⟩` is expensive, materializing it with the `MATERIALIZED` keyword can improve performance. In this case, `⟨complex_query⟩` is evaluated only once. + +```sql +WITH t(x) AS MATERIALIZED (⟨complex_query⟩) +SELECT * +FROM + t AS t1, + t AS t2, + t AS t3; ``` -If `⟨Q_t⟩` is expensive, materializing it with the `MATERIALIZED` keyword can improve performance. In this case, `⟨Q_t⟩` is evaluated only once. +If one wants to disable materialization, use `NOT MATERIALIZED`: ```sql -WITH t(x) AS MATERIALIZED (⟨Q_t⟩) +WITH t(x) AS NOT MATERIALIZED (⟨complex_query⟩) SELECT * FROM t AS t1, From a299e0ed56ffba73976ec9a474a2c53984f2f596 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Wed, 28 Aug 2024 15:05:45 +0200 Subject: [PATCH 031/187] Document changes in ATTACH over HTTP/S3 Fixes #3042 --- .../duckdb_over_https_or_s3.md | 12 ++++++++---- docs/sql/statements/attach.md | 17 +++++++++++++++++ 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/docs/guides/network_cloud_storage/duckdb_over_https_or_s3.md b/docs/guides/network_cloud_storage/duckdb_over_https_or_s3.md index 6787dd27a40..b06d02ecbe1 100644 --- a/docs/guides/network_cloud_storage/duckdb_over_https_or_s3.md +++ b/docs/guides/network_cloud_storage/duckdb_over_https_or_s3.md @@ -14,10 +14,12 @@ This guide requires the [`httpfs` extension]({% link docs/extensions/httpfs/over To connect to a DuckDB database via HTTPS, use the [`ATTACH` statement]({% link docs/sql/statements/attach.md %}) as follows: ```sql -LOAD httpfs; -ATTACH 'https://blobs.duckdb.org/databases/stations.duckdb' AS stations_db (READ_ONLY); +ATTACH 'https://blobs.duckdb.org/databases/stations.duckdb' AS stations_db; ``` +> Since DuckDB version 1.1, the `ATTACH` statement creates a read-only connection to HTTP endpoints. +> In prior versions, it is necessary to use the `READ_ONLY` flag. + Then, the database can be queried using: ```sql @@ -35,10 +37,12 @@ To connect to a DuckDB database via the S3 API, [configure the authentication]({ Then, use the [`ATTACH` statement]({% link docs/sql/statements/attach.md %}) as follows: ```sql -LOAD httpfs; -ATTACH 's3://duckdb-blobs/databases/stations.duckdb' AS stations_db (READ_ONLY); +ATTACH 's3://duckdb-blobs/databases/stations.duckdb' AS stations_db; ``` +> Since DuckDB version 1.1, the `ATTACH` statement creates a read-only connection to HTTP endpoints. +> In prior versions, it is necessary to use the `READ_ONLY` flag. + The database can be queried using: ```sql diff --git a/docs/sql/statements/attach.md b/docs/sql/statements/attach.md index 1d4bdec0086..61d6d13cd68 100644 --- a/docs/sql/statements/attach.md +++ b/docs/sql/statements/attach.md @@ -76,6 +76,23 @@ USE file; `ATTACH` allows DuckDB to operate on multiple database files, and allows for transfer of data between different database files. +`ATTACH` supports HTTP and S3 endpoints. For these, it creates a read-only connection by default. +Therefore, the following two commands are equivalent: + +```sql +ATTACH 'https://blobs.duckdb.org/databases/stations.duckdb' AS stations_db; +ATTACH 'https://blobs.duckdb.org/databases/stations.duckdb' AS stations_db (READ_ONLY); +``` + +Similarly, the following two commands connecting to S3 are equivalent: + +```sql +ATTACH 's3://duckdb-blobs/databases/stations.duckdb' AS stations_db; +ATTACH 's3://duckdb-blobs/databases/stations.duckdb' AS stations_db (READ_ONLY); +``` + +> Prior to DuckDB version, it was necessary to specify the `READ_ONLY` flag for HTTP and S3 endpoints. + ## Detach The `DETACH` statement allows previously attached database files to be closed and detached, releasing any locks held on the database file. From 4220fe20d30abfdb57b8f3d57b69836ea611531f Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 29 Aug 2024 14:31:17 +0200 Subject: [PATCH 032/187] Release preparation --- _posts/2024-09-02-announcing-duckdb-110.md | 15 ++++++++++----- images/240902.svg | 16 ++++++++++++++++ images/blog/thumbs/240902.png | Bin 0 -> 21363 bytes images/blog/thumbs/240902.svg | 16 ++++++++++++++++ 4 files changed, 42 insertions(+), 5 deletions(-) create mode 100644 images/240902.svg create mode 100644 images/blog/thumbs/240902.png create mode 100644 images/blog/thumbs/240902.svg diff --git a/_posts/2024-09-02-announcing-duckdb-110.md b/_posts/2024-09-02-announcing-duckdb-110.md index 6081b1d0a73..c9130884b5b 100644 --- a/_posts/2024-09-02-announcing-duckdb-110.md +++ b/_posts/2024-09-02-announcing-duckdb-110.md @@ -9,11 +9,16 @@ excerpt: "The DuckDB team is happy to announce that today we’re releasing Duck To install the new version, please visit the [installation guide]({% link docs/installation/index.html %}). For the release notes, see the [release page](https://github.com/duckdb/duckdb/releases/tag/v1.1.0). -Logos of DuckDB releases +> Some packages (R, Java) take a few extra days to release due to the reviews required in the release pipelines. +This is our first minor release since we have released version 1.0.0 three months ago. +## What's New in 1.1.0 -_For press inquiries, please reach out to [Gabor Szarnyas](mailto:gabor@duckdblabs.com)._ +There have been too many changes to discuss them each in detail, but we would like to highlight several particularly exciting features! + +... + +## Final Thoughts + +These were a few highlights – but there are many more features and improvements in this release. Below are a few more highlights. The full release notes can be [found on GitHub](https://github.com/duckdb/duckdb/releases/tag/v1.1.0). diff --git a/images/240902.svg b/images/240902.svg new file mode 100644 index 00000000000..326290edbbe --- /dev/null +++ b/images/240902.svg @@ -0,0 +1,16 @@ + + + + + + + + + + + + + + + + diff --git a/images/blog/thumbs/240902.png b/images/blog/thumbs/240902.png new file mode 100644 index 0000000000000000000000000000000000000000..ade9429c1f8b40f9e99fd23c7f27c44224db99ec GIT binary patch literal 21363 zcmeIacRbba`#AnIqCs9oGD0OIDtl#?q9K%w?CcRo*0G%iQlzpsDKjK1jv3i2WW+hP zW1fS9WBcCEqu2ZW`ThC*=l6L0evi+2Jj(NV&+ERf>%Q)5-Opf6byfPKY)2soqQ8CX z#sdgCQ~^O$(uZk432py{a`5Mf)2&A?5OnD-{Ey2o%?v?Zr!zEEcg8AHZZgvXkRgakX$f8=lydmF6#N&u`A-D}{FndvDCh=3N++lh z@L%ZlSr01suO@XZJNy@Q{xBB)3o%{$-y!~A!-bN=rImek3J7&*=`vul;sTHRfCO#v zfLNgFB%mGY5KO^Ldck{uMga${&cUQRHf8`g+|E3B9|`_mLLn{!wg~-s7gU*IVIter zy#ze)^`ENGfFC5IVqmf|!~BsV$-M2K0YNpC8&Q-HN?))1A-S%hlnH`*pHm5fSp(p? zGXp-uZ1hzL{=b;aFvobn+?vOtFC7Em96A9SvB=QsP(sk|=nw!y#c`Gg@Y`2YxCgg& z;t{~34CH4%80KAg4B-c5)!xk?z&iDW0q>#S>$^V!FRKAy(KWY>1mK3g=i!E;Wi1zU z23dypl|Y~Bo!zzC=i7ktHX^mosp$=0$}~uV63xbPnm<&2>%1GK-s1F z+pEuH^qz&$8)(&rcLcabkzdXukvJILH7PJabU%;wV89FN2>fEGjud)y63jsbBTrb9 zwfr6utg()a49(U?HS!xwRAFJ(^631-0JW*)`C|n41VFz)@`}$k=;&JlOw&cf+%Y1P z35H}j1TU(pu!Z_R&-$Ka+Pq+H8yJw)cFR`oDDsH!;Sp!vB+PMc-vMKWgus}3Zn@A= ziFvbI4dkHdOP{-xpvZUXs5cs*;1?t4p!ZJh)5-@+7z@rY7KoB1-RB|!@_gap+wL~t zPYHsxq)5V85I{dE8vtUaHw=c5hZRkH2c|-m!)y{GRse{QS@;a_px(gPwL3ep&J*3HviLc=EfBxs4SKi1_!Bm*bgHyye$r)JWV23~3XAq* zF1v>SV_ixZrC*B0x=zKOvs?Z+_+l*T(&l(4sDM)Zs1Ye{qNcc71V+jpE`eSH-u&5G zgOOdmJQC$K4VUqMSfFCM$X6m%SwwUnd@>0Z4yE`!m(mPLm^3aJC*B5I1K{H35)IpB zZe+uvF^n20r4o$&-m7Og+6~JrbR2H-PsKB<=8O2&sLm>YN2n$iv^XK^8kSpB0(cri z5g99<=^K2zVD|G4BcKKZ83M?AKX8zrAA${49Nsf*ZgQd=?wtuXiXC@j+RRbj9F`I}EO{{&T6X zrgx)Ne5PK*Q=s0Ttax_l(YFiZEBWL(LjYzjHJ$R8V6TLcVcHAk!s-|w^w#783l#Fa zHMMdIu7rvKO(Ulsk_L3i>$L;33bX=IAP@5nq$&2{lxFk59ascU&2R4*B^k$W0EmvXwV6AM3gQNz!IXC8vx4JlC_#?~CNoau1m zCAUL@x@&*iF@a~}bn;v_Cu`ZIg<$CkBtq~geM0yw8HsU#v>nZ^U?W4&J#G&w6SHz4 zPaqU8K!T7628Kn6@d&mi+K{|~p_Hf8h|y!tk+<{2;U%EbPk?q`(d0hqpigyEv*WxP zsDW$*YS}m5SVog^C>}LRzw>jY0IonafOdJG?{esn;RiB9DH?MC7}#5`Mqco-1u3P0 za%5!i5i(E2fn&t189oCcC={G|PiJ9)cko=$3!e}`8~vvub&A<7=}^=aG74?vYT5M< z7$F5l}r;Lgp^xjfiwXdhg=R&1D;XW4`aKWdBLs++YlU@mwCzTp$TgBpA>*i z4`8}Rjj$=;Ng_|p|7!TqWvd_Lt{*^Gv%g(~L2=1|7-$YadXG-|$?Qo+LCRU@(?tUx zZ<0Izgij6dgZ0PDB^I~gXC)C>z=GfZg$}6)n4{($L(^`yg+yX~P+$0K3&gJO%@Ad0;?5zz9Wb2igX_Jye-9}8rwyTJpojaRgoHAH? zFU5xRs$yk^J@?hm%)VTt8`#`4TlT8j&vn^d3$R%2*r;?|3;do<^jn;pH@D>bkiWRY z0zt3zRy+4MW-};?x+6`FK{mN`gEG0ks{{7STk*8xq~v?HQuoyZt*JXpi!r71H=Q8V z+nx;Wy(QzW^MEojqx4D!7EOM;xm3#y@j`Cgqy_yCRt<|~SH+(1uZ5Pnjo?P8ARES7 zv@allNG&_+fn{4q<}O;f(|rU-E%hrDJD<-%pjq-^fHbfEx77!WNx!`s#-~eNu7bSZ zw;xYP@s7t!sn@%iR5z!_KuYha3F{{+Qhx%x$6aCY_8I%kSI#(`{{1v>Mm2&pIV4mH z%j-u40b&iQE(J1!)6C{$!f=wGV|0_e5^FV|BtmMF9uiL)#VGt;g;qjY-i=Alyy3M< zTOUzJ)@oUBmZ~f{0(q&gwnUSH7j?wdhVOfs<|htwOG=hxqY2(-MBf7UwLq|ADE|xk z)t0ZjE2*;{V>OKFW%ez-QcxGubL2M6XW-Oz>mF*aMC_~g4AVbdi_9_aX-{18?ZLSF zQbENRJW7F{a2cGOnzW50^30v?*TK2h9>q9<#8xS9&_F;Y{R@w2gs7|HTAT8+51*^n z@iBHN^MBy@{y*hI*FgI@rT?~%d5L#FSw(@mfAT*lVWA1S|AAK$5f*GfU<6gE|D&pC zb{9R~Y@Pnnu{^KJ*=^XZ@!_zj9}>v%7yrRoO-jYy^s|kON>bxc<8j)A%XK7q1Fq0e zDRpY7=IjydBDaU~8^D>X2I|g=-@NV5$cb|}iL855K`(ecN;}8F&Q)ky<3S<}xZTC9 zJyEj^>5h232Q7s16CR_7cPPqFdJkW)@!Q-0ZLelp<=oyZlx{4(NDr+>9wDrYi~)V9 z0*LGqTEsu-M_(rh9B-N(?8mdLIgdh6j1c3&){YUEqSP~HV-JUX4 zgj8r)SO@BVHdzM~7IQhzIav8SHG=M@0+w^6Q2y;1`7;3Zbwxl*xtIZgfqDb<5eJ6h zv+iQ9om>O#iZSIfZ_mh{fgmRlSfx50A*d!QJso$1B>?|Ayo3HUtv)5F(g7;FlR75y zq{c8;XZ3ffl;cT0UfJHJb1=FJPUQgM)&a{Slu&bK3}dmGlv}ub3O z0*$VGyPT_{_9&yBAXj{uRinBm=maZSOQ$P+F=ZXpbC=Rc8h&Lzjlue=YL0{d%Q;89 z{rNlybT%J`=M(W2zN!G@VEL%1R$IJ@e2Ml#dV#3d{^ZzY5tkN0MxdR*D!)xZ2&U{u z{aA!JrdBt7f>zeiJ%$HTY9%X(k&m5*sHEy)az~uyYW1X2T?IXafEZ2*5IyW(H|pI; zdFMlz?JrOOR2wkW3G~Ip8K4^)1NNx4Ymi6n+DH5K=k@-Xh_*EAZP` zz2hHscx(oEY*RC-wv*rt{{x%i{1F56CcSU4Vi|zXAWPUTJ1P!n=eM=QT)^JZ9g3B` z0cL&x1ANHKg4E}+Jun7UUCIw_Bn@*xP=qQxNSjut91|_WJz6I}n37?HQ(*i%;2edR zE|`Hmtg!fl25L>)8QKS&jwl*`7X~>mIx9R1}aBE4)O?GybHQ#Cl;3 z#bzpF+5UeZs2#TL#E0A3f;Vb=+n?@Kno~UNM+aJ|Cjv_*4ko9~cR3G@g}{Jj`=*== zpiBcU`wA?JO)76Hz3&xKXUvX3T~A<4C|CL}d{q`m?fh>M)f9!d;_g%wv_;TGm;S*4 zq-nxUsip)Nz0r?_SN?c?Vg$>v;e?U*IGsfJR?1GZFe=!y?g;>U2ad$9vt=!lV1HAx zR1CjQ?gt#)4%=JeO<)hcjb%j#JZ4$Z-~Mf`6Mvv}KwbZ=yVosM?oCtV+=C!CSf{r^ z;RV7oGALOT;G#HXa*?4WWk}#}-oNAZ4zKPR>BQE&dgMR}bWzSH12Ip{DJ=~}CM92)nv+mU5=^hS=H~l7 z7)^vHYHHO_3((aKrh4x!b)*JigjH#Gao#Y;vz-!x6+{58+**h>Zx@ry)MDHrDDyNt zJFoKtafTe~xHRvYFBidxe_)-!?#Cy7@o@XoP>KRP3!5;8zTL6~wEp{g?r8fo8x536 z3q~-lS3@cjV^{wB8hD*G$=6XAR%-788~nWpMqV!`f3{*bdax89w%=Ojj}0bSkG$R8 zwkC|k`Z8$m*yHr8h;KPWO3buDFE#*c&Ed+WZpPXlx#j20cB)l}S91q5E62)-jf-V1 zqrMxVgQd}avg~yX)tpYoVq$z|Ui*39iR&B`P>s!{Eu8<(Y7eA8+me3>;+FIz#7XaA z3sU0fhS~d6hf8G4%!ofG)(tIJ6_$nuA-~ZPD}Op`Hg z6N~_(+}-k%>I+GQ#*^6Py1^s(o#n5VddWK-3kEG4yY+~AghI3yme#~m^5C3hQ3wJa zlaQukv|q$x-xzAI%ye?UtvazhUhV3)ll!Y4@6sG**=b(LHI!u1>G#D{ND-=;fp;Xg z)8xz;yccUXE#>^4?mwBt?qRlW;EByygdIIGKMgEvk)H8T9-zlBm_{n5TV+PIWr?ii zF3-ztofmH-<_tC*^G7(mB!Rm?0j3PKv4S{<&;1@{(%?&i`*uH-xA}B$Ffbmz!&E)j zx?AP>m7~yE`!;^&3Xf}Y?&&I*M&zEv;=BG0;`%BTL`SwVEsp2x4A+-h^QE|`B4?c& zk$!oydkQweFY-NjXrY>~uxwIx-NBe2FMQLPQBIQA-ml}|c9;t)92+V^Zm#H;_EG;k zr58}Hez~*fz42HpU5uGiNDjelN_-tc`}AycXdZ)v)nL;9BR{mV;WkoPnA-13T(I!Q zua}H}dSd*G0unZa>)%aY4KEGjcw5qPidWX3Ie|wv;rkKIJUXoBBJi@DAvZZAxB8^$ zzPf1j_i79<^mzL#Ci`!#UgCy*Nf3QxPYDUj zk%1_Bns}RF+K^yu<#>Z2YPpmo<8~e(7O3-(>0x94d8MM;h;IhpbJfA9MA|PBz@tP% zuJa?^@m@#1!5riBipBvs1$c6ljH9KM)b;v2`RbB}Ph4KVWD#e(e9F3;^cf*kE$pJ8 z`Erl8Z*gP$RdD;YWQ$cSDLw@PtUbP%w)kg)0TMn6gN5Rd>wW9SOTw^=tktD-HdLQl zp6Pr+xs%7!34ADN7r^eP8(_a2v(#~WGEwmujm0)n7D1w+kG2Q$hUF3rAgbJ+?r4}{ zrLq9)7Y;?&SC&0fUfkX=0I6DfD2tp~y3( zc(kquN2Gl9aW>d@cL0nM&`XBpi@x1pEzeLxhP7>!p68XMrH|-5!c)CMMK*IG6ouz@ zqH|i`aqmy6_+QDLd6B7}Z_jvN1`DEt7o>D2R3GeYH9tOi%VV;DE$$csG6&*Rk%^RrkOt5Eu!nCW||s^$zjoHb4$8;FkWdpP=<) zf4Z@@=dbSe-ahQFPtz27%j9l4wc(#^lz7c|YAjMq@l93BD}}ZZ64u}TBjX^_XPuNz zQk7ed@7VSa{uuFn{Vgs zwz(w@PlxLM2^E1*S7A}K^`+>Qgh;*eu}3jO`YYa%Ql`ec4n*C>wgEu@F5n9Q?%4LW z3ND_qD%9;iCH#-)%3fDiJG!qbI4=|f`?ng#tAe28=ipsTXu;g;^<6*9RY$uSMO&aq zrVtKlZSP&f6Ve|^ehKRvB*hO7jEIYg!6R40d&^g+`z3Nn$08ynRM5sbgOsKi*6`E@ zsCxV4T1l6030dv&U&L$^O7c=-ciyf6i-^27-ELznZ(gtsI>FsBhr^#M^ueDh|G&$U z781M{mzDcVmaigSd}^`l4usl+H|o}xh^+PmyyO^0&7tbP$nv61$cH{I3aFRtsk_)pWi40 z*ZdZ&JdGw$6Q;}4rx-hO@_}fZsv*hiNL& zpW#2PoyT{<+@ImO`PGK_62L(TT$ZpN(Pgb`QhfD}Q=9%_wz(ztUx#%~%B})#gB4$t z%hh%DE=`B3M^8sfVNJg%7WAl*2@Y)kJsl6u$5WsP*vViJLESH}Pg*Q7SH@?xS0iM` zFifAnC{n_v8z5O{Lqp49Y%?dp%C6=3yx-6^d)OlgRdRs5Pg(8#`fTd)Uvdj%GyZUG zJrcijTdEO)=E&aOia%@`@e7Gj>e2+SFP=S^|2Zmby+w>K&0nh%w+(Wuo(zr6N>-g1 z0H#xyKbgrwPl*@X{y=;FaSuPAQC{l9$JTrYh}Q$EJ$bV5n*S#LrsG7tlUDH}Nzk0s z9v|#}-7`KI(*ynK^uhSKmjul?HcC^@ zC1JR7-#xCLEs%{YB5*+|R}Pr;vf0$@5W~o8yZ83*G7(afZw@czPB-Ew|HC^pDMjzr z>JiW5TC;E`&o6CGhDJ;D-&EU`du<9_UVkMFze5<3oi`{xuRi|dX3KNz4df9)wGIn% zgDAiYzPiJlgwDRWFXiVmv#t$CNUwJ+5O%*#nlB{!@9$I_tei`j*NJx1qWcRMg`UB{ zOXQ!m`Bxl@bawm`OJ~!Q(euhjADw$3r3?F@SPlz<zLhKaK`h%N8asg zu=s*Gj_RxXJx>wA{(E#Tm4{ry!!h0*0)M0h8(Z|fsSj{d15Lgp={>t9$nei6XZg%0b{YJ8!Jfimvx%`CUZo{CQ zA&d0ByemY_t6JDK({gm#9be!j%%nP~Fw(*^;ra+P+*Y7wf$%?i3~El+MqbQ#!?oop zd_HJI7=w9u{0sHJr`{dxZ#1!BaH)OCVH5E9ItWA10e>JQti+??oa8ZL7kt{RVcdTjiCCmi4-oh(3$_Ws0wbk*X08+4v;4uL%* zK3pzAK6}VZNs&u&+%2BC|BbGoOzY~1n6yr;xC!a(&?!q!i(`NBhKj6cMB|p&3$I1h z4y@_btLK=r5w?Elo&oX%j+qFXvKxFWlzQ>S7rPrR#aOcMs|jrbXDSUWDOkR`_E~$WRULB;$z(I(YW|N;Nqo&#hEa5d;{UF{}GB#lGD3O;trvwl~RuH)N$!C*7KLPW{)ix@cD zuE5Oj>d%>qo(>b$FJ&DUTv}=>*U%@1_7(UCq6`Qif*D9Y|34mCo)OCy6cd~ z#0_CVzJwvsMNIr$NAZ=Sy7VZ{hd^t+1B64(J4@=;+X$;Q(7hC?GSjhlXDLR`CM6uH*N;X?MZl;!%r%fOGbyiauh{< zq(qo-jl;(<$3H*G4{kr6dFUV{XiB{zEE}Y2tQR;Df@OURboRvlXOoX3dbSmlHBXlh zUCv)D&XEHCgt}4?Qe_D?02J&Wk^9F`57CH>11Gq{NXP78WA|;pCznd#T8hA*Mnv^ zdYLVdC*Kl;C{2r)X}j*cp+=zo#0>bDKs_i zqgXrZdF?I`e`{u{{n>j97gbBz&F;$a}df%1A^k@6W1(r127VB!Flrjl3;9vIBP0@t_88l~5~D|4~j zlDH1d5juvDW@!ipiE+F~**j)HV|;02`EPd0%;W@o)lZ5-5D~iIp?vJJ0W&0Cv!CE2j@l|~EItBh zf_P3**GXXJhJM}qvpfJe99k=|_33s4b>^T>El=#xLJO@;2mvuP0o~0EHneocajee+ zA4BZIq05DBTy_`%!a`k|I2wakZKL87-sICSIwb-Y6-!p{uOMZZn zL(N?dgog4hGwIk;g;%#bx~` zpb?SVOG^MA9Rr9e4Ctr#%6Qed)Js|E$Vx%Q_L%NZut`B=B{elFKgO2|6)lgY5>_FEKh91tcx{zJxK3X~q3Qkn%h z@dA@anbL_bH_zVQgWdQxuf2J>7O4q~Kxw}%`XaDwCuiNAN?4k;(toE$A?VQq@6DE% zhK<}H;OE3!%g$yoHFe@{9EpaVmoVIsw_akQew!uTye=H6MWb z=F#7HS#Q(;y%I--?}F3L>e%I{z$&DuixO0;LJ~Ly)gV-hdh*+iI`3<5^lT|5+y|Wt z*il}30DH*%(o=kNLx9f~=0Sc<2Q?xyYq)W_ab(U8N(;e}G$J41H=9W9*a707`;G&C~ z^*jljF!0_=G4ayElqskJ1~64n^H2^o^t)OA86;h>{VbB&HO*#+Sg0AXi{?Mr?M(0E z?ijA=0>)_XjbHC*9?r?v=gtmpr1({8 z)Jzg9ePU#PNdEX|YKW!*oGFK~!q;1M+wk+o3&qS3(=>4DEN?K93|xN5IDkT8V&Jgfvu#f|OkDI;-Q=~Y6 zIM8bK!s}U_JI!K5s`q%`wpnXwseuVlkR!>_DZ2aWZn&e2kPY{C<&Z7VqlP^}#Nq)Z z#C?t&V^eklsY{X3oglA9Y3Rtv;#pTJ$n7k-ipoM)3Q9;`GU84BPf(5JUC3}*l?9yX zaz$ywv1<6&uHx0y?UHuV5e%f1dzc*QN}k&KkZ(Aqc?ZfAs3Q@D@S&%Tz0Sk#kxa#) z`;?FyA9;M=52>>K(Uabg>0{?V+W}M>_45eM!*kX^=Pj_^!4GVN9)Ty8AO{rn&Vpwe z+@g8V#2|>rO)2+?fItUJBKFUR>Z;`uN1)FB4L!yL$%Q!oeEsTfw1E*NRA)BlEiUSa zSh#_|C#4ZLO#`v1l0$(4O&9TZh2ni(Sa_fi75rAmOk((m42Al>x_dHq!&2aE?o`BjRdP-GzSR!gD(Al%zlE16AdWWnzx97AK+6VKU?tkm+$zFYBF-vshpM zxn2B-?BM3pJhj=qgpLtKBF>~K_Py8CYi-O$5U#KEmn<27jUQ=EN!&N=1)7ffl4CL) znKXPzWBrR2Xlw0~=CJSL|0c}sL=5MHEv%#sJIwhQ5V1X-Mnl0jNNNgfZmS@f<7arM+RbXi zg}7GJDF1xNt=(albJl*LE1MalwX$;dlTtV0C;R29*Kvh^6DH=yM^k5Gu+{Sw`xPDi zTd~IeM5P^kSK+cVj@UfvE7I7x;b#2@oXZ>2p^U{_@|0wEGs?5+mB+eNZ{52Sj*Hfa z`TWW(w_m|?&XmqfVZZ&hcfD1E_B>a4Z;CBVN7$&pdAk+I{@bOqY&9uN(r=9e=N0KtfN!gZ1+~ra{zXv*$c*yAF)q2K_nKWcr43u6B;_)`mHK)Y1n?= z3>&G?5u@S$DG+y63Af;Pv|GRN8_V<9Wb+40xG~u`_+oczV2%l?FW40%@dxueg25 zi%N-g0rX;SihhlA#Bj-f?aX_?^1NbtC6bvmylXR`UuL|D-DSvc7ibs^BlA%)o$Uc~?=}Go)t5`KU-2Mb*st(* z-_OGfUdf)xmCBlLXN621fis5X`3~>yoS<$pF*#9wEJ~5Hw{_n;Jc{znXP5T&mLmh3 z>N)FG>Z}xEXRq|G?j;s1Hw=|mOVxb<3(A)I*{uE!(cLJ^#af=jsTr#^JTg*}LEMOq z>FMlj+$CLA-0zTecV$@lva*pV@9Jj1v+>NAf^0xR%3lw5Bf2}k`F1_NwJ>aI$R0!5 zeWm_U%AqUF{))M+;`I&w{IbxIm>Ko@b>+a8;inbv@>@ZoLm*p4R&w?=#r#-&u}c!W z-szpV{pSKYy!A>8Z-#oPqy@bEzC(hQ z;pI0$#;e~PHrmSa$|{;yRrprdC*nXdfdP6fFU!j$e#bC<-%=N6|4l5r#U@FLS&mSo zrGgXiMxQ>3?sNtA9wt&!o3jvXlHMWI^rJJ7lKG_6E^MS@F{)^xe&ePho4|_U ztYZl#c`7sg643>uWYr}0E8q>ox?jFHH(Ootcy;HEch1g-s(O5qPn(XX z+ci={$DR>D7@U?(A1TN#`oq$-WjCGpUe@g=Uran!qV(OVb38^ap4_J6eT(;hvm>gQ ziaL)8$=|mvm(4Hq`-xAgu+^UGZ1qi?moRw`Y*$#|(ntTYCDrA3c&`+#Qgx?{ILrUQ zHoooHj?4LW_dDL%u4Jx%jCATF-&Ce3CNV6lqGRha)mBuyOZDyW{9Do+ErkFgc$`B` zh;icQkuty3?1b-v7i(&TSM$m?ZxFC!VV|3tNdj-r7;TK9#aoHA>pF4xsvHg4+@#j} zT#|O&N=Fz6FzP`xUFn=?&F)OC@A8YRctNhbXrZB`0iWc$vaJ+!+D0zb({IE*3<*2C zC%u2w@=SaF@^4+Ep50x2*)VpdUR5ElPEs7&atF^}+pzC$AP2-|?|uA}B*L{+Pc5^7 z0vAi%=Hn_ZghJ=!<9fG;Q#2Y18rTQRymwBqqFq|N^TGxzu0{5fuCJ4*pgNF|taQzb z!V%;Rxhl^at{ImY@zhvSdrah(EskS6CLDZ2(VnBzgyZdtmecmSCL@~!jry}w{pS^v zq8aY#9!+L77H#q^BrZpuRMb*PN>9%58!%u*lut^KD!oarX~cDq5~mxOtj;XeCrl2wPEbpO74o&n2^CHzecKGo`!9S%n zoe}26A2zw$G09pDBNZ3#-&C1MuljM;e`IB;0azwFTW)1{GGC6$k0`s?2KoHFd&={L zYnPW4^EBHy5eGbEU1!{3`8&;c&s%6{`iJO@Rb~b796EkVLM$j4s9R6p3u^ADhMBst zEo*PbJ81De#z3$U?^EGe?kO6#kSFW@@OZUnyTlsF^W2v}n`kL)!uF?~KN9TI^6yQw zBiU0ODs?DQ;ADONUZ3<|$M^hQ3>dq0Se;#r)D)FTWojVXWoQo{c*h=^JFC14l=Voz zdkX4=a&v*WywcO(vkiHAUA6NA+}!(1??mDX=BpJr>%L`;vt^~rb5SY9gEeSo4j(ge z8yDBLcX9-7!=EGWW~8s{lhgiux-TZhJsBq(V^gbMKQKdT;14BrMj(5%6K6Urw~Caj z>@r(ircD?gC*IaD&e1k@(6oHO0kMG8Y3OL=NE>c%PC(Sqb8tP{vd!0rFv{>iHadYf z-+LA4E^{3l-Db!Zdwl#zW{J?=lyZA@^XiXhd;+^W^G62^I)=~4tm}VHu6As9%Gl3Y z=*sspr-fKJ;E{FKmhuhnnTC;0xARS}8ziQYe5w>YrR{7B(AE?FIm{LDe(h6@#+G;F zsV2UQjjTQhOIy1G8tfke3Y$@Tm0rr!~&1Oi^I1f4ndQw!&6-zzONt=^^Tqb_b71IWD(0oeDB@kV%Zq*(N!fp+9Wn9@olchi*t&3 znEj=>E|V?)^pLW!hdWyq3p+oRh2E}Ca7G!nYw~y`C)n!TXl-1W@FozjtC{0@LrChy z2UDHlIRaugrViTfu0 zlB)BlntJ&paw(CkA7qVTl(|&TDRuz;Jt$Ml~T|RQP z%g9_TS(jh!NmZvSvUSqs#%ElxV1DrZCSA+6wFx;qjAi(uUx0zGNLAvK&ATOH3)+u* z-Z_8VMwH+abi3MT65sF8YDsmlZAin-Gh$&xn!QS@^dBMt}i9o4&;bHDY@`&8y;w}JCQ0I*?;7e zLu`Y*!S5I6q_ENjqslB|@2r@T>{mNR(jIGgzQMTYx;=1vrL8sKB$}i+v2w=7XZK({51S}*-^-JX zZ9Tpeal%Uh9f3=;TTo}c+F!itRJu8AVb>zQVCZH}QZ|!XL%Idji7o;O3EamNkb)M^ zS?d_i`g1e^~Aoy_sgHu*D_hyzJbJHi`RH_tDzB0m;B@h_me>&;TTMg z60$YVY#K0{B4Wk;kaX7rSIcO6;Ixt#s~gzuiOtg_olTynos($=v(wF}iX_fK`A4%# zI^aSaKb#q6^6gsyV>Y5;;Vz~U<=6P%sXEz|tn!Xdb`DNT@f+FLzmYbRDyAzj5?Lw|zJ992Ol{>)S zFdGEBj3T+DHtmU#)SE3(i}u>n=&(2UE4ou#CQ3tb2H=XwBU?%+AU@W!y1NrGL2dBc z#$I&IMP_ju5avZE2U(UX_SNVFD?*`H(7+ zW=z?`DQvK+4d1BcAAXO_&gHbJyE0a&U3PJlXVF9w6CACrn;P+Djy{4{F@42+A2`D! z(~#=F75qnoQw+5ymNMOKbv*;ZaQ=`DoELT!IcZZoifK3^8J>QOl!Gzw0Ep*8Q*CaT}&K!W~9nkiq)OP;q=2NxMSulJ@T@J$K+oZ-_j;6Mmy(H zDuJj2#6K{PsASU4HR!2A%v7pX`on48zfTYYUVCS`h?YmZib}rwTMWuh5>YCA=(%EUDlB&e>I{BM4@GSB5pL;7SQJ@4 zmjSo;R%@y@u$A5dyXC{}JCi*OH;3f>O%YS}jy)TgUsew2XT6d#r;MhAd$CewCoZt-r}Sjh4FRY=<20v!gvZopVg3 zARHNy6sI~_-EppfDX%58SB} z*(R>}1eA`rhTb6g^m_YJ9TAr-xFF4^TrE%Ti9J@N!?{dQ`qs?<5L zgJsJi1)Ib3cb(Z}5FwN1Px4mmC3c7NV!p3c@oRH-KU#}xZS1%_5q>`|yMJ06`A zPL^8vP3PkK`+lAT>nic4LgREYkIU}}HxsDl5(q?UzDr2YT@hVOx|g`5<5v7bFQB=t zbo)DTYmFk8G3#o;#*e!%;;;JvuK^}SA-dF0g%t7tNItjP8UHZYyLmy)o zVi|=?>%FN%MKAK@3@)BeceDQzZ1ZC6lw+n93)kx2zCeCc6K-J5eUVBj4W>T|JNF7_ zGR&?$B}GJH@B1b;4-A|FWY|quO48^`euHHiH(VU5b~%f-BHByz4}^0O1`g|7-)W!h zXi44Ws~#3D!c^3~>xW$%NhAqm0sFCg;=98gcN7(?kDKen6KY1|eSR&T z)et*c*T5X$YRCsBagF$^{C@sPb5CH7v~7!EKKX9CNL2dkPUQzxHZ8vIWF7h9;W2gA z(Q%sS{@ZsO-$zD3>acsQ&@rybUXvEo z&ZL-@i*c)PrN1@Kr9A5afAzZ%Iqm0rZrnbxz6$OuPOSpl6~y_->bxw9Ebs6%B1>DR zd6lz%Q4t2S`MrHU9jHVvFT z1>4c=TjeDe^GCpePAoe|2Zg|CIrQOAFHS61>&UszKVd!Lw`y8^5hFHLZ@ecvU7mMA z5j*4@=aaO2o^*l1xmb!lmvWcjtVg z&Fj72k6nC<=|`{)JWGDj)oJgjF#lri(uaPsS2H~_-uH)pvzR>(hOZ{?W3?_ztm&m82mW1++I>KpH{Y-@0jPN zFU{{A-&|2oz36|4AV??RUpc)k%@T#3OQ@LMBQ)o)PsO$7{{r`gE}7{U$z}~7!Gx3Y zVs#hPdGg%G##cOZT8KBXGYgG-`#u#qm4i;!I1xLLEfxnNT@c+hnsyD@ee^rzhp36WbHoCP2w`YEsu!@Kg{iLMb86Nu1*F0xPZJ$poD!$To&zMR(*#s?1+{XtF zva6gjVHF+04Lo3o0yqUKkLh7iX?ib?MnlU5tSixCRhjep`o_R+B;^>m&tq=D_rp?6 zdc^L{4yh90MCgpiN1-4i9f@RZjx5slua{-%W!Y(HD+^Ap*1-Y^#m4YNsIaSf467eKL!#92rEJW6L)_tRpJWs7&fHC$p0O?n=Ilo|_AKLb#IOXcW?PKH2yVzHq}+-k zWe^7ESqRk#d)J)+kn&^Pb-{rqEc+L?`Hjf0SGhKew|-gYTxDj<-afUYTFh*IMYI8T zE^~@L21tpWc>Jp-6#_ z=TZYd>B~0x<`;68@=H6+^qlS1OVlG+0_o;o42v1v%=^gtTR)_1ZV{JzUUyhcu~hKR z(`gBCKwqimy?)-h1=o*v6V(ZuzktAE#U45I0~a&6o((baQ(afrQdl1Ng|F(p*6PNr z&A=ERcK^$2Q(}S4Gr4U3p6!I~Vx(`^%Ce3`8?5wR?i;vN|24lXy$KiKv$GZFbft1e zK15OR$^|}jb+rQmoah=aqF{63thGvMLi(A$m$t~npQtn^p`Ev8pBozxYW4Ua6ArJ4 zW|oM3%r%Y$X@1Oa!BbT_W!vPq59#%RZg@)Ri)96b;ru48v&n& z(e;)CcOymUUdRoBlr`9^z{dNcdn-Q-S@o#;VxK#uQ{r<2A+LQqQePDZzfK1KuSA;R zEgLnT?W&Coc~WC*Pn<&FQ(Fu}&b@Gf>~y z#qFI*a>Xo&upWz)+I05q(MpUrIVMB5d3~ib%ja$GcI;tjwUaO70I4coLbJx;0u94EIgwg-i z22u6e8#tX5q;|jGo7fNA*qHeDqd!W|>ha)qFdX-@aC(LHZ2*n8M<}Wlp{V3|=%}YN!Jn*&Xa*_>FpA z4K}Xr$rv_E`QRcJA-g6YU9ff16B;Veb8%SEUt$$ESZ4{(<{DP=R_o^$whw3V!0Of7V<4XoeoJ|8rB34m=gh)YZ)Y7lz-JHOXSPOH zDIEOG8So2T`K6kUmH{7kfZuE)!U?nqfAJ-*qVAroz~Kbn=LEm5!{EN%MKrn$`rG!U zTxPec)akzi6-Vc&oq5>&^Dn@;k1T%}{$1eze0K2lXc1tXhA*RnpWQ{>*fu@O>xcd# zOE%f>z>8H^e$;ZiG<^iuPM+)NofN2RH zQK^h*)|YUA9%Sj>;POASTOuF{uJ(lHUF6opHwSOs>y0P@IzL|lN27I{z z?nHA%8A*d+xp06hmKfmK+ue`B4AFz@(PYF^E&ozI!8d}G=Bb0PD1x7_b-)qwwIcXA zJm>Bn@#QJr5cp;e_^tkIYksB0N2T(2eC7*Cw@x3^?0E8rYXHEWo2%ik!uMI18yTqV z{(jX;pt;L;u6H3dVzZx|un)!&=DRKhvQ@|vp1$DP13t9?5AURYT?+ot1pI8W1R(R5 zlh`(%Nxyx7xKW{=VXj?~@rf`hwu23ain{7s&iH+wX*olEa*wD=0e}AibXK}M$wW$4 zOiCQ@sB!oZ-+jOinwbfsdmEj}jf*~o;qf01sMlr6`c-k$Rq}~{_vU;L*PEdtIMW<_ z3+I7)y5H99lZw`u)5#qT6|2p&sJ=O4_T-!R3Y8lU8{oqqpjC)kV0llO)<$O{$vZv@>^p|xN)m0S zKlnTlRFg238ZO4iy|?igQd&HKc3e*n_o<9~RS-3qbviAoD(o8DJx)sjPgmp392lPNmdL{@R-6C6MGM3Fk6{yV6^et4|gbe6K<&r$7QL zAo?@#HJ3lP$g}>KWD6(lD=*Nb_>za_4@ty8MUiBT9oiiA1;eZ#%<&J5t5tw2doqlg zC9Mc#lZ1uO4b z6#bc?8ta1vr5IX{gQU)l^9PM*>bWQ(X*d7}K1CFsqqeX}gAyWd^^|KMSr-2A4%qAe zkLPm=a+V-f((u(mr*}DO3JFJ`QObkyAzE$9nx8Oa`x%wqBB5t#p7Uq!P?h*@7oVne zDiwSdn0_>(K8=+^>1luYHV$MXBgh<~Ia~}9ov{bg3C}x1f7v!q-?K=%_xp^)gI8$&{{{bi_h64rG)ux#Xfr;BL6o0Ycka z7abvw1A1TYsI!iQmf#CdfPfXo^pM8^y{^N=$rDne%`?he7t{)&bBj zw=q!H0T|~m0R<9{t%3{q3qvDA>`)h3+Mp==iYcJP1DcyY1@{)p51^>A0!(voP{u6C z4HIZkun!S@!{G^r4H8C?M+#pA@YBJU)S()S`EGDQ=Ag-f^naT;sK{CQ*#4eKS*i@L#h=W!$QL3bS{XI!Bk>9NXvIO3$v`P6ge>W5Ze)PT z>&Pn<{x$aL2t@Z+0{-(?exCjJJA(gyHt^pM4IZojg8uz<-i?D(0D49KY#jK`)Bg_e e{|g+9o%TKYs>1JKXFcF9==M$Z8+q4FpZ^~Ul_l8# literal 0 HcmV?d00001 diff --git a/images/blog/thumbs/240902.svg b/images/blog/thumbs/240902.svg new file mode 100644 index 00000000000..326290edbbe --- /dev/null +++ b/images/blog/thumbs/240902.svg @@ -0,0 +1,16 @@ + + + + + + + + + + + + + + + + From c898ca318b57723faab57be20062ecf062a106f2 Mon Sep 17 00:00:00 2001 From: Sam Ansmink Date: Wed, 28 Aug 2024 16:15:41 +0200 Subject: [PATCH 033/187] add explanation of extension versioning scheme --- docs/extensions/versioning_of_extensions.md | 107 +++++++++++++++----- docs/extensions/working_with_extensions.md | 9 ++ 2 files changed, 92 insertions(+), 24 deletions(-) diff --git a/docs/extensions/versioning_of_extensions.md b/docs/extensions/versioning_of_extensions.md index c76239bf891..739a703e9ff 100644 --- a/docs/extensions/versioning_of_extensions.md +++ b/docs/extensions/versioning_of_extensions.md @@ -4,40 +4,99 @@ title: Versioning of Extensions --- ## Extension Versioning +Most software has some sort of version number. Version numbers serve a few important goals: +- Tie a binary to a specific state of the source code +- Allow determining the expected feature set +- Allow determining the state of the APIs +- Allow efficient processing of bug reports (e.g. bug `#1337` was introduced in version `v3.4.5` ) +- Allow determining chronological order of releases (e.g. version `v1.2.3` is older than `v1.2.4`) +- Give an indication of expected stability (e.g. `v0.0.1` is likely not very stable, whereas `v13.11.0` probably is stable) -Just like DuckDB itself, DuckDB extensions have a version. This version can be used by users to determine which features are available -in the extension they have installed, and by developers to understand bug reports. DuckDB extensions can be versioned in different ways: +Just like [DuckDB itself](https://github.com/duckdb/duckdb/releases), DuckDB extensions have their own version number. To ensure consistent semantics +of these version numbers across the various extensions, DuckDB's [Core Extensions]({% link docs/extensions/core_extensions.md %}) use +a versioning scheme that prescribes how extensions should be versioned. The versioning scheme for Core Extensions is made up of 3 different stability levels: **unstable**, **pre-release**, and **stable**. +Let's go over each of the 3 levels and describe their format: -**Extensions whose source lives in DuckDB's main repository** (in-tree extensions) are tagged with the short git hash of the repository. -For example, the parquet extension is built into DuckDB version `v0.10.3` (which has commit `70fd6a8a24`): +### Unstable Extensions +Unstable extensions are extensions that can't (or don't want to) give any guarantees regarding their current stability, +or their goals of becoming stable. Unstable extensions are tagged with the **short git hash** of the extension. -```sql -SELECT extension_name, extension_version, install_mode -FROM duckdb_extensions() -WHERE extension_name='parquet'; -``` +For example, at the time of writing this, the version of the `vss` extension is an unstable extension of version `690bfc5`. -
+What to expect from an extension that has a version number in the **unstable** format? +- The state of the source code of the extension can be found by looking up the hash in the extension repository +- Functionality may change or be removed completely with every release +- This extension's API could change with every release +- This extension may not follow a structured release cycle, new (breaking) versions can be pushed at any time -| extension_name | extension_version | install_mode | -|:------------------|:------------------|:---------------------| -| parquet | 70fd6a8a24 | STATICALLY_LINKED | +### Pre-release extensions +Pre-release extensions are the next step up from Unstable extensions. They are tagged with version in the **[SemVer](https://semver.org/)** format, more specifically, those in the `v0.y.z` format. +In semantic versioning, versions starting with `v0` have a special meaning: they indicate that the more strict semantics of regular (`>v1.0.0`) versions do not yet apply. It basically means that an extensions is working +towards becoming a stable extension, but is not quite there yet. -**Extensions whose source lives in a separate repository** (out-of-tree extensions) have their own version. This version is **either** -the short git hash of the separate repository, **or** the git version tag in [Semantic Versioning](https://semver.org/) format. -For example, in DuckDB version `v0.10.3`, the azure extension could be versioned as follows: +For example, at the time of writing this, the version of the `delta` extension is a pre-release extension of version `v0.1.0`. -```sql -SELECT extension_name, extension_version, install_mode -FROM duckdb_extensions() -WHERE extension_name = 'azure'; +What to expect from an extension that has a version number in the **pre-release** format? +- The extension is compiled from the source code corresponding to the tag. +- Semantic Versioning semantics apply. See the [Semantic Versioning](https://semver.org/) specification for details. +- The extension follows a release cycle where new features are tested in nightly builds before being grouped into a release and pushed to the `core` repository. +- Release notes describing what has been added each release should be available to make it easy to understand the difference between versions. + +### Stable extensions +Stable extensions are the final step of extension stability. This is denoted by using a **stable SemVer** of format `vx.y.z` where `x>0`. + +For example, at the time of writing this, the version of the `parquet` extension is a stable extension of version `v1.0.0`. + +What to expect from an extension that has a version number in the **stable** format? Essentially the same as pre-release extensions, but now the more +strict SemVer semantics apply: the API of the extension should now be stable and will only change in backwards incompatible ways when the major version is bumped. +See the SemVer specification for details + +## Release cycle of pre-release and stable core extensions +In general for extensions the release cycle depends on their stability level. **unstable** extensions are often in +sync with DuckDB's release cycle, but may also be quietly updated between DuckDB releases. **pre-release** and **stable** +extensions follow their own release cycle. These may or may not coincide with DuckDB releases. To find out more about the release cycle of a specific +extension, refer to the documentation or GitHub page of the respective extension. Generally, **pre-release** and **stable** extensions will document +their releases as GitHub releases, an example of which you can see in the [delta extension](https://github.com/duckdb/duckdb_delta/releases). + +Finally, there is a small exception: All [in-tree]({% link docs/extensions/working_with_extensions.md %}#in-tree-vs-out-of-tree) extensions simply +follow DuckDB's release cycle. + +## Nightly builds +Just like DuckDB itself, DuckDB's core extensions have nightly or dev builds that can be used to try out features before they are officially released. This +can be useful when your workflow depends on a new feature, or when you need to confirm that your stack is compatible with the upcoming version. + +Nightly builds for extensions are slightly complicated due to the fact that currently DuckDB extensions binaries are tightly bound to a single DuckDB version. Because of this tight connection, +there is a potential risk for a combinatory explosion. Therefore, not all combinations of nightly extension build and nightly DuckDB build are available. + +In general, there are 2 ways of using nightly builds: using a nightly DuckDB build and using a stable DuckDB build. Let's go over the differences between the two: + +### From stable DuckDB +In most cases, user's will be interested in a nightly build of a specific extension, but don't necessarily want to switch to using the nightly build of DuckDB itself. This allows using a specific bleeding-edge +feature while limiting the exposure to unstable code. + +To achieve this, Core Extensions tend to regularly push builds to the [`core_nightly` repository]({% link docs/extensions/working_with_extensions.md %}#extension-repositories). Let's look at an example: + +First we install a **stable DuckDB build** from [here]({% link docs/installation/index.html %}). + +Then we can install and load a **nightly** extension like this: +```shell +INSTALL aws FROM core_nightly; +LOAD aws; ``` -
+In this example we are using the latest **nightly** build of the aws extension with the latest **stable** version of DuckDB. + +### From nightly DuckDB +When DuckDB CI produces a nightly binary of DuckDB itself, the binaries are distributed with a set of extensions that are pinned at a specific version. This extension version will be tested for that +specific build of DuckDB, but might not be the latest dev build. Let's look at an example: -| extension_name | extension_version | install_mode | -|:---------------|:------------------|:---------------| -| azure | 49b63dc | REPOSITORY | +First we install a **nightly DuckDB build** from [here]({% link docs/installation/index.html %}). + +Then we can install and load the aws extension as expected: +```SQL +INSTALL aws; +LOAD aws; +``` ## Updating Extensions diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index 4ff780e40e2..57c1bab0267 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -227,3 +227,12 @@ For building and installing extensions from source, see the [building guide]({% ## Statically Linking Extensions To statically link extensions, follow the [developer documentation's "Using extension config files" section](https://github.com/duckdb/duckdb/blob/main/extension/README.md#using-extension-config-files). + +## In-tree vs Out-of-tree +Originally, DuckDB extensions lived exclusively in the DuckDB main repository, `github.com/duckdb/duckdb`. These extensions are called in-tree. Later, the concept +of out-of-tree extensions was added, where extensions where separated into their own repository, which we call out-of-tree. + +While from a user's perspective, there are generally no noticeable differences, there are some minor differences related to versioning: +- in-tree extensions use the version of DuckDB instead of having their own version +- in-tree extensions do not have dedicated release notes, their changes are reflected in the regular [DuckDB release notes](https://github.com/duckdb/duckdb/releases) +- core out-of tree extensions tend to live in a repository in `github.com/duckdb/duckdb_` but the name may vary. See the [full list]({% link docs/extensions/core_extensions.md %}) of core extensions for details. \ No newline at end of file From 673980f375a086efbdf857b70bd6f4cdb9cde0c1 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 29 Aug 2024 15:32:57 +0200 Subject: [PATCH 034/187] Delete images/240902.svg --- images/240902.svg | 16 ---------------- 1 file changed, 16 deletions(-) delete mode 100644 images/240902.svg diff --git a/images/240902.svg b/images/240902.svg deleted file mode 100644 index 326290edbbe..00000000000 --- a/images/240902.svg +++ /dev/null @@ -1,16 +0,0 @@ - - - - - - - - - - - - - - - - From abe4f5d0a887b5fbef51445279a9c877884d4ecb Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 29 Aug 2024 15:44:33 +0200 Subject: [PATCH 035/187] Adjust text --- docs/extensions/versioning_of_extensions.md | 70 ++++++++++++--------- docs/extensions/working_with_extensions.md | 10 +-- 2 files changed, 46 insertions(+), 34 deletions(-) diff --git a/docs/extensions/versioning_of_extensions.md b/docs/extensions/versioning_of_extensions.md index 739a703e9ff..a83b9bdb6a2 100644 --- a/docs/extensions/versioning_of_extensions.md +++ b/docs/extensions/versioning_of_extensions.md @@ -4,32 +4,37 @@ title: Versioning of Extensions --- ## Extension Versioning + Most software has some sort of version number. Version numbers serve a few important goals: -- Tie a binary to a specific state of the source code -- Allow determining the expected feature set -- Allow determining the state of the APIs -- Allow efficient processing of bug reports (e.g. bug `#1337` was introduced in version `v3.4.5` ) -- Allow determining chronological order of releases (e.g. version `v1.2.3` is older than `v1.2.4`) -- Give an indication of expected stability (e.g. `v0.0.1` is likely not very stable, whereas `v13.11.0` probably is stable) - -Just like [DuckDB itself](https://github.com/duckdb/duckdb/releases), DuckDB extensions have their own version number. To ensure consistent semantics -of these version numbers across the various extensions, DuckDB's [Core Extensions]({% link docs/extensions/core_extensions.md %}) use + +* Tie a binary to a specific state of the source code +* Allow determining the expected feature set +* Allow determining the state of the APIs +* Allow efficient processing of bug reports (e.g., bug `#1337` was introduced in version `v3.4.5` ) +* Allow determining chronological order of releases (e.g., version `v1.2.3` is older than `v1.2.4`) +* Give an indication of expected stability (e.g., `v0.0.1` is likely not very stable, whereas `v13.11.0` probably is stable) + +Just like [DuckDB itself]({% link docs/dev/release_calendar.md %}), DuckDB extensions have their own version number. To ensure consistent semantics +of these version numbers across the various extensions, DuckDB's [Core Extensions]({% link docs/extensions/core_extensions.md %}) use a versioning scheme that prescribes how extensions should be versioned. The versioning scheme for Core Extensions is made up of 3 different stability levels: **unstable**, **pre-release**, and **stable**. Let's go over each of the 3 levels and describe their format: ### Unstable Extensions + Unstable extensions are extensions that can't (or don't want to) give any guarantees regarding their current stability, or their goals of becoming stable. Unstable extensions are tagged with the **short git hash** of the extension. For example, at the time of writing this, the version of the `vss` extension is an unstable extension of version `690bfc5`. What to expect from an extension that has a version number in the **unstable** format? -- The state of the source code of the extension can be found by looking up the hash in the extension repository -- Functionality may change or be removed completely with every release -- This extension's API could change with every release -- This extension may not follow a structured release cycle, new (breaking) versions can be pushed at any time -### Pre-release extensions +* The state of the source code of the extension can be found by looking up the hash in the extension repository +* Functionality may change or be removed completely with every release +* This extension's API could change with every release +* This extension may not follow a structured release cycle, new (breaking) versions can be pushed at any time + +### Pre-Release Extensions + Pre-release extensions are the next step up from Unstable extensions. They are tagged with version in the **[SemVer](https://semver.org/)** format, more specifically, those in the `v0.y.z` format. In semantic versioning, versions starting with `v0` have a special meaning: they indicate that the more strict semantics of regular (`>v1.0.0`) versions do not yet apply. It basically means that an extensions is working towards becoming a stable extension, but is not quite there yet. @@ -37,12 +42,14 @@ towards becoming a stable extension, but is not quite there yet. For example, at the time of writing this, the version of the `delta` extension is a pre-release extension of version `v0.1.0`. What to expect from an extension that has a version number in the **pre-release** format? -- The extension is compiled from the source code corresponding to the tag. -- Semantic Versioning semantics apply. See the [Semantic Versioning](https://semver.org/) specification for details. -- The extension follows a release cycle where new features are tested in nightly builds before being grouped into a release and pushed to the `core` repository. -- Release notes describing what has been added each release should be available to make it easy to understand the difference between versions. -### Stable extensions +* The extension is compiled from the source code corresponding to the tag. +* Semantic Versioning semantics apply. See the [Semantic Versioning](https://semver.org/) specification for details. +* The extension follows a release cycle where new features are tested in nightly builds before being grouped into a release and pushed to the `core` repository. +* Release notes describing what has been added each release should be available to make it easy to understand the difference between versions. + +### Stable Extensions + Stable extensions are the final step of extension stability. This is denoted by using a **stable SemVer** of format `vx.y.z` where `x>0`. For example, at the time of writing this, the version of the `parquet` extension is a stable extension of version `v1.0.0`. @@ -51,7 +58,8 @@ What to expect from an extension that has a version number in the **stable** for strict SemVer semantics apply: the API of the extension should now be stable and will only change in backwards incompatible ways when the major version is bumped. See the SemVer specification for details -## Release cycle of pre-release and stable core extensions +## Release Cycle of Pre-Release and Stable Core Extensions + In general for extensions the release cycle depends on their stability level. **unstable** extensions are often in sync with DuckDB's release cycle, but may also be quietly updated between DuckDB releases. **pre-release** and **stable** extensions follow their own release cycle. These may or may not coincide with DuckDB releases. To find out more about the release cycle of a specific @@ -61,7 +69,8 @@ their releases as GitHub releases, an example of which you can see in the [delta Finally, there is a small exception: All [in-tree]({% link docs/extensions/working_with_extensions.md %}#in-tree-vs-out-of-tree) extensions simply follow DuckDB's release cycle. -## Nightly builds +## Nightly Builds + Just like DuckDB itself, DuckDB's core extensions have nightly or dev builds that can be used to try out features before they are officially released. This can be useful when your workflow depends on a new feature, or when you need to confirm that your stack is compatible with the upcoming version. @@ -70,30 +79,31 @@ there is a potential risk for a combinatory explosion. Therefore, not all combin In general, there are 2 ways of using nightly builds: using a nightly DuckDB build and using a stable DuckDB build. Let's go over the differences between the two: -### From stable DuckDB +### From Stable DuckDB + In most cases, user's will be interested in a nightly build of a specific extension, but don't necessarily want to switch to using the nightly build of DuckDB itself. This allows using a specific bleeding-edge feature while limiting the exposure to unstable code. To achieve this, Core Extensions tend to regularly push builds to the [`core_nightly` repository]({% link docs/extensions/working_with_extensions.md %}#extension-repositories). Let's look at an example: -First we install a **stable DuckDB build** from [here]({% link docs/installation/index.html %}). +First we install a [**stable DuckDB build**]({% link docs/installation/index.html %}). Then we can install and load a **nightly** extension like this: -```shell + +```bash INSTALL aws FROM core_nightly; LOAD aws; ``` In this example we are using the latest **nightly** build of the aws extension with the latest **stable** version of DuckDB. -### From nightly DuckDB -When DuckDB CI produces a nightly binary of DuckDB itself, the binaries are distributed with a set of extensions that are pinned at a specific version. This extension version will be tested for that -specific build of DuckDB, but might not be the latest dev build. Let's look at an example: +### From Nightly DuckDB + +When DuckDB CI produces a nightly binary of DuckDB itself, the binaries are distributed with a set of extensions that are pinned at a specific version. This extension version will be tested for that specific build of DuckDB, but might not be the latest dev build. Let's look at an example: -First we install a **nightly DuckDB build** from [here]({% link docs/installation/index.html %}). +First, we install a [**nightly DuckDB build**]({% link docs/installation/index.html %}). Then, we can install and load the `aws` extension as expected: -Then we can install and load the aws extension as expected: -```SQL +```sql INSTALL aws; LOAD aws; ``` diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index ff9148ffe04..5ce20b021f3 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -228,11 +228,13 @@ For building and installing extensions from source, see the [building guide]({% To statically link extensions, follow the [developer documentation's “Using extension config files” section](https://github.com/duckdb/duckdb/blob/main/extension/README.md#using-extension-config-files). -## In-tree vs Out-of-tree +## In-Tree vs. Out-of-Tree + Originally, DuckDB extensions lived exclusively in the DuckDB main repository, `github.com/duckdb/duckdb`. These extensions are called in-tree. Later, the concept of out-of-tree extensions was added, where extensions where separated into their own repository, which we call out-of-tree. While from a user's perspective, there are generally no noticeable differences, there are some minor differences related to versioning: -- in-tree extensions use the version of DuckDB instead of having their own version -- in-tree extensions do not have dedicated release notes, their changes are reflected in the regular [DuckDB release notes](https://github.com/duckdb/duckdb/releases) -- core out-of tree extensions tend to live in a repository in `github.com/duckdb/duckdb_` but the name may vary. See the [full list]({% link docs/extensions/core_extensions.md %}) of core extensions for details. \ No newline at end of file + +* in-tree extensions use the version of DuckDB instead of having their own version +* in-tree extensions do not have dedicated release notes, their changes are reflected in the regular [DuckDB release notes](https://github.com/duckdb/duckdb/releases) +* core out-of tree extensions tend to live in a repository in `github.com/duckdb/duckdb_⟨ Date: Thu, 29 Aug 2024 19:18:17 +0200 Subject: [PATCH 036/187] Add release icon --- images/release-icons/1.1.0.svg | 6 ++++++ 1 file changed, 6 insertions(+) create mode 100644 images/release-icons/1.1.0.svg diff --git a/images/release-icons/1.1.0.svg b/images/release-icons/1.1.0.svg new file mode 100644 index 00000000000..40897466d7f --- /dev/null +++ b/images/release-icons/1.1.0.svg @@ -0,0 +1,6 @@ + + + + + + From c067178c9571b4d2515dcc494f49939dae52c564 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 30 Aug 2024 07:57:19 +0200 Subject: [PATCH 037/187] Add n-ary lambda example Fixes #3403 --- docs/sql/functions/list.md | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/docs/sql/functions/list.md b/docs/sql/functions/list.md index 43d7c7af25f..caf2676284a 100644 --- a/docs/sql/functions/list.md +++ b/docs/sql/functions/list.md @@ -403,23 +403,40 @@ The following operators are supported for lists: Python-style list comprehension can be used to compute expressions over elements in a list. For example: ```sql -SELECT [lower(x) FOR x IN strings] +SELECT [lower(x) FOR x IN strings] AS strings FROM (VALUES (['Hello', '', 'World'])) t(strings); ``` -```text -[hello, , world] -``` +
+ +| strings | +|------------------| +| [hello, , world] | ```sql -SELECT [upper(x) FOR x IN strings IF len(x) > 0] +SELECT [upper(x) FOR x IN strings IF len(x) > 0] AS strings FROM (VALUES (['Hello', '', 'World'])) t(strings); ``` -```text -[HELLO, WORLD] +
+ +| strings | +|----------------| +| [HELLO, WORLD] | + +List comprehensions can also use the position of the list elements by adding a second variable. +In the following example, we use `x, i`, where `x` is the value and `i` is the position: + +```sql +SELECT [4, 5, 6] as l, [x FOR x, i IN l IF i != 2] filtered; ``` +
+ +| l | filtered | +|-----------|----------| +| [4, 5, 6] | [4, 6] | + ## Range Functions DuckDB offers two range functions, [`range(start, stop, step)`](#range) and [`generate_series(start, stop, step)`](#generate_series), and their variants with default arguments for `stop` and `step`. The two functions' behavior is different regarding their `stop` argument. This is documented below. From 355d2475b799d3e2ba88c96a9ecca3a6f40ba443 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 30 Aug 2024 08:50:41 +0200 Subject: [PATCH 038/187] Document non-integer literal ordering option Fixes #3446 --- docs/configuration/pragmas.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index cb0071bbc77..cd4c7163385 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -190,6 +190,24 @@ SET default_order = 'ASCENDING'; SET default_order = 'DESCENDING'; ``` +## Ordering by Non-Integer Literals + +By default, ordering by non-integer literals is not allowed: + +```sql +SELECT 42 ORDER BY 'hello world'; +``` + +```console +-- Binder Error: ORDER BY non-integer literal has no effect. +``` + +To allow this behavior, use the `order_by_non_integer_literal` option: + +```sql +SET order_by_non_integer_literal = true; +``` + ## Implicit Casting to `VARCHAR` Prior to version 0.10.0, DuckDB would automatically allow any type to be implicitly cast to `VARCHAR` during function binding. As a result it was possible to e.g., compute the substring of an integer without using an explicit cast. For version v0.10.0 and later an explicit cast is needed instead. To revert to the old behaviour that performs implicit casting, set the `old_implicit_casting` variable to `true`: From cb83e6968b6da870d6fcd954b9a1a319a8a2136e Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Fri, 30 Aug 2024 11:22:53 +0200 Subject: [PATCH 039/187] Document json_exists and json_extract Fixes #3408 --- docs/extensions/json.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/extensions/json.md b/docs/extensions/json.md index abfbf8bf383..a5baae80af8 100644 --- a/docs/extensions/json.md +++ b/docs/extensions/json.md @@ -454,8 +454,6 @@ SELECT json_merge_patch('{"duck": 42}', '{"goose": 123}'); {"goose":123,"duck":42} ``` - - ## JSON Extraction Functions There are two extraction functions, which have their respective operators. The operators can only be used if the string is stored as the `JSON` logical type. @@ -463,8 +461,10 @@ These functions supports the same two location notations as the previous functio | Function | Alias | Operator | Description | |:---|:---|:-| -| `json_extract(json, path)` | `json_extract_path` | `->` | Extract `JSON` from `json` at the given `path`. If `path` is a `LIST`, the result will be a `LIST` of `JSON` | -| `json_extract_string(json, path)` | `json_extract_path_text` | `->>` | Extract `VARCHAR` from `json` at the given `path`. If `path` is a `LIST`, the result will be a `LIST` of `VARCHAR` | +| `json_exists(json, path)` | | | Returns `true` if the supplied path exists in the `json`, and `false` otherwise. | +| `json_extract(json, path)` | `json_extract_path` | `->` | Extracts `JSON` from `json` at the given `path`. If `path` is a `LIST`, the result will be a `LIST` of `JSON`. | +| `json_extract_string(json, path)` | `json_extract_path_text` | `->>` | Extracts `VARCHAR` from `json` at the given `path`. If `path` is a `LIST`, the result will be a `LIST` of `VARCHAR`. | +| `json_value(json, path)` | | | Extracts `JSON` from `json` at the given `path`. If the `json` at the supplied path is not a scalar value, it will return `NULL`. | Note that the equality comparison operator (`=`) has a higher precedence than the `->` JSON extract operator. Therefore, surround the uses of the `->` operator with parentheses when making equality comparisons. For example: From aa3ac909e577f2f8f8d18cd2543ad095e2807943 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Mon, 2 Sep 2024 13:33:28 +0200 Subject: [PATCH 040/187] Bump release date --- ...uncing-duckdb-110.md => 2024-09-09-announcing-duckdb-110.md} | 2 +- images/blog/thumbs/{240902.svg => 240909.svg} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename _posts/{2024-09-02-announcing-duckdb-110.md => 2024-09-09-announcing-duckdb-110.md} (96%) rename images/blog/thumbs/{240902.svg => 240909.svg} (100%) diff --git a/_posts/2024-09-02-announcing-duckdb-110.md b/_posts/2024-09-09-announcing-duckdb-110.md similarity index 96% rename from _posts/2024-09-02-announcing-duckdb-110.md rename to _posts/2024-09-09-announcing-duckdb-110.md index c9130884b5b..1d7045d07c6 100644 --- a/_posts/2024-09-02-announcing-duckdb-110.md +++ b/_posts/2024-09-09-announcing-duckdb-110.md @@ -2,7 +2,7 @@ layout: post title: "Announcing DuckDB 1.1.0" author: Mark Raasveldt and Hannes Mühleisen -thumb: "/images/blog/thumbs/240902.svg" +thumb: "/images/blog/thumbs/240909.svg" excerpt: "The DuckDB team is happy to announce that today we’re releasing DuckDB version 1.1.0, codename “Eatoni”." --- diff --git a/images/blog/thumbs/240902.svg b/images/blog/thumbs/240909.svg similarity index 100% rename from images/blog/thumbs/240902.svg rename to images/blog/thumbs/240909.svg From 13cc64ab72cc52e4ec1582ed324da727cc13d01a Mon Sep 17 00:00:00 2001 From: Maia Date: Wed, 4 Sep 2024 13:36:03 +0200 Subject: [PATCH 041/187] Update profiling pages to reflect recent changes --- docs/configuration/pragmas.md | 19 +- docs/dev/profiling.md | 473 +++++++++++++++++++--------- docs/guides/meta/explain.md | 141 ++++----- docs/guides/meta/explain_analyze.md | 139 +++----- 4 files changed, 448 insertions(+), 324 deletions(-) diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index cb0071bbc77..127116dca08 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -301,7 +301,7 @@ PRAGMA enable_profile; ##### Profiling Format -The format of the resulting profiling information can be specified as either `json`, `query_tree`, or `query_tree_optimizer`. The default format is `query_tree`, which prints the physical operator tree together with the timings and cardinalities of each operator in the tree to the screen. +The format of the resulting profiling information can be specified as either `json`, `query_tree`, or `query_tree_optimizer`. The default format is `query_tree`, which prints the logical query plan together with the timings and cardinalities of each operator in the tree to the screen. To return the logical query plan as JSON: @@ -356,6 +356,23 @@ The output of this mode shows how long it takes to apply certain optimizers on t SET profiling_mode = 'detailed'; ``` +```sql +SET profiling_mode = 'standard'; +``` + +#### Custom Profiling Metrics +By default, all metrics are enabled, but they can be toggled on or off individually. This `PRAGMA` accepts a JSON object with the metric names as keys and a boolean value to enable or disable the metric. The metrics set by this `PRAGMA` will override the default settings. + +> Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` formats will always a default set of metrics. + +In the following example the `CPU_TIME` metric is disabled, and the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics are enabled. + +```SQL +SET custom_profiling_settings='{"CPU_TIME": "false", "EXTRA_INFO": "true", "OPERATOR_CARDINALITY": "true", "OPERATOR_TIMING": "true"}'; +``` + +The profiling docs contain an overview of the available [metrics]({% link docs/dev/profiling.md %}#metrics). + ## Query Optimization #### Optimizer diff --git a/docs/dev/profiling.md b/docs/dev/profiling.md index 9d8491ad203..5cfebdccaac 100644 --- a/docs/dev/profiling.md +++ b/docs/dev/profiling.md @@ -7,177 +7,360 @@ redirect_from: Profiling is important to help understand why certain queries exhibit specific performance characteristics. DuckDB contains several built-in features to enable query profiling that will be explained on this page. -For the examples on this page we will use the following example data set: +## `EXPLAIN` Statement + +The first step to profiling a database engine is figuring out what execution plan the engine is using. The [`EXPLAIN`]({% link docs/guides/meta/explain.md %}) statement allows you to peek into the query plan and see what is going on under the hood. + +## `EXPLAIN ANALYZE` Statement + +The query plan helps understand the performance characteristics of the system. However, often it is also necessary to look at the performance numbers of individual operators and the cardinalities that pass through them. This is where the [`EXPLAIN_ANALYZE`]({% link docs/guides/meta/explain_analyze.md %}) statement comes in, which pretty-prints the query plan and also executes the query, providing the actual run-time performance numbers. + +## Pragmas +DuckDB supports several pragmas that can be used to enable and disable profiling, as well as to control the level of detail in the profiling output. +> Tip In the following examples, `PRAGMA` can be used interchangeably with `SET`. They can also be reset using `RESET`, followed by the setting name. + +The following pragmas are available: + +### Enable Profiling +```SQL +PRAGMA enable_profiling; +``` +or ```sql -CREATE TABLE students (sid INTEGER PRIMARY KEY, name VARCHAR); -CREATE TABLE exams (cid INTEGER, sid INTEGER, grade INTEGER, PRIMARY KEY (cid, sid)); +PRAGMA enable_profile; +``` -INSERT INTO students VALUES (1, 'Mark'), (2, 'Hannes'), (3, 'Pedro'); -INSERT INTO exams VALUES (1, 1, 8), (1, 2, 8), (1, 3, 7), (2, 1, 9), (2, 2, 10); +### Profiling Format +The profiling can be output in several formats. When not specified, the default is `query_tree`, which prints the logical query plan with the timings and cardinalities of each operator in the tree to the screen. + +```SQL +PRAGMA enable_profiling = 'json'; ``` -## `EXPLAIN` Statement +```sql +# prints the physical operator tree +PRAGMA enable_profiling = 'query_tree_optimizer'; +``` + +```sql +PRAGMA enable_profiling = 'query_tree'; +``` + +### Disable Profiling +```SQL +PRAGMA disable_profiling; +``` +or +```sql +PRAGMA disable_profile; +``` + +### Profiling Mode +The default profiling mode is `standard`, but can also be set to `detailed` which enables additional metrics that show the time taken by each optimizer, the planner, and the physical planner. + +```SQL +PRAGMA profiling_mode = 'detailed'; +``` + +```sql +PRAGMA profiling_mode = 'standard'; +``` + +### Profiling Output +By default, the profiling output is printed to the console, but can be directed to a file using the following pragma: + +```SQL +PRAGMA profiling_output = 'filename'; +``` + +> Warning The file contents will be overwritten for every new query that is issued, hence the file will only contain the profiling information of the last query that is run. + +### Custom Profiling Metrics +By default, all metrics are enabled, but they can be toggled on or off individually. This `PRAGMA` accepts a JSON object with the metric names as keys and a boolean value to enable or disable the metric. The metrics set by this `PRAGMA` will override the default settings. + +> Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` formats will always a default set of metrics. -The first step to profiling a database engine is figuring out what execution plan the engine is using. The `EXPLAIN` statement allows you to peek into the query plan and see what is going on under the hood. +In the following example the `CPU_TIME` metric is disabled, and the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics are enabled. -The `EXPLAIN` statement displays the physical plan, i.e., the query plan that will get executed. +```SQL +PRAGMA custom_profiling_settings='{"CPU_TIME": "false", "EXTRA_INFO": "true", "OPERATOR_CARDINALITY": "true", "OPERATOR_TIMING": "true"}'; +``` + +For an overview of the available metrics, see the [metrics](#metrics) section below. + +## Metrics + +There are two types of nodes in the query tree; the `QUERY_ROOT`, and `OPERATOR` nodes. The `QUERY_ROOT` refers exclusively to the top level node and the metrics it contains are measured over the entire query. The `OPERATOR` nodes refer to the individual operators in the query plan. Some metrics are only available for `QUERY_ROOT` nodes, while others are only available for `OPERATOR` nodes. The table below describes each metric, as well as which nodes they are available for. + +Other than `QUERY_NAME` and `OPERATOR_TYPE`, all metrics can be turned on or off. + +| Metric | Return Type | Query | Operator | Description | +|-------------------------|-------------|:-----:|:--------:|------------------------------------------------------------------------------------| +| `BLOCKED_THREAD_TIME` | `double` | ✅ | | The total time threads are blocked | +| `EXTRA_INFO` | `string` | ✅ | ✅ | Each operator also has unique metrics, and can be accessed here. | +| `OPERATOR_CARDINALITY` | `uint64` | ✅ | ✅ ️ | The cardinality of each operator, ie. the number of rows it returns to its parent. | +| `OPERATOR_ROWS_SCANNED` | `uint64` | ✅ | ✅ | The total rows scanned by each operator | +| `OPERATOR_TIMING` | `uint64` | ✅ | ✅ | The time taken by the operator | +| `OPERATOR_TYPE` | `string` | | ✅ | The name of the operator | +| `QUERY_NAME` | `string` | ✅ | | The input query | +| `RESULT_SET_SIZE` | `uint64` | ✅ | ✅ | The size of the result in bytes | + +### Cumulative Metrics + +DuckDB also supports several cumulative metrics, which are available in all nodes. In the `QUERY_ROOT` node, these metrics are the sum of the specific metric in all the operators in the query. In the `OPERATOR` nodes, these metrics are the sum of the operator's specific metric as well as those of all its children recursively. + +These metrics can be used without turning on the specific metric. + +| Metric | Metric Calculated Cumulatively | +|---------------------------|--------------------------------| +| `CPU_TIME` | `OPERATOR_TIMING` | +| `CUMULATIVE_CARDINALITY` | `OPERATOR_CARDINALITY` | +| `CUMULATIVE_ROWS_SCANNED` | `OPERATOR_ROWS_SCANNED` | + +## Detailed Profiling + +As explained above, when the `profiling_mode` is set to `detailed`, an extra set of timing metrics are enabled, which are only available at the `QUERY_ROOT` level. These include [`OPTIMIZERS`](#optimizers), [`PLANNER`](#planner), and [`PHYSICAL_PLANNER`](#physical-planner) metrics, which are all measured in milliseconds, and returned as a `double`. + +These metrics are automatically enabled, and added to the enabled settings, when the `profiling_mode` is set to `detailed`, however, they can also be toggled individually. + +### Optimizers +At the `QUERY_ROOT` level, there are also metrics that measure the time taken by each [optimizer]({% link docs/internal/overview.md %}#optimizer). These metrics are only available when the specific optimizer is enabled. The available optimizations can be queried using the [`duckdb_optimizers() table function`]({% link docs/sql/meta/duckdb_table_functions.md %}#duckdb_optimizers). + +Each optimizer has a corresponding metric that follows the template: `OPTIMIZER_`. For example, the `OPTIMIZER_JOIN_ORDER` metric corresponds to the `JOIN_ORDER` optimizer. + +Additionally, the following metrics are available to support the optimizer metrics: +- `ALL_OPTIMIZERS` - Turns on all optimizer metrics, and measures the time taken by the optimizer parent node +- `CUMMULATIVE_OPTIMIZER_TIMING` - The cumulative sum of all optimizer metrics, can be used without turning on all optimizer metrics. + +### Planner +The `PLANNER` is responsible for generating the logical plan. Currently, two metrics are measured in the `PLANNER`: +- `PLANNER` - The time taken to generate the logical plan +- `PLANNER_BINDING` - The time taken to bind the logical plan -To demonstrate, see the below example: +### Physical Planner +The `PHYSICAL_PLANNER` is responsible for generating the physical plan. The following are the metrics supported in the `PHYSICAL_PLANNER`: +- `PHYSICAL_PLANNER` - The time taken to generate the physical plan +- `PHYSICAL_PLANNER_COLUMN_BINDING` - The time taken to bind the columns in the physical plan +- `physical_planner_resolve_types` - The time taken to resolve the types in the physical plan +- `physical_planner_create_plan` - The time taken to create the physical plan + +## Setting Custom Metrics Examples +Using the dataset from the previous example, we can demonstrate how to enable profiling and set the output format to `json`. + +The first example shows how to enable profiling, set the output to a file, and only enable the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics. ```sql CREATE TABLE students (name VARCHAR, sid INTEGER); CREATE TABLE exams (eid INTEGER, subject VARCHAR, sid INTEGER); INSERT INTO students VALUES ('Mark', 1), ('Joe', 2), ('Matthew', 3); INSERT INTO exams VALUES (10, 'Physics', 1), (20, 'Chemistry', 2), (30, 'Literature', 3); -EXPLAIN SELECT name FROM students JOIN exams USING (sid) WHERE name LIKE 'Ma%'; -``` - -```text -┌─────────────────────────────┐ -│┌───────────────────────────┐│ -││ Physical Plan ││ -│└───────────────────────────┘│ -└─────────────────────────────┘ -┌───────────────────────────┐ -│ PROJECTION │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ name │ -└─────────────┬─────────────┘ -┌─────────────┴─────────────┐ -│ HASH_JOIN │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ INNER │ -│ sid = sid ├──────────────┐ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ -│ EC: 1 │ │ -└─────────────┬─────────────┘ │ -┌─────────────┴─────────────┐┌─────────────┴─────────────┐ -│ SEQ_SCAN ││ FILTER │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ exams ││ prefix(name, 'Ma') │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ sid ││ EC: 1 │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │ -│ EC: 3 ││ │ -└───────────────────────────┘└─────────────┬─────────────┘ - ┌─────────────┴─────────────┐ - │ SEQ_SCAN │ - │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ - │ students │ - │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ - │ sid │ - │ name │ - │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ - │ Filters: name>=Ma AND name│ - │ =Ma AND name│ - │ This file is overwritten with each query that is issued. If you want to store the profile output for later it should be copied to a different file. +The contents of the outputted file: +```json +{ + "operator_timing": 0.000372, + "operator_cardinality": 0, + "extra_info": {}, + "query_name": "SELECT name\nFROM students\nJOIN exams USING (sid)\nWHERE name LIKE 'Ma%';", + "children": [ + { + "operator_timing": 0.000001, + "operator_cardinality": 2, + "operator_type": "PROJECTION", + "extra_info": { + "Projections": "name", + "Estimated Cardinality": "1" + }, + "children": [ + { + "operator_timing": 0.000031, + "operator_cardinality": 2, + "operator_type": "HASH_JOIN", + "extra_info": { + "Join Type": "INNER", + "Conditions": "sid = sid", + "Build Min": "1", + "Build Max": "3", + "Estimated Cardinality": "1" + }, + "children": [ + { + "operator_timing": 0.0000049999999999999996, + "operator_cardinality": 3, + "operator_type": "TABLE_SCAN", + "extra_info": { + "Text": "exams", + "Projections": "sid", + "Estimated Cardinality": "3" + }, + "children": [] + }, + { + "operator_timing": 0.000013000000000000001, + "operator_cardinality": 2, + "operator_type": "FILTER", + "extra_info": { + "Expression": "prefix(name, 'Ma')", + "Estimated Cardinality": "1" + }, + "children": [ + { + "operator_timing": 0.000017, + "operator_cardinality": 2, + "operator_type": "TABLE_SCAN", + "extra_info": { + "Text": "students", + "Projections": [ + "sid", + "name" + ], + "Filters": "name>='Ma' AND name<'Mb' AND name IS NOT NULL", + "Estimated Cardinality": "1" + }, + "children": [] + } + ] + } + ] + } + ] + } + ] +} +``` -Now let us run the query that we inspected before: +The second example adds the detailed metrics to the output. ```sql +PRAGMA profiling_mode = 'detailed'; + SELECT name FROM students JOIN exams USING (sid) WHERE name LIKE 'Ma%'; ``` -After the query is completed, the JSON file containing the profiling output has been written to the specified file. We can then render the query graph using the Python script, provided we have the `duckdb` python module installed. This script will generate a HTML file and open it in your web browser. +The contents of the outputted file: +```json +{ + "all_optimizers": 0.00014299999999999998, + "cumulative_optimizer_timing": 0.00014299999999999998, + "planner": 0.000187, + "planner_binding": 0.000185, + "physical_planner": 0.000034, + "physical_planner_column_binding": 0.000002, + "physical_planner_resolve_types": 0.0, + "physical_planner_create_plan": 0.000031, + "optimizer_expression_rewriter": 0.000012, + "optimizer_filter_pullup": 0.000001, + "optimizer_filter_pushdown": 0.000035, + "optimizer_cte_filter_pusher": 0.0, + "optimizer_regex_range": 0.0, + "optimizer_in_clause": 0.000001, + "optimizer_join_order": 0.000061, + "optimizer_unnest_rewriter": 0.0, + "optimizer_unused_columns": 0.000003, + "optimizer_common_subexpressions": 0.000001, + "optimizer_common_aggregate": 0.000001, + "optimizer_build_side_probe_side": 0.000003, + "optimizer_limit_pushdown": 0.000001, + "optimizer_top_n": 0.0, + "optimizer_duplicate_groups": 0.000002, + "optimizer_reorder_filter": 0.000002, + "optimizer_extension": 0.0, + "optimizer_materialized_cte": 0.0, + "optimizer_column_lifetime": 0.000003, + "operator_timing": 0.001189, + "optimizer_join_filter_pushdown": 0.000006, + "optimizer_statistics_propagation": 0.000011, + "operator_cardinality": 0, + "optimizer_compressed_materialization": 0.0, + "optimizer_deliminator": 0.0, + "extra_info": {}, + "query_name": "SELECT name\nFROM students\nJOIN exams USING (sid)\nWHERE name LIKE 'Ma%';", + "children": [ + { + "operator_timing": 0.0, + "operator_cardinality": 2, + "operator_type": "PROJECTION", + "extra_info": { + "Projections": "name", + "Estimated Cardinality": "1" + }, + "children": [ + { + "operator_timing": 0.00010100000000000002, + "operator_cardinality": 2, + "operator_type": "HASH_JOIN", + "extra_info": { + "Join Type": "INNER", + "Conditions": "sid = sid", + "Build Min": "1", + "Build Max": "3", + "Estimated Cardinality": "1" + }, + "children": [ + { + "operator_timing": 0.000035, + "operator_cardinality": 3, + "operator_type": "TABLE_SCAN", + "extra_info": { + "Text": "exams", + "Projections": "sid", + "Estimated Cardinality": "3" + }, + "children": [] + }, + { + "operator_timing": 0.000023, + "operator_cardinality": 2, + "operator_type": "FILTER", + "extra_info": { + "Expression": "prefix(name, 'Ma')", + "Estimated Cardinality": "1" + }, + "children": [ + { + "operator_timing": 0.000065, + "operator_cardinality": 2, + "operator_type": "TABLE_SCAN", + "extra_info": { + "Text": "students", + "Projections": [ + "sid", + "name" + ], + "Filters": "name>='Ma' AND name<'Mb' AND name IS NOT NULL", + "Estimated Cardinality": "1" + }, + "children": [] + } + ] + } + ] + } + ] + } + ] +} +``` + +## Query Graphs + +It is also possible to render the profiling output as a query graph. The query graph is a visual representation of the query plan, showing the operators and their relationships. The query plan must be output in the `json` format and stored in a file. +After a profiling output is written to its designated file it can then be rendered as a query graph using the Python script, provided the `duckdb` python module is installed. This script will generate an HTML file and open it in your web browser. ```bash python -m duckdb.query_graph /path/to/file.json @@ -193,4 +376,4 @@ Join operators in the query plan show the join type used: * Inner joins are denoted as `INNER`. * Left outer joins and right outer joins are denoted as `LEFT` and `RIGHT`, respectively. -* Full outer joins are denoted as `FULL`. +* Full outer joins are denoted as `FULL`. \ No newline at end of file diff --git a/docs/guides/meta/explain.md b/docs/guides/meta/explain.md index f76c84e1b65..09fed2809c1 100644 --- a/docs/guides/meta/explain.md +++ b/docs/guides/meta/explain.md @@ -3,40 +3,29 @@ layout: docu title: "EXPLAIN: Inspect Query Plans" --- -In order to view the query plan of a query, prepend `EXPLAIN` to a query. - ```sql EXPLAIN SELECT * FROM tbl; ``` -By default only the final physical plan is shown. In order to see the unoptimized and optimized logical plans, change the `explain_output` setting: +The `EXPLAIN` statement displays the physical plan, i.e., the query plan that will get executed, +and is enabled by prepending the query with `EXPLAIN`. +The physical plan is a tree of operators that are executed in a specific order to produce the result of the query, +and is generated by the query optimizer. +To generate an efficient physical plan, the query optimizer transforms the logical plan into a physical plan. -```sql -SET explain_output = 'all'; -``` - -Below is an example of running `EXPLAIN` on [`Q13`](https://github.com/duckdb/duckdb/blob/main/extension/tpch/dbgen/queries/q13.sql) of the [TPC-H benchmark]({% link docs/extensions/tpch.md %}) on the scale factor 1 data set. +To demonstrate, see the below example: ```sql -EXPLAIN - SELECT - c_count, - count(*) AS custdist - FROM ( - SELECT - c_custkey, - count(o_orderkey) - FROM - customer - LEFT OUTER JOIN orders ON c_custkey = o_custkey - AND o_comment NOT LIKE '%special%requests%' - GROUP BY c_custkey - ) AS c_orders (c_custkey, c_count) - GROUP BY - c_count - ORDER BY - custdist DESC, - c_count DESC; +CREATE TABLE students (name VARCHAR, sid INTEGER); +CREATE TABLE exams (eid INTEGER, subject VARCHAR, sid INTEGER); +INSERT INTO students VALUES ('Mark', 1), ('Joe', 2), ('Matthew', 3); +INSERT INTO exams VALUES (10, 'Physics', 1), (20, 'Chemistry', 2), (30, 'Literature', 3); + +EXPLAIN ANALYZE + SELECT name + FROM students + JOIN exams USING (sid) + WHERE name LIKE 'Ma%'; ``` ```text @@ -46,70 +35,62 @@ EXPLAIN │└───────────────────────────┘│ └─────────────────────────────┘ ┌───────────────────────────┐ -│ ORDER_BY │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ ORDERS: │ -│ count_star() DESC │ -│ c_orders.c_count DESC │ -└─────────────┬─────────────┘ -┌─────────────┴─────────────┐ -│ HASH_GROUP_BY │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ #0 │ -│ count_star() │ -└─────────────┬─────────────┘ -┌─────────────┴─────────────┐ -│ PROJECTION │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ c_count │ -└─────────────┬─────────────┘ -┌─────────────┴─────────────┐ -│ PROJECTION │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ count(o_orderkey) │ -└─────────────┬─────────────┘ -┌─────────────┴─────────────┐ -│ HASH_GROUP_BY │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ #0 │ -│ count(#1) │ -└─────────────┬─────────────┘ -┌─────────────┴─────────────┐ │ PROJECTION │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ c_custkey │ -│ o_orderkey │ +│ name │ └─────────────┬─────────────┘ ┌─────────────┴─────────────┐ │ HASH_JOIN │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ RIGHT │ -│ o_custkey = c_custkey ├──────────────┐ +│ INNER │ +│ sid = sid ├──────────────┐ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ -│ EC: 300000 │ │ +│ EC: 1 │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ -│ FILTER ││ SEQ_SCAN │ +│ SEQ_SCAN ││ FILTER │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ (o_comment !~~ '%special ││ customer │ -│ %requests%') ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ c_custkey │ -│ EC: 300000 ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ ││ EC: 150000 │ -└─────────────┬─────────────┘└───────────────────────────┘ -┌─────────────┴─────────────┐ -│ SEQ_SCAN │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ orders │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ o_custkey │ -│ o_comment │ -│ o_orderkey │ -│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ -│ EC: 1500000 │ -└───────────────────────────┘ +│ exams ││ prefix(name, 'Ma') │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ sid ││ EC: 1 │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │ +│ EC: 3 ││ │ +└───────────────────────────┘└─────────────┬─────────────┘ + ┌─────────────┴─────────────┐ + │ SEQ_SCAN │ + │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ + │ students │ + │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ + │ sid │ + │ name │ + │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ + │ Filters: name>=Ma AND name│ + │ =Ma AND name│ +│ Date: Wed, 4 Sep 2024 13:43:07 +0200 Subject: [PATCH 042/187] md lint --- docs/configuration/pragmas.md | 1 + docs/dev/profiling.md | 17 ++++++++++++++--- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index 127116dca08..16c9df7df0e 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -361,6 +361,7 @@ SET profiling_mode = 'standard'; ``` #### Custom Profiling Metrics + By default, all metrics are enabled, but they can be toggled on or off individually. This `PRAGMA` accepts a JSON object with the metric names as keys and a boolean value to enable or disable the metric. The metrics set by this `PRAGMA` will override the default settings. > Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` formats will always a default set of metrics. diff --git a/docs/dev/profiling.md b/docs/dev/profiling.md index 5cfebdccaac..95a7e57e859 100644 --- a/docs/dev/profiling.md +++ b/docs/dev/profiling.md @@ -16,6 +16,7 @@ The first step to profiling a database engine is figuring out what execution pla The query plan helps understand the performance characteristics of the system. However, often it is also necessary to look at the performance numbers of individual operators and the cardinalities that pass through them. This is where the [`EXPLAIN_ANALYZE`]({% link docs/guides/meta/explain_analyze.md %}) statement comes in, which pretty-prints the query plan and also executes the query, providing the actual run-time performance numbers. ## Pragmas + DuckDB supports several pragmas that can be used to enable and disable profiling, as well as to control the level of detail in the profiling output. > Tip In the following examples, `PRAGMA` can be used interchangeably with `SET`. They can also be reset using `RESET`, followed by the setting name. @@ -23,6 +24,7 @@ DuckDB supports several pragmas that can be used to enable and disable profiling The following pragmas are available: ### Enable Profiling + ```SQL PRAGMA enable_profiling; ``` @@ -32,6 +34,7 @@ PRAGMA enable_profile; ``` ### Profiling Format + The profiling can be output in several formats. When not specified, the default is `query_tree`, which prints the logical query plan with the timings and cardinalities of each operator in the tree to the screen. ```SQL @@ -48,6 +51,7 @@ PRAGMA enable_profiling = 'query_tree'; ``` ### Disable Profiling + ```SQL PRAGMA disable_profiling; ``` @@ -57,6 +61,7 @@ PRAGMA disable_profile; ``` ### Profiling Mode + The default profiling mode is `standard`, but can also be set to `detailed` which enables additional metrics that show the time taken by each optimizer, the planner, and the physical planner. ```SQL @@ -68,6 +73,7 @@ PRAGMA profiling_mode = 'standard'; ``` ### Profiling Output + By default, the profiling output is printed to the console, but can be directed to a file using the following pragma: ```SQL @@ -77,6 +83,7 @@ PRAGMA profiling_output = 'filename'; > Warning The file contents will be overwritten for every new query that is issued, hence the file will only contain the profiling information of the last query that is run. ### Custom Profiling Metrics + By default, all metrics are enabled, but they can be toggled on or off individually. This `PRAGMA` accepts a JSON object with the metric names as keys and a boolean value to enable or disable the metric. The metrics set by this `PRAGMA` will override the default settings. > Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` formats will always a default set of metrics. @@ -125,6 +132,7 @@ As explained above, when the `profiling_mode` is set to `detailed`, an extra set These metrics are automatically enabled, and added to the enabled settings, when the `profiling_mode` is set to `detailed`, however, they can also be toggled individually. ### Optimizers + At the `QUERY_ROOT` level, there are also metrics that measure the time taken by each [optimizer]({% link docs/internal/overview.md %}#optimizer). These metrics are only available when the specific optimizer is enabled. The available optimizations can be queried using the [`duckdb_optimizers() table function`]({% link docs/sql/meta/duckdb_table_functions.md %}#duckdb_optimizers). Each optimizer has a corresponding metric that follows the template: `OPTIMIZER_`. For example, the `OPTIMIZER_JOIN_ORDER` metric corresponds to the `JOIN_ORDER` optimizer. @@ -134,11 +142,13 @@ Additionally, the following metrics are available to support the optimizer metri - `CUMMULATIVE_OPTIMIZER_TIMING` - The cumulative sum of all optimizer metrics, can be used without turning on all optimizer metrics. ### Planner + The `PLANNER` is responsible for generating the logical plan. Currently, two metrics are measured in the `PLANNER`: - `PLANNER` - The time taken to generate the logical plan - `PLANNER_BINDING` - The time taken to bind the logical plan ### Physical Planner + The `PHYSICAL_PLANNER` is responsible for generating the physical plan. The following are the metrics supported in the `PHYSICAL_PLANNER`: - `PHYSICAL_PLANNER` - The time taken to generate the physical plan - `PHYSICAL_PLANNER_COLUMN_BINDING` - The time taken to bind the columns in the physical plan @@ -146,6 +156,7 @@ The `PHYSICAL_PLANNER` is responsible for generating the physical plan. The foll - `physical_planner_create_plan` - The time taken to create the physical plan ## Setting Custom Metrics Examples + Using the dataset from the previous example, we can demonstrate how to enable profiling and set the output format to `json`. The first example shows how to enable profiling, set the output to a file, and only enable the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics. @@ -374,6 +385,6 @@ while the _build side_ is the right operand. Join operators in the query plan show the join type used: -* Inner joins are denoted as `INNER`. -* Left outer joins and right outer joins are denoted as `LEFT` and `RIGHT`, respectively. -* Full outer joins are denoted as `FULL`. \ No newline at end of file +- Inner joins are denoted as `INNER`. +- Left outer joins and right outer joins are denoted as `LEFT` and `RIGHT`, respectively. +- Full outer joins are denoted as `FULL`. From bc9bed826b5694882bf74833e5e4e2eabeb8b371 Mon Sep 17 00:00:00 2001 From: Maia Date: Wed, 4 Sep 2024 13:49:46 +0200 Subject: [PATCH 043/187] fix link --- docs/dev/profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/dev/profiling.md b/docs/dev/profiling.md index 95a7e57e859..f83a29b421c 100644 --- a/docs/dev/profiling.md +++ b/docs/dev/profiling.md @@ -133,7 +133,7 @@ These metrics are automatically enabled, and added to the enabled settings, when ### Optimizers -At the `QUERY_ROOT` level, there are also metrics that measure the time taken by each [optimizer]({% link docs/internal/overview.md %}#optimizer). These metrics are only available when the specific optimizer is enabled. The available optimizations can be queried using the [`duckdb_optimizers() table function`]({% link docs/sql/meta/duckdb_table_functions.md %}#duckdb_optimizers). +At the `QUERY_ROOT` level, there are also metrics that measure the time taken by each [optimizer]({% link docs/internals/overview.md %}#optimizer). These metrics are only available when the specific optimizer is enabled. The available optimizations can be queried using the [`duckdb_optimizers() table function`]({% link docs/sql/meta/duckdb_table_functions.md %}#duckdb_optimizers). Each optimizer has a corresponding metric that follows the template: `OPTIMIZER_`. For example, the `OPTIMIZER_JOIN_ORDER` metric corresponds to the `JOIN_ORDER` optimizer. From d14fc69369ada1add707b34a56ff4440ac88fb47 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 5 Sep 2024 10:16:00 +0200 Subject: [PATCH 044/187] Fix --- docs/extensions/working_with_extensions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index 5ce20b021f3..288c1d3fd62 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -237,4 +237,4 @@ While from a user's perspective, there are generally no noticeable differences, * in-tree extensions use the version of DuckDB instead of having their own version * in-tree extensions do not have dedicated release notes, their changes are reflected in the regular [DuckDB release notes](https://github.com/duckdb/duckdb/releases) -* core out-of tree extensions tend to live in a repository in `github.com/duckdb/duckdb_⟨ Date: Thu, 5 Sep 2024 10:18:57 +0200 Subject: [PATCH 045/187] Extensions: Add Android to list of unsupported platforms --- docs/extensions/working_with_extensions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index 288c1d3fd62..44e55733d54 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -28,7 +28,7 @@ Some extensions are distributed for the following platforms: * `windows_amd64_rtools` * `wasm_eh` and `wasm_mvp` (see [DuckDB-Wasm's extensions]({% link docs/api/wasm/extensions.md %})) -For platforms outside the ones listed above, we do not officially distribute extensions (e.g., `linux_arm64_gcc4`, `windows_amd64_mingw`). +For platforms outside the ones listed above, we do not officially distribute extensions (e.g., `linux_arm64_android`, `linux_arm64_gcc4`, `windows_amd64_mingw`). ### Sharing Extensions between Clients From 6230f0998e85d8cbc19c8dd3f5b8c1967f1e01b6 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 5 Sep 2024 11:21:24 +0200 Subject: [PATCH 046/187] Typo --- docs/extensions/working_with_extensions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index 44e55733d54..aa9d624f37b 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -32,7 +32,7 @@ For platforms outside the ones listed above, we do not officially distribute ext ### Sharing Extensions between Clients -The shared installation location allows extensions to be shared between the client APIs _of the same DuckDB version_, as long as they share the same `platfrom` or ABI. For example, if an extension is installed with version 0.10.0 of the CLI client on macOS, it is available from the Python, R, etc. client libraries provided that they have access to the user's home directory and use DuckDB version 0.10.0. +The shared installation location allows extensions to be shared between the client APIs _of the same DuckDB version_, as long as they share the same `platform` or ABI. For example, if an extension is installed with version 0.10.0 of the CLI client on macOS, it is available from the Python, R, etc. client libraries provided that they have access to the user's home directory and use DuckDB version 0.10.0. ## Extension Repositories From 091ebce0faded251ca6c4747bc81d0732d9bf8b7 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 5 Sep 2024 11:21:29 +0200 Subject: [PATCH 047/187] Formatting --- docs/extensions/working_with_extensions.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index aa9d624f37b..fd210a1c9d3 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -89,7 +89,7 @@ SET custom_extension_repository = 'http://nightly-extensions.duckdb.org'; While any url or local path can be used as a repository, currently DuckDB contains the following predefined repositories: -
+
| Alias | Url | Description | |:----------------------|:---------------------------------------|:---------------------------------------------------------------------------------------| @@ -109,7 +109,9 @@ INSTALL aws FROM core_nightly; SELECT extension_name, extension_version, installed_from, install_mode FROM duckdb_extensions(); ``` -Would output: +This outputs: + +
| extensions_name | extensions_version | installed_from | install_mode | |:----------------|:-------------------|:---------------|:-------------| From 41cb33a96a98f7143a118c792da96c68cdd0fbd7 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 5 Sep 2024 11:23:54 +0200 Subject: [PATCH 048/187] Remove extra formatting --- docs/extensions/working_with_extensions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index 24157da6849..e8cf7829d45 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -89,7 +89,7 @@ SET custom_extension_repository = 'http://nightly-extensions.duckdb.org'; While any url or local path can be used as a repository, currently DuckDB contains the following predefined repositories: -
+
| Alias | Url | Description | |:----------------------|:---------------------------------------|:---------------------------------------------------------------------------------------| From 79f05f12683245160ee561c6117a835825681a93 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 5 Sep 2024 11:25:19 +0200 Subject: [PATCH 049/187] Formatting --- docs/extensions/working_with_extensions.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index e8cf7829d45..d90ba2e3ecc 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -78,7 +78,7 @@ INSTALL spatial FROM 'http://nightly-extensions.duckdb.org'; To install an extensions from a custom repository unknown to DuckDB: ```sql -INSTALL custom_extension FROM 'https://my-custom-extension-repository'; +INSTALL ⟨custom_extension⟩ FROM 'https://my-custom-extension-repository'; ``` Alternatively, the `custom_extension_repository` setting can be used to change the default repository used by DuckDB: @@ -87,11 +87,11 @@ Alternatively, the `custom_extension_repository` setting can be used to change t SET custom_extension_repository = 'http://nightly-extensions.duckdb.org'; ``` -While any url or local path can be used as a repository, currently DuckDB contains the following predefined repositories: +While any URL or local path can be used as a repository, DuckDB currently contains the following predefined repositories: -
+
-| Alias | Url | Description | +| Alias | URL | Description | |:----------------------|:---------------------------------------|:---------------------------------------------------------------------------------------| | `core` | `http://extensions.duckdb.org` | DuckDB core extensions | | `core_nightly` | `http://nightly-extensions.duckdb.org` | Nightly builds for `core` | From 6d1c1c122eebc21d0b1323614ddbbbd0df352755 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Thu, 5 Sep 2024 11:26:36 +0200 Subject: [PATCH 050/187] Semantics! --- docs/extensions/working_with_extensions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/extensions/working_with_extensions.md b/docs/extensions/working_with_extensions.md index d90ba2e3ecc..c6608c93be6 100644 --- a/docs/extensions/working_with_extensions.md +++ b/docs/extensions/working_with_extensions.md @@ -122,7 +122,7 @@ This outputs: ### Creating a Custom Repository A DuckDB repository is an HTTP, HTTPS, S3, or local file based directory that serves the extensions files in a specific structure. -This structure is describe [here](#downloading-extensions-directly-from-s3), and is the same +This structure is described in the [“Downloading Extensions Directly from S3” section](#downloading-extensions-directly-from-s3), and is the same for local paths and remote servers, for example: ```text From a259cba550c89ba869abebb96218ecbbb4aa8a64 Mon Sep 17 00:00:00 2001 From: taniabogatsch <44262898+taniabogatsch@users.noreply.github.com> Date: Thu, 5 Sep 2024 14:32:30 +0200 Subject: [PATCH 051/187] add newline --- docs/configuration/overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/configuration/overview.md b/docs/configuration/overview.md index b99464cc45f..1abd848e14e 100644 --- a/docs/configuration/overview.md +++ b/docs/configuration/overview.md @@ -185,4 +185,4 @@ Configuration options come with different default [scopes]({% link docs/sql/stat | `progress_bar_time` | Sets the time (in milliseconds) how long a query needs to take before we start printing a progress bar | `BIGINT` | `2000` | | `schema` | Sets the default search schema. Equivalent to setting search_path to a single value. | `VARCHAR` | `main` | | `search_path` | Sets the default catalog search path as a comma-separated list of values | `VARCHAR` | | -| `streaming_buffer_size` | The maximum memory to buffer between fetching from a streaming result (e.g., 1GB) | `VARCHAR` | `976.5 KiB` | \ No newline at end of file +| `streaming_buffer_size` | The maximum memory to buffer between fetching from a streaming result (e.g., 1GB) | `VARCHAR` | `976.5 KiB` | From 39c51c05f5f52094d16c29b67aa050a698316f50 Mon Sep 17 00:00:00 2001 From: taniabogatsch <44262898+taniabogatsch@users.noreply.github.com> Date: Thu, 5 Sep 2024 14:46:45 +0200 Subject: [PATCH 052/187] revert overview.md --- docs/configuration/overview.md | 188 ++++++++++++++++----------------- 1 file changed, 90 insertions(+), 98 deletions(-) diff --git a/docs/configuration/overview.md b/docs/configuration/overview.md index 1abd848e14e..a46a518246c 100644 --- a/docs/configuration/overview.md +++ b/docs/configuration/overview.md @@ -85,104 +85,96 @@ Configuration options come with different default [scopes]({% link docs/sql/stat ### Global Configuration Options -| Name | Description | Type | Default value | -|----------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------------------------------------| -| `Calendar` | The current calendar. | `VARCHAR` | System (locale) calendar | -| `TimeZone` | The current time zone. | `VARCHAR` | System (locale) timezone | -| `access_mode` | Access mode of the database (**AUTOMATIC**, **READ_ONLY** or **READ_WRITE**). | `VARCHAR` | `automatic` | -| `allocator_background_threads` | Whether to enable the allocator background thread. | `BOOLEAN` | `false` | -| `allocator_flush_threshold` | Peak allocation threshold at which to flush the allocator after completing a task. | `VARCHAR` | `128.0 MiB` | -| `allow_community_extensions` | Allow to load community built extensions. | `BOOLEAN` | `true` | -| `allow_extensions_metadata_mismatch` | Allow to load extensions with not compatible metadata. | `BOOLEAN` | `false` | -| `allow_persistent_secrets` | Allow the creation of persistent secrets, that are stored and loaded on restarts. | `BOOLEAN` | `true` | -| `allow_unredacted_secrets` | Allow printing unredacted secrets. | `BOOLEAN` | `false` | -| `allow_unsigned_extensions` | Allow to load extensions with invalid or missing signatures. | `BOOLEAN` | `false` | -| `arrow_large_buffer_size` | If arrow buffers for strings, blobs, uuids and bits should be exported using large buffers. | `BOOLEAN` | `false` | -| `arrow_output_list_view` | If export to arrow format should use ListView as the physical layout for LIST columns. | `BOOLEAN` | `false` | -| `autoinstall_extension_repository` | Overrides the custom endpoint for extension installation on autoloading. | `VARCHAR` | | -| `autoinstall_known_extensions` | Whether known extensions are allowed to be automatically installed when a query depends on them. | `BOOLEAN` | `true` | -| `autoload_known_extensions` | Whether known extensions are allowed to be automatically loaded when a query depends on them. | `BOOLEAN` | `true` | -| `binary_as_string` | In Parquet files, interpret binary data as a string. | `BOOLEAN` | | -| `ca_cert_file` | Path to a custom certificate file for self-signed certificates. | `VARCHAR` | | -| `checkpoint_threshold`, `wal_autocheckpoint` | The WAL size threshold at which to automatically trigger a checkpoint (e.g., 1GB). | `VARCHAR` | `16.0 MiB` | -| `custom_extension_repository` | Overrides the custom endpoint for remote extension installation. | `VARCHAR` | | -| `custom_user_agent` | Metadata from DuckDB callers. | `VARCHAR` | | -| `default_block_size` | The default block size for new duckdb database files (new as-in, they do not yet exist). | `UBIGINT` | `262144` | -| `default_collation` | The collation setting used when none is specified. | `VARCHAR` | | -| `default_null_order`, `null_order` | Null ordering used when none is specified (**NULLS_FIRST** or **NULLS_LAST**). | `VARCHAR` | `NULLS_LAST` | -| `default_order` | The order type used when none is specified (**ASC** or **DESC**). | `VARCHAR` | `ASC` | -| `default_secret_storage` | Allows switching the default storage for secrets. | `VARCHAR` | `local_file` | -| `disabled_filesystems` | Disable specific file systems preventing access (e.g., LocalFileSystem). | `VARCHAR` | | -| `duckdb_api` | DuckDB API surface. | `VARCHAR` | `cli` | -| `enable_external_access` | Allow the database to access external state (through e.g., loading/installing modules, COPY TO/FROM, CSV readers, pandas replacement scans, etc). | `BOOLEAN` | `true` | -| `enable_fsst_vectors` | Allow scans on FSST compressed segments to emit compressed vectors to utilize late decompression. | `BOOLEAN` | `false` | -| `enable_http_metadata_cache` | Whether or not the global http metadata is used to cache HTTP metadata. | `BOOLEAN` | `false` | -| `enable_macro_dependencies` | Enable created MACROs to create dependencies on the referenced objects (such as tables). | `BOOLEAN` | `false` | -| `enable_object_cache` | Whether or not object cache is used to cache e.g., Parquet metadata. | `BOOLEAN` | `false` | -| `enable_server_cert_verification` | Enable server side certificate verification. | `BOOLEAN` | `false` | -| `enable_view_dependencies` | Enable created VIEWs to create dependencies on the referenced objects (such as tables). | `BOOLEAN` | `false` | -| `extension_directory` | Set the directory to store extensions in. | `VARCHAR` | | -| `external_threads` | The number of external threads that work on DuckDB tasks. | `BIGINT` | `1` | -| `force_download` | Forces upfront download of file. | `BOOLEAN` | `false` | -| `http_keep_alive` | Keep alive connections. Setting this to false can help when running into connection failures. | `BOOLEAN` | `true` | -| `http_retries` | HTTP retries on I/O error. | `UBIGINT` | `3` | -| `http_retry_backoff` | Backoff factor for exponentially increasing retry wait time. | `FLOAT` | `4` | -| `http_retry_wait_ms` | Time between retries. | `UBIGINT` | `100` | -| `http_timeout` | HTTP timeout read/write/connection/retry. | `UBIGINT` | `30000` | -| `immediate_transaction_mode` | Whether transactions should be started lazily when needed, or immediately when BEGIN TRANSACTION is called. | `BOOLEAN` | `false` | -| `index_scan_max_count` | The maximum index scan count sets a fixed threshold for index scans. | `UBIGINT` | `2048` | -| `index_scan_percentage` | The index scan percentage sets a percental threshold for index scans. | `DOUBLE` | `0.001` | -| `lock_configuration` | Whether or not the configuration can be altered. | `BOOLEAN` | `false` | -| `max_memory`, `memory_limit` | The maximum memory of the system (e.g., 1GB). | `VARCHAR` | 80% of RAM | -| `max_temp_directory_size` | The maximum amount of data stored inside the `temp_directory` (e.g., 1GB). No limit is applied when value is zero. | `VARCHAR` | `0 bytes` | -| `old_implicit_casting` | Allow implicit casting to/from VARCHAR. | `BOOLEAN` | `false` | -| `password` | The password to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | -| `preserve_insertion_order` | Whether or not to preserve insertion order. If set to false the system is allowed to re-order any results that do not contain ORDER BY clauses. | `BOOLEAN` | `true` | -| `produce_arrow_string_view` | If strings should be produced by DuckDB in Utf8View format instead of Utf8. | `BOOLEAN` | `false` | -| `s3_access_key_id` | S3 Access Key ID. | `VARCHAR` | | -| `s3_endpoint` | S3 Endpoint. | `VARCHAR` | | -| `s3_region` | S3 Region. | `VARCHAR` | `us-east-1` | -| `s3_secret_access_key` | S3 Access Key. | `VARCHAR` | | -| `s3_session_token` | S3 Session Token. | `VARCHAR` | | -| `s3_uploader_max_filesize` | S3 Uploader max filesize (between 50GB and 5TB). | `VARCHAR` | `800GB` | -| `s3_uploader_max_parts_per_file` | S3 Uploader max parts per file (between 1 and 10000). | `UBIGINT` | `10000` | -| `s3_uploader_thread_limit` | S3 Uploader global thread limit. | `UBIGINT` | `50` | -| `s3_url_compatibility_mode` | Disable Globs and Query Parameters on S3 URLs. | `BOOLEAN` | `false` | -| `s3_url_style` | S3 URL style. | `VARCHAR` | `vhost` | -| `s3_use_ssl` | S3 use SSL. | `BOOLEAN` | `true` | -| `secret_directory` | Set the directory to which persistent secrets are stored. | `VARCHAR` | `~/.duckdb/stored_secrets` | -| `storage_compatibility_version` | Serialize on checkpoint with compatibility for a given duckdb version. | `VARCHAR` | `v0.10.2` | -| `temp_directory` | Set the directory to which to write temp files. | `VARCHAR` | `⟨database_name⟩.tmp` or `.tmp` (in in-memory mode) | -| `threads`, `worker_threads` | The number of total threads used by the system. | `BIGINT` | # CPU cores | -| `username`, `user` | The username to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | +| Name | Description | Type | Default value | +|----|--------|--|---| +| `Calendar` | The current calendar | `VARCHAR` | System (locale) calendar | +| `TimeZone` | The current time zone | `VARCHAR` | System (locale) timezone | +| `access_mode` | Access mode of the database (**AUTOMATIC**, **READ_ONLY** or **READ_WRITE**) | `VARCHAR` | `automatic` | +| `allocator_flush_threshold` | Peak allocation threshold at which to flush the allocator after completing a task. | `VARCHAR` | `128.0 MiB` | +| `allow_community_extensions` | Allow to load community built extensions | `BOOLEAN` | `true` | +| `allow_extensions_metadata_mismatch` | Allow to load extensions with not compatible metadata | `BOOLEAN` | `false` | +| `allow_persistent_secrets` | Allow the creation of persistent secrets, that are stored and loaded on restarts | `BOOLEAN` | `true` | +| `allow_unredacted_secrets` | Allow printing unredacted secrets | `BOOLEAN` | `false` | +| `allow_unsigned_extensions` | Allow to load extensions with invalid or missing signatures | `BOOLEAN` | `false` | +| `arrow_large_buffer_size` | If arrow buffers for strings, blobs, uuids and bits should be exported using large buffers | `BOOLEAN` | `false` | +| `autoinstall_extension_repository` | Overrides the custom endpoint for extension installation on autoloading | `VARCHAR` | | +| `autoinstall_known_extensions` | Whether known extensions are allowed to be automatically installed when a query depends on them | `BOOLEAN` | `true` | +| `autoload_known_extensions` | Whether known extensions are allowed to be automatically loaded when a query depends on them | `BOOLEAN` | `true` | +| `binary_as_string` | In Parquet files, interpret binary data as a string. | `BOOLEAN` | | +| `ca_cert_file` | Path to a custom certificate file for self-signed certificates. | `VARCHAR` | | +| `checkpoint_threshold`, `wal_autocheckpoint` | The WAL size threshold at which to automatically trigger a checkpoint (e.g., 1GB) | `VARCHAR` | `16.0 MiB` | +| `custom_extension_repository` | Overrides the custom endpoint for remote extension installation | `VARCHAR` | | +| `custom_user_agent` | Metadata from DuckDB callers | `VARCHAR` | | +| `default_collation` | The collation setting used when none is specified | `VARCHAR` | | +| `default_null_order`, `null_order` | Null ordering used when none is specified (**NULLS_FIRST** or **NULLS_LAST**) | `VARCHAR` | `NULLS_LAST` | +| `default_order` | The order type used when none is specified (**ASC** or **DESC**) | `VARCHAR` | `ASC` | +| `default_secret_storage` | Allows switching the default storage for secrets | `VARCHAR` | `local_file` | +| `disabled_filesystems` | Disable specific file systems preventing access (e.g., LocalFileSystem) | `VARCHAR` | | +| `duckdb_api` | DuckDB API surface | `VARCHAR` | `cli` | +| `enable_external_access` | Allow the database to access external state (through e.g., loading/installing modules, COPY TO/FROM, CSV readers, pandas replacement scans, etc) | `BOOLEAN` | `true` | +| `enable_fsst_vectors` | Allow scans on FSST compressed segments to emit compressed vectors to utilize late decompression | `BOOLEAN` | `false` | +| `enable_http_metadata_cache` | Whether or not the global http metadata is used to cache HTTP metadata | `BOOLEAN` | `false` | +| `enable_macro_dependencies` | Enable created MACROs to create dependencies on the referenced objects (such as tables) | `BOOLEAN` | `false` | +| `enable_object_cache` | Whether or not object cache is used to cache e.g., Parquet metadata | `BOOLEAN` | `false` | +| `enable_server_cert_verification` | Enable server side certificate verification. | `BOOLEAN` | `false` | +| `enable_view_dependencies` | Enable created VIEWs to create dependencies on the referenced objects (such as tables) | `BOOLEAN` | `false` | +| `extension_directory` | Set the directory to store extensions in | `VARCHAR` | | +| `external_threads` | The number of external threads that work on DuckDB tasks. | `BIGINT` | `1` | +| `force_download` | Forces upfront download of file | `BOOLEAN` | `false` | +| `http_keep_alive` | Keep alive connections. Setting this to false can help when running into connection failures | `BOOLEAN` | `true` | +| `http_retries` | HTTP retries on I/O error | `UBIGINT` | `3` | +| `http_retry_backoff` | Backoff factor for exponentially increasing retry wait time | `FLOAT` | `4` | +| `http_retry_wait_ms` | Time between retries | `UBIGINT` | `100` | +| `http_timeout` | HTTP timeout read/write/connection/retry | `UBIGINT` | `30000` | +| `immediate_transaction_mode` | Whether transactions should be started lazily when needed, or immediately when BEGIN TRANSACTION is called | `BOOLEAN` | `false` | +| `lock_configuration` | Whether or not the configuration can be altered | `BOOLEAN` | `false` | +| `max_memory`, `memory_limit` | The maximum memory of the system (e.g., 1GB) | `VARCHAR` | 80% of RAM | +| `max_temp_directory_size` | The maximum amount of data stored inside the `temp_directory` (e.g., 1GB). No limit is applied when value is zero. | `VARCHAR` | `0 bytes` | +| `old_implicit_casting` | Allow implicit casting to/from VARCHAR | `BOOLEAN` | `false` | +| `password` | The password to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | +| `preserve_insertion_order` | Whether or not to preserve insertion order. If set to false the system is allowed to re-order any results that do not contain ORDER BY clauses. | `BOOLEAN` | `true` | +| `s3_access_key_id` | S3 Access Key ID | `VARCHAR` | | +| `s3_endpoint` | S3 Endpoint | `VARCHAR` | | +| `s3_region` | S3 Region | `VARCHAR` | `us-east-1` | +| `s3_secret_access_key` | S3 Access Key | `VARCHAR` | | +| `s3_session_token` | S3 Session Token | `VARCHAR` | | +| `s3_uploader_max_filesize` | S3 Uploader max filesize (between 50GB and 5TB) | `VARCHAR` | `800GB` | +| `s3_uploader_max_parts_per_file` | S3 Uploader max parts per file (between 1 and 10000) | `UBIGINT` | `10000` | +| `s3_uploader_thread_limit` | S3 Uploader global thread limit | `UBIGINT` | `50` | +| `s3_url_compatibility_mode` | Disable Globs and Query Parameters on S3 URLs | `BOOLEAN` | `false` | +| `s3_url_style` | S3 URL style | `VARCHAR` | `vhost` | +| `s3_use_ssl` | S3 use SSL | `BOOLEAN` | `true` | +| `secret_directory` | Set the directory to which persistent secrets are stored | `VARCHAR` | `~/.duckdb/stored_secrets` | +| `storage_compatibility_version` | Serialize on checkpoint with compatibility for a given duckdb version | `VARCHAR` | `v0.10.2` | +| `temp_directory` | Set the directory to which to write temp files | `VARCHAR` | `⟨database_name⟩.tmp` or `.tmp` (in in-memory mode) | +| `threads`, `worker_threads` | The number of total threads used by the system. | `BIGINT` | # CPU cores | +| `username`, `user` | The username to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | ### Local Configuration Options -| Name | Description | Type | Default value | -|--------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|---------------------------------------------------| -| `custom_profiling_settings` | Accepts a **JSON** enabling custom metrics | `VARCHAR` | `{"OPERATOR_TIMING": "true", "CPU_TIME": "true"}` | -| `enable_http_logging` | Enables HTTP logging | `BOOLEAN` | `false` | -| `enable_profiling` | Enables profiling, and sets the output format (**JSON**, **QUERY_TREE**, **QUERY_TREE_OPTIMIZER**) | `VARCHAR` | `NULL` | -| `enable_progress_bar_print` | Controls the printing of the progress bar, when 'enable_progress_bar' is true | `BOOLEAN` | `true` | -| `enable_progress_bar` | Enables the progress bar, printing progress to the terminal for long queries | `BOOLEAN` | `false` | -| `errors_as_json` | Output error messages as structured **JSON** instead of as a raw string | `BOOLEAN` | `false` | -| `explain_output` | Output of EXPLAIN statements (**ALL**, **OPTIMIZED_ONLY**, **PHYSICAL_ONLY**) | `VARCHAR` | `physical_only` | -| `file_search_path` | A comma separated list of directories to search for input files | `VARCHAR` | | -| `home_directory` | Sets the home directory used by the system | `VARCHAR` | | -| `http_logging_output` | The file to which HTTP logging output should be saved, or empty to print to the terminal | `VARCHAR` | | -| `integer_division` | Whether or not the / operator defaults to integer division, or to floating point division | `BOOLEAN` | `false` | -| `log_query_path` | Specifies the path to which queries should be logged (default: NULL, queries are not logged) | `VARCHAR` | `NULL` | -| `max_expression_depth` | The maximum expression depth limit in the parser. WARNING: increasing this setting and using very deep expressions might lead to stack overflow errors. | `UBIGINT` | `1000` | -| `ordered_aggregate_threshold` | The number of rows to accumulate before sorting, used for tuning | `UBIGINT` | `262144` | -| `partitioned_write_flush_threshold` | The threshold in number of rows after which we flush a thread state when writing using **PARTITION_BY** | `BIGINT` | `524288` | -| `perfect_ht_threshold` | Threshold in bytes for when to use a perfect hash table | `BIGINT` | `12` | -| `pivot_filter_threshold` | The threshold to switch from using filtered aggregates to LIST with a dedicated pivot operator | `BIGINT` | `10` | -| `pivot_limit` | The maximum number of pivot columns in a pivot statement | `BIGINT` | `100000` | -| `prefer_range_joins` | Force use of range joins with mixed predicates | `BOOLEAN` | `false` | -| `preserve_identifier_case` | Whether or not to preserve the identifier case, instead of always lowercasing all non-quoted identifiers | `BOOLEAN` | `true` | -| `profile_output`, `profiling_output` | The file to which profile output should be saved, or empty to print to the terminal | `VARCHAR` | | -| `profiling_mode` | The profiling mode (**STANDARD** or **DETAILED**) | `VARCHAR` | `NULL` | -| `progress_bar_time` | Sets the time (in milliseconds) how long a query needs to take before we start printing a progress bar | `BIGINT` | `2000` | -| `schema` | Sets the default search schema. Equivalent to setting search_path to a single value. | `VARCHAR` | `main` | -| `search_path` | Sets the default catalog search path as a comma-separated list of values | `VARCHAR` | | -| `streaming_buffer_size` | The maximum memory to buffer between fetching from a streaming result (e.g., 1GB) | `VARCHAR` | `976.5 KiB` | +| Name | Description | Type | Default value | +|----|--------|--|---| +| `enable_http_logging` | Enables HTTP logging | `BOOLEAN` | `false` | +| `enable_profiling` | Enables profiling, and sets the output format (**JSON**, **QUERY_TREE**, **QUERY_TREE_OPTIMIZER**) | `VARCHAR` | `NULL` | +| `enable_progress_bar_print` | Controls the printing of the progress bar, when 'enable_progress_bar' is true | `BOOLEAN` | `true` | +| `enable_progress_bar` | Enables the progress bar, printing progress to the terminal for long queries | `BOOLEAN` | `false` | +| `errors_as_json` | Output error messages as structured **JSON** instead of as a raw string | `BOOLEAN` | `false` | +| `explain_output` | Output of EXPLAIN statements (**ALL**, **OPTIMIZED_ONLY**, **PHYSICAL_ONLY**) | `VARCHAR` | `physical_only` | +| `file_search_path` | A comma separated list of directories to search for input files | `VARCHAR` | | +| `home_directory` | Sets the home directory used by the system | `VARCHAR` | | +| `http_logging_output` | The file to which HTTP logging output should be saved, or empty to print to the terminal | `VARCHAR` | | +| `integer_division` | Whether or not the / operator defaults to integer division, or to floating point division | `BOOLEAN` | `false` | +| `log_query_path` | Specifies the path to which queries should be logged (default: NULL, queries are not logged) | `VARCHAR` | `NULL` | +| `max_expression_depth` | The maximum expression depth limit in the parser. WARNING: increasing this setting and using very deep expressions might lead to stack overflow errors. | `UBIGINT` | `1000` | +| `ordered_aggregate_threshold` | The number of rows to accumulate before sorting, used for tuning | `UBIGINT` | `262144` | +| `partitioned_write_flush_threshold` | The threshold in number of rows after which we flush a thread state when writing using **PARTITION_BY** | `BIGINT` | `524288` | +| `perfect_ht_threshold` | Threshold in bytes for when to use a perfect hash table | `BIGINT` | `12` | +| `pivot_filter_threshold` | The threshold to switch from using filtered aggregates to LIST with a dedicated pivot operator | `BIGINT` | `10` | +| `pivot_limit` | The maximum number of pivot columns in a pivot statement | `BIGINT` | `100000` | +| `prefer_range_joins` | Force use of range joins with mixed predicates | `BOOLEAN` | `false` | +| `preserve_identifier_case` | Whether or not to preserve the identifier case, instead of always lowercasing all non-quoted identifiers | `BOOLEAN` | `true` | +| `profile_output`, `profiling_output` | The file to which profile output should be saved, or empty to print to the terminal | `VARCHAR` | | +| `profiling_mode` | The profiling mode (**STANDARD** or **DETAILED**) | `VARCHAR` | `NULL` | +| `progress_bar_time` | Sets the time (in milliseconds) how long a query needs to take before we start printing a progress bar | `BIGINT` | `2000` | +| `schema` | Sets the default search schema. Equivalent to setting search_path to a single value. | `VARCHAR` | `main` | +| `search_path` | Sets the default catalog search path as a comma-separated list of values | `VARCHAR` | | From b8c97d32162989837d7a03f11a00af72d220f62d Mon Sep 17 00:00:00 2001 From: Maia Date: Thu, 5 Sep 2024 16:50:24 +0200 Subject: [PATCH 053/187] edits and nits --- docs/configuration/pragmas.md | 24 ++++++++++++++++++------ docs/dev/profiling.md | 34 +++++++++++++++++++++++++--------- 2 files changed, 43 insertions(+), 15 deletions(-) diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index 16c9df7df0e..fe2379372be 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -301,26 +301,34 @@ PRAGMA enable_profile; ##### Profiling Format -The format of the resulting profiling information can be specified as either `json`, `query_tree`, or `query_tree_optimizer`. The default format is `query_tree`, which prints the logical query plan together with the timings and cardinalities of each operator in the tree to the screen. +The format of the resulting profiling information can be specified as either `json`, `query_tree`, or `query_tree_optimizer`. The default format is `query_tree`, which prints the physical query plan together with the timings and cardinalities of each operator in the tree to the screen. -To return the logical query plan as JSON: +To return the physical query plan as JSON: ```sql SET enable_profiling = 'json'; ``` -To return the logical query plan: +To return the physical query plan: ```sql SET enable_profiling = 'query_tree'; ``` -To return the physical query plan: +To return the physical query plan with optimizer and planner timings, see [profiling mode](#profiling-mode): ```sql SET enable_profiling = 'query_tree_optimizer'; ``` +###### Disabling Output + +It is also possible to disable outputting profiling information. This is specifically useful when accessing the profiling through API functions: + +```sql +SET enable_profiling = 'no_output'; +``` + ##### Disable Profiling To disable profiling: @@ -362,9 +370,13 @@ SET profiling_mode = 'standard'; #### Custom Profiling Metrics -By default, all metrics are enabled, but they can be toggled on or off individually. This `PRAGMA` accepts a JSON object with the metric names as keys and a boolean value to enable or disable the metric. The metrics set by this `PRAGMA` will override the default settings. +By default, all metrics are enabled except those activated by detailed profiling. +All metrics, including those from detailed profiling, +can be individually enabled or disabled using the `custom_profiling_settings` `PRAGMA`. +This `PRAGMA` accepts a JSON object with metric names as keys and boolean values to toggle them on or off. +Settings specified by this `PRAGMA` will override the default behavior. -> Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` formats will always a default set of metrics. +> Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` always use a default set of metrics. In the following example the `CPU_TIME` metric is disabled, and the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics are enabled. diff --git a/docs/dev/profiling.md b/docs/dev/profiling.md index f83a29b421c..fcccd6e66d1 100644 --- a/docs/dev/profiling.md +++ b/docs/dev/profiling.md @@ -9,7 +9,7 @@ Profiling is important to help understand why certain queries exhibit specific p ## `EXPLAIN` Statement -The first step to profiling a database engine is figuring out what execution plan the engine is using. The [`EXPLAIN`]({% link docs/guides/meta/explain.md %}) statement allows you to peek into the query plan and see what is going on under the hood. +A first step to profiling a duckdb can include examining the plan of a query. The [`EXPLAIN`]({% link docs/guides/meta/explain.md %}) statement allows you to peek into the query plan and see what is going on under the hood. ## `EXPLAIN ANALYZE` Statement @@ -35,21 +35,32 @@ PRAGMA enable_profile; ### Profiling Format -The profiling can be output in several formats. When not specified, the default is `query_tree`, which prints the logical query plan with the timings and cardinalities of each operator in the tree to the screen. +The profiling can be output in several formats. When not specified, the default is `query_tree`, which prints the physical query plan with the timings and cardinalities of each operator in the tree to the screen. +Outputs the physical query plan in JSON format: ```SQL PRAGMA enable_profiling = 'json'; ``` +Outputs the physical query plan in a tree format with optimizer and planner timing, +see [profiling mode](#profiling-mode): ```sql -# prints the physical operator tree PRAGMA enable_profiling = 'query_tree_optimizer'; ``` +Outputs the physical query plan: ```sql PRAGMA enable_profiling = 'query_tree'; ``` +#### Disabling Output + +It is also possible to disable outputting profiling information. This is specifically useful when accessing the profiling through API functions: + +```sql +PRAGMA enable_profiling = 'no_output'; +``` + ### Disable Profiling ```SQL @@ -84,9 +95,13 @@ PRAGMA profiling_output = 'filename'; ### Custom Profiling Metrics -By default, all metrics are enabled, but they can be toggled on or off individually. This `PRAGMA` accepts a JSON object with the metric names as keys and a boolean value to enable or disable the metric. The metrics set by this `PRAGMA` will override the default settings. +By default, all metrics are enabled except those activated by detailed profiling. +All metrics, including those from detailed profiling, +can be individually enabled or disabled using the `custom_profiling_settings` `PRAGMA`. +This `PRAGMA` accepts a JSON object with metric names as keys and boolean values to toggle them on or off. +Settings specified by this `PRAGMA` will override the default behavior. -> Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` formats will always a default set of metrics. +> Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` always use a default set of metrics. In the following example the `CPU_TIME` metric is disabled, and the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics are enabled. @@ -115,9 +130,10 @@ Other than `QUERY_NAME` and `OPERATOR_TYPE`, all metrics can be turned on or off ### Cumulative Metrics -DuckDB also supports several cumulative metrics, which are available in all nodes. In the `QUERY_ROOT` node, these metrics are the sum of the specific metric in all the operators in the query. In the `OPERATOR` nodes, these metrics are the sum of the operator's specific metric as well as those of all its children recursively. +DuckDB also supports several cumulative metrics, available in all nodes. In the QUERY_ROOT node, these metrics represent the sum of the corresponding metrics across all operators in the query. In the OPERATOR nodes, they represent the sum of the operator's specific metric along with those of all its children recursively. -These metrics can be used without turning on the specific metric. +These cumulative metrics can be enabled independently, even if the underlying specific metrics are disabled. +The table below shows the cumulative metrics and the specific metrics they are calculated from. | Metric | Metric Calculated Cumulatively | |---------------------------|--------------------------------| @@ -152,8 +168,8 @@ The `PLANNER` is responsible for generating the logical plan. Currently, two met The `PHYSICAL_PLANNER` is responsible for generating the physical plan. The following are the metrics supported in the `PHYSICAL_PLANNER`: - `PHYSICAL_PLANNER` - The time taken to generate the physical plan - `PHYSICAL_PLANNER_COLUMN_BINDING` - The time taken to bind the columns in the physical plan -- `physical_planner_resolve_types` - The time taken to resolve the types in the physical plan -- `physical_planner_create_plan` - The time taken to create the physical plan +- `PHYSICAL_PLANNER_RESOLVE_TYPES` - The time taken to resolve the types in the physical plan +- `PHYSICAL_PLANNER_CREATE_PLAN` - The time taken to create the physical plan ## Setting Custom Metrics Examples From e40fcc4d274d93cf06be0fad1bd05db48b253972 Mon Sep 17 00:00:00 2001 From: Maia Date: Fri, 6 Sep 2024 10:04:00 +0200 Subject: [PATCH 054/187] move pragmas to a table --- docs/dev/profiling.md | 108 ++++++------------------------------------ 1 file changed, 15 insertions(+), 93 deletions(-) diff --git a/docs/dev/profiling.md b/docs/dev/profiling.md index fcccd6e66d1..ca8c47710aa 100644 --- a/docs/dev/profiling.md +++ b/docs/dev/profiling.md @@ -17,99 +17,21 @@ The query plan helps understand the performance characteristics of the system. H ## Pragmas -DuckDB supports several pragmas that can be used to enable and disable profiling, as well as to control the level of detail in the profiling output. - -> Tip In the following examples, `PRAGMA` can be used interchangeably with `SET`. They can also be reset using `RESET`, followed by the setting name. - -The following pragmas are available: - -### Enable Profiling - -```SQL -PRAGMA enable_profiling; -``` -or -```sql -PRAGMA enable_profile; -``` - -### Profiling Format - -The profiling can be output in several formats. When not specified, the default is `query_tree`, which prints the physical query plan with the timings and cardinalities of each operator in the tree to the screen. - -Outputs the physical query plan in JSON format: -```SQL -PRAGMA enable_profiling = 'json'; -``` - -Outputs the physical query plan in a tree format with optimizer and planner timing, -see [profiling mode](#profiling-mode): -```sql -PRAGMA enable_profiling = 'query_tree_optimizer'; -``` - -Outputs the physical query plan: -```sql -PRAGMA enable_profiling = 'query_tree'; -``` - -#### Disabling Output - -It is also possible to disable outputting profiling information. This is specifically useful when accessing the profiling through API functions: - -```sql -PRAGMA enable_profiling = 'no_output'; -``` - -### Disable Profiling - -```SQL -PRAGMA disable_profiling; -``` -or -```sql -PRAGMA disable_profile; -``` - -### Profiling Mode - -The default profiling mode is `standard`, but can also be set to `detailed` which enables additional metrics that show the time taken by each optimizer, the planner, and the physical planner. - -```SQL -PRAGMA profiling_mode = 'detailed'; -``` - -```sql -PRAGMA profiling_mode = 'standard'; -``` - -### Profiling Output - -By default, the profiling output is printed to the console, but can be directed to a file using the following pragma: - -```SQL -PRAGMA profiling_output = 'filename'; -``` - -> Warning The file contents will be overwritten for every new query that is issued, hence the file will only contain the profiling information of the last query that is run. - -### Custom Profiling Metrics - -By default, all metrics are enabled except those activated by detailed profiling. -All metrics, including those from detailed profiling, -can be individually enabled or disabled using the `custom_profiling_settings` `PRAGMA`. -This `PRAGMA` accepts a JSON object with metric names as keys and boolean values to toggle them on or off. -Settings specified by this `PRAGMA` will override the default behavior. - -> Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` always use a default set of metrics. - -In the following example the `CPU_TIME` metric is disabled, and the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics are enabled. - -```SQL -PRAGMA custom_profiling_settings='{"CPU_TIME": "false", "EXTRA_INFO": "true", "OPERATOR_CARDINALITY": "true", "OPERATOR_TIMING": "true"}'; -``` - -For an overview of the available metrics, see the [metrics](#metrics) section below. +DuckDB supports several pragmas that can be used to enable and disable profiling, as well as to control the level of detail in the profiling output. + +The following pragmas are available, and can be set using either `PRAGMA` or `SET`. +They can also be reset using `RESET`, followed by the setting name. +For more information on the profiling pragmas and their usage, +see the [Profiling Queries]({% link docs/configuration/pragmas.md %}#profiling_queries) +section of the pragmas page. + +| Setting | Description | Default | Options | +|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|-----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------| +| [`enable_profiling`]({% link docs/configuration/pragmas.md %}#enable_profiling) , [`enable_profile`]({% link docs/configuration/pragmas.md %}#enable_profiling) | Turn on profiling | `query_tree` | `query_tree`, `json`, `query_tree_optimizer`, [`no_output`]({% link docs/configuration/pragmas.md %}#disabling_output) | +| [`disable_profiling`]({% link docs/configuration/pragmas.md %}#disable_profiling), [`disable_profile`]({% link docs/configuration/pragmas.md %}#disable_profiling) | Turn off profiling | | | +| [`profiling_mode`]({% link docs/configuration/pragmas.md %}#profiling_mode) | Toggle additional optimizer, planner, and physical planner metrics | `standard` | `standard`, `detailed` | +| [`profiling_output`]({% link docs/configuration/pragmas.md %}#profiling_output) | Set a file to output the profiling to | Console | A path to a file | +| [`custom_profiling_settings`]({% link docs/configuration/pragmas.md %}#custom_profiling_metrics) | Enable or disable specific metrics. | All metrics except those activated by detailed profiling. | A JSON object that matches the following: `{"METRIC_NAME": "boolean", ...}`. See the [metrics](#metrics) section below | ## Metrics From 50aef65d4f9f7029ef4d98faba2954430ecf45f4 Mon Sep 17 00:00:00 2001 From: Maia Date: Fri, 6 Sep 2024 15:39:07 +0200 Subject: [PATCH 055/187] nits and comments --- docs/configuration/pragmas.md | 12 +- docs/dev/profiling.md | 212 +++++++++------------------------- docs/guides/meta/explain.md | 5 +- 3 files changed, 65 insertions(+), 164 deletions(-) diff --git a/docs/configuration/pragmas.md b/docs/configuration/pragmas.md index 64263acfa67..93da3cb0610 100644 --- a/docs/configuration/pragmas.md +++ b/docs/configuration/pragmas.md @@ -315,15 +315,17 @@ To return the physical query plan: SET enable_profiling = 'query_tree'; ``` -To return the physical query plan with optimizer and planner timings, see [profiling mode](#profiling-mode): + +To return the physical query plan with optimizer and planner timings: ```sql SET enable_profiling = 'query_tree_optimizer'; ``` +For more information on the profiling mode, see [profiling mode](#profiling-mode). ###### Disabling Output -It is also possible to disable outputting profiling information. This is specifically useful when accessing the profiling through API functions: +It is also possible to disable outputting profiling information. This is specifically useful when accessing the profiling information: ```sql SET enable_profiling = 'no_output'; @@ -371,10 +373,10 @@ SET profiling_mode = 'standard'; #### Custom Profiling Metrics By default, all metrics are enabled except those activated by detailed profiling. -All metrics, including those from detailed profiling, +Each metrics, including those from detailed profiling, can be individually enabled or disabled using the `custom_profiling_settings` `PRAGMA`. This `PRAGMA` accepts a JSON object with metric names as keys and boolean values to toggle them on or off. -Settings specified by this `PRAGMA` will override the default behavior. +Settings specified by this `PRAGMA` override the default behavior. > Note This only affects the metrics when the `enable_profiling` is set to `json`. The `query_tree` and `query_tree_optimizer` always use a default set of metrics. @@ -384,7 +386,7 @@ In the following example the `CPU_TIME` metric is disabled, and the `EXTRA_INFO` SET custom_profiling_settings='{"CPU_TIME": "false", "EXTRA_INFO": "true", "OPERATOR_CARDINALITY": "true", "OPERATOR_TIMING": "true"}'; ``` -The profiling docs contain an overview of the available [metrics]({% link docs/dev/profiling.md %}#metrics). +The profiling documentation contains an overview of the available [metrics]({% link docs/dev/profiling.md %}#metrics). ## Query Optimization diff --git a/docs/dev/profiling.md b/docs/dev/profiling.md index ca8c47710aa..20f6e2fa3cd 100644 --- a/docs/dev/profiling.md +++ b/docs/dev/profiling.md @@ -9,7 +9,7 @@ Profiling is important to help understand why certain queries exhibit specific p ## `EXPLAIN` Statement -A first step to profiling a duckdb can include examining the plan of a query. The [`EXPLAIN`]({% link docs/guides/meta/explain.md %}) statement allows you to peek into the query plan and see what is going on under the hood. +A first step to profiling a duckdb can include examining the query plan. The [`EXPLAIN`]({% link docs/guides/meta/explain.md %}) statement allows you to peek into the query plan and see what is going on under the hood. ## `EXPLAIN ANALYZE` Statement @@ -35,20 +35,20 @@ section of the pragmas page. ## Metrics -There are two types of nodes in the query tree; the `QUERY_ROOT`, and `OPERATOR` nodes. The `QUERY_ROOT` refers exclusively to the top level node and the metrics it contains are measured over the entire query. The `OPERATOR` nodes refer to the individual operators in the query plan. Some metrics are only available for `QUERY_ROOT` nodes, while others are only available for `OPERATOR` nodes. The table below describes each metric, as well as which nodes they are available for. +There are two types of nodes in the query tree: the `QUERY_ROOT`, and `OPERATOR` nodes. The `QUERY_ROOT` refers exclusively to the top level node and the metrics it contains are measured over the entire query. The `OPERATOR` nodes refer to the individual operators in the query plan. Some metrics are only available for `QUERY_ROOT` nodes, while others are only available for `OPERATOR` nodes. The table below describes each metric, as well as which nodes they are available for. Other than `QUERY_NAME` and `OPERATOR_TYPE`, all metrics can be turned on or off. -| Metric | Return Type | Query | Operator | Description | -|-------------------------|-------------|:-----:|:--------:|------------------------------------------------------------------------------------| -| `BLOCKED_THREAD_TIME` | `double` | ✅ | | The total time threads are blocked | -| `EXTRA_INFO` | `string` | ✅ | ✅ | Each operator also has unique metrics, and can be accessed here. | -| `OPERATOR_CARDINALITY` | `uint64` | ✅ | ✅ ️ | The cardinality of each operator, ie. the number of rows it returns to its parent. | -| `OPERATOR_ROWS_SCANNED` | `uint64` | ✅ | ✅ | The total rows scanned by each operator | -| `OPERATOR_TIMING` | `uint64` | ✅ | ✅ | The time taken by the operator | -| `OPERATOR_TYPE` | `string` | | ✅ | The name of the operator | -| `QUERY_NAME` | `string` | ✅ | | The input query | -| `RESULT_SET_SIZE` | `uint64` | ✅ | ✅ | The size of the result in bytes | +| Metric | Return Type | Query | Operator | Description | +|-------------------------|-------------|:-----:|:--------:|--------------------------------------------------------------------------------------| +| `BLOCKED_THREAD_TIME` | `double` | ✅ | | The total time threads are blocked | +| `EXTRA_INFO` | `string` | ✅ | ✅ | Each operator also has unique metrics, which can be accessed here. | +| `OPERATOR_CARDINALITY` | `uint64` | ✅ | ✅ ️ | The cardinality of each operator, i.e., the number of rows it returns to its parent. | +| `OPERATOR_ROWS_SCANNED` | `uint64` | ✅ | ✅ | The total rows scanned by each operator | +| `OPERATOR_TIMING` | `uint64` | ✅ | ✅ | The time taken by each operator | +| `OPERATOR_TYPE` | `string` | | ✅ | The name of each operator | +| `QUERY_NAME` | `string` | ✅ | | The input query | +| `RESULT_SET_SIZE` | `uint64` | ✅ | ✅ | The size of the result in bytes | ### Cumulative Metrics @@ -82,22 +82,23 @@ Additionally, the following metrics are available to support the optimizer metri ### Planner The `PLANNER` is responsible for generating the logical plan. Currently, two metrics are measured in the `PLANNER`: -- `PLANNER` - The time taken to generate the logical plan +- `PLANNER` - The time taken to generate the logical plan from the parsed SQL nodes. - `PLANNER_BINDING` - The time taken to bind the logical plan ### Physical Planner -The `PHYSICAL_PLANNER` is responsible for generating the physical plan. The following are the metrics supported in the `PHYSICAL_PLANNER`: +The `PHYSICAL_PLANNER` is responsible for generating the physical plan from the logical plan. +The following are the metrics supported in the `PHYSICAL_PLANNER`: - `PHYSICAL_PLANNER` - The time taken to generate the physical plan -- `PHYSICAL_PLANNER_COLUMN_BINDING` - The time taken to bind the columns in the physical plan -- `PHYSICAL_PLANNER_RESOLVE_TYPES` - The time taken to resolve the types in the physical plan +- `PHYSICAL_PLANNER_COLUMN_BINDING` - The time taken to bind the columns in the logical plan to physical columns +- `PHYSICAL_PLANNER_RESOLVE_TYPES` - The time taken to resolve the types in the logical plan to physical types - `PHYSICAL_PLANNER_CREATE_PLAN` - The time taken to create the physical plan ## Setting Custom Metrics Examples -Using the dataset from the previous example, we can demonstrate how to enable profiling and set the output format to `json`. +The following examples demonstrate how to enable profiling and set the output format to `json`. -The first example shows how to enable profiling, set the output to a file, and only enable the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics. +In the first example, profiling is enabled, the output is set to a file, and only the `EXTRA_INFO`, `OPERATOR_CARDINALITY`, and `OPERATOR_TIMING` metrics are enabled. ```sql CREATE TABLE students (name VARCHAR, sid INTEGER); @@ -118,75 +119,33 @@ WHERE name LIKE 'Ma%'; The contents of the outputted file: ```json -{ - "operator_timing": 0.000372, - "operator_cardinality": 0, - "extra_info": {}, - "query_name": "SELECT name\nFROM students\nJOIN exams USING (sid)\nWHERE name LIKE 'Ma%';", - "children": [ +"operator_timing": 0.000372, +"operator_cardinality": 0, +"extra_info": {}, +"query_name": "SELECT name\nFROM students\nJOIN exams USING (sid)\nWHERE name LIKE 'Ma%';", +"children": [ { - "operator_timing": 0.000001, - "operator_cardinality": 2, - "operator_type": "PROJECTION", - "extra_info": { - "Projections": "name", - "Estimated Cardinality": "1" - }, - "children": [ - { - "operator_timing": 0.000031, - "operator_cardinality": 2, - "operator_type": "HASH_JOIN", - "extra_info": { - "Join Type": "INNER", - "Conditions": "sid = sid", - "Build Min": "1", - "Build Max": "3", + "operator_timing": 0.000001, + "operator_cardinality": 2, + "operator_type": "PROJECTION", + "extra_info": { + "Projections": "name", "Estimated Cardinality": "1" - }, - "children": [ - { - "operator_timing": 0.0000049999999999999996, - "operator_cardinality": 3, - "operator_type": "TABLE_SCAN", - "extra_info": { - "Text": "exams", - "Projections": "sid", - "Estimated Cardinality": "3" - }, - "children": [] - }, + }, + "children": [ { - "operator_timing": 0.000013000000000000001, - "operator_cardinality": 2, - "operator_type": "FILTER", - "extra_info": { - "Expression": "prefix(name, 'Ma')", - "Estimated Cardinality": "1" - }, - "children": [ - { - "operator_timing": 0.000017, - "operator_cardinality": 2, - "operator_type": "TABLE_SCAN", - "extra_info": { - "Text": "students", - "Projections": [ - "sid", - "name" - ], - "Filters": "name>='Ma' AND name<'Mb' AND name IS NOT NULL", + "operator_timing": 0.000031, + "operator_cardinality": 2, + "operator_type": "HASH_JOIN", + "extra_info": { + "Join Type": "INNER", + "Conditions": "sid = sid", + "Build Min": "1", + "Build Max": "3", "Estimated Cardinality": "1" - }, - "children": [] - } - ] - } - ] - } - ] - } - ] + }, + "children": [ + ... } ``` @@ -215,25 +174,7 @@ The contents of the outputted file: "optimizer_expression_rewriter": 0.000012, "optimizer_filter_pullup": 0.000001, "optimizer_filter_pushdown": 0.000035, - "optimizer_cte_filter_pusher": 0.0, - "optimizer_regex_range": 0.0, - "optimizer_in_clause": 0.000001, - "optimizer_join_order": 0.000061, - "optimizer_unnest_rewriter": 0.0, - "optimizer_unused_columns": 0.000003, - "optimizer_common_subexpressions": 0.000001, - "optimizer_common_aggregate": 0.000001, - "optimizer_build_side_probe_side": 0.000003, - "optimizer_limit_pushdown": 0.000001, - "optimizer_top_n": 0.0, - "optimizer_duplicate_groups": 0.000002, - "optimizer_reorder_filter": 0.000002, - "optimizer_extension": 0.0, - "optimizer_materialized_cte": 0.0, - "optimizer_column_lifetime": 0.000003, - "operator_timing": 0.001189, - "optimizer_join_filter_pushdown": 0.000006, - "optimizer_statistics_propagation": 0.000011, + ... "operator_cardinality": 0, "optimizer_compressed_materialization": 0.0, "optimizer_deliminator": 0.0, @@ -249,60 +190,19 @@ The contents of the outputted file: "Estimated Cardinality": "1" }, "children": [ - { - "operator_timing": 0.00010100000000000002, - "operator_cardinality": 2, - "operator_type": "HASH_JOIN", - "extra_info": { - "Join Type": "INNER", - "Conditions": "sid = sid", - "Build Min": "1", - "Build Max": "3", - "Estimated Cardinality": "1" - }, - "children": [ - { - "operator_timing": 0.000035, - "operator_cardinality": 3, - "operator_type": "TABLE_SCAN", - "extra_info": { - "Text": "exams", - "Projections": "sid", - "Estimated Cardinality": "3" - }, - "children": [] - }, - { - "operator_timing": 0.000023, - "operator_cardinality": 2, - "operator_type": "FILTER", - "extra_info": { - "Expression": "prefix(name, 'Ma')", - "Estimated Cardinality": "1" - }, - "children": [ - { - "operator_timing": 0.000065, - "operator_cardinality": 2, - "operator_type": "TABLE_SCAN", - "extra_info": { - "Text": "students", - "Projections": [ - "sid", - "name" - ], - "Filters": "name>='Ma' AND name<'Mb' AND name IS NOT NULL", - "Estimated Cardinality": "1" - }, - "children": [] - } - ] - } - ] - } - ] - } - ] + { + "operator_timing": 0.00010100000000000002, + "operator_cardinality": 2, + "operator_type": "HASH_JOIN", + "extra_info": { + "Join Type": "INNER", + "Conditions": "sid = sid", + "Build Min": "1", + "Build Max": "3", + "Estimated Cardinality": "1" + }, + "children": [ + ... } ``` diff --git a/docs/guides/meta/explain.md b/docs/guides/meta/explain.md index 09fed2809c1..4bd5beefddd 100644 --- a/docs/guides/meta/explain.md +++ b/docs/guides/meta/explain.md @@ -9,9 +9,8 @@ EXPLAIN SELECT * FROM tbl; The `EXPLAIN` statement displays the physical plan, i.e., the query plan that will get executed, and is enabled by prepending the query with `EXPLAIN`. -The physical plan is a tree of operators that are executed in a specific order to produce the result of the query, -and is generated by the query optimizer. -To generate an efficient physical plan, the query optimizer transforms the logical plan into a physical plan. +The physical plan is a tree of operators that are executed in a specific order to produce the result of the query. +To generate an efficient physical plan, the query optimizer transforms the existing physical plan into a better physical plan. To demonstrate, see the below example: From fe13f658048a930c0a975afe6aa96798d3aa91e8 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sat, 7 Sep 2024 16:44:21 +0200 Subject: [PATCH 056/187] Add DuckCon5 talk --- media.md | 1 + 1 file changed, 1 insertion(+) diff --git a/media.md b/media.md index d01a81220ff..66696c4d476 100644 --- a/media.md +++ b/media.md @@ -6,6 +6,7 @@ title: Media | Title | Venue | Presenter | Duration | |-|-|-|-|-:| | _**DuckDB Announcements and Project Updates**_ | | | | +| [DuckDB – Overview and latest developments](https://www.youtube.com/watch?v=xX6qnP2H5wkl) ([pdf](https://blobs.duckdb.org/events/duckcon5/hannes-muhleisen-mark-raasveldt-introduction-and-state-of-project.pdf)) | DuckCon #5 (2024) | Hannes Mühleisen and Mark Raasveldt | 30min | | [Announcing DuckDB support for Delta Lake and the Unity Catalog extension](https://www.youtube.com/watch?v=wuP6iEYH11E) | Data + AI Summit 2024 | Hannes Mühleisen | 5min | | [State of the Duck](https://www.youtube.com/watch?v=cyZfpXxXojEl) ([pdf](https://blobs.duckdb.org/events/duckcon4/duckcon4-mark-raasveldt-hannes-muhleisen-state-of-the-duck.pdf)) | DuckCon #4 (2024) | Hannes Mühleisen and Mark Raasveldt | 20min | | _**Overview Talks on DuckDB**_ | | | | From 0f813e8ac86f23fa4cd34cf8488607ca71fe2662 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sat, 7 Sep 2024 16:47:11 +0200 Subject: [PATCH 057/187] Add pg_duckdb demo --- _posts/2024-08-15-duckcon5.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_posts/2024-08-15-duckcon5.md b/_posts/2024-08-15-duckcon5.md index 936bbc9bed5..46fbb19f59a 100644 --- a/_posts/2024-08-15-duckcon5.md +++ b/_posts/2024-08-15-duckcon5.md @@ -34,7 +34,7 @@ As is traditional in DuckCons, we will start with a talk from DuckDB's creators | 3:10PM | _Break_ | | | | 3:30PM | _Second session_ | | | | 3:30PM | [**A duck for your dashboard: Performant data apps in the browser with DuckDB**](https://youtu.be/blYQhiOMhwA) | [Robert Kosara](https://www.linkedin.com/in/rkosara/)
_([Observable](https://observablehq.com/))_ | [pdf](https://blobs.duckdb.org/events/duckcon5/robert-kosara-a-duck-for-your-dashboard.pdf) | -| 4:00PM | **pg_duckdb, DuckDB-powered Postgres** | [Joseph Sciarrino](https://www.linkedin.com/in/jsciarrino12/), [Jonathan Dance](https://www.linkedin.com/in/jonathandance/)
_([Hydra](https://www.hydra.so/))_ | [pdf](https://blobs.duckdb.org/events/duckcon5/joseph-sciarrino-jonathan-dance-hydra-duckdb-powered-postgres.pdf) | +| 4:00PM | **pg_duckdb, DuckDB-powered Postgres** ([demo](https://youtu.be/NbnVkSwTyeU)) | [Joseph Sciarrino](https://www.linkedin.com/in/jsciarrino12/), [Jonathan Dance](https://www.linkedin.com/in/jonathandance/)
_([Hydra](https://www.hydra.so/))_ | [pdf](https://blobs.duckdb.org/events/duckcon5/joseph-sciarrino-jonathan-dance-hydra-duckdb-powered-postgres.pdf) | | 4:15PM | [**Delighting users with RESTful APIs and DuckDB**](https://youtu.be/yNL4MPbOuZc) | [Miguel Filipe](https://www.linkedin.com/in/miguelmfilipe/)
_([Dune Analytics](https://dune.com/))_ | [pdf](https://blobs.duckdb.org/events/duckcon5/miguel-filipe-delighting-users-with-restful-apis-and-duckdb.pdf) | | 4:23PM | [**Aerodynamic data models: Flying fast at scale with DuckDB**](https://youtu.be/OkKpnORjlVo) | [Brian Holmes](https://github.com/briangregoryholmes)
_([Rill Data](https://www.rilldata.com/))_ | [pdf](https://blobs.duckdb.org/events/duckcon5/brian-holmes-flying-fast-at-scale-with-duckdb.pdf) | | 4:31PM | [**Double glazing: Two years of windowing improvements**](https://youtu.be/QubE0u8Kq7Y) | [Richard Wesley](https://www.linkedin.com/in/riwesley/)
_([DuckDB Labs](https://duckdblabs.com/))_ | [pdf](https://blobs.duckdb.org/events/duckcon5/richard-wesley-double-glazing-two-years-of-windowing-improvements.pdf) | From aca2472a45f28f5967d9318c6dd969e5bdea292b Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sat, 7 Sep 2024 17:19:09 +0200 Subject: [PATCH 058/187] Fix quotes --- faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/faq.md b/faq.md index 4d5e569d37f..0f517e34fa1 100644 --- a/faq.md +++ b/faq.md @@ -306,7 +306,7 @@ See the most recent [overview talk at DuckCon #5](https://blobs.duckdb.org/event
The DuckDB Website is hosted by GitHub Pages, its repository is at [`duckdb/duckdb-web`](https://github.com/duckdb/duckdb-web). -When the documentation is browsed from a desktop computer, every page has a "Page Source" button on the top that navigates you to its Markdown source file. +When the documentation is browsed from a desktop computer, every page has a “Page Source” button on the top that navigates you to its Markdown source file. Pull requests to fix issues or to expand the documentation section on DuckDB's features are very welcome. Before opening a pull request, please consult our [Contributor Guide](https://github.com/duckdb/duckdb-web/blob/main/CONTRIBUTING.md). From a309c413d0cb0dcc6e175c931d58c2a26c7c5315 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sat, 7 Sep 2024 17:26:26 +0200 Subject: [PATCH 059/187] Fix quotes --- docs/api/cpp.md | 2 +- docs/sql/functions/date.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/api/cpp.md b/docs/api/cpp.md index 2c56dd31375..8987b25253b 100644 --- a/docs/api/cpp.md +++ b/docs/api/cpp.md @@ -41,7 +41,7 @@ if (result->HasError()) { } ``` -The `MaterializedQueryResult` instance contains firstly two fields that indicate whether the query was successful. `Query` will not throw exceptions under normal circumstances. Instead, invalid queries or other issues will lead to the `success` boolean field in the query result instance to be set to `false`. In this case an error message may be available in `error` as a string. If successful, other fields are set: the type of statement that was just executed (e.g., `StatementType::INSERT_STATEMENT`) is contained in `statement_type`. The high-level ("Logical type"/"SQL type") types of the result set columns are in `types`. The names of the result columns are in the `names` string vector. In case multiple result sets are returned, for example because the result set contained multiple statements, the result set can be chained using the `next` field. +The `MaterializedQueryResult` instance contains firstly two fields that indicate whether the query was successful. `Query` will not throw exceptions under normal circumstances. Instead, invalid queries or other issues will lead to the `success` boolean field in the query result instance to be set to `false`. In this case an error message may be available in `error` as a string. If successful, other fields are set: the type of statement that was just executed (e.g., `StatementType::INSERT_STATEMENT`) is contained in `statement_type`. The high-level (“Logical type”/“SQL type”) types of the result set columns are in `types`. The names of the result columns are in the `names` string vector. In case multiple result sets are returned, for example because the result set contained multiple statements, the result set can be chained using the `next` field. DuckDB also supports prepared statements in the C++ API with the `Prepare()` method. This returns an instance of `PreparedStatement`. This instance can be used to execute the prepared statement with parameters. Below is an example: diff --git a/docs/sql/functions/date.md b/docs/sql/functions/date.md index 8b1113f122e..e821b53cd93 100644 --- a/docs/sql/functions/date.md +++ b/docs/sql/functions/date.md @@ -249,5 +249,5 @@ There are also dedicated extraction functions to get the [subfields]({% link doc A few examples include extracting the day from a date, or the day of the week from a date. Functions applied to infinite dates will either return the same infinite dates -(e.g, `greatest`) or `NULL` (e.g., `date_part`) depending on what "makes sense". +(e.g, `greatest`) or `NULL` (e.g., `date_part`) depending on what “makes sense”. In general, if the function needs to examine the parts of the infinite date, the result will be `NULL`. From 82f2965060bebc5f85725f0c19b90a64cc935760 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sat, 7 Sep 2024 22:16:09 +0200 Subject: [PATCH 060/187] Set release date --- _data/past_releases.csv | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_data/past_releases.csv b/_data/past_releases.csv index ef33501fc96..90b2c1e7eb9 100644 --- a/_data/past_releases.csv +++ b/_data/past_releases.csv @@ -1,5 +1,5 @@ release_date,version_number,codename,duck_species_primary,duck_species_secondary,duck_wikipage,blog_post -2024-09-02,1.1.0,Eatoni,Anas eatoni,Eaton's_pintail,,https://duckdb.org/2024/09/02/announcing-duckdb-110 +2024-09-09,1.1.0,Eatoni,Anas eatoni,Eaton's_pintail,,https://duckdb.org/2024/09/09/announcing-duckdb-110 2024-06-03,1.0.0,Nivis,Anas nivis,Snow duck,,https://duckdb.org/2024/06/03/announcing-duckdb-100 2024-05-22,0.10.3,,,,, 2024-04-17,0.10.2,,,,, From 8fc643e3803de7599bb23153509816beaff0e48b Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sat, 7 Sep 2024 22:19:31 +0200 Subject: [PATCH 061/187] Add banner --- index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/index.html b/index.html index 947fe0c5e58..5107c13b8fe 100644 --- a/index.html +++ b/index.html @@ -8,7 +8,7 @@ @@ -19,7 +19,7 @@

DuckDB is a fast in-process analytical database

DuckDB supports a feature-rich SQL dialect complemented with deep integrations into client APIs.
- DuckDB v1.1.0 was released in September 2024. + DuckDB v1.1.0 was released in September 2024.

Installation Documentation From 92f42688fe3e3012cd062418759f0c8fed56d4e7 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 8 Sep 2024 12:50:49 +0200 Subject: [PATCH 062/187] Add v1.0.0 to past releases --- _data/versions.csv | 1 + 1 file changed, 1 insertion(+) diff --git a/_data/versions.csv b/_data/versions.csv index bc5ecdaf565..2b633e1b1a5 100644 --- a/_data/versions.csv +++ b/_data/versions.csv @@ -1,4 +1,5 @@ version +1.0 0.10 0.9 0.8 From d3587395ee567f272ce3ff283896541180804702 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 8 Sep 2024 12:55:11 +0200 Subject: [PATCH 063/187] Update archive script --- scripts/archive_docs.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/archive_docs.py b/scripts/archive_docs.py index 39e7eec550b..afb057530db 100644 --- a/scripts/archive_docs.py +++ b/scripts/archive_docs.py @@ -10,7 +10,7 @@ # Update _config.yml to specify the new version number # Add a new row to /_data/versions.csv with the new version number # Run this script. More options below, but as an example: -# run this script in the top level docs directory like: python3 scripts/archive_docs.py 0.3.4 +# run this script in the top level docs directory like: python3 scripts/archive_docs.py 1.0 # When testing, run a clean (non-incremental) serve: jekyll serve if len(sys.argv) < 2: From 5f2e823753217d558e33329950b825050865f7e7 Mon Sep 17 00:00:00 2001 From: Gabor Szarnyas Date: Sun, 8 Sep 2024 12:55:31 +0200 Subject: [PATCH 064/187] Archive v1.0 page --- _data/menu_docs_10.json | 1527 +++ docs/archive/1.0/api/adbc.md | 359 + docs/archive/1.0/api/c/api.md | 6588 ++++++++++++ docs/archive/1.0/api/c/appender.md | 595 ++ docs/archive/1.0/api/c/config.md | 182 + docs/archive/1.0/api/c/connect.md | 232 + docs/archive/1.0/api/c/data_chunk.md | 188 + docs/archive/1.0/api/c/overview.md | 15 + docs/archive/1.0/api/c/prepared.md | 246 + docs/archive/1.0/api/c/query.md | 486 + docs/archive/1.0/api/c/replacement_scans.md | 117 + docs/archive/1.0/api/c/table_functions.md | 832 ++ docs/archive/1.0/api/c/types.md | 1263 +++ docs/archive/1.0/api/c/value.md | 241 + docs/archive/1.0/api/c/vector.md | 803 ++ docs/archive/1.0/api/cli/arguments.md | 54 + docs/archive/1.0/api/cli/autocomplete.md | 57 + docs/archive/1.0/api/cli/dot_commands.md | 243 + docs/archive/1.0/api/cli/editing.md | 100 + docs/archive/1.0/api/cli/output_formats.md | 81 + docs/archive/1.0/api/cli/overview.md | 313 + .../1.0/api/cli/syntax_highlighting.md | 65 + docs/archive/1.0/api/cpp.md | 236 + docs/archive/1.0/api/go.md | 117 + docs/archive/1.0/api/java.md | 304 + docs/archive/1.0/api/julia.md | 160 + docs/archive/1.0/api/nodejs/overview.md | 176 + docs/archive/1.0/api/nodejs/reference.md | 890 ++ docs/archive/1.0/api/odbc/configuration.md | 54 + docs/archive/1.0/api/odbc/linux.md | 88 + docs/archive/1.0/api/odbc/macos.md | 70 + docs/archive/1.0/api/odbc/overview.md | 24 + docs/archive/1.0/api/odbc/windows.md | 93 + docs/archive/1.0/api/overview.md | 32 + docs/archive/1.0/api/python/conversion.md | 212 + docs/archive/1.0/api/python/data_ingestion.md | 195 + docs/archive/1.0/api/python/dbapi.md | 172 + docs/archive/1.0/api/python/expression.md | 172 + docs/archive/1.0/api/python/function.md | 232 + docs/archive/1.0/api/python/known_issues.md | 104 + docs/archive/1.0/api/python/overview.md | 227 + .../1.0/api/python/reference/.gitignore | 7 + .../archive/1.0/api/python/reference/index.md | 3206 ++++++ .../api/python/reference/templates/index.rst | 24 + docs/archive/1.0/api/python/relational_api.md | 314 + docs/archive/1.0/api/python/spark_api.md | 52 + docs/archive/1.0/api/python/types.md | 184 + docs/archive/1.0/api/r.md | 163 + docs/archive/1.0/api/rust.md | 66 + docs/archive/1.0/api/swift.md | 151 + docs/archive/1.0/api/wasm/data_ingestion.md | 181 + docs/archive/1.0/api/wasm/extensions.md | 82 + docs/archive/1.0/api/wasm/instantiation.md | 107 + docs/archive/1.0/api/wasm/overview.md | 28 + docs/archive/1.0/api/wasm/query.md | 83 + docs/archive/1.0/configuration/overview.md | 186 + docs/archive/1.0/configuration/pragmas.md | 534 + .../1.0/configuration/secrets_manager.md | 112 + docs/archive/1.0/connect/concurrency.md | 37 + docs/archive/1.0/connect/overview.md | 31 + docs/archive/1.0/data/appender.md | 85 + docs/archive/1.0/data/csv/auto_detection.md | 185 + docs/archive/1.0/data/csv/overview.md | 224 + .../1.0/data/csv/reading_faulty_csv_files.md | 235 + docs/archive/1.0/data/csv/tips.md | 46 + docs/archive/1.0/data/insert.md | 23 + docs/archive/1.0/data/json/overview.md | 320 + .../data/multiple_files/combining_schemas.md | 96 + .../1.0/data/multiple_files/overview.md | 169 + docs/archive/1.0/data/overview.md | 100 + docs/archive/1.0/data/parquet/encryption.md | 68 + docs/archive/1.0/data/parquet/metadata.md | 121 + docs/archive/1.0/data/parquet/overview.md | 302 + docs/archive/1.0/data/parquet/tips.md | 54 + .../data/partitioning/hive_partitioning.md | 106 + .../data/partitioning/partitioned_writes.md | 73 + docs/archive/1.0/dev/benchmark.md | 101 + .../1.0/dev/building/build_configuration.md | 104 + .../1.0/dev/building/build_instructions.md | 87 + .../1.0/dev/building/building_extensions.md | 139 + docs/archive/1.0/dev/building/overview.md | 10 + .../1.0/dev/building/supported_platforms.md | 18 + .../1.0/dev/building/troubleshooting.md | 129 + docs/archive/1.0/dev/internal_errors.md | 15 + docs/archive/1.0/dev/profiling.md | 196 + docs/archive/1.0/dev/release_calendar.md | 54 + docs/archive/1.0/dev/repositories.md | 39 + docs/archive/1.0/dev/sqllogictest/catch.md | 46 + .../archive/1.0/dev/sqllogictest/debugging.md | 73 + docs/archive/1.0/dev/sqllogictest/intro.md | 75 + docs/archive/1.0/dev/sqllogictest/loops.md | 77 + .../dev/sqllogictest/multiple_connections.md | 53 + docs/archive/1.0/dev/sqllogictest/overview.md | 12 + .../dev/sqllogictest/persistent_testing.md | 43 + .../dev/sqllogictest/result_verification.md | 159 + .../1.0/dev/sqllogictest/writing_tests.md | 98 + docs/archive/1.0/extensions/arrow.md | 26 + docs/archive/1.0/extensions/autocomplete.md | 54 + docs/archive/1.0/extensions/aws.md | 75 + docs/archive/1.0/extensions/azure.md | 355 + .../1.0/extensions/community_extensions.md | 71 + .../archive/1.0/extensions/core_extensions.md | 56 + docs/archive/1.0/extensions/delta.md | 87 + docs/archive/1.0/extensions/excel.md | 45 + .../1.0/extensions/full_text_search.md | 167 + docs/archive/1.0/extensions/httpfs/https.md | 52 + .../1.0/extensions/httpfs/hugging_face.md | 150 + .../archive/1.0/extensions/httpfs/overview.md | 30 + docs/archive/1.0/extensions/httpfs/s3api.md | 256 + .../httpfs/s3api_legacy_authentication.md | 86 + docs/archive/1.0/extensions/iceberg.md | 75 + docs/archive/1.0/extensions/icu.md | 24 + docs/archive/1.0/extensions/inet.md | 86 + docs/archive/1.0/extensions/jemalloc.md | 36 + docs/archive/1.0/extensions/json.md | 1041 ++ docs/archive/1.0/extensions/mysql.md | 263 + docs/archive/1.0/extensions/overview.md | 179 + docs/archive/1.0/extensions/postgres.md | 343 + docs/archive/1.0/extensions/spatial.md | 308 + docs/archive/1.0/extensions/sqlite.md | 270 + docs/archive/1.0/extensions/substrait.md | 147 + docs/archive/1.0/extensions/tpcds.md | 44 + docs/archive/1.0/extensions/tpch.md | 108 + .../extensions/versioning_of_extensions.md | 80 + docs/archive/1.0/extensions/vss.md | 114 + .../1.0/extensions/working_with_extensions.md | 233 + .../1.0/guides/data_viewers/tableau.md | 137 + .../1.0/guides/data_viewers/youplot.md | 77 + .../1.0/guides/database_integration/mysql.md | 53 + .../guides/database_integration/overview.md | 4 + .../guides/database_integration/postgres.md | 60 + .../1.0/guides/database_integration/sqlite.md | 46 + .../1.0/guides/file_formats/csv_export.md | 20 + .../1.0/guides/file_formats/csv_import.md | 47 + .../1.0/guides/file_formats/excel_export.md | 38 + .../1.0/guides/file_formats/excel_import.md | 106 + .../1.0/guides/file_formats/json_export.md | 20 + .../1.0/guides/file_formats/json_import.md | 37 + .../1.0/guides/file_formats/overview.md | 4 + .../1.0/guides/file_formats/parquet_export.md | 20 + .../1.0/guides/file_formats/parquet_import.md | 40 + .../1.0/guides/file_formats/query_parquet.md | 16 + .../1.0/guides/file_formats/read_file.md | 65 + docs/archive/1.0/guides/glossary.md | 24 + docs/archive/1.0/guides/meta/describe.md | 75 + .../1.0/guides/meta/duckdb_environment.md | 82 + docs/archive/1.0/guides/meta/explain.md | 115 + .../1.0/guides/meta/explain_analyze.md | 152 + docs/archive/1.0/guides/meta/list_tables.md | 39 + docs/archive/1.0/guides/meta/summarize.md | 70 + .../cloudflare_r2_import.md | 33 + .../duckdb_over_https_or_s3.md | 57 + .../network_cloud_storage/gcs_import.md | 43 + .../network_cloud_storage/http_import.md | 42 + .../guides/network_cloud_storage/overview.md | 4 + .../guides/network_cloud_storage/s3_export.md | 62 + .../network_cloud_storage/s3_express_one.md | 80 + .../s3_iceberg_import.md | 60 + .../guides/network_cloud_storage/s3_import.md | 57 + docs/archive/1.0/guides/odbc/general.md | 353 + docs/archive/1.0/guides/offline-copy.md | 16 + docs/archive/1.0/guides/overview.md | 124 + .../1.0/guides/performance/benchmarks.md | 29 + .../1.0/guides/performance/environment.md | 34 + .../1.0/guides/performance/file_formats.md | 105 + .../performance/how_to_tune_workloads.md | 139 + docs/archive/1.0/guides/performance/import.md | 23 + .../1.0/guides/performance/indexing.md | 63 + .../guides/performance/my_workload_is_slow.md | 19 + .../1.0/guides/performance/overview.md | 10 + docs/archive/1.0/guides/performance/schema.md | 86 + docs/archive/1.0/guides/python/execute_sql.md | 46 + .../archive/1.0/guides/python/export_arrow.md | 65 + .../archive/1.0/guides/python/export_numpy.md | 34 + .../1.0/guides/python/export_pandas.md | 23 + docs/archive/1.0/guides/python/filesystems.md | 31 + docs/archive/1.0/guides/python/ibis.md | 679 ++ .../archive/1.0/guides/python/import_arrow.md | 20 + .../archive/1.0/guides/python/import_numpy.md | 32 + .../1.0/guides/python/import_pandas.md | 34 + docs/archive/1.0/guides/python/install.md | 30 + docs/archive/1.0/guides/python/jupyter.md | 201 + .../1.0/guides/python/multiple_threads.md | 115 + docs/archive/1.0/guides/python/polars.md | 61 + .../guides/python/relational_api_pandas.md | 32 + .../archive/1.0/guides/python/sql_on_arrow.md | 113 + .../1.0/guides/python/sql_on_pandas.md | 20 + .../guides/snippets/create_synthetic_data.md | 57 + .../archive/1.0/guides/sql_editors/dbeaver.md | 65 + .../1.0/guides/sql_features/asof_join.md | 174 + .../guides/sql_features/full_text_search.md | 77 + docs/archive/1.0/index.md | 22 + docs/archive/1.0/installation/index.html | 110 + docs/archive/1.0/internals/overview.md | 80 + docs/archive/1.0/internals/storage.md | 138 + docs/archive/1.0/internals/vector.md | 141 + .../files_created_by_duckdb.md | 31 + .../gitignore_for_duckdb.md | 32 + docs/archive/1.0/operations_manual/limits.md | 16 + .../non-deterministic_behavior.md | 88 + .../archive/1.0/operations_manual/overview.md | 10 + .../securing_duckdb/overview.md | 92 + .../securing_duckdb/securing_extensions.md | 78 + docs/archive/1.0/search.md | 19 + docs/archive/1.0/sitemap.md | 6 + docs/archive/1.0/sql/constraints.md | 112 + docs/archive/1.0/sql/data_types/array.md | 123 + docs/archive/1.0/sql/data_types/bitstring.md | 56 + docs/archive/1.0/sql/data_types/blob.md | 38 + docs/archive/1.0/sql/data_types/boolean.md | 68 + docs/archive/1.0/sql/data_types/date.md | 47 + docs/archive/1.0/sql/data_types/enum.md | 228 + docs/archive/1.0/sql/data_types/interval.md | 130 + docs/archive/1.0/sql/data_types/list.md | 133 + .../1.0/sql/data_types/literal_types.md | 186 + docs/archive/1.0/sql/data_types/map.md | 96 + docs/archive/1.0/sql/data_types/nulls.md | 131 + docs/archive/1.0/sql/data_types/numeric.md | 114 + docs/archive/1.0/sql/data_types/overview.md | 96 + docs/archive/1.0/sql/data_types/struct.md | 293 + docs/archive/1.0/sql/data_types/text.md | 61 + docs/archive/1.0/sql/data_types/time.md | 51 + docs/archive/1.0/sql/data_types/timestamp.md | 190 + docs/archive/1.0/sql/data_types/timezones.md | 655 ++ .../archive/1.0/sql/data_types/typecasting.md | 110 + docs/archive/1.0/sql/data_types/union.md | 114 + docs/archive/1.0/sql/dialect/friendly_sql.md | 99 + .../sql/dialect/keywords_and_identifiers.md | 119 + .../1.0/sql/dialect/order_preservation.md | 17 + docs/archive/1.0/sql/dialect/overview.md | 9 + .../sql/dialect/postgresql_compatibility.md | 171 + docs/archive/1.0/sql/expressions/case.md | 77 + docs/archive/1.0/sql/expressions/cast.md | 60 + .../archive/1.0/sql/expressions/collations.md | 178 + .../sql/expressions/comparison_operators.md | 54 + docs/archive/1.0/sql/expressions/in.md | 51 + .../1.0/sql/expressions/logical_operators.md | 34 + docs/archive/1.0/sql/expressions/overview.md | 6 + docs/archive/1.0/sql/expressions/star.md | 225 + .../archive/1.0/sql/expressions/subqueries.md | 219 + docs/archive/1.0/sql/functions/aggregates.md | 623 ++ docs/archive/1.0/sql/functions/array.md | 67 + docs/archive/1.0/sql/functions/bitstring.md | 158 + docs/archive/1.0/sql/functions/blob.md | 62 + docs/archive/1.0/sql/functions/char.md | 1044 ++ docs/archive/1.0/sql/functions/date.md | 253 + docs/archive/1.0/sql/functions/dateformat.md | 128 + docs/archive/1.0/sql/functions/datepart.md | 289 + docs/archive/1.0/sql/functions/enum.md | 66 + docs/archive/1.0/sql/functions/interval.md | 176 + docs/archive/1.0/sql/functions/lambda.md | 269 + docs/archive/1.0/sql/functions/list.md | 887 ++ docs/archive/1.0/sql/functions/map.md | 90 + docs/archive/1.0/sql/functions/nested.md | 16 + docs/archive/1.0/sql/functions/numeric.md | 540 + docs/archive/1.0/sql/functions/overview.md | 57 + .../1.0/sql/functions/pattern_matching.md | 210 + .../1.0/sql/functions/regular_expressions.md | 204 + docs/archive/1.0/sql/functions/struct.md | 81 + docs/archive/1.0/sql/functions/time.md | 120 + docs/archive/1.0/sql/functions/timestamp.md | 400 + docs/archive/1.0/sql/functions/timestamptz.md | 495 + docs/archive/1.0/sql/functions/union.md | 45 + docs/archive/1.0/sql/functions/utility.md | 309 + .../1.0/sql/functions/window_functions.md | 437 + docs/archive/1.0/sql/indexes.md | 70 + docs/archive/1.0/sql/introduction.md | 403 + .../1.0/sql/meta/duckdb_table_functions.md | 340 + .../1.0/sql/meta/information_schema.md | 131 + docs/archive/1.0/sql/query_syntax/filter.md | 143 + docs/archive/1.0/sql/query_syntax/from.md | 534 + docs/archive/1.0/sql/query_syntax/groupby.md | 64 + .../1.0/sql/query_syntax/grouping_sets.md | 213 + docs/archive/1.0/sql/query_syntax/having.md | 31 + docs/archive/1.0/sql/query_syntax/limit.md | 42 + docs/archive/1.0/sql/query_syntax/orderby.md | 134 + .../sql/query_syntax/prepared_statements.md | 100 + docs/archive/1.0/sql/query_syntax/qualify.md | 102 + docs/archive/1.0/sql/query_syntax/sample.md | 37 + docs/archive/1.0/sql/query_syntax/select.md | 173 + docs/archive/1.0/sql/query_syntax/setops.md | 158 + docs/archive/1.0/sql/query_syntax/unnest.md | 167 + docs/archive/1.0/sql/query_syntax/values.md | 41 + docs/archive/1.0/sql/query_syntax/where.md | 37 + docs/archive/1.0/sql/query_syntax/window.md | 11 + docs/archive/1.0/sql/query_syntax/with.md | 302 + docs/archive/1.0/sql/samples.md | 131 + .../archive/1.0/sql/statements/alter_table.md | 166 + docs/archive/1.0/sql/statements/alter_view.md | 16 + docs/archive/1.0/sql/statements/analyze.md | 16 + docs/archive/1.0/sql/statements/attach.md | 227 + docs/archive/1.0/sql/statements/call.md | 33 + docs/archive/1.0/sql/statements/checkpoint.md | 50 + docs/archive/1.0/sql/statements/comment_on.md | 127 + docs/archive/1.0/sql/statements/copy.md | 350 + .../1.0/sql/statements/create_index.md | 80 + .../1.0/sql/statements/create_macro.md | 246 + .../1.0/sql/statements/create_schema.md | 40 + .../1.0/sql/statements/create_secret.md | 15 + .../1.0/sql/statements/create_sequence.md | 164 + .../1.0/sql/statements/create_table.md | 289 + .../archive/1.0/sql/statements/create_type.md | 45 + .../archive/1.0/sql/statements/create_view.md | 43 + docs/archive/1.0/sql/statements/delete.md | 41 + docs/archive/1.0/sql/statements/describe.md | 26 + docs/archive/1.0/sql/statements/drop.md | 136 + docs/archive/1.0/sql/statements/export.md | 79 + docs/archive/1.0/sql/statements/insert.md | 429 + docs/archive/1.0/sql/statements/overview.md | 4 + docs/archive/1.0/sql/statements/pivot.md | 420 + docs/archive/1.0/sql/statements/profiling.md | 27 + docs/archive/1.0/sql/statements/select.md | 170 + docs/archive/1.0/sql/statements/set.md | 77 + docs/archive/1.0/sql/statements/summarize.md | 22 + .../1.0/sql/statements/transactions.md | 71 + docs/archive/1.0/sql/statements/unpivot.md | 374 + docs/archive/1.0/sql/statements/update.md | 138 + docs/archive/1.0/sql/statements/use.md | 24 + docs/archive/1.0/sql/statements/vacuum.md | 42 + .../1.0/sql/tutorial/css/bootstrap.min.css | 6 + .../1.0/sql/tutorial/css/codemirror.css | 341 + .../archive/1.0/sql/tutorial/css/docs.min.css | 11 + .../fonts/glyphicons-halflings-regular.eot | Bin 0 -> 20127 bytes .../fonts/glyphicons-halflings-regular.svg | 1 + .../fonts/glyphicons-halflings-regular.ttf | Bin 0 -> 45404 bytes .../fonts/glyphicons-halflings-regular.woff | Bin 0 -> 23424 bytes .../fonts/glyphicons-halflings-regular.woff2 | Bin 0 -> 18028 bytes docs/archive/1.0/sql/tutorial/index.html | 517 + .../1.0/sql/tutorial/js/bootstrap.min.js | 7 + .../1.0/sql/tutorial/js/codemirror-sql.js | 413 + .../archive/1.0/sql/tutorial/js/codemirror.js | 9231 +++++++++++++++++ docs/archive/1.0/sql/tutorial/js/docs.min.js | 26 + .../archive/1.0/sql/tutorial/js/jquery.min.js | 5 + docs/archive/1.0/sql/tutorial/js/sql.js | 505 + docs/archive/1.0/sql/tutorial/js/vocdata.js | 219 + 335 files changed, 67857 insertions(+) create mode 100644 _data/menu_docs_10.json create mode 100644 docs/archive/1.0/api/adbc.md create mode 100644 docs/archive/1.0/api/c/api.md create mode 100644 docs/archive/1.0/api/c/appender.md create mode 100644 docs/archive/1.0/api/c/config.md create mode 100644 docs/archive/1.0/api/c/connect.md create mode 100644 docs/archive/1.0/api/c/data_chunk.md create mode 100644 docs/archive/1.0/api/c/overview.md create mode 100644 docs/archive/1.0/api/c/prepared.md create mode 100644 docs/archive/1.0/api/c/query.md create mode 100644 docs/archive/1.0/api/c/replacement_scans.md create mode 100644 docs/archive/1.0/api/c/table_functions.md create mode 100644 docs/archive/1.0/api/c/types.md create mode 100644 docs/archive/1.0/api/c/value.md create mode 100644 docs/archive/1.0/api/c/vector.md create mode 100644 docs/archive/1.0/api/cli/arguments.md create mode 100644 docs/archive/1.0/api/cli/autocomplete.md create mode 100644 docs/archive/1.0/api/cli/dot_commands.md create mode 100644 docs/archive/1.0/api/cli/editing.md create mode 100644 docs/archive/1.0/api/cli/output_formats.md create mode 100644 docs/archive/1.0/api/cli/overview.md create mode 100644 docs/archive/1.0/api/cli/syntax_highlighting.md create mode 100644 docs/archive/1.0/api/cpp.md create mode 100644 docs/archive/1.0/api/go.md create mode 100644 docs/archive/1.0/api/java.md create mode 100644 docs/archive/1.0/api/julia.md create mode 100644 docs/archive/1.0/api/nodejs/overview.md create mode 100644 docs/archive/1.0/api/nodejs/reference.md create mode 100644 docs/archive/1.0/api/odbc/configuration.md create mode 100644 docs/archive/1.0/api/odbc/linux.md create mode 100644 docs/archive/1.0/api/odbc/macos.md create mode 100644 docs/archive/1.0/api/odbc/overview.md create mode 100644 docs/archive/1.0/api/odbc/windows.md create mode 100644 docs/archive/1.0/api/overview.md create mode 100644 docs/archive/1.0/api/python/conversion.md create mode 100644 docs/archive/1.0/api/python/data_ingestion.md create mode 100644 docs/archive/1.0/api/python/dbapi.md create mode 100644 docs/archive/1.0/api/python/expression.md create mode 100644 docs/archive/1.0/api/python/function.md create mode 100644 docs/archive/1.0/api/python/known_issues.md create mode 100644 docs/archive/1.0/api/python/overview.md create mode 100644 docs/archive/1.0/api/python/reference/.gitignore create mode 100644 docs/archive/1.0/api/python/reference/index.md create mode 100644 docs/archive/1.0/api/python/reference/templates/index.rst create mode 100644 docs/archive/1.0/api/python/relational_api.md create mode 100644 docs/archive/1.0/api/python/spark_api.md create mode 100644 docs/archive/1.0/api/python/types.md create mode 100644 docs/archive/1.0/api/r.md create mode 100644 docs/archive/1.0/api/rust.md create mode 100644 docs/archive/1.0/api/swift.md create mode 100644 docs/archive/1.0/api/wasm/data_ingestion.md create mode 100644 docs/archive/1.0/api/wasm/extensions.md create mode 100644 docs/archive/1.0/api/wasm/instantiation.md create mode 100644 docs/archive/1.0/api/wasm/overview.md create mode 100644 docs/archive/1.0/api/wasm/query.md create mode 100644 docs/archive/1.0/configuration/overview.md create mode 100644 docs/archive/1.0/configuration/pragmas.md create mode 100644 docs/archive/1.0/configuration/secrets_manager.md create mode 100644 docs/archive/1.0/connect/concurrency.md create mode 100644 docs/archive/1.0/connect/overview.md create mode 100644 docs/archive/1.0/data/appender.md create mode 100644 docs/archive/1.0/data/csv/auto_detection.md create mode 100644 docs/archive/1.0/data/csv/overview.md create mode 100644 docs/archive/1.0/data/csv/reading_faulty_csv_files.md create mode 100644 docs/archive/1.0/data/csv/tips.md create mode 100644 docs/archive/1.0/data/insert.md create mode 100644 docs/archive/1.0/data/json/overview.md create mode 100644 docs/archive/1.0/data/multiple_files/combining_schemas.md create mode 100644 docs/archive/1.0/data/multiple_files/overview.md create mode 100644 docs/archive/1.0/data/overview.md create mode 100644 docs/archive/1.0/data/parquet/encryption.md create mode 100644 docs/archive/1.0/data/parquet/metadata.md create mode 100644 docs/archive/1.0/data/parquet/overview.md create mode 100644 docs/archive/1.0/data/parquet/tips.md create mode 100644 docs/archive/1.0/data/partitioning/hive_partitioning.md create mode 100644 docs/archive/1.0/data/partitioning/partitioned_writes.md create mode 100644 docs/archive/1.0/dev/benchmark.md create mode 100644 docs/archive/1.0/dev/building/build_configuration.md create mode 100644 docs/archive/1.0/dev/building/build_instructions.md create mode 100644 docs/archive/1.0/dev/building/building_extensions.md create mode 100644 docs/archive/1.0/dev/building/overview.md create mode 100644 docs/archive/1.0/dev/building/supported_platforms.md create mode 100644 docs/archive/1.0/dev/building/troubleshooting.md create mode 100644 docs/archive/1.0/dev/internal_errors.md create mode 100644 docs/archive/1.0/dev/profiling.md create mode 100644 docs/archive/1.0/dev/release_calendar.md create mode 100644 docs/archive/1.0/dev/repositories.md create mode 100644 docs/archive/1.0/dev/sqllogictest/catch.md create mode 100644 docs/archive/1.0/dev/sqllogictest/debugging.md create mode 100644 docs/archive/1.0/dev/sqllogictest/intro.md create mode 100644 docs/archive/1.0/dev/sqllogictest/loops.md create mode 100644 docs/archive/1.0/dev/sqllogictest/multiple_connections.md create mode 100644 docs/archive/1.0/dev/sqllogictest/overview.md create mode 100644 docs/archive/1.0/dev/sqllogictest/persistent_testing.md create mode 100644 docs/archive/1.0/dev/sqllogictest/result_verification.md create mode 100644 docs/archive/1.0/dev/sqllogictest/writing_tests.md create mode 100644 docs/archive/1.0/extensions/arrow.md create mode 100644 docs/archive/1.0/extensions/autocomplete.md create mode 100644 docs/archive/1.0/extensions/aws.md create mode 100644 docs/archive/1.0/extensions/azure.md create mode 100644 docs/archive/1.0/extensions/community_extensions.md create mode 100644 docs/archive/1.0/extensions/core_extensions.md create mode 100644 docs/archive/1.0/extensions/delta.md create mode 100644 docs/archive/1.0/extensions/excel.md create mode 100644 docs/archive/1.0/extensions/full_text_search.md create mode 100644 docs/archive/1.0/extensions/httpfs/https.md create mode 100644 docs/archive/1.0/extensions/httpfs/hugging_face.md create mode 100644 docs/archive/1.0/extensions/httpfs/overview.md create mode 100644 docs/archive/1.0/extensions/httpfs/s3api.md create mode 100644 docs/archive/1.0/extensions/httpfs/s3api_legacy_authentication.md create mode 100644 docs/archive/1.0/extensions/iceberg.md create mode 100644 docs/archive/1.0/extensions/icu.md create mode 100644 docs/archive/1.0/extensions/inet.md create mode 100644 docs/archive/1.0/extensions/jemalloc.md create mode 100644 docs/archive/1.0/extensions/json.md create mode 100644 docs/archive/1.0/extensions/mysql.md create mode 100644 docs/archive/1.0/extensions/overview.md create mode 100644 docs/archive/1.0/extensions/postgres.md create mode 100644 docs/archive/1.0/extensions/spatial.md create mode 100644 docs/archive/1.0/extensions/sqlite.md create mode 100644 docs/archive/1.0/extensions/substrait.md create mode 100644 docs/archive/1.0/extensions/tpcds.md create mode 100644 docs/archive/1.0/extensions/tpch.md create mode 100644 docs/archive/1.0/extensions/versioning_of_extensions.md create mode 100644 docs/archive/1.0/extensions/vss.md create mode 100644 docs/archive/1.0/extensions/working_with_extensions.md create mode 100644 docs/archive/1.0/guides/data_viewers/tableau.md create mode 100644 docs/archive/1.0/guides/data_viewers/youplot.md create mode 100644 docs/archive/1.0/guides/database_integration/mysql.md create mode 100644 docs/archive/1.0/guides/database_integration/overview.md create mode 100644 docs/archive/1.0/guides/database_integration/postgres.md create mode 100644 docs/archive/1.0/guides/database_integration/sqlite.md create mode 100644 docs/archive/1.0/guides/file_formats/csv_export.md create mode 100644 docs/archive/1.0/guides/file_formats/csv_import.md create mode 100644 docs/archive/1.0/guides/file_formats/excel_export.md create mode 100644 docs/archive/1.0/guides/file_formats/excel_import.md create mode 100644 docs/archive/1.0/guides/file_formats/json_export.md create mode 100644 docs/archive/1.0/guides/file_formats/json_import.md create mode 100644 docs/archive/1.0/guides/file_formats/overview.md create mode 100644 docs/archive/1.0/guides/file_formats/parquet_export.md create mode 100644 docs/archive/1.0/guides/file_formats/parquet_import.md create mode 100644 docs/archive/1.0/guides/file_formats/query_parquet.md create mode 100644 docs/archive/1.0/guides/file_formats/read_file.md create mode 100644 docs/archive/1.0/guides/glossary.md create mode 100644 docs/archive/1.0/guides/meta/describe.md create mode 100644 docs/archive/1.0/guides/meta/duckdb_environment.md create mode 100644 docs/archive/1.0/guides/meta/explain.md create mode 100644 docs/archive/1.0/guides/meta/explain_analyze.md create mode 100644 docs/archive/1.0/guides/meta/list_tables.md create mode 100644 docs/archive/1.0/guides/meta/summarize.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/cloudflare_r2_import.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/duckdb_over_https_or_s3.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/gcs_import.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/http_import.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/overview.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/s3_export.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/s3_express_one.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/s3_iceberg_import.md create mode 100644 docs/archive/1.0/guides/network_cloud_storage/s3_import.md create mode 100644 docs/archive/1.0/guides/odbc/general.md create mode 100644 docs/archive/1.0/guides/offline-copy.md create mode 100644 docs/archive/1.0/guides/overview.md create mode 100644 docs/archive/1.0/guides/performance/benchmarks.md create mode 100644 docs/archive/1.0/guides/performance/environment.md create mode 100644 docs/archive/1.0/guides/performance/file_formats.md create mode 100644 docs/archive/1.0/guides/performance/how_to_tune_workloads.md create mode 100644 docs/archive/1.0/guides/performance/import.md create mode 100644 docs/archive/1.0/guides/performance/indexing.md create mode 100644 docs/archive/1.0/guides/performance/my_workload_is_slow.md create mode 100644 docs/archive/1.0/guides/performance/overview.md create mode 100644 docs/archive/1.0/guides/performance/schema.md create mode 100644 docs/archive/1.0/guides/python/execute_sql.md create mode 100644 docs/archive/1.0/guides/python/export_arrow.md create mode 100644 docs/archive/1.0/guides/python/export_numpy.md create mode 100644 docs/archive/1.0/guides/python/export_pandas.md create mode 100644 docs/archive/1.0/guides/python/filesystems.md create mode 100644 docs/archive/1.0/guides/python/ibis.md create mode 100644 docs/archive/1.0/guides/python/import_arrow.md create mode 100644 docs/archive/1.0/guides/python/import_numpy.md create mode 100644 docs/archive/1.0/guides/python/import_pandas.md create mode 100644 docs/archive/1.0/guides/python/install.md create mode 100644 docs/archive/1.0/guides/python/jupyter.md create mode 100644 docs/archive/1.0/guides/python/multiple_threads.md create mode 100644 docs/archive/1.0/guides/python/polars.md create mode 100644 docs/archive/1.0/guides/python/relational_api_pandas.md create mode 100644 docs/archive/1.0/guides/python/sql_on_arrow.md create mode 100644 docs/archive/1.0/guides/python/sql_on_pandas.md create mode 100644 docs/archive/1.0/guides/snippets/create_synthetic_data.md create mode 100644 docs/archive/1.0/guides/sql_editors/dbeaver.md create mode 100644 docs/archive/1.0/guides/sql_features/asof_join.md create mode 100644 docs/archive/1.0/guides/sql_features/full_text_search.md create mode 100644 docs/archive/1.0/index.md create mode 100644 docs/archive/1.0/installation/index.html create mode 100644 docs/archive/1.0/internals/overview.md create mode 100644 docs/archive/1.0/internals/storage.md create mode 100644 docs/archive/1.0/internals/vector.md create mode 100644 docs/archive/1.0/operations_manual/footprint_of_duckdb/files_created_by_duckdb.md create mode 100644 docs/archive/1.0/operations_manual/footprint_of_duckdb/gitignore_for_duckdb.md create mode 100644 docs/archive/1.0/operations_manual/limits.md create mode 100644 docs/archive/1.0/operations_manual/non-deterministic_behavior.md create mode 100644 docs/archive/1.0/operations_manual/overview.md create mode 100644 docs/archive/1.0/operations_manual/securing_duckdb/overview.md create mode 100644 docs/archive/1.0/operations_manual/securing_duckdb/securing_extensions.md create mode 100644 docs/archive/1.0/search.md create mode 100644 docs/archive/1.0/sitemap.md create mode 100644 docs/archive/1.0/sql/constraints.md create mode 100644 docs/archive/1.0/sql/data_types/array.md create mode 100644 docs/archive/1.0/sql/data_types/bitstring.md create mode 100644 docs/archive/1.0/sql/data_types/blob.md create mode 100644 docs/archive/1.0/sql/data_types/boolean.md create mode 100644 docs/archive/1.0/sql/data_types/date.md create mode 100644 docs/archive/1.0/sql/data_types/enum.md create mode 100644 docs/archive/1.0/sql/data_types/interval.md create mode 100644 docs/archive/1.0/sql/data_types/list.md create mode 100644 docs/archive/1.0/sql/data_types/literal_types.md create mode 100644 docs/archive/1.0/sql/data_types/map.md create mode 100644 docs/archive/1.0/sql/data_types/nulls.md create mode 100644 docs/archive/1.0/sql/data_types/numeric.md create mode 100644 docs/archive/1.0/sql/data_types/overview.md create mode 100644 docs/archive/1.0/sql/data_types/struct.md create mode 100644 docs/archive/1.0/sql/data_types/text.md create mode 100644 docs/archive/1.0/sql/data_types/time.md create mode 100644 docs/archive/1.0/sql/data_types/timestamp.md create mode 100644 docs/archive/1.0/sql/data_types/timezones.md create mode 100644 docs/archive/1.0/sql/data_types/typecasting.md create mode 100644 docs/archive/1.0/sql/data_types/union.md create mode 100644 docs/archive/1.0/sql/dialect/friendly_sql.md create mode 100644 docs/archive/1.0/sql/dialect/keywords_and_identifiers.md create mode 100644 docs/archive/1.0/sql/dialect/order_preservation.md create mode 100644 docs/archive/1.0/sql/dialect/overview.md create mode 100644 docs/archive/1.0/sql/dialect/postgresql_compatibility.md create mode 100644 docs/archive/1.0/sql/expressions/case.md create mode 100644 docs/archive/1.0/sql/expressions/cast.md create mode 100644 docs/archive/1.0/sql/expressions/collations.md create mode 100644 docs/archive/1.0/sql/expressions/comparison_operators.md create mode 100644 docs/archive/1.0/sql/expressions/in.md create mode 100644 docs/archive/1.0/sql/expressions/logical_operators.md create mode 100644 docs/archive/1.0/sql/expressions/overview.md create mode 100644 docs/archive/1.0/sql/expressions/star.md create mode 100644 docs/archive/1.0/sql/expressions/subqueries.md create mode 100644 docs/archive/1.0/sql/functions/aggregates.md create mode 100644 docs/archive/1.0/sql/functions/array.md create mode 100644 docs/archive/1.0/sql/functions/bitstring.md create mode 100644 docs/archive/1.0/sql/functions/blob.md create mode 100644 docs/archive/1.0/sql/functions/char.md create mode 100644 docs/archive/1.0/sql/functions/date.md create mode 100644 docs/archive/1.0/sql/functions/dateformat.md create mode 100644 docs/archive/1.0/sql/functions/datepart.md create mode 100644 docs/archive/1.0/sql/functions/enum.md create mode 100644 docs/archive/1.0/sql/functions/interval.md create mode 100644 docs/archive/1.0/sql/functions/lambda.md create mode 100644 docs/archive/1.0/sql/functions/list.md create mode 100644 docs/archive/1.0/sql/functions/map.md create mode 100644 docs/archive/1.0/sql/functions/nested.md create mode 100644 docs/archive/1.0/sql/functions/numeric.md create mode 100644 docs/archive/1.0/sql/functions/overview.md create mode 100644 docs/archive/1.0/sql/functions/pattern_matching.md create mode 100644 docs/archive/1.0/sql/functions/regular_expressions.md create mode 100644 docs/archive/1.0/sql/functions/struct.md create mode 100644 docs/archive/1.0/sql/functions/time.md create mode 100644 docs/archive/1.0/sql/functions/timestamp.md create mode 100644 docs/archive/1.0/sql/functions/timestamptz.md create mode 100644 docs/archive/1.0/sql/functions/union.md create mode 100644 docs/archive/1.0/sql/functions/utility.md create mode 100644 docs/archive/1.0/sql/functions/window_functions.md create mode 100644 docs/archive/1.0/sql/indexes.md create mode 100644 docs/archive/1.0/sql/introduction.md create mode 100644 docs/archive/1.0/sql/meta/duckdb_table_functions.md create mode 100644 docs/archive/1.0/sql/meta/information_schema.md create mode 100644 docs/archive/1.0/sql/query_syntax/filter.md create mode 100644 docs/archive/1.0/sql/query_syntax/from.md create mode 100644 docs/archive/1.0/sql/query_syntax/groupby.md create mode 100644 docs/archive/1.0/sql/query_syntax/grouping_sets.md create mode 100644 docs/archive/1.0/sql/query_syntax/having.md create mode 100644 docs/archive/1.0/sql/query_syntax/limit.md create mode 100644 docs/archive/1.0/sql/query_syntax/orderby.md create mode 100644 docs/archive/1.0/sql/query_syntax/prepared_statements.md create mode 100644 docs/archive/1.0/sql/query_syntax/qualify.md create mode 100644 docs/archive/1.0/sql/query_syntax/sample.md create mode 100644 docs/archive/1.0/sql/query_syntax/select.md create mode 100644 docs/archive/1.0/sql/query_syntax/setops.md create mode 100644 docs/archive/1.0/sql/query_syntax/unnest.md create mode 100644 docs/archive/1.0/sql/query_syntax/values.md create mode 100644 docs/archive/1.0/sql/query_syntax/where.md create mode 100644 docs/archive/1.0/sql/query_syntax/window.md create mode 100644 docs/archive/1.0/sql/query_syntax/with.md create mode 100644 docs/archive/1.0/sql/samples.md create mode 100644 docs/archive/1.0/sql/statements/alter_table.md create mode 100644 docs/archive/1.0/sql/statements/alter_view.md create mode 100644 docs/archive/1.0/sql/statements/analyze.md create mode 100644 docs/archive/1.0/sql/statements/attach.md create mode 100644 docs/archive/1.0/sql/statements/call.md create mode 100644 docs/archive/1.0/sql/statements/checkpoint.md create mode 100644 docs/archive/1.0/sql/statements/comment_on.md create mode 100644 docs/archive/1.0/sql/statements/copy.md create mode 100644 docs/archive/1.0/sql/statements/create_index.md create mode 100644 docs/archive/1.0/sql/statements/create_macro.md create mode 100644 docs/archive/1.0/sql/statements/create_schema.md create mode 100644 docs/archive/1.0/sql/statements/create_secret.md create mode 100644 docs/archive/1.0/sql/statements/create_sequence.md create mode 100644 docs/archive/1.0/sql/statements/create_table.md create mode 100644 docs/archive/1.0/sql/statements/create_type.md create mode 100644 docs/archive/1.0/sql/statements/create_view.md create mode 100644 docs/archive/1.0/sql/statements/delete.md create mode 100644 docs/archive/1.0/sql/statements/describe.md create mode 100644 docs/archive/1.0/sql/statements/drop.md create mode 100644 docs/archive/1.0/sql/statements/export.md create mode 100644 docs/archive/1.0/sql/statements/insert.md create mode 100644 docs/archive/1.0/sql/statements/overview.md create mode 100644 docs/archive/1.0/sql/statements/pivot.md create mode 100644 docs/archive/1.0/sql/statements/profiling.md create mode 100644 docs/archive/1.0/sql/statements/select.md create mode 100644 docs/archive/1.0/sql/statements/set.md create mode 100644 docs/archive/1.0/sql/statements/summarize.md create mode 100644 docs/archive/1.0/sql/statements/transactions.md create mode 100644 docs/archive/1.0/sql/statements/unpivot.md create mode 100644 docs/archive/1.0/sql/statements/update.md create mode 100644 docs/archive/1.0/sql/statements/use.md create mode 100644 docs/archive/1.0/sql/statements/vacuum.md create mode 100644 docs/archive/1.0/sql/tutorial/css/bootstrap.min.css create mode 100644 docs/archive/1.0/sql/tutorial/css/codemirror.css create mode 100644 docs/archive/1.0/sql/tutorial/css/docs.min.css create mode 100644 docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.eot create mode 100644 docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.svg create mode 100644 docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.ttf create mode 100644 docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.woff create mode 100644 docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.woff2 create mode 100644 docs/archive/1.0/sql/tutorial/index.html create mode 100644 docs/archive/1.0/sql/tutorial/js/bootstrap.min.js create mode 100644 docs/archive/1.0/sql/tutorial/js/codemirror-sql.js create mode 100644 docs/archive/1.0/sql/tutorial/js/codemirror.js create mode 100644 docs/archive/1.0/sql/tutorial/js/docs.min.js create mode 100644 docs/archive/1.0/sql/tutorial/js/jquery.min.js create mode 100644 docs/archive/1.0/sql/tutorial/js/sql.js create mode 100644 docs/archive/1.0/sql/tutorial/js/vocdata.js diff --git a/_data/menu_docs_10.json b/_data/menu_docs_10.json new file mode 100644 index 00000000000..e717923d22c --- /dev/null +++ b/_data/menu_docs_10.json @@ -0,0 +1,1527 @@ +{ + "docsmenu": [ + { + "page": "Installation", + "slug": "installation/", + "url": "index" + }, + { + "page": "Documentation", + "slug": "", + "mainfolderitems": [ + { + "page": "Getting Started", + "url": "index" + }, + { + "page": "Connect", + "slug": "connect", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Concurrency", + "url": "concurrency" + } + ] + }, + { + "page": "Data Import", + "slug": "data", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "CSV Files", + "slug": "csv", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Auto Detection", + "url": "auto_detection" + }, + { + "page": "Reading Faulty CSV Files", + "url": "reading_faulty_csv_files" + }, + { + "page": "Tips", + "url": "tips" + } + ] + }, + { + "page": "JSON Files", + "slug": "json", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + } + ] + }, + { + "page": "Multiple Files", + "slug": "multiple_files", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Combining Schemas", + "url": "combining_schemas" + } + ] + }, + { + "page": "Parquet Files", + "slug": "parquet", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Metadata", + "url": "metadata" + }, + { + "page": "Encryption", + "url": "encryption" + }, + { + "page": "Tips", + "url": "tips" + } + ] + }, + { + "page": "Partitioning", + "slug": "partitioning", + "subsubfolderitems": [ + { + "page": "Hive Partitioning", + "url": "hive_partitioning" + }, + { + "page": "Partitioned Writes", + "url": "partitioned_writes" + } + ] + }, + { + "page": "Appender", + "url": "appender" + }, + { + "page": "INSERT Statements", + "url": "insert" + } + ] + }, + { + "page": "Client APIs", + "slug": "api", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "C", + "slug": "c", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Startup", + "url": "connect" + }, + { + "page": "Configuration", + "url": "config" + }, + { + "page": "Query", + "url": "query" + }, + { + "page": "Data Chunks", + "url": "data_chunk" + }, + { + "page": "Vectors", + "url": "vector" + }, + { + "page": "Values", + "url": "value" + }, + { + "page": "Types", + "url": "types" + }, + { + "page": "Prepared Statements", + "url": "prepared" + }, + { + "page": "Appender", + "url": "appender" + }, + { + "page": "Table Functions", + "url": "table_functions" + }, + { + "page": "Replacement Scans", + "url": "replacement_scans" + }, + { + "page": "API Reference", + "url": "api" + } + ] + }, + { + "page": "C++", + "url": "cpp" + }, + { + "page": "CLI", + "slug": "cli", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Arguments", + "url": "arguments" + }, + { + "page": "Dot Commands", + "url": "dot_commands" + }, + { + "page": "Output Formats", + "url": "output_formats" + }, + { + "page": "Editing", + "url": "editing" + }, + { + "page": "Autocomplete", + "url": "autocomplete" + }, + { + "page": "Syntax Highlighting", + "url": "syntax_highlighting" + } + ] + }, + { + "page": "Go", + "url": "go" + }, + { + "page": "Java", + "url": "java" + }, + { + "page": "Julia", + "url": "julia" + }, + { + "page": "Node.js", + "slug": "nodejs", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "API Reference", + "url": "reference" + } + ] + }, + { + "page": "Python", + "slug": "python", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Data Ingestion", + "url": "data_ingestion" + }, + { + "page": "Conversion between DuckDB and Python", + "url": "conversion" + }, + { + "page": "DB API", + "url": "dbapi" + }, + { + "page": "Relational API", + "url": "relational_api" + }, + { + "page": "Function API", + "url": "function" + }, + { + "page": "Types API", + "url": "types" + }, + { + "page": "Expression API", + "url": "expression" + }, + { + "page": "Spark API", + "url": "spark_api" + }, + { + "page": "API Reference", + "url": "reference" + }, + { + "page": "Known Python Issues", + "url": "known_issues" + } + ] + }, + { + "page": "R", + "url": "r" + }, + { + "page": "Rust", + "url": "rust" + }, + { + "page": "Swift", + "url": "swift" + }, + { + "page": "Wasm", + "slug": "wasm", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Instantiation", + "url": "instantiation" + }, + { + "page": "Data Ingestion", + "url": "data_ingestion" + }, + { + "page": "Query", + "url": "query" + }, + { + "page": "Extensions", + "url": "extensions" + } + ] + }, + { + "page": "ADBC", + "url": "adbc" + }, + { + "page": "ODBC", + "slug": "odbc", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Linux Setup", + "url": "linux" + }, + { + "page": "Windows Setup", + "url": "windows" + }, + { + "page": "macOS Setup", + "url": "macos" + }, + { + "page": "Configuration", + "url": "configuration" + } + ] + } + ] + }, + { + "page": "SQL", + "slug": "sql", + "subfolderitems": [ + { + "page": "Introduction", + "url": "introduction" + }, + { + "page": "Statements", + "slug": "statements", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "ANALYZE", + "url": "analyze" + }, + { + "page": "ALTER TABLE", + "url": "alter_table" + }, + { + "page": "ALTER VIEW", + "url": "alter_view" + }, + { + "page": "ATTACH/DETACH", + "url": "attach" + }, + { + "page": "CALL", + "url": "call" + }, + { + "page": "CHECKPOINT", + "url": "checkpoint" + }, + { + "page": "COMMENT ON", + "url": "comment_on" + }, + { + "page": "COPY", + "url": "copy" + }, + { + "page": "CREATE INDEX", + "url": "create_index" + }, + { + "page": "CREATE MACRO", + "url": "create_macro" + }, + { + "page": "CREATE SCHEMA", + "url": "create_schema" + }, + { + "page": "CREATE SECRET", + "url": "create_secret" + }, + { + "page": "CREATE SEQUENCE", + "url": "create_sequence" + }, + { + "page": "CREATE TABLE", + "url": "create_table" + }, + { + "page": "CREATE VIEW", + "url": "create_view" + }, + { + "page": "CREATE TYPE", + "url": "create_type" + }, + { + "page": "DELETE", + "url": "delete" + }, + { + "page": "DESCRIBE", + "url": "describe" + }, + { + "page": "DROP", + "url": "drop" + }, + { + "page": "EXPORT/IMPORT DATABASE", + "url": "export" + }, + { + "page": "INSERT", + "url": "insert" + }, + { + "page": "PIVOT", + "url": "pivot" + }, + { + "page": "Profiling", + "url": "profiling" + }, + { + "page": "SELECT", + "url": "select" + }, + { + "page": "SET/RESET", + "url": "set" + }, + { + "page": "SUMMARIZE", + "url": "summarize" + }, + { + "page": "Transaction Management", + "url": "transactions" + }, + { + "page": "UNPIVOT", + "url": "unpivot" + }, + { + "page": "UPDATE", + "url": "update" + }, + { + "page": "USE", + "url": "use" + }, + { + "page": "VACUUM", + "url": "vacuum" + } + ] + }, + { + "page": "Query Syntax", + "slug": "query_syntax", + "subsubfolderitems": [ + { + "page": "SELECT", + "url": "select" + }, + { + "page": "FROM & JOIN", + "url": "from" + }, + { + "page": "WHERE", + "url": "where" + }, + { + "page": "GROUP BY", + "url": "groupby" + }, + { + "page": "GROUPING SETS", + "url": "grouping_sets" + }, + { + "page": "HAVING", + "url": "having" + }, + { + "page": "ORDER BY", + "url": "orderby" + }, + { + "page": "LIMIT and OFFSET", + "url": "limit" + }, + { + "page": "SAMPLE", + "url": "sample" + }, + { + "page": "Unnesting", + "url": "unnest" + }, + { + "page": "WITH", + "url": "with" + }, + { + "page": "WINDOW", + "url": "window" + }, + { + "page": "QUALIFY", + "url": "qualify" + }, + { + "page": "VALUES", + "url": "values" + }, + { + "page": "FILTER", + "url": "filter" + }, + { + "page": "Set Operations", + "url": "setops" + }, + { + "page": "Prepared Statements", + "url": "prepared_statements" + } + ] + }, + { + "page": "Data Types", + "slug": "data_types", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Array", + "url": "array" + }, + { + "page": "Bitstring", + "url": "bitstring" + }, + { + "page": "Blob", + "url": "blob" + }, + { + "page": "Boolean", + "url": "boolean" + }, + { + "page": "Date", + "url": "date" + }, + { + "page": "Enum", + "url": "enum" + }, + { + "page": "Interval", + "url": "interval" + }, + { + "page": "List", + "url": "list" + }, + { + "page": "Literal Types", + "url": "literal_types" + }, + { + "page": "Map", + "url": "map" + }, + { + "page": "NULL Values", + "url": "nulls" + }, + { + "page": "Numeric", + "url": "numeric" + }, + { + "page": "Struct", + "url": "struct" + }, + { + "page": "Text", + "url": "text" + }, + { + "page": "Time", + "url": "time" + }, + { + "page": "Timestamp", + "url": "timestamp" + }, + { + "page": "Time Zones", + "url": "timezones" + }, + { + "page": "Union", + "url": "union" + }, + { + "page": "Typecasting", + "url": "typecasting" + } + ] + }, + { + "page": "Expressions", + "slug": "expressions", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "CASE Statement", + "url": "case" + }, + { + "page": "Casting", + "url": "cast" + }, + { + "page": "Collations", + "url": "collations" + }, + { + "page": "Comparisons", + "url": "comparison_operators" + }, + { + "page": "IN Operator", + "url": "in" + }, + { + "page": "Logical Operators", + "url": "logical_operators" + }, + { + "page": "Star Expression", + "url": "star" + }, + { + "page": "Subqueries", + "url": "subqueries" + } + ] + }, + { + "page": "Functions", + "slug": "functions", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Aggregate Functions", + "url": "aggregates" + }, + { + "page": "Array Functions", + "url": "array" + }, + { + "page": "Bitstring Functions", + "url": "bitstring" + }, + { + "page": "Blob Functions", + "url": "blob" + }, + { + "page": "Date Format Functions", + "url": "dateformat" + }, + { + "page": "Date Functions", + "url": "date" + }, + { + "page": "Date Part Functions", + "url": "datepart" + }, + { + "page": "Enum Functions", + "url": "enum" + }, + { + "page": "Interval Functions", + "url": "interval" + }, + { + "page": "Lambda Functions", + "url": "lambda" + }, + { + "page": "List Functions", + "url": "list" + }, + { + "page": "Map Functions", + "url": "map" + }, + { + "page": "Nested Functions", + "url": "nested" + }, + { + "page": "Numeric Functions", + "url": "numeric" + }, + { + "page": "Pattern Matching", + "url": "pattern_matching" + }, + { + "page": "Regular Expressions", + "url": "regular_expressions" + }, + { + "page": "Struct Functions", + "url": "struct" + }, + { + "page": "Text Functions", + "url": "char" + }, + { + "page": "Time Functions", + "url": "time" + }, + { + "page": "Timestamp Functions", + "url": "timestamp" + }, + { + "page": "Timestamp with Time Zone Functions", + "url": "timestamptz" + }, + { + "page": "Union Functions", + "url": "union" + }, + { + "page": "Utility Functions", + "url": "utility" + }, + { + "page": "Window Functions", + "url": "window_functions" + } + ] + }, + { + "page": "Constraints", + "url": "constraints" + }, + { + "page": "Indexes", + "url": "indexes" + }, + { + "page": "Meta Queries", + "slug": "meta", + "subsubfolderitems": [ + { + "page": "Information Schema", + "url": "information_schema" + }, + { + "page": "Metadata Functions", + "url": "duckdb_table_functions" + } + ] + }, + { + "page": "DuckDB's SQL Dialect", + "slug": "dialect", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Friendly SQL", + "url": "friendly_sql" + }, + { + "page": "Keywords and Identifiers", + "url": "keywords_and_identifiers" + }, + { + "page": "Order Preservation", + "url": "order_preservation" + }, + { + "page": "PostgreSQL Compatibility", + "url": "postgresql_compatibility" + } + ] + }, + { + "page": "Samples", + "url": "samples" + } + ] + }, + { + "page": "Configuration", + "slug": "configuration", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Pragmas", + "url": "pragmas" + }, + { + "page": "Secrets Manager", + "url": "secrets_manager" + } + ] + }, + { + "page": "Extensions", + "slug": "extensions", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Core Extensions", + "url": "core_extensions" + }, + { + "page": "Community Extensions", + "url": "community_extensions" + }, + { + "page": "Working with Extensions", + "url": "working_with_extensions" + }, + { + "page": "Versioning of Extensions", + "url": "versioning_of_extensions" + }, + { + "page": "Arrow", + "url": "arrow" + }, + { + "page": "AutoComplete", + "url": "autocomplete" + }, + { + "page": "AWS", + "url": "aws" + }, + { + "page": "Azure", + "url": "azure" + }, + { + "page": "Delta", + "url": "delta" + }, + { + "page": "Excel", + "url": "excel" + }, + { + "page": "Full Text Search", + "url": "full_text_search" + }, + { + "page": "httpfs (HTTP and S3)", + "slug": "httpfs", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "HTTP(S) Support", + "url": "https" + }, + { + "page": "Hugging Face", + "url": "hugging_face" + }, + { + "page": "S3 API Support", + "url": "s3api" + }, + { + "page": "Legacy Authentication Scheme for S3 API", + "url": "s3api_legacy_authentication" + } + ] + }, + { + "page": "Iceberg", + "url": "iceberg" + }, + { + "page": "ICU", + "url": "icu" + }, + { + "page": "inet", + "url": "inet" + }, + { + "page": "jemalloc", + "url": "jemalloc" + }, + { + "page": "JSON", + "url": "json" + }, + { + "page": "MySQL", + "url": "mysql" + }, + { + "page": "PostgreSQL", + "url": "postgres" + }, + { + "page": "Spatial", + "url": "spatial" + }, + { + "page": "SQLite", + "url": "sqlite" + }, + { + "page": "Substrait", + "url": "substrait" + }, + { + "page": "TPC-DS", + "url": "tpcds" + }, + { + "page": "TPC-H", + "url": "tpch" + }, + { + "page": "VSS", + "url": "vss" + } + ] + }, + { + "page": "Guides", + "slug": "guides", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Data Viewers", + "slug": "data_viewers", + "subsubfolderitems": [ + { + "page": "Tableau", + "url": "tableau" + }, + { + "page": "CLI Charting with YouPlot", + "url": "youplot" + } + ] + }, + { + "page": "Database Integration", + "slug": "database_integration", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "MySQL Import", + "url": "mysql" + }, + { + "page": "PostgreSQL Import", + "url": "postgres" + }, + { + "page": "SQLite Import", + "url": "sqlite" + } + ] + }, + { + "page": "File Formats", + "slug": "file_formats", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "CSV Import", + "url": "csv_import" + }, + { + "page": "CSV Export", + "url": "csv_export" + }, + { + "page": "Directly Reading Files", + "url": "read_file" + }, + { + "page": "Excel Import", + "url": "excel_import" + }, + { + "page": "Excel Export", + "url": "excel_export" + }, + { + "page": "JSON Import", + "url": "json_import" + }, + { + "page": "JSON Export", + "url": "json_export" + }, + { + "page": "Parquet Import", + "url": "parquet_import" + }, + { + "page": "Parquet Export", + "url": "parquet_export" + }, + { + "page": "Querying Parquet Files", + "url": "query_parquet" + } + ] + }, + { + "page": "Network & Cloud Storage", + "slug": "network_cloud_storage", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "HTTP Parquet Import", + "url": "http_import" + }, + { + "page": "S3 Parquet Import", + "url": "s3_import" + }, + { + "page": "S3 Parquet Export", + "url": "s3_export" + }, + { + "page": "S3 Iceberg Import", + "url": "s3_iceberg_import" + }, + { + "page": "S3 Express One", + "url": "s3_express_one" + }, + { + "page": "GCS Import", + "url": "gcs_import" + }, + { + "page": "Cloudflare R2 Import", + "url": "cloudflare_r2_import" + }, + { + "page": "DuckDB over HTTPS/S3", + "url": "duckdb_over_https_or_s3" + } + ] + }, + { + "page": "Meta Queries", + "slug": "meta", + "subsubfolderitems": [ + { + "page": "Describe Table", + "url": "describe" + }, + { + "page": "EXPLAIN: Inspect Query Plans", + "url": "explain" + }, + { + "page": "EXPLAIN ANALYZE: Profile Queries", + "url": "explain_analyze" + }, + { + "page": "List Tables", + "url": "list_tables" + }, + { + "page": "Summarize", + "url": "summarize" + }, + { + "page": "DuckDB Environment", + "url": "duckdb_environment" + } + ] + }, + { + "page": "ODBC", + "slug": "odbc", + "subsubfolderitems": [ + { + "page": "ODBC Guide", + "url": "general" + } + ] + }, + { + "page": "Performance", + "slug": "performance", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Import", + "url": "import" + }, + { + "page": "Schema", + "url": "schema" + }, + { + "page": "Indexing", + "url": "indexing" + }, + { + "page": "Environment", + "url": "environment" + }, + { + "page": "File Formats", + "url": "file_formats" + }, + { + "page": "How to Tune Workloads", + "url": "how_to_tune_workloads" + }, + { + "page": "My Workload Is Slow", + "url": "my_workload_is_slow" + }, + { + "page": "Benchmarks", + "url": "benchmarks" + } + ] + }, + { + "page": "Python", + "slug": "python", + "subsubfolderitems": [ + { + "page": "Installation", + "url": "install" + }, + { + "page": "Executing SQL", + "url": "execute_sql" + }, + { + "page": "Jupyter Notebooks", + "url": "jupyter" + }, + { + "page": "SQL on Pandas", + "url": "sql_on_pandas" + }, + { + "page": "Import from Pandas", + "url": "import_pandas" + }, + { + "page": "Export to Pandas", + "url": "export_pandas" + }, + { + "page": "Import from Numpy", + "url": "import_numpy" + }, + { + "page": "Export to Numpy", + "url": "export_numpy" + }, + { + "page": "SQL on Arrow", + "url": "sql_on_arrow" + }, + { + "page": "Import from Arrow", + "url": "import_arrow" + }, + { + "page": "Export to Arrow", + "url": "export_arrow" + }, + { + "page": "Relational API on Pandas", + "url": "relational_api_pandas" + }, + { + "page": "Multiple Python Threads", + "url": "multiple_threads" + }, + { + "page": "Integration with Ibis", + "url": "ibis" + }, + { + "page": "Integration with Polars", + "url": "polars" + }, + { + "page": "Using fsspec Filesystems", + "url": "filesystems" + } + ] + }, + { + "page": "SQL Editors", + "slug": "sql_editors", + "subsubfolderitems": [ + { + "page": "DBeaver SQL IDE", + "url": "dbeaver" + } + ] + }, + { + "page": "SQL Features", + "slug": "sql_features", + "subsubfolderitems": [ + { + "page": "AsOf Join", + "url": "asof_join" + }, + { + "page": "Full-Text Search", + "url": "full_text_search" + } + ] + }, + { + "page": "Snippets", + "slug": "snippets", + "subsubfolderitems": [ + { + "page": "Create Synthetic Data", + "url": "create_synthetic_data" + } + ] + }, + { + "page": "Glossary of Terms", + "url": "glossary" + }, + { + "page": "Browse Offline", + "url": "offline-copy" + } + ] + }, + { + "page": "Operations Manual", + "slug": "operations_manual", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Limits", + "url": "limits" + }, + { + "page": "Non-Deterministic Behavior", + "url": "non-deterministic_behavior" + }, + { + "page": "DuckDB's Footprint", + "slug": "footprint_of_duckdb", + "subsubfolderitems": [ + { + "page": "Files Created by DuckDB", + "url": "files_created_by_duckdb" + }, + { + "page": "Gitignore for DuckDB", + "url": "gitignore_for_duckdb" + } + ] + }, + { + "page": "Securing DuckDB", + "slug": "securing_duckdb", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Securing Extensions", + "url": "securing_extensions" + } + ] + } + ] + }, + { + "page": "Development", + "slug": "dev", + "subfolderitems": [ + { + "page": "DuckDB Repositories", + "url": "repositories" + }, + { + "page": "Testing", + "slug": "sqllogictest", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "sqllogictest Introduction", + "url": "intro" + }, + { + "page": "Writing Tests", + "url": "writing_tests" + }, + { + "page": "Debugging", + "url": "debugging" + }, + { + "page": "Result Verification", + "url": "result_verification" + }, + { + "page": "Persistent Testing", + "url": "persistent_testing" + }, + { + "page": "Loops", + "url": "loops" + }, + { + "page": "Multiple Connections", + "url": "multiple_connections" + }, + { + "page": "Catch", + "url": "catch" + } + ] + }, + { + "page": "Profiling", + "url": "profiling" + }, + { + "page": "Release Calendar", + "url": "release_calendar" + }, + { + "page": "Building", + "slug": "building", + "subsubfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Build Instructions", + "url": "build_instructions" + }, + { + "page": "Build Configuration", + "url": "build_configuration" + }, + { + "page": "Building Extensions", + "url": "building_extensions" + }, + { + "page": "Supported Platforms", + "url": "supported_platforms" + }, + { + "page": "Troubleshooting", + "url": "troubleshooting" + } + ] + }, + { + "page": "Benchmark Suite", + "url": "benchmark" + } + ] + }, + { + "page": "Internals", + "slug": "internals", + "subfolderitems": [ + { + "page": "Overview", + "url": "overview" + }, + { + "page": "Storage Versions & Format", + "url": "storage" + }, + { + "page": "Execution Format", + "url": "vector" + } + ] + } + ] + } + ] +} diff --git a/docs/archive/1.0/api/adbc.md b/docs/archive/1.0/api/adbc.md new file mode 100644 index 00000000000..a088943ee9f --- /dev/null +++ b/docs/archive/1.0/api/adbc.md @@ -0,0 +1,359 @@ +--- +layout: docu +title: ADBC API +--- + +[Arrow Database Connectivity (ADBC)](https://arrow.apache.org/adbc/), similarly to ODBC and JDBC, is a C-style API that enables code portability between different database systems. This allows developers to effortlessly build applications that communicate with database systems without using code specific to that system. The main difference between ADBC and ODBC/JDBC is that ADBC uses [Arrow](https://arrow.apache.org/) to transfer data between the database system and the application. DuckDB has an ADBC driver, which takes advantage of the [zero-copy integration between DuckDB and Arrow]({% post_url 2021-12-03-duck-arrow %}) to efficiently transfer data. + +DuckDB's ADBC driver currently supports version 0.7 of ADBC. + +Please refer to the [ADBC documentation page](https://arrow.apache.org/adbc/0.7.0/cpp/index.html) for a more extensive discussion on ADBC and a detailed API explanation. + +## Implemented Functionality + +The DuckDB-ADBC driver implements the full ADBC specification, with the exception of the `ConnectionReadPartition` and `StatementExecutePartitions` functions. Both of these functions exist to support systems that internally partition the query results, which does not apply to DuckDB. +In this section, we will describe the main functions that exist in ADBC, along with the arguments they take and provide examples for each function. + +### Database + +Set of functions that operate on a database. + +| Function name | Description | Arguments | Example | +|:---|:-|:---|:----| +| `DatabaseNew` | Allocate a new (but uninitialized) database. | `(AdbcDatabase *database, AdbcError *error)` | `AdbcDatabaseNew(&adbc_database, &adbc_error)` | +| `DatabaseSetOption` | Set a char* option. | `(AdbcDatabase *database, const char *key, const char *value, AdbcError *error)` | `AdbcDatabaseSetOption(&adbc_database, "path", "test.db", &adbc_error)` | +| `DatabaseInit` | Finish setting options and initialize the database. | `(AdbcDatabase *database, AdbcError *error)` | `AdbcDatabaseInit(&adbc_database, &adbc_error)` | +| `DatabaseRelease` | Destroy the database. | `(AdbcDatabase *database, AdbcError *error)` | `AdbcDatabaseRelease(&adbc_database, &adbc_error)` | + +### Connection + +A set of functions that create and destroy a connection to interact with a database. + +| Function name | Description | Arguments | Example | +|:---|:-|:---|:----| +| `ConnectionNew` | Allocate a new (but uninitialized) connection. | `(AdbcConnection*, AdbcError*)` | `AdbcConnectionNew(&adbc_connection, &adbc_error)` | +| `ConnectionSetOption` | Options may be set before ConnectionInit. | `(AdbcConnection*, const char*, const char*, AdbcError*)` | `AdbcConnectionSetOption(&adbc_connection, ADBC_CONNECTION_OPTION_AUTOCOMMIT, ADBC_OPTION_VALUE_DISABLED, &adbc_error)` | +| `ConnectionInit` | Finish setting options and initialize the connection. | `(AdbcConnection*, AdbcDatabase*, AdbcError*)` | `AdbcConnectionInit(&adbc_connection, &adbc_database, &adbc_error)` | +| `ConnectionRelease` | Destroy this connection. | `(AdbcConnection*, AdbcError*)` | `AdbcConnectionRelease(&adbc_connection, &adbc_error)` | + +A set of functions that retrieve metadata about the database. In general, these functions will return Arrow objects, specifically an ArrowArrayStream. + +| Function name | Description | Arguments | Example | +|:---|:-|:---|:----| +| `ConnectionGetObjects` | Get a hierarchical view of all catalogs, database schemas, tables, and columns. | `(AdbcConnection*, int, const char*, const char*, const char*, const char**, const char*, ArrowArrayStream*, AdbcError*)` | `AdbcDatabaseInit(&adbc_database, &adbc_error)` | +| `ConnectionGetTableSchema` | Get the Arrow schema of a table. | `(AdbcConnection*, const char*, const char*, const char*, ArrowSchema*, AdbcError*)` | `AdbcDatabaseRelease(&adbc_database, &adbc_error)` | +| `ConnectionGetTableTypes` | Get a list of table types in the database. | `(AdbcConnection*, ArrowArrayStream*, AdbcError*)` | `AdbcDatabaseNew(&adbc_database, &adbc_error)` | + +A set of functions with transaction semantics for the connection. By default, all connections start with auto-commit mode on, but this can be turned off via the ConnectionSetOption function. + +| Function name | Description | Arguments | Example | +|:---|:-|:---|:----| +| `ConnectionCommit` | Commit any pending transactions. | `(AdbcConnection*, AdbcError*)` | `AdbcConnectionCommit(&adbc_connection, &adbc_error)` | +| `ConnectionRollback` | Rollback any pending transactions. | `(AdbcConnection*, AdbcError*)` | `AdbcConnectionRollback(&adbc_connection, &adbc_error)` | + +### Statement + +Statements hold state related to query execution. They represent both one-off queries and prepared statements. They can be reused; however, doing so will invalidate prior result sets from that statement. + +The functions used to create, destroy, and set options for a statement: + +| Function name | Description | Arguments | Example | +|:---|:-|:---|:----| +| `StatementNew` | Create a new statement for a given connection. | `(AdbcConnection*, AdbcStatement*, AdbcError*)` | `AdbcStatementNew(&adbc_connection, &adbc_statement, &adbc_error)` | +| `StatementRelease` | Destroy a statement. | `(AdbcStatement*, AdbcError*)` | `AdbcStatementRelease(&adbc_statement, &adbc_error)` | +| `StatementSetOption` | Set a string option on a statement. | `(AdbcStatement*, const char*, const char*, AdbcError*)` | `StatementSetOption(&adbc_statement, ADBC_INGEST_OPTION_TARGET_TABLE, "TABLE_NAME", &adbc_error)` | + +Functions related to query execution: + +| Function name | Description | Arguments | Example | +|:---|:-|:---|:----| +| `StatementSetSqlQuery` | Set the SQL query to execute. The query can then be executed with StatementExecuteQuery. | `(AdbcStatement*, const char*, AdbcError*)` | `AdbcStatementSetSqlQuery(&adbc_statement, "SELECT * FROM TABLE", &adbc_error)` | +| `StatementSetSubstraitPlan` | Set a substrait plan to execute. The query can then be executed with StatementExecuteQuery. | `(AdbcStatement*, const uint8_t*, size_t, AdbcError*)` | `AdbcStatementSetSubstraitPlan(&adbc_statement, substrait_plan, length, &adbc_error)` | +| `StatementExecuteQuery` | Execute a statement and get the results. | `(AdbcStatement*, ArrowArrayStream*, int64_t*, AdbcError*)` | `AdbcStatementExecuteQuery(&adbc_statement, &arrow_stream, &rows_affected, &adbc_error)` | +| `StatementPrepare` | Turn this statement into a prepared statement to be executed multiple times. | `(AdbcStatement*, AdbcError*)` | `AdbcStatementPrepare(&adbc_statement, &adbc_error)` | + +Functions related to binding, used for bulk insertion or in prepared statements. + +
+ +| Function name | Description | Arguments | Example | +|:---|:-|:---|:----| +| `StatementBindStream` | Bind Arrow Stream. This can be used for bulk inserts or prepared statements. | `(AdbcStatement*, ArrowArrayStream*, AdbcError*)` | `StatementBindStream(&adbc_statement, &input_data, &adbc_error)` | + +## Examples + +Regardless of the programming language being used, there are two database options which will be required to utilize ADBC with DuckDB. The first one is the `driver`, which takes a path to the DuckDB library. The second option is the `entrypoint`, which is an exported function from the DuckDB-ADBC driver that initializes all the ADBC functions. Once we have configured these two options, we can optionally set the `path` option, providing a path on disk to store our DuckDB database. If not set, an in-memory database is created. After configuring all the necessary options, we can proceed to initialize our database. Below is how you can do so with various different language environments. + +### C++ + +We begin our C++ example by declaring the essential variables for querying data through ADBC. These variables include Error, Database, Connection, Statement handling, and an Arrow Stream to transfer data between DuckDB and the application. + +```cpp +AdbcError adbc_error; +AdbcDatabase adbc_database; +AdbcConnection adbc_connection; +AdbcStatement adbc_statement; +ArrowArrayStream arrow_stream; +``` + +We can then initialize our database variable. Before initializing the database, we need to set the `driver` and `entrypoint` options as mentioned above. Then we set the `path` option and initialize the database. With the example below, the string `"path/to/libduckdb.dylib"` should be the path to the dynamic library for DuckDB. This will be `.dylib` on macOS, and `.so` on Linux. + +```cpp +AdbcDatabaseNew(&adbc_database, &adbc_error); +AdbcDatabaseSetOption(&adbc_database, "driver", "path/to/libduckdb.dylib", &adbc_error); +AdbcDatabaseSetOption(&adbc_database, "entrypoint", "duckdb_adbc_init", &adbc_error); +// By default, we start an in-memory database, but you can optionally define a path to store it on disk. +AdbcDatabaseSetOption(&adbc_database, "path", "test.db", &adbc_error); +AdbcDatabaseInit(&adbc_database, &adbc_error); +``` + +After initializing the database, we must create and initialize a connection to it. + +```cpp +AdbcConnectionNew(&adbc_connection, &adbc_error); +AdbcConnectionInit(&adbc_connection, &adbc_database, &adbc_error); +``` + +We can now initialize our statement and run queries through our connection. After the `AdbcStatementExecuteQuery` the `arrow_stream` is populated with the result. + +```cpp +AdbcStatementNew(&adbc_connection, &adbc_statement, &adbc_error); +AdbcStatementSetSqlQuery(&adbc_statement, "SELECT 42", &adbc_error); +int64_t rows_affected; +AdbcStatementExecuteQuery(&adbc_statement, &arrow_stream, &rows_affected, &adbc_error); +arrow_stream.release(arrow_stream) +``` + +Besides running queries, we can also ingest data via `arrow_streams`. For this we need to set an option with the table name we want to insert to, bind the stream and then execute the query. + +```cpp +StatementSetOption(&adbc_statement, ADBC_INGEST_OPTION_TARGET_TABLE, "AnswerToEverything", &adbc_error); +StatementBindStream(&adbc_statement, &arrow_stream, &adbc_error); +StatementExecuteQuery(&adbc_statement, nullptr, nullptr, &adbc_error); +``` + +### Python + +The first thing to do is to use `pip` and install the ADBC Driver manager. You will also need to install the `pyarrow` to directly access Apache Arrow formatted result sets (such as using `fetch_arrow_table`). + +```bash +pip install adbc_driver_manager pyarrow +``` + +> For details on the `adbc_driver_manager` package, see the [`adbc_driver_manager` package documentation](https://arrow.apache.org/adbc/current/python/api/adbc_driver_manager.html). + +As with C++, we need to provide initialization options consisting of the location of the libduckdb shared object and entrypoint function. Notice that the `path` argument for DuckDB is passed in through the `db_kwargs` dictionary. + +```python +import adbc_driver_duckdb.dbapi + +with adbc_driver_duckdb.dbapi.connect("test.db") as conn, conn.cursor() as cur: + cur.execute("SELECT 42") + # fetch a pyarrow table + tbl = cur.fetch_arrow_table() + print(tbl) +``` + +Alongside `fetch_arrow_table`, other methods from DBApi are also implemented on the cursor, such as `fetchone` and `fetchall`. Data can also be ingested via `arrow_streams`. We just need to set options on the statement to bind the stream of data and execute the query. + +```python +import adbc_driver_duckdb.dbapi +import pyarrow + +data = pyarrow.record_batch( + [[1, 2, 3, 4], ["a", "b", "c", "d"]], + names = ["ints", "strs"], +) + +with adbc_driver_duckdb.dbapi.connect("test.db") as conn, conn.cursor() as cur: + cur.adbc_ingest("AnswerToEverything", data) +``` + +### Go + +Make sure to download the `libduckdb` library first (i.e., the `.so` on Linux, `.dylib` on Mac or `.dll` on Windows) from the [releases page](https://github.com/duckdb/duckdb/releases), and put it on your `LD_LIBRARY_PATH` before you run the code (but if you don't, the error will explain your options regarding the location of this file.) + +The following example uses an in-memory DuckDB database to modify in-memory Arrow RecordBatches via SQL queries: + +{% raw %} +```go +package main + +import ( + "bytes" + "context" + "fmt" + "io" + + "github.com/apache/arrow-adbc/go/adbc" + "github.com/apache/arrow-adbc/go/adbc/drivermgr" + "github.com/apache/arrow/go/v17/arrow" + "github.com/apache/arrow/go/v17/arrow/array" + "github.com/apache/arrow/go/v17/arrow/ipc" + "github.com/apache/arrow/go/v17/arrow/memory" +) + +func _makeSampleArrowRecord() arrow.Record { + b := array.NewFloat64Builder(memory.DefaultAllocator) + b.AppendValues([]float64{1, 2, 3}, nil) + col := b.NewArray() + + defer col.Release() + defer b.Release() + + schema := arrow.NewSchema([]arrow.Field{{Name: "column1", Type: arrow.PrimitiveTypes.Float64}}, nil) + return array.NewRecord(schema, []arrow.Array{col}, int64(col.Len())) +} + +type DuckDBSQLRunner struct { + ctx context.Context + conn adbc.Connection + db adbc.Database +} + +func NewDuckDBSQLRunner(ctx context.Context) (*DuckDBSQLRunner, error) { + var drv drivermgr.Driver + db, err := drv.NewDatabase(map[string]string{ + "driver": "duckdb", + "entrypoint": "duckdb_adbc_init", + "path": ":memory:", + }) + if err != nil { + return nil, fmt.Errorf("failed to create new in-memory DuckDB database: %w", err) + } + conn, err := db.Open(ctx) + if err != nil { + return nil, fmt.Errorf("failed to open connection to new in-memory DuckDB database: %w", err) + } + return &DuckDBSQLRunner{ctx: ctx, conn: conn, db: db}, nil +} + +func serializeRecord(record arrow.Record) (io.Reader, error) { + buf := new(bytes.Buffer) + wr := ipc.NewWriter(buf, ipc.WithSchema(record.Schema())) + if err := wr.Write(record); err != nil { + return nil, fmt.Errorf("failed to write record: %w", err) + } + if err := wr.Close(); err != nil { + return nil, fmt.Errorf("failed to close writer: %w", err) + } + return buf, nil +} + +func (r *DuckDBSQLRunner) importRecord(sr io.Reader) error { + rdr, err := ipc.NewReader(sr) + if err != nil { + return fmt.Errorf("failed to create IPC reader: %w", err) + } + defer rdr.Release() + stmt, err := r.conn.NewStatement() + if err != nil { + return fmt.Errorf("failed to create new statement: %w", err) + } + if err := stmt.SetOption(adbc.OptionKeyIngestMode, adbc.OptionValueIngestModeCreate); err != nil { + return fmt.Errorf("failed to set ingest mode: %w", err) + } + if err := stmt.SetOption(adbc.OptionKeyIngestTargetTable, "temp_table"); err != nil { + return fmt.Errorf("failed to set ingest target table: %w", err) + } + if err := stmt.BindStream(r.ctx, rdr); err != nil { + return fmt.Errorf("failed to bind stream: %w", err) + } + if _, err := stmt.ExecuteUpdate(r.ctx); err != nil { + return fmt.Errorf("failed to execute update: %w", err) + } + return stmt.Close() +} + +func (r *DuckDBSQLRunner) runSQL(sql string) ([]arrow.Record, error) { + stmt, err := r.conn.NewStatement() + if err != nil { + return nil, fmt.Errorf("failed to create new statement: %w", err) + } + defer stmt.Close() + + if err := stmt.SetSqlQuery(sql); err != nil { + return nil, fmt.Errorf("failed to set SQL query: %w", err) + } + out, n, err := stmt.ExecuteQuery(r.ctx) + if err != nil { + return nil, fmt.Errorf("failed to execute query: %w", err) + } + defer out.Release() + + result := make([]arrow.Record, 0, n) + for out.Next() { + rec := out.Record() + rec.Retain() // .Next() will release the record, so we need to retain it + result = append(result, rec) + } + if out.Err() != nil { + return nil, out.Err() + } + return result, nil +} + +func (r *DuckDBSQLRunner) RunSQLOnRecord(record arrow.Record, sql string) ([]arrow.Record, error) { + serializedRecord, err := serializeRecord(record) + if err != nil { + return nil, fmt.Errorf("failed to serialize record: %w", err) + } + if err := r.importRecord(serializedRecord); err != nil { + return nil, fmt.Errorf("failed to import record: %w", err) + } + result, err := r.runSQL(sql) + if err != nil { + return nil, fmt.Errorf("failed to run SQL: %w", err) + } + + if _, err := r.runSQL("DROP TABLE temp_table"); err != nil { + return nil, fmt.Errorf("failed to drop temp table after running query: %w", err) + } + return result, nil +} + +func (r *DuckDBSQLRunner) Close() { + r.conn.Close() + r.db.Close() +} + +func main() { + rec := _makeSampleArrowRecord() + fmt.Println(rec) + + runner, err := NewDuckDBSQLRunner(context.Background()) + if err != nil { + panic(err) + } + defer runner.Close() + + resultRecords, err := runner.RunSQLOnRecord(rec, "SELECT column1+1 FROM temp_table") + if err != nil { + panic(err) + } + + for _, resultRecord := range resultRecords { + fmt.Println(resultRecord) + resultRecord.Release() + } +} +``` +{% endraw %} + +Running it produces the following output: + +```go +record: + schema: + fields: 1 + - column1: type=float64 + rows: 3 + col[0][column1]: [1 2 3] + +record: + schema: + fields: 1 + - (column1 + 1): type=float64, nullable + rows: 3 + col[0][(column1 + 1)]: [2 3 4] +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/c/api.md b/docs/archive/1.0/api/c/api.md new file mode 100644 index 00000000000..73695e87317 --- /dev/null +++ b/docs/archive/1.0/api/c/api.md @@ -0,0 +1,6588 @@ +--- +layout: docu +title: Complete API +--- + + + +This page contains the reference for DuckDB's C API. + +> Deprecated The reference contains several deprecation notices. These concern methods whose long-term availability is not guaranteed as they may be removed in the future. That said, DuckDB's developers plan to carry out deprecations slowly as several of the deprecated methods do not yet have a fully functional alternative. Therefore, they will not removed before the alternative is available, and even then, there will be a grace period of a few minor versions before removing them. The reason that the methods are already deprecated in v1.0 is to denote that they are not part of the v1.0 stable API, which contains methods that are available long-term. + +## API Reference Overview + + + +### Open/Connect + +
duckdb_state duckdb_open(const char *path, duckdb_database *out_database);
+duckdb_state duckdb_open_ext(const char *path, duckdb_database *out_database, duckdb_config config, char **out_error);
+void duckdb_close(duckdb_database *database);
+duckdb_state duckdb_connect(duckdb_database database, duckdb_connection *out_connection);
+void duckdb_interrupt(duckdb_connection connection);
+duckdb_query_progress_type duckdb_query_progress(duckdb_connection connection);
+void duckdb_disconnect(duckdb_connection *connection);
+const char *duckdb_library_version();
+
+ +### Configuration + +
duckdb_state duckdb_create_config(duckdb_config *out_config);
+size_t duckdb_config_count();
+duckdb_state duckdb_get_config_flag(size_t index, const char **out_name, const char **out_description);
+duckdb_state duckdb_set_config(duckdb_config config, const char *name, const char *option);
+void duckdb_destroy_config(duckdb_config *config);
+
+ +### Query Execution + +
duckdb_state duckdb_query(duckdb_connection connection, const char *query, duckdb_result *out_result);
+void duckdb_destroy_result(duckdb_result *result);
+const char *duckdb_column_name(duckdb_result *result, idx_t col);
+duckdb_type duckdb_column_type(duckdb_result *result, idx_t col);
+duckdb_statement_type duckdb_result_statement_type(duckdb_result result);
+duckdb_logical_type duckdb_column_logical_type(duckdb_result *result, idx_t col);
+idx_t duckdb_column_count(duckdb_result *result);
+idx_t duckdb_row_count(duckdb_result *result);
+idx_t duckdb_rows_changed(duckdb_result *result);
+void *duckdb_column_data(duckdb_result *result, idx_t col);
+bool *duckdb_nullmask_data(duckdb_result *result, idx_t col);
+const char *duckdb_result_error(duckdb_result *result);
+
+ +### Result Functions + +
duckdb_data_chunk duckdb_result_get_chunk(duckdb_result result, idx_t chunk_index);
+bool duckdb_result_is_streaming(duckdb_result result);
+idx_t duckdb_result_chunk_count(duckdb_result result);
+duckdb_result_type duckdb_result_return_type(duckdb_result result);
+
+ +### Safe Fetch Functions + +
bool duckdb_value_boolean(duckdb_result *result, idx_t col, idx_t row);
+int8_t duckdb_value_int8(duckdb_result *result, idx_t col, idx_t row);
+int16_t duckdb_value_int16(duckdb_result *result, idx_t col, idx_t row);
+int32_t duckdb_value_int32(duckdb_result *result, idx_t col, idx_t row);
+int64_t duckdb_value_int64(duckdb_result *result, idx_t col, idx_t row);
+duckdb_hugeint duckdb_value_hugeint(duckdb_result *result, idx_t col, idx_t row);
+duckdb_uhugeint duckdb_value_uhugeint(duckdb_result *result, idx_t col, idx_t row);
+duckdb_decimal duckdb_value_decimal(duckdb_result *result, idx_t col, idx_t row);
+uint8_t duckdb_value_uint8(duckdb_result *result, idx_t col, idx_t row);
+uint16_t duckdb_value_uint16(duckdb_result *result, idx_t col, idx_t row);
+uint32_t duckdb_value_uint32(duckdb_result *result, idx_t col, idx_t row);
+uint64_t duckdb_value_uint64(duckdb_result *result, idx_t col, idx_t row);
+float duckdb_value_float(duckdb_result *result, idx_t col, idx_t row);
+double duckdb_value_double(duckdb_result *result, idx_t col, idx_t row);
+duckdb_date duckdb_value_date(duckdb_result *result, idx_t col, idx_t row);
+duckdb_time duckdb_value_time(duckdb_result *result, idx_t col, idx_t row);
+duckdb_timestamp duckdb_value_timestamp(duckdb_result *result, idx_t col, idx_t row);
+duckdb_interval duckdb_value_interval(duckdb_result *result, idx_t col, idx_t row);
+char *duckdb_value_varchar(duckdb_result *result, idx_t col, idx_t row);
+duckdb_string duckdb_value_string(duckdb_result *result, idx_t col, idx_t row);
+char *duckdb_value_varchar_internal(duckdb_result *result, idx_t col, idx_t row);
+duckdb_string duckdb_value_string_internal(duckdb_result *result, idx_t col, idx_t row);
+duckdb_blob duckdb_value_blob(duckdb_result *result, idx_t col, idx_t row);
+bool duckdb_value_is_null(duckdb_result *result, idx_t col, idx_t row);
+
+ +### Helpers + +
void *duckdb_malloc(size_t size);
+void duckdb_free(void *ptr);
+idx_t duckdb_vector_size();
+bool duckdb_string_is_inlined(duckdb_string_t string);
+
+ +### Date/Time/Timestamp Helpers + +
duckdb_date_struct duckdb_from_date(duckdb_date date);
+duckdb_date duckdb_to_date(duckdb_date_struct date);
+bool duckdb_is_finite_date(duckdb_date date);
+duckdb_time_struct duckdb_from_time(duckdb_time time);
+duckdb_time_tz duckdb_create_time_tz(int64_t micros, int32_t offset);
+duckdb_time_tz_struct duckdb_from_time_tz(duckdb_time_tz micros);
+duckdb_time duckdb_to_time(duckdb_time_struct time);
+duckdb_timestamp_struct duckdb_from_timestamp(duckdb_timestamp ts);
+duckdb_timestamp duckdb_to_timestamp(duckdb_timestamp_struct ts);
+bool duckdb_is_finite_timestamp(duckdb_timestamp ts);
+
+ +### Hugeint Helpers + +
double duckdb_hugeint_to_double(duckdb_hugeint val);
+duckdb_hugeint duckdb_double_to_hugeint(double val);
+
+ +### Unsigned Hugeint Helpers + +
double duckdb_uhugeint_to_double(duckdb_uhugeint val);
+duckdb_uhugeint duckdb_double_to_uhugeint(double val);
+
+ +### Decimal Helpers + +
duckdb_decimal duckdb_double_to_decimal(double val, uint8_t width, uint8_t scale);
+double duckdb_decimal_to_double(duckdb_decimal val);
+
+ +### Prepared Statements + +
duckdb_state duckdb_prepare(duckdb_connection connection, const char *query, duckdb_prepared_statement *out_prepared_statement);
+void duckdb_destroy_prepare(duckdb_prepared_statement *prepared_statement);
+const char *duckdb_prepare_error(duckdb_prepared_statement prepared_statement);
+idx_t duckdb_nparams(duckdb_prepared_statement prepared_statement);
+const char *duckdb_parameter_name(duckdb_prepared_statement prepared_statement, idx_t index);
+duckdb_type duckdb_param_type(duckdb_prepared_statement prepared_statement, idx_t param_idx);
+duckdb_state duckdb_clear_bindings(duckdb_prepared_statement prepared_statement);
+duckdb_statement_type duckdb_prepared_statement_type(duckdb_prepared_statement statement);
+
+ +### Bind Values to Prepared Statements + +
duckdb_state duckdb_bind_value(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_value val);
+duckdb_state duckdb_bind_parameter_index(duckdb_prepared_statement prepared_statement, idx_t *param_idx_out, const char *name);
+duckdb_state duckdb_bind_boolean(duckdb_prepared_statement prepared_statement, idx_t param_idx, bool val);
+duckdb_state duckdb_bind_int8(duckdb_prepared_statement prepared_statement, idx_t param_idx, int8_t val);
+duckdb_state duckdb_bind_int16(duckdb_prepared_statement prepared_statement, idx_t param_idx, int16_t val);
+duckdb_state duckdb_bind_int32(duckdb_prepared_statement prepared_statement, idx_t param_idx, int32_t val);
+duckdb_state duckdb_bind_int64(duckdb_prepared_statement prepared_statement, idx_t param_idx, int64_t val);
+duckdb_state duckdb_bind_hugeint(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_hugeint val);
+duckdb_state duckdb_bind_uhugeint(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_uhugeint val);
+duckdb_state duckdb_bind_decimal(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_decimal val);
+duckdb_state duckdb_bind_uint8(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint8_t val);
+duckdb_state duckdb_bind_uint16(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint16_t val);
+duckdb_state duckdb_bind_uint32(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint32_t val);
+duckdb_state duckdb_bind_uint64(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint64_t val);
+duckdb_state duckdb_bind_float(duckdb_prepared_statement prepared_statement, idx_t param_idx, float val);
+duckdb_state duckdb_bind_double(duckdb_prepared_statement prepared_statement, idx_t param_idx, double val);
+duckdb_state duckdb_bind_date(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_date val);
+duckdb_state duckdb_bind_time(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_time val);
+duckdb_state duckdb_bind_timestamp(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_timestamp val);
+duckdb_state duckdb_bind_interval(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_interval val);
+duckdb_state duckdb_bind_varchar(duckdb_prepared_statement prepared_statement, idx_t param_idx, const char *val);
+duckdb_state duckdb_bind_varchar_length(duckdb_prepared_statement prepared_statement, idx_t param_idx, const char *val, idx_t length);
+duckdb_state duckdb_bind_blob(duckdb_prepared_statement prepared_statement, idx_t param_idx, const void *data, idx_t length);
+duckdb_state duckdb_bind_null(duckdb_prepared_statement prepared_statement, idx_t param_idx);
+
+ +### Execute Prepared Statements + +
duckdb_state duckdb_execute_prepared(duckdb_prepared_statement prepared_statement, duckdb_result *out_result);
+duckdb_state duckdb_execute_prepared_streaming(duckdb_prepared_statement prepared_statement, duckdb_result *out_result);
+
+ +### Extract Statements + +
idx_t duckdb_extract_statements(duckdb_connection connection, const char *query, duckdb_extracted_statements *out_extracted_statements);
+duckdb_state duckdb_prepare_extracted_statement(duckdb_connection connection, duckdb_extracted_statements extracted_statements, idx_t index, duckdb_prepared_statement *out_prepared_statement);
+const char *duckdb_extract_statements_error(duckdb_extracted_statements extracted_statements);
+void duckdb_destroy_extracted(duckdb_extracted_statements *extracted_statements);
+
+ +### Pending Result Interface + +
duckdb_state duckdb_pending_prepared(duckdb_prepared_statement prepared_statement, duckdb_pending_result *out_result);
+duckdb_state duckdb_pending_prepared_streaming(duckdb_prepared_statement prepared_statement, duckdb_pending_result *out_result);
+void duckdb_destroy_pending(duckdb_pending_result *pending_result);
+const char *duckdb_pending_error(duckdb_pending_result pending_result);
+duckdb_pending_state duckdb_pending_execute_task(duckdb_pending_result pending_result);
+duckdb_pending_state duckdb_pending_execute_check_state(duckdb_pending_result pending_result);
+duckdb_state duckdb_execute_pending(duckdb_pending_result pending_result, duckdb_result *out_result);
+bool duckdb_pending_execution_is_finished(duckdb_pending_state pending_state);
+
+ +### Value Interface + +
void duckdb_destroy_value(duckdb_value *value);
+duckdb_value duckdb_create_varchar(const char *text);
+duckdb_value duckdb_create_varchar_length(const char *text, idx_t length);
+duckdb_value duckdb_create_int64(int64_t val);
+duckdb_value duckdb_create_struct_value(duckdb_logical_type type, duckdb_value *values);
+duckdb_value duckdb_create_list_value(duckdb_logical_type type, duckdb_value *values, idx_t value_count);
+duckdb_value duckdb_create_array_value(duckdb_logical_type type, duckdb_value *values, idx_t value_count);
+char *duckdb_get_varchar(duckdb_value value);
+int64_t duckdb_get_int64(duckdb_value value);
+
+ +### Logical Type Interface + +
duckdb_logical_type duckdb_create_logical_type(duckdb_type type);
+char *duckdb_logical_type_get_alias(duckdb_logical_type type);
+duckdb_logical_type duckdb_create_list_type(duckdb_logical_type type);
+duckdb_logical_type duckdb_create_array_type(duckdb_logical_type type, idx_t array_size);
+duckdb_logical_type duckdb_create_map_type(duckdb_logical_type key_type, duckdb_logical_type value_type);
+duckdb_logical_type duckdb_create_union_type(duckdb_logical_type *member_types, const char **member_names, idx_t member_count);
+duckdb_logical_type duckdb_create_struct_type(duckdb_logical_type *member_types, const char **member_names, idx_t member_count);
+duckdb_logical_type duckdb_create_enum_type(const char **member_names, idx_t member_count);
+duckdb_logical_type duckdb_create_decimal_type(uint8_t width, uint8_t scale);
+duckdb_type duckdb_get_type_id(duckdb_logical_type type);
+uint8_t duckdb_decimal_width(duckdb_logical_type type);
+uint8_t duckdb_decimal_scale(duckdb_logical_type type);
+duckdb_type duckdb_decimal_internal_type(duckdb_logical_type type);
+duckdb_type duckdb_enum_internal_type(duckdb_logical_type type);
+uint32_t duckdb_enum_dictionary_size(duckdb_logical_type type);
+char *duckdb_enum_dictionary_value(duckdb_logical_type type, idx_t index);
+duckdb_logical_type duckdb_list_type_child_type(duckdb_logical_type type);
+duckdb_logical_type duckdb_array_type_child_type(duckdb_logical_type type);
+idx_t duckdb_array_type_array_size(duckdb_logical_type type);
+duckdb_logical_type duckdb_map_type_key_type(duckdb_logical_type type);
+duckdb_logical_type duckdb_map_type_value_type(duckdb_logical_type type);
+idx_t duckdb_struct_type_child_count(duckdb_logical_type type);
+char *duckdb_struct_type_child_name(duckdb_logical_type type, idx_t index);
+duckdb_logical_type duckdb_struct_type_child_type(duckdb_logical_type type, idx_t index);
+idx_t duckdb_union_type_member_count(duckdb_logical_type type);
+char *duckdb_union_type_member_name(duckdb_logical_type type, idx_t index);
+duckdb_logical_type duckdb_union_type_member_type(duckdb_logical_type type, idx_t index);
+void duckdb_destroy_logical_type(duckdb_logical_type *type);
+
+ +### Data Chunk Interface + +
duckdb_data_chunk duckdb_create_data_chunk(duckdb_logical_type *types, idx_t column_count);
+void duckdb_destroy_data_chunk(duckdb_data_chunk *chunk);
+void duckdb_data_chunk_reset(duckdb_data_chunk chunk);
+idx_t duckdb_data_chunk_get_column_count(duckdb_data_chunk chunk);
+duckdb_vector duckdb_data_chunk_get_vector(duckdb_data_chunk chunk, idx_t col_idx);
+idx_t duckdb_data_chunk_get_size(duckdb_data_chunk chunk);
+void duckdb_data_chunk_set_size(duckdb_data_chunk chunk, idx_t size);
+
+ +### Vector Interface + +
duckdb_logical_type duckdb_vector_get_column_type(duckdb_vector vector);
+void *duckdb_vector_get_data(duckdb_vector vector);
+uint64_t *duckdb_vector_get_validity(duckdb_vector vector);
+void duckdb_vector_ensure_validity_writable(duckdb_vector vector);
+void duckdb_vector_assign_string_element(duckdb_vector vector, idx_t index, const char *str);
+void duckdb_vector_assign_string_element_len(duckdb_vector vector, idx_t index, const char *str, idx_t str_len);
+duckdb_vector duckdb_list_vector_get_child(duckdb_vector vector);
+idx_t duckdb_list_vector_get_size(duckdb_vector vector);
+duckdb_state duckdb_list_vector_set_size(duckdb_vector vector, idx_t size);
+duckdb_state duckdb_list_vector_reserve(duckdb_vector vector, idx_t required_capacity);
+duckdb_vector duckdb_struct_vector_get_child(duckdb_vector vector, idx_t index);
+duckdb_vector duckdb_array_vector_get_child(duckdb_vector vector);
+
+ +### Validity Mask Functions + +
bool duckdb_validity_row_is_valid(uint64_t *validity, idx_t row);
+void duckdb_validity_set_row_validity(uint64_t *validity, idx_t row, bool valid);
+void duckdb_validity_set_row_invalid(uint64_t *validity, idx_t row);
+void duckdb_validity_set_row_valid(uint64_t *validity, idx_t row);
+
+ +### Table Functions + +
duckdb_table_function duckdb_create_table_function();
+void duckdb_destroy_table_function(duckdb_table_function *table_function);
+void duckdb_table_function_set_name(duckdb_table_function table_function, const char *name);
+void duckdb_table_function_add_parameter(duckdb_table_function table_function, duckdb_logical_type type);
+void duckdb_table_function_add_named_parameter(duckdb_table_function table_function, const char *name, duckdb_logical_type type);
+void duckdb_table_function_set_extra_info(duckdb_table_function table_function, void *extra_info, duckdb_delete_callback_t destroy);
+void duckdb_table_function_set_bind(duckdb_table_function table_function, duckdb_table_function_bind_t bind);
+void duckdb_table_function_set_init(duckdb_table_function table_function, duckdb_table_function_init_t init);
+void duckdb_table_function_set_local_init(duckdb_table_function table_function, duckdb_table_function_init_t init);
+void duckdb_table_function_set_function(duckdb_table_function table_function, duckdb_table_function_t function);
+void duckdb_table_function_supports_projection_pushdown(duckdb_table_function table_function, bool pushdown);
+duckdb_state duckdb_register_table_function(duckdb_connection con, duckdb_table_function function);
+
+ +### Table Function Bind + +
void *duckdb_bind_get_extra_info(duckdb_bind_info info);
+void duckdb_bind_add_result_column(duckdb_bind_info info, const char *name, duckdb_logical_type type);
+idx_t duckdb_bind_get_parameter_count(duckdb_bind_info info);
+duckdb_value duckdb_bind_get_parameter(duckdb_bind_info info, idx_t index);
+duckdb_value duckdb_bind_get_named_parameter(duckdb_bind_info info, const char *name);
+void duckdb_bind_set_bind_data(duckdb_bind_info info, void *bind_data, duckdb_delete_callback_t destroy);
+void duckdb_bind_set_cardinality(duckdb_bind_info info, idx_t cardinality, bool is_exact);
+void duckdb_bind_set_error(duckdb_bind_info info, const char *error);
+
+ +### Table Function Init + +
void *duckdb_init_get_extra_info(duckdb_init_info info);
+void *duckdb_init_get_bind_data(duckdb_init_info info);
+void duckdb_init_set_init_data(duckdb_init_info info, void *init_data, duckdb_delete_callback_t destroy);
+idx_t duckdb_init_get_column_count(duckdb_init_info info);
+idx_t duckdb_init_get_column_index(duckdb_init_info info, idx_t column_index);
+void duckdb_init_set_max_threads(duckdb_init_info info, idx_t max_threads);
+void duckdb_init_set_error(duckdb_init_info info, const char *error);
+
+ +### Table Function + +
void *duckdb_function_get_extra_info(duckdb_function_info info);
+void *duckdb_function_get_bind_data(duckdb_function_info info);
+void *duckdb_function_get_init_data(duckdb_function_info info);
+void *duckdb_function_get_local_init_data(duckdb_function_info info);
+void duckdb_function_set_error(duckdb_function_info info, const char *error);
+
+ +### Replacement Scans + +
void duckdb_add_replacement_scan(duckdb_database db, duckdb_replacement_callback_t replacement, void *extra_data, duckdb_delete_callback_t delete_callback);
+void duckdb_replacement_scan_set_function_name(duckdb_replacement_scan_info info, const char *function_name);
+void duckdb_replacement_scan_add_parameter(duckdb_replacement_scan_info info, duckdb_value parameter);
+void duckdb_replacement_scan_set_error(duckdb_replacement_scan_info info, const char *error);
+
+ +### Appender + +
duckdb_state duckdb_appender_create(duckdb_connection connection, const char *schema, const char *table, duckdb_appender *out_appender);
+idx_t duckdb_appender_column_count(duckdb_appender appender);
+duckdb_logical_type duckdb_appender_column_type(duckdb_appender appender, idx_t col_idx);
+const char *duckdb_appender_error(duckdb_appender appender);
+duckdb_state duckdb_appender_flush(duckdb_appender appender);
+duckdb_state duckdb_appender_close(duckdb_appender appender);
+duckdb_state duckdb_appender_destroy(duckdb_appender *appender);
+duckdb_state duckdb_appender_begin_row(duckdb_appender appender);
+duckdb_state duckdb_appender_end_row(duckdb_appender appender);
+duckdb_state duckdb_append_bool(duckdb_appender appender, bool value);
+duckdb_state duckdb_append_int8(duckdb_appender appender, int8_t value);
+duckdb_state duckdb_append_int16(duckdb_appender appender, int16_t value);
+duckdb_state duckdb_append_int32(duckdb_appender appender, int32_t value);
+duckdb_state duckdb_append_int64(duckdb_appender appender, int64_t value);
+duckdb_state duckdb_append_hugeint(duckdb_appender appender, duckdb_hugeint value);
+duckdb_state duckdb_append_uint8(duckdb_appender appender, uint8_t value);
+duckdb_state duckdb_append_uint16(duckdb_appender appender, uint16_t value);
+duckdb_state duckdb_append_uint32(duckdb_appender appender, uint32_t value);
+duckdb_state duckdb_append_uint64(duckdb_appender appender, uint64_t value);
+duckdb_state duckdb_append_uhugeint(duckdb_appender appender, duckdb_uhugeint value);
+duckdb_state duckdb_append_float(duckdb_appender appender, float value);
+duckdb_state duckdb_append_double(duckdb_appender appender, double value);
+duckdb_state duckdb_append_date(duckdb_appender appender, duckdb_date value);
+duckdb_state duckdb_append_time(duckdb_appender appender, duckdb_time value);
+duckdb_state duckdb_append_timestamp(duckdb_appender appender, duckdb_timestamp value);
+duckdb_state duckdb_append_interval(duckdb_appender appender, duckdb_interval value);
+duckdb_state duckdb_append_varchar(duckdb_appender appender, const char *val);
+duckdb_state duckdb_append_varchar_length(duckdb_appender appender, const char *val, idx_t length);
+duckdb_state duckdb_append_blob(duckdb_appender appender, const void *data, idx_t length);
+duckdb_state duckdb_append_null(duckdb_appender appender);
+duckdb_state duckdb_append_data_chunk(duckdb_appender appender, duckdb_data_chunk chunk);
+
+ +### Arrow Interface + +
duckdb_state duckdb_query_arrow(duckdb_connection connection, const char *query, duckdb_arrow *out_result);
+duckdb_state duckdb_query_arrow_schema(duckdb_arrow result, duckdb_arrow_schema *out_schema);
+duckdb_state duckdb_prepared_arrow_schema(duckdb_prepared_statement prepared, duckdb_arrow_schema *out_schema);
+void duckdb_result_arrow_array(duckdb_result result, duckdb_data_chunk chunk, duckdb_arrow_array *out_array);
+duckdb_state duckdb_query_arrow_array(duckdb_arrow result, duckdb_arrow_array *out_array);
+idx_t duckdb_arrow_column_count(duckdb_arrow result);
+idx_t duckdb_arrow_row_count(duckdb_arrow result);
+idx_t duckdb_arrow_rows_changed(duckdb_arrow result);
+const char *duckdb_query_arrow_error(duckdb_arrow result);
+void duckdb_destroy_arrow(duckdb_arrow *result);
+void duckdb_destroy_arrow_stream(duckdb_arrow_stream *stream_p);
+duckdb_state duckdb_execute_prepared_arrow(duckdb_prepared_statement prepared_statement, duckdb_arrow *out_result);
+duckdb_state duckdb_arrow_scan(duckdb_connection connection, const char *table_name, duckdb_arrow_stream arrow);
+duckdb_state duckdb_arrow_array_scan(duckdb_connection connection, const char *table_name, duckdb_arrow_schema arrow_schema, duckdb_arrow_array arrow_array, duckdb_arrow_stream *out_stream);
+
+ +### Threading Information + +
void duckdb_execute_tasks(duckdb_database database, idx_t max_tasks);
+duckdb_task_state duckdb_create_task_state(duckdb_database database);
+void duckdb_execute_tasks_state(duckdb_task_state state);
+idx_t duckdb_execute_n_tasks_state(duckdb_task_state state, idx_t max_tasks);
+void duckdb_finish_execution(duckdb_task_state state);
+bool duckdb_task_state_is_finished(duckdb_task_state state);
+void duckdb_destroy_task_state(duckdb_task_state state);
+bool duckdb_execution_is_finished(duckdb_connection con);
+
+ +### Streaming Result Interface + +
duckdb_data_chunk duckdb_stream_fetch_chunk(duckdb_result result);
+duckdb_data_chunk duckdb_fetch_chunk(duckdb_result result);
+
+ +## Detailed API Reference + +#### `duckdb_open` + +Creates a new database or opens an existing database file stored at the given path. +If no path is given a new in-memory database is created instead. +The instantiated database should be closed with 'duckdb_close'. + +##### Syntax + +
duckdb_state duckdb_open(
+  const char *path,
+  duckdb_database *out_database
+);
+
+ +##### Parameters + +* `path` + +Path to the database file on disk, or `nullptr` or `:memory:` to open an in-memory database. +* `out_database` + +The result database object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_open_ext` + +Extended version of duckdb_open. Creates a new database or opens an existing database file stored at the given path. +The instantiated database should be closed with 'duckdb_close'. + +##### Syntax + +
duckdb_state duckdb_open_ext(
+  const char *path,
+  duckdb_database *out_database,
+  duckdb_config config,
+  char **out_error
+);
+
+ +##### Parameters + +* `path` + +Path to the database file on disk, or `nullptr` or `:memory:` to open an in-memory database. +* `out_database` + +The result database object. +* `config` + +(Optional) configuration used to start up the database system. +* `out_error` + +If set and the function returns DuckDBError, this will contain the reason why the start-up failed. +Note that the error must be freed using `duckdb_free`. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_close` + +Closes the specified database and de-allocates all memory allocated for that database. +This should be called after you are done with any database allocated through `duckdb_open` or `duckdb_open_ext`. +Note that failing to call `duckdb_close` (in case of e.g., a program crash) will not cause data corruption. +Still, it is recommended to always correctly close a database object after you are done with it. + +##### Syntax + +
void duckdb_close(
+  duckdb_database *database
+);
+
+ +##### Parameters + +* `database` + +The database object to shut down. + +
+ +#### `duckdb_connect` + +Opens a connection to a database. Connections are required to query the database, and store transactional state +associated with the connection. +The instantiated connection should be closed using 'duckdb_disconnect'. + +##### Syntax + +
duckdb_state duckdb_connect(
+  duckdb_database database,
+  duckdb_connection *out_connection
+);
+
+ +##### Parameters + +* `database` + +The database file to connect to. +* `out_connection` + +The result connection object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_interrupt` + +Interrupt running query + +##### Syntax + +
void duckdb_interrupt(
+  duckdb_connection connection
+);
+
+ +##### Parameters + +* `connection` + +The connection to interrupt + +
+ +#### `duckdb_query_progress` + +Get progress of the running query + +##### Syntax + +
duckdb_query_progress_type duckdb_query_progress(
+  duckdb_connection connection
+);
+
+ +##### Parameters + +* `connection` + +The working connection +* `returns` + +-1 if no progress or a percentage of the progress + +
+ +#### `duckdb_disconnect` + +Closes the specified connection and de-allocates all memory allocated for that connection. + +##### Syntax + +
void duckdb_disconnect(
+  duckdb_connection *connection
+);
+
+ +##### Parameters + +* `connection` + +The connection to close. + +
+ +#### `duckdb_library_version` + +Returns the version of the linked DuckDB, with a version postfix for dev versions + +Usually used for developing C extensions that must return this for a compatibility check. + +##### Syntax + +
const char *duckdb_library_version(
+  
+);
+
+
+ +#### `duckdb_create_config` + +Initializes an empty configuration object that can be used to provide start-up options for the DuckDB instance +through `duckdb_open_ext`. +The duckdb_config must be destroyed using 'duckdb_destroy_config' + +This will always succeed unless there is a malloc failure. + +##### Syntax + +
duckdb_state duckdb_create_config(
+  duckdb_config *out_config
+);
+
+ +##### Parameters + +* `out_config` + +The result configuration object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_config_count` + +This returns the total amount of configuration options available for usage with `duckdb_get_config_flag`. + +This should not be called in a loop as it internally loops over all the options. + +##### Syntax + +
size_t duckdb_config_count(
+  
+);
+
+ +##### Parameters + +* `returns` + +The amount of config options available. + +
+ +#### `duckdb_get_config_flag` + +Obtains a human-readable name and description of a specific configuration option. This can be used to e.g. +display configuration options. This will succeed unless `index` is out of range (i.e., `>= duckdb_config_count`). + +The result name or description MUST NOT be freed. + +##### Syntax + +
duckdb_state duckdb_get_config_flag(
+  size_t index,
+  const char **out_name,
+  const char **out_description
+);
+
+ +##### Parameters + +* `index` + +The index of the configuration option (between 0 and `duckdb_config_count`) +* `out_name` + +A name of the configuration flag. +* `out_description` + +A description of the configuration flag. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_set_config` + +Sets the specified option for the specified configuration. The configuration option is indicated by name. +To obtain a list of config options, see `duckdb_get_config_flag`. + +In the source code, configuration options are defined in `config.cpp`. + +This can fail if either the name is invalid, or if the value provided for the option is invalid. + +##### Syntax + +
duckdb_state duckdb_set_config(
+  duckdb_config config,
+  const char *name,
+  const char *option
+);
+
+ +##### Parameters + +* `duckdb_config` + +The configuration object to set the option on. +* `name` + +The name of the configuration flag to set. +* `option` + +The value to set the configuration flag to. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_destroy_config` + +Destroys the specified configuration object and de-allocates all memory allocated for the object. + +##### Syntax + +
void duckdb_destroy_config(
+  duckdb_config *config
+);
+
+ +##### Parameters + +* `config` + +The configuration object to destroy. + +
+ +#### `duckdb_query` + +Executes a SQL query within a connection and stores the full (materialized) result in the out_result pointer. +If the query fails to execute, DuckDBError is returned and the error message can be retrieved by calling +`duckdb_result_error`. + +Note that after running `duckdb_query`, `duckdb_destroy_result` must be called on the result object even if the +query fails, otherwise the error stored within the result will not be freed correctly. + +##### Syntax + +
duckdb_state duckdb_query(
+  duckdb_connection connection,
+  const char *query,
+  duckdb_result *out_result
+);
+
+ +##### Parameters + +* `connection` + +The connection to perform the query in. +* `query` + +The SQL query to run. +* `out_result` + +The query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_destroy_result` + +Closes the result and de-allocates all memory allocated for that connection. + +##### Syntax + +
void duckdb_destroy_result(
+  duckdb_result *result
+);
+
+ +##### Parameters + +* `result` + +The result to destroy. + +
+ +#### `duckdb_column_name` + +Returns the column name of the specified column. The result should not need to be freed; the column names will +automatically be destroyed when the result is destroyed. + +Returns `NULL` if the column is out of range. + +##### Syntax + +
const char *duckdb_column_name(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the column name from. +* `col` + +The column index. +* `returns` + +The column name of the specified column. + +
+ +#### `duckdb_column_type` + +Returns the column type of the specified column. + +Returns `DUCKDB_TYPE_INVALID` if the column is out of range. + +##### Syntax + +
duckdb_type duckdb_column_type(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the column type from. +* `col` + +The column index. +* `returns` + +The column type of the specified column. + +
+ +#### `duckdb_result_statement_type` + +Returns the statement type of the statement that was executed + +##### Syntax + +
duckdb_statement_type duckdb_result_statement_type(
+  duckdb_result result
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the statement type from. +* `returns` + +duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID + +
+ +#### `duckdb_column_logical_type` + +Returns the logical column type of the specified column. + +The return type of this call should be destroyed with `duckdb_destroy_logical_type`. + +Returns `NULL` if the column is out of range. + +##### Syntax + +
duckdb_logical_type duckdb_column_logical_type(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the column type from. +* `col` + +The column index. +* `returns` + +The logical column type of the specified column. + +
+ +#### `duckdb_column_count` + +Returns the number of columns present in a the result object. + +##### Syntax + +
idx_t duckdb_column_count(
+  duckdb_result *result
+);
+
+ +##### Parameters + +* `result` + +The result object. +* `returns` + +The number of columns present in the result object. + +
+ +#### `duckdb_row_count` + +> Deprecated This method is scheduled for removal in a future release. + +Returns the number of rows present in the result object. + +##### Syntax + +
idx_t duckdb_row_count(
+  duckdb_result *result
+);
+
+ +##### Parameters + +* `result` + +The result object. +* `returns` + +The number of rows present in the result object. + +
+ +#### `duckdb_rows_changed` + +Returns the number of rows changed by the query stored in the result. This is relevant only for INSERT/UPDATE/DELETE +queries. For other queries the rows_changed will be 0. + +##### Syntax + +
idx_t duckdb_rows_changed(
+  duckdb_result *result
+);
+
+ +##### Parameters + +* `result` + +The result object. +* `returns` + +The number of rows changed. + +
+ +#### `duckdb_column_data` + +> Deprecated This method is deprecated. Prefer using `duckdb_result_get_chunk` instead. + +Returns the data of a specific column of a result in columnar format. + +The function returns a dense array which contains the result data. The exact type stored in the array depends on the +corresponding duckdb_type (as provided by `duckdb_column_type`). For the exact type by which the data should be +accessed, see the comments in the [Types section]({% link docs/archive/1.0/api/c/types.md %}) or the `DUCKDB_TYPE` enum. + +For example, for a column of type `DUCKDB_TYPE_INTEGER`, rows can be accessed in the following manner: + +```c +int32_t *data = (int32_t *) duckdb_column_data(&result, 0); +printf("Data for row %d: %d\n", row, data[row]); +``` + +##### Syntax + +
void *duckdb_column_data(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the column data from. +* `col` + +The column index. +* `returns` + +The column data of the specified column. + +
+ +#### `duckdb_nullmask_data` + +> Deprecated This method is deprecated. Prefer using `duckdb_result_get_chunk` instead. + +Returns the nullmask of a specific column of a result in columnar format. The nullmask indicates for every row +whether or not the corresponding row is `NULL`. If a row is `NULL`, the values present in the array provided +by `duckdb_column_data` are undefined. + +```c +int32_t *data = (int32_t *) duckdb_column_data(&result, 0); +bool *nullmask = duckdb_nullmask_data(&result, 0); +if (nullmask[row]) { + printf("Data for row %d: NULL\n", row); +} else { + printf("Data for row %d: %d\n", row, data[row]); +} +``` + +##### Syntax + +
bool *duckdb_nullmask_data(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the nullmask from. +* `col` + +The column index. +* `returns` + +The nullmask of the specified column. + +
+ +#### `duckdb_result_error` + +Returns the error message contained within the result. The error is only set if `duckdb_query` returns `DuckDBError`. + +The result of this function must not be freed. It will be cleaned up when `duckdb_destroy_result` is called. + +##### Syntax + +
const char *duckdb_result_error(
+  duckdb_result *result
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the error from. +* `returns` + +The error of the result. + +
+ +#### `duckdb_result_get_chunk` + +> Deprecated This method is scheduled for removal in a future release. + +Fetches a data chunk from the duckdb_result. This function should be called repeatedly until the result is exhausted. + +The result must be destroyed with `duckdb_destroy_data_chunk`. + +This function supersedes all `duckdb_value` functions, as well as the `duckdb_column_data` and `duckdb_nullmask_data` +functions. It results in significantly better performance, and should be preferred in newer code-bases. + +If this function is used, none of the other result functions can be used and vice versa (i.e., this function cannot be +mixed with the legacy result functions). + +Use `duckdb_result_chunk_count` to figure out how many chunks there are in the result. + +##### Syntax + +
duckdb_data_chunk duckdb_result_get_chunk(
+  duckdb_result result,
+  idx_t chunk_index
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the data chunk from. +* `chunk_index` + +The chunk index to fetch from. +* `returns` + +The resulting data chunk. Returns `NULL` if the chunk index is out of bounds. + +
+ +#### `duckdb_result_is_streaming` + +> Deprecated This method is scheduled for removal in a future release. + +Checks if the type of the internal result is StreamQueryResult. + +##### Syntax + +
bool duckdb_result_is_streaming(
+  duckdb_result result
+);
+
+ +##### Parameters + +* `result` + +The result object to check. +* `returns` + +Whether or not the result object is of the type StreamQueryResult + +
+ +#### `duckdb_result_chunk_count` + +> Deprecated This method is scheduled for removal in a future release. + +Returns the number of data chunks present in the result. + +##### Syntax + +
idx_t duckdb_result_chunk_count(
+  duckdb_result result
+);
+
+ +##### Parameters + +* `result` + +The result object +* `returns` + +Number of data chunks present in the result. + +
+ +#### `duckdb_result_return_type` + +Returns the return_type of the given result, or DUCKDB_RETURN_TYPE_INVALID on error + +##### Syntax + +
duckdb_result_type duckdb_result_return_type(
+  duckdb_result result
+);
+
+ +##### Parameters + +* `result` + +The result object +* `returns` + +The return_type + +
+ +#### `duckdb_value_boolean` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
bool duckdb_value_boolean(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The boolean value at the specified location, or false if the value cannot be converted. + +
+ +#### `duckdb_value_int8` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
int8_t duckdb_value_int8(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The int8_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_int16` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
int16_t duckdb_value_int16(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The int16_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_int32` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
int32_t duckdb_value_int32(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The int32_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_int64` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
int64_t duckdb_value_int64(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The int64_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_hugeint` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_hugeint duckdb_value_hugeint(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_hugeint value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_uhugeint` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_uhugeint duckdb_value_uhugeint(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_uhugeint value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_decimal` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_decimal duckdb_value_decimal(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_decimal value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_uint8` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
uint8_t duckdb_value_uint8(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The uint8_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_uint16` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
uint16_t duckdb_value_uint16(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The uint16_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_uint32` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
uint32_t duckdb_value_uint32(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The uint32_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_uint64` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
uint64_t duckdb_value_uint64(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The uint64_t value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_float` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
float duckdb_value_float(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The float value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_double` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
double duckdb_value_double(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The double value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_date` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_date duckdb_value_date(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_date value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_time` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_time duckdb_value_time(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_time value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_timestamp` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_timestamp duckdb_value_timestamp(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_timestamp value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_interval` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_interval duckdb_value_interval(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_interval value at the specified location, or 0 if the value cannot be converted. + +
+ +#### `duckdb_value_varchar` + + +##### Syntax + +
char *duckdb_value_varchar(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `DEPRECATED` + +use duckdb_value_string instead. This function does not work correctly if the string contains null bytes. +* `returns` + +The text value at the specified location as a null-terminated string, or nullptr if the value cannot be +converted. The result must be freed with `duckdb_free`. + +
+ +#### `duckdb_value_string` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_string duckdb_value_string(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The string value at the specified location. Attempts to cast the result value to string. +* No support for nested types, and for other complex types. +* The resulting field "string.data" must be freed with `duckdb_free.` + +
+ +#### `duckdb_value_varchar_internal` + + +##### Syntax + +
char *duckdb_value_varchar_internal(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `DEPRECATED` + +use duckdb_value_string_internal instead. This function does not work correctly if the string contains +null bytes. +* `returns` + +The char* value at the specified location. ONLY works on VARCHAR columns and does not auto-cast. +If the column is NOT a VARCHAR column this function will return NULL. + +The result must NOT be freed. + +
+ +#### `duckdb_value_string_internal` + + +##### Syntax + +
duckdb_string duckdb_value_string_internal(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `DEPRECATED` + +use duckdb_value_string_internal instead. This function does not work correctly if the string contains +null bytes. +* `returns` + +The char* value at the specified location. ONLY works on VARCHAR columns and does not auto-cast. +If the column is NOT a VARCHAR column this function will return NULL. + +The result must NOT be freed. + +
+ +#### `duckdb_value_blob` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
duckdb_blob duckdb_value_blob(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_blob value at the specified location. Returns a blob with blob.data set to nullptr if the +value cannot be converted. The resulting field "blob.data" must be freed with `duckdb_free.` + +
+ +#### `duckdb_value_is_null` + +> Deprecated This method is scheduled for removal in a future release. + +##### Syntax + +
bool duckdb_value_is_null(
+  duckdb_result *result,
+  idx_t col,
+  idx_t row
+);
+
+ +##### Parameters + +* `returns` + +Returns true if the value at the specified index is NULL, and false otherwise. + +
+ +#### `duckdb_malloc` + +Allocate `size` bytes of memory using the duckdb internal malloc function. Any memory allocated in this manner +should be freed using `duckdb_free`. + +##### Syntax + +
void *duckdb_malloc(
+  size_t size
+);
+
+ +##### Parameters + +* `size` + +The number of bytes to allocate. +* `returns` + +A pointer to the allocated memory region. + +
+ +#### `duckdb_free` + +Free a value returned from `duckdb_malloc`, `duckdb_value_varchar`, `duckdb_value_blob`, or +`duckdb_value_string`. + +##### Syntax + +
void duckdb_free(
+  void *ptr
+);
+
+ +##### Parameters + +* `ptr` + +The memory region to de-allocate. + +
+ +#### `duckdb_vector_size` + +The internal vector size used by DuckDB. +This is the amount of tuples that will fit into a data chunk created by `duckdb_create_data_chunk`. + +##### Syntax + +
idx_t duckdb_vector_size(
+  
+);
+
+ +##### Parameters + +* `returns` + +The vector size. + +
+ +#### `duckdb_string_is_inlined` + +Whether or not the duckdb_string_t value is inlined. +This means that the data of the string does not have a separate allocation. + +##### Syntax + +
bool duckdb_string_is_inlined(
+  duckdb_string_t string
+);
+
+
+ +#### `duckdb_from_date` + +Decompose a `duckdb_date` object into year, month and date (stored as `duckdb_date_struct`). + +##### Syntax + +
duckdb_date_struct duckdb_from_date(
+  duckdb_date date
+);
+
+ +##### Parameters + +* `date` + +The date object, as obtained from a `DUCKDB_TYPE_DATE` column. +* `returns` + +The `duckdb_date_struct` with the decomposed elements. + +
+ +#### `duckdb_to_date` + +Re-compose a `duckdb_date` from year, month and date (`duckdb_date_struct`). + +##### Syntax + +
duckdb_date duckdb_to_date(
+  duckdb_date_struct date
+);
+
+ +##### Parameters + +* `date` + +The year, month and date stored in a `duckdb_date_struct`. +* `returns` + +The `duckdb_date` element. + +
+ +#### `duckdb_is_finite_date` + +Test a `duckdb_date` to see if it is a finite value. + +##### Syntax + +
bool duckdb_is_finite_date(
+  duckdb_date date
+);
+
+ +##### Parameters + +* `date` + +The date object, as obtained from a `DUCKDB_TYPE_DATE` column. +* `returns` + +True if the date is finite, false if it is ±infinity. + +
+ +#### `duckdb_from_time` + +Decompose a `duckdb_time` object into hour, minute, second and microsecond (stored as `duckdb_time_struct`). + +##### Syntax + +
duckdb_time_struct duckdb_from_time(
+  duckdb_time time
+);
+
+ +##### Parameters + +* `time` + +The time object, as obtained from a `DUCKDB_TYPE_TIME` column. +* `returns` + +The `duckdb_time_struct` with the decomposed elements. + +
+ +#### `duckdb_create_time_tz` + +Create a `duckdb_time_tz` object from micros and a timezone offset. + +##### Syntax + +
duckdb_time_tz duckdb_create_time_tz(
+  int64_t micros,
+  int32_t offset
+);
+
+ +##### Parameters + +* `micros` + +The microsecond component of the time. +* `offset` + +The timezone offset component of the time. +* `returns` + +The `duckdb_time_tz` element. + +
+ +#### `duckdb_from_time_tz` + +Decompose a TIME_TZ objects into micros and a timezone offset. + +Use `duckdb_from_time` to further decompose the micros into hour, minute, second and microsecond. + +##### Syntax + +
duckdb_time_tz_struct duckdb_from_time_tz(
+  duckdb_time_tz micros
+);
+
+ +##### Parameters + +* `micros` + +The time object, as obtained from a `DUCKDB_TYPE_TIME_TZ` column. +* `out_micros` + +The microsecond component of the time. +* `out_offset` + +The timezone offset component of the time. + +
+ +#### `duckdb_to_time` + +Re-compose a `duckdb_time` from hour, minute, second and microsecond (`duckdb_time_struct`). + +##### Syntax + +
duckdb_time duckdb_to_time(
+  duckdb_time_struct time
+);
+
+ +##### Parameters + +* `time` + +The hour, minute, second and microsecond in a `duckdb_time_struct`. +* `returns` + +The `duckdb_time` element. + +
+ +#### `duckdb_from_timestamp` + +Decompose a `duckdb_timestamp` object into a `duckdb_timestamp_struct`. + +##### Syntax + +
duckdb_timestamp_struct duckdb_from_timestamp(
+  duckdb_timestamp ts
+);
+
+ +##### Parameters + +* `ts` + +The ts object, as obtained from a `DUCKDB_TYPE_TIMESTAMP` column. +* `returns` + +The `duckdb_timestamp_struct` with the decomposed elements. + +
+ +#### `duckdb_to_timestamp` + +Re-compose a `duckdb_timestamp` from a duckdb_timestamp_struct. + +##### Syntax + +
duckdb_timestamp duckdb_to_timestamp(
+  duckdb_timestamp_struct ts
+);
+
+ +##### Parameters + +* `ts` + +The de-composed elements in a `duckdb_timestamp_struct`. +* `returns` + +The `duckdb_timestamp` element. + +
+ +#### `duckdb_is_finite_timestamp` + +Test a `duckdb_timestamp` to see if it is a finite value. + +##### Syntax + +
bool duckdb_is_finite_timestamp(
+  duckdb_timestamp ts
+);
+
+ +##### Parameters + +* `ts` + +The timestamp object, as obtained from a `DUCKDB_TYPE_TIMESTAMP` column. +* `returns` + +True if the timestamp is finite, false if it is ±infinity. + +
+ +#### `duckdb_hugeint_to_double` + +Converts a duckdb_hugeint object (as obtained from a `DUCKDB_TYPE_HUGEINT` column) into a double. + +##### Syntax + +
double duckdb_hugeint_to_double(
+  duckdb_hugeint val
+);
+
+ +##### Parameters + +* `val` + +The hugeint value. +* `returns` + +The converted `double` element. + +
+ +#### `duckdb_double_to_hugeint` + +Converts a double value to a duckdb_hugeint object. + +If the conversion fails because the double value is too big the result will be 0. + +##### Syntax + +
duckdb_hugeint duckdb_double_to_hugeint(
+  double val
+);
+
+ +##### Parameters + +* `val` + +The double value. +* `returns` + +The converted `duckdb_hugeint` element. + +
+ +#### `duckdb_uhugeint_to_double` + +Converts a duckdb_uhugeint object (as obtained from a `DUCKDB_TYPE_UHUGEINT` column) into a double. + +##### Syntax + +
double duckdb_uhugeint_to_double(
+  duckdb_uhugeint val
+);
+
+ +##### Parameters + +* `val` + +The uhugeint value. +* `returns` + +The converted `double` element. + +
+ +#### `duckdb_double_to_uhugeint` + +Converts a double value to a duckdb_uhugeint object. + +If the conversion fails because the double value is too big the result will be 0. + +##### Syntax + +
duckdb_uhugeint duckdb_double_to_uhugeint(
+  double val
+);
+
+ +##### Parameters + +* `val` + +The double value. +* `returns` + +The converted `duckdb_uhugeint` element. + +
+ +#### `duckdb_double_to_decimal` + +Converts a double value to a duckdb_decimal object. + +If the conversion fails because the double value is too big, or the width/scale are invalid the result will be 0. + +##### Syntax + +
duckdb_decimal duckdb_double_to_decimal(
+  double val,
+  uint8_t width,
+  uint8_t scale
+);
+
+ +##### Parameters + +* `val` + +The double value. +* `returns` + +The converted `duckdb_decimal` element. + +
+ +#### `duckdb_decimal_to_double` + +Converts a duckdb_decimal object (as obtained from a `DUCKDB_TYPE_DECIMAL` column) into a double. + +##### Syntax + +
double duckdb_decimal_to_double(
+  duckdb_decimal val
+);
+
+ +##### Parameters + +* `val` + +The decimal value. +* `returns` + +The converted `double` element. + +
+ +#### `duckdb_prepare` + +Create a prepared statement object from a query. + +Note that after calling `duckdb_prepare`, the prepared statement should always be destroyed using +`duckdb_destroy_prepare`, even if the prepare fails. + +If the prepare fails, `duckdb_prepare_error` can be called to obtain the reason why the prepare failed. + +##### Syntax + +
duckdb_state duckdb_prepare(
+  duckdb_connection connection,
+  const char *query,
+  duckdb_prepared_statement *out_prepared_statement
+);
+
+ +##### Parameters + +* `connection` + +The connection object +* `query` + +The SQL query to prepare +* `out_prepared_statement` + +The resulting prepared statement object +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_destroy_prepare` + +Closes the prepared statement and de-allocates all memory allocated for the statement. + +##### Syntax + +
void duckdb_destroy_prepare(
+  duckdb_prepared_statement *prepared_statement
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to destroy. + +
+ +#### `duckdb_prepare_error` + +Returns the error message associated with the given prepared statement. +If the prepared statement has no error message, this returns `nullptr` instead. + +The error message should not be freed. It will be de-allocated when `duckdb_destroy_prepare` is called. + +##### Syntax + +
const char *duckdb_prepare_error(
+  duckdb_prepared_statement prepared_statement
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to obtain the error from. +* `returns` + +The error message, or `nullptr` if there is none. + +
+ +#### `duckdb_nparams` + +Returns the number of parameters that can be provided to the given prepared statement. + +Returns 0 if the query was not successfully prepared. + +##### Syntax + +
idx_t duckdb_nparams(
+  duckdb_prepared_statement prepared_statement
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to obtain the number of parameters for. + +
+ +#### `duckdb_parameter_name` + +Returns the name used to identify the parameter +The returned string should be freed using `duckdb_free`. + +Returns NULL if the index is out of range for the provided prepared statement. + +##### Syntax + +
const char *duckdb_parameter_name(
+  duckdb_prepared_statement prepared_statement,
+  idx_t index
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement for which to get the parameter name from. + +
+ +#### `duckdb_param_type` + +Returns the parameter type for the parameter at the given index. + +Returns `DUCKDB_TYPE_INVALID` if the parameter index is out of range or the statement was not successfully prepared. + +##### Syntax + +
duckdb_type duckdb_param_type(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement. +* `param_idx` + +The parameter index. +* `returns` + +The parameter type + +
+ +#### `duckdb_clear_bindings` + +Clear the params bind to the prepared statement. + +##### Syntax + +
duckdb_state duckdb_clear_bindings(
+  duckdb_prepared_statement prepared_statement
+);
+
+
+ +#### `duckdb_prepared_statement_type` + +Returns the statement type of the statement to be executed + +##### Syntax + +
duckdb_statement_type duckdb_prepared_statement_type(
+  duckdb_prepared_statement statement
+);
+
+ +##### Parameters + +* `statement` + +The prepared statement. +* `returns` + +duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID + +
+ +#### `duckdb_bind_value` + +Binds a value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_value(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_value val
+);
+
+
+ +#### `duckdb_bind_parameter_index` + +Retrieve the index of the parameter for the prepared statement, identified by name + +##### Syntax + +
duckdb_state duckdb_bind_parameter_index(
+  duckdb_prepared_statement prepared_statement,
+  idx_t *param_idx_out,
+  const char *name
+);
+
+
+ +#### `duckdb_bind_boolean` + +Binds a bool value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_boolean(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  bool val
+);
+
+
+ +#### `duckdb_bind_int8` + +Binds an int8_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_int8(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  int8_t val
+);
+
+
+ +#### `duckdb_bind_int16` + +Binds an int16_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_int16(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  int16_t val
+);
+
+
+ +#### `duckdb_bind_int32` + +Binds an int32_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_int32(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  int32_t val
+);
+
+
+ +#### `duckdb_bind_int64` + +Binds an int64_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_int64(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  int64_t val
+);
+
+
+ +#### `duckdb_bind_hugeint` + +Binds a duckdb_hugeint value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_hugeint(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_hugeint val
+);
+
+
+ +#### `duckdb_bind_uhugeint` + +Binds an duckdb_uhugeint value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_uhugeint(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_uhugeint val
+);
+
+
+ +#### `duckdb_bind_decimal` + +Binds a duckdb_decimal value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_decimal(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_decimal val
+);
+
+
+ +#### `duckdb_bind_uint8` + +Binds an uint8_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_uint8(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  uint8_t val
+);
+
+
+ +#### `duckdb_bind_uint16` + +Binds an uint16_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_uint16(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  uint16_t val
+);
+
+
+ +#### `duckdb_bind_uint32` + +Binds an uint32_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_uint32(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  uint32_t val
+);
+
+
+ +#### `duckdb_bind_uint64` + +Binds an uint64_t value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_uint64(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  uint64_t val
+);
+
+
+ +#### `duckdb_bind_float` + +Binds a float value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_float(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  float val
+);
+
+
+ +#### `duckdb_bind_double` + +Binds a double value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_double(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  double val
+);
+
+
+ +#### `duckdb_bind_date` + +Binds a duckdb_date value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_date(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_date val
+);
+
+
+ +#### `duckdb_bind_time` + +Binds a duckdb_time value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_time(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_time val
+);
+
+
+ +#### `duckdb_bind_timestamp` + +Binds a duckdb_timestamp value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_timestamp(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_timestamp val
+);
+
+
+ +#### `duckdb_bind_interval` + +Binds a duckdb_interval value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_interval(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  duckdb_interval val
+);
+
+
+ +#### `duckdb_bind_varchar` + +Binds a null-terminated varchar value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_varchar(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  const char *val
+);
+
+
+ +#### `duckdb_bind_varchar_length` + +Binds a varchar value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_varchar_length(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  const char *val,
+  idx_t length
+);
+
+
+ +#### `duckdb_bind_blob` + +Binds a blob value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_blob(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx,
+  const void *data,
+  idx_t length
+);
+
+
+ +#### `duckdb_bind_null` + +Binds a NULL value to the prepared statement at the specified index. + +##### Syntax + +
duckdb_state duckdb_bind_null(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx
+);
+
+
+ +#### `duckdb_execute_prepared` + +Executes the prepared statement with the given bound parameters, and returns a materialized query result. + +This method can be called multiple times for each prepared statement, and the parameters can be modified +between calls to this function. + +Note that the result must be freed with `duckdb_destroy_result`. + +##### Syntax + +
duckdb_state duckdb_execute_prepared(
+  duckdb_prepared_statement prepared_statement,
+  duckdb_result *out_result
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to execute. +* `out_result` + +The query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_execute_prepared_streaming` + +> Deprecated This method is scheduled for removal in a future release. + +Executes the prepared statement with the given bound parameters, and returns an optionally-streaming query result. +To determine if the resulting query was in fact streamed, use `duckdb_result_is_streaming` + +This method can be called multiple times for each prepared statement, and the parameters can be modified +between calls to this function. + +Note that the result must be freed with `duckdb_destroy_result`. + +##### Syntax + +
duckdb_state duckdb_execute_prepared_streaming(
+  duckdb_prepared_statement prepared_statement,
+  duckdb_result *out_result
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to execute. +* `out_result` + +The query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_extract_statements` + +Extract all statements from a query. +Note that after calling `duckdb_extract_statements`, the extracted statements should always be destroyed using +`duckdb_destroy_extracted`, even if no statements were extracted. + +If the extract fails, `duckdb_extract_statements_error` can be called to obtain the reason why the extract failed. + +##### Syntax + +
idx_t duckdb_extract_statements(
+  duckdb_connection connection,
+  const char *query,
+  duckdb_extracted_statements *out_extracted_statements
+);
+
+ +##### Parameters + +* `connection` + +The connection object +* `query` + +The SQL query to extract +* `out_extracted_statements` + +The resulting extracted statements object +* `returns` + +The number of extracted statements or 0 on failure. + +
+ +#### `duckdb_prepare_extracted_statement` + +Prepare an extracted statement. +Note that after calling `duckdb_prepare_extracted_statement`, the prepared statement should always be destroyed using +`duckdb_destroy_prepare`, even if the prepare fails. + +If the prepare fails, `duckdb_prepare_error` can be called to obtain the reason why the prepare failed. + +##### Syntax + +
duckdb_state duckdb_prepare_extracted_statement(
+  duckdb_connection connection,
+  duckdb_extracted_statements extracted_statements,
+  idx_t index,
+  duckdb_prepared_statement *out_prepared_statement
+);
+
+ +##### Parameters + +* `connection` + +The connection object +* `extracted_statements` + +The extracted statements object +* `index` + +The index of the extracted statement to prepare +* `out_prepared_statement` + +The resulting prepared statement object +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_extract_statements_error` + +Returns the error message contained within the extracted statements. +The result of this function must not be freed. It will be cleaned up when `duckdb_destroy_extracted` is called. + +##### Syntax + +
const char *duckdb_extract_statements_error(
+  duckdb_extracted_statements extracted_statements
+);
+
+ +##### Parameters + +* `result` + +The extracted statements to fetch the error from. +* `returns` + +The error of the extracted statements. + +
+ +#### `duckdb_destroy_extracted` + +De-allocates all memory allocated for the extracted statements. + +##### Syntax + +
void duckdb_destroy_extracted(
+  duckdb_extracted_statements *extracted_statements
+);
+
+ +##### Parameters + +* `extracted_statements` + +The extracted statements to destroy. + +
+ +#### `duckdb_pending_prepared` + +Executes the prepared statement with the given bound parameters, and returns a pending result. +The pending result represents an intermediate structure for a query that is not yet fully executed. +The pending result can be used to incrementally execute a query, returning control to the client between tasks. + +Note that after calling `duckdb_pending_prepared`, the pending result should always be destroyed using +`duckdb_destroy_pending`, even if this function returns DuckDBError. + +##### Syntax + +
duckdb_state duckdb_pending_prepared(
+  duckdb_prepared_statement prepared_statement,
+  duckdb_pending_result *out_result
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to execute. +* `out_result` + +The pending query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_pending_prepared_streaming` + +> Deprecated This method is scheduled for removal in a future release. + +Executes the prepared statement with the given bound parameters, and returns a pending result. +This pending result will create a streaming duckdb_result when executed. +The pending result represents an intermediate structure for a query that is not yet fully executed. + +Note that after calling `duckdb_pending_prepared_streaming`, the pending result should always be destroyed using +`duckdb_destroy_pending`, even if this function returns DuckDBError. + +##### Syntax + +
duckdb_state duckdb_pending_prepared_streaming(
+  duckdb_prepared_statement prepared_statement,
+  duckdb_pending_result *out_result
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to execute. +* `out_result` + +The pending query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_destroy_pending` + +Closes the pending result and de-allocates all memory allocated for the result. + +##### Syntax + +
void duckdb_destroy_pending(
+  duckdb_pending_result *pending_result
+);
+
+ +##### Parameters + +* `pending_result` + +The pending result to destroy. + +
+ +#### `duckdb_pending_error` + +Returns the error message contained within the pending result. + +The result of this function must not be freed. It will be cleaned up when `duckdb_destroy_pending` is called. + +##### Syntax + +
const char *duckdb_pending_error(
+  duckdb_pending_result pending_result
+);
+
+ +##### Parameters + +* `result` + +The pending result to fetch the error from. +* `returns` + +The error of the pending result. + +
+ +#### `duckdb_pending_execute_task` + +Executes a single task within the query, returning whether or not the query is ready. + +If this returns DUCKDB_PENDING_RESULT_READY, the duckdb_execute_pending function can be called to obtain the result. +If this returns DUCKDB_PENDING_RESULT_NOT_READY, the duckdb_pending_execute_task function should be called again. +If this returns DUCKDB_PENDING_ERROR, an error occurred during execution. + +The error message can be obtained by calling duckdb_pending_error on the pending_result. + +##### Syntax + +
duckdb_pending_state duckdb_pending_execute_task(
+  duckdb_pending_result pending_result
+);
+
+ +##### Parameters + +* `pending_result` + +The pending result to execute a task within. +* `returns` + +The state of the pending result after the execution. + +
+ +#### `duckdb_pending_execute_check_state` + +If this returns DUCKDB_PENDING_RESULT_READY, the duckdb_execute_pending function can be called to obtain the result. +If this returns DUCKDB_PENDING_RESULT_NOT_READY, the duckdb_pending_execute_check_state function should be called again. +If this returns DUCKDB_PENDING_ERROR, an error occurred during execution. + +The error message can be obtained by calling duckdb_pending_error on the pending_result. + +##### Syntax + +
duckdb_pending_state duckdb_pending_execute_check_state(
+  duckdb_pending_result pending_result
+);
+
+ +##### Parameters + +* `pending_result` + +The pending result. +* `returns` + +The state of the pending result. + +
+ +#### `duckdb_execute_pending` + +Fully execute a pending query result, returning the final query result. + +If duckdb_pending_execute_task has been called until DUCKDB_PENDING_RESULT_READY was returned, this will return fast. +Otherwise, all remaining tasks must be executed first. + +Note that the result must be freed with `duckdb_destroy_result`. + +##### Syntax + +
duckdb_state duckdb_execute_pending(
+  duckdb_pending_result pending_result,
+  duckdb_result *out_result
+);
+
+ +##### Parameters + +* `pending_result` + +The pending result to execute. +* `out_result` + +The result object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_pending_execution_is_finished` + +Returns whether a duckdb_pending_state is finished executing. For example if `pending_state` is +DUCKDB_PENDING_RESULT_READY, this function will return true. + +##### Syntax + +
bool duckdb_pending_execution_is_finished(
+  duckdb_pending_state pending_state
+);
+
+ +##### Parameters + +* `pending_state` + +The pending state on which to decide whether to finish execution. +* `returns` + +Boolean indicating pending execution should be considered finished. + +
+ +#### `duckdb_destroy_value` + +Destroys the value and de-allocates all memory allocated for that type. + +##### Syntax + +
void duckdb_destroy_value(
+  duckdb_value *value
+);
+
+ +##### Parameters + +* `value` + +The value to destroy. + +
+ +#### `duckdb_create_varchar` + +Creates a value from a null-terminated string + +##### Syntax + +
duckdb_value duckdb_create_varchar(
+  const char *text
+);
+
+ +##### Parameters + +* `value` + +The null-terminated string +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_create_varchar_length` + +Creates a value from a string + +##### Syntax + +
duckdb_value duckdb_create_varchar_length(
+  const char *text,
+  idx_t length
+);
+
+ +##### Parameters + +* `value` + +The text +* `length` + +The length of the text +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_create_int64` + +Creates a value from an int64 + +##### Syntax + +
duckdb_value duckdb_create_int64(
+  int64_t val
+);
+
+ +##### Parameters + +* `value` + +The bigint value +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_create_struct_value` + +Creates a struct value from a type and an array of values + +##### Syntax + +
duckdb_value duckdb_create_struct_value(
+  duckdb_logical_type type,
+  duckdb_value *values
+);
+
+ +##### Parameters + +* `type` + +The type of the struct +* `values` + +The values for the struct fields +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_create_list_value` + +Creates a list value from a type and an array of values of length `value_count` + +##### Syntax + +
duckdb_value duckdb_create_list_value(
+  duckdb_logical_type type,
+  duckdb_value *values,
+  idx_t value_count
+);
+
+ +##### Parameters + +* `type` + +The type of the list +* `values` + +The values for the list +* `value_count` + +The number of values in the list +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_create_array_value` + +Creates a array value from a type and an array of values of length `value_count` + +##### Syntax + +
duckdb_value duckdb_create_array_value(
+  duckdb_logical_type type,
+  duckdb_value *values,
+  idx_t value_count
+);
+
+ +##### Parameters + +* `type` + +The type of the array +* `values` + +The values for the array +* `value_count` + +The number of values in the array +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_get_varchar` + +Obtains a string representation of the given value. +The result must be destroyed with `duckdb_free`. + +##### Syntax + +
char *duckdb_get_varchar(
+  duckdb_value value
+);
+
+ +##### Parameters + +* `value` + +The value +* `returns` + +The string value. This must be destroyed with `duckdb_free`. + +
+ +#### `duckdb_get_int64` + +Obtains an int64 of the given value. + +##### Syntax + +
int64_t duckdb_get_int64(
+  duckdb_value value
+);
+
+ +##### Parameters + +* `value` + +The value +* `returns` + +The int64 value, or 0 if no conversion is possible + +
+ +#### `duckdb_create_logical_type` + +Creates a `duckdb_logical_type` from a standard primitive type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +This should not be used with `DUCKDB_TYPE_DECIMAL`. + +##### Syntax + +
duckdb_logical_type duckdb_create_logical_type(
+  duckdb_type type
+);
+
+ +##### Parameters + +* `type` + +The primitive type to create. +* `returns` + +The logical type. + +
+ +#### `duckdb_logical_type_get_alias` + +Returns the alias of a duckdb_logical_type, if one is set, else `NULL`. +The result must be destroyed with `duckdb_free`. + +##### Syntax + +
char *duckdb_logical_type_get_alias(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type to return the alias of +* `returns` + +The alias or `NULL` + +
+ +#### `duckdb_create_list_type` + +Creates a list type from its child type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_create_list_type(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The child type of list type to create. +* `returns` + +The logical type. + +
+ +#### `duckdb_create_array_type` + +Creates a array type from its child type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_create_array_type(
+  duckdb_logical_type type,
+  idx_t array_size
+);
+
+ +##### Parameters + +* `type` + +The child type of array type to create. +* `array_size` + +The number of elements in the array. +* `returns` + +The logical type. + +
+ +#### `duckdb_create_map_type` + +Creates a map type from its key type and value type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_create_map_type(
+  duckdb_logical_type key_type,
+  duckdb_logical_type value_type
+);
+
+ +##### Parameters + +* `type` + +The key type and value type of map type to create. +* `returns` + +The logical type. + +
+ +#### `duckdb_create_union_type` + +Creates a UNION type from the passed types array. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_create_union_type(
+  duckdb_logical_type *member_types,
+  const char **member_names,
+  idx_t member_count
+);
+
+ +##### Parameters + +* `types` + +The array of types that the union should consist of. +* `type_amount` + +The size of the types array. +* `returns` + +The logical type. + +
+ +#### `duckdb_create_struct_type` + +Creates a STRUCT type from the passed member name and type arrays. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_create_struct_type(
+  duckdb_logical_type *member_types,
+  const char **member_names,
+  idx_t member_count
+);
+
+ +##### Parameters + +* `member_types` + +The array of types that the struct should consist of. +* `member_names` + +The array of names that the struct should consist of. +* `member_count` + +The number of members that were specified for both arrays. +* `returns` + +The logical type. + +
+ +#### `duckdb_create_enum_type` + +Creates an ENUM type from the passed member name array. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_create_enum_type(
+  const char **member_names,
+  idx_t member_count
+);
+
+ +##### Parameters + +* `enum_name` + +The name of the enum. +* `member_names` + +The array of names that the enum should consist of. +* `member_count` + +The number of elements that were specified in the array. +* `returns` + +The logical type. + +
+ +#### `duckdb_create_decimal_type` + +Creates a `duckdb_logical_type` of type decimal with the specified width and scale. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_create_decimal_type(
+  uint8_t width,
+  uint8_t scale
+);
+
+ +##### Parameters + +* `width` + +The width of the decimal type +* `scale` + +The scale of the decimal type +* `returns` + +The logical type. + +
+ +#### `duckdb_get_type_id` + +Retrieves the enum type class of a `duckdb_logical_type`. + +##### Syntax + +
duckdb_type duckdb_get_type_id(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The type id + +
+ +#### `duckdb_decimal_width` + +Retrieves the width of a decimal type. + +##### Syntax + +
uint8_t duckdb_decimal_width(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The width of the decimal type + +
+ +#### `duckdb_decimal_scale` + +Retrieves the scale of a decimal type. + +##### Syntax + +
uint8_t duckdb_decimal_scale(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The scale of the decimal type + +
+ +#### `duckdb_decimal_internal_type` + +Retrieves the internal storage type of a decimal type. + +##### Syntax + +
duckdb_type duckdb_decimal_internal_type(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The internal type of the decimal type + +
+ +#### `duckdb_enum_internal_type` + +Retrieves the internal storage type of an enum type. + +##### Syntax + +
duckdb_type duckdb_enum_internal_type(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The internal type of the enum type + +
+ +#### `duckdb_enum_dictionary_size` + +Retrieves the dictionary size of the enum type. + +##### Syntax + +
uint32_t duckdb_enum_dictionary_size(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The dictionary size of the enum type + +
+ +#### `duckdb_enum_dictionary_value` + +Retrieves the dictionary value at the specified position from the enum. + +The result must be freed with `duckdb_free`. + +##### Syntax + +
char *duckdb_enum_dictionary_value(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `index` + +The index in the dictionary +* `returns` + +The string value of the enum type. Must be freed with `duckdb_free`. + +
+ +#### `duckdb_list_type_child_type` + +Retrieves the child type of the given list type. + +The result must be freed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_list_type_child_type(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The child type of the list type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +#### `duckdb_array_type_child_type` + +Retrieves the child type of the given array type. + +The result must be freed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_array_type_child_type(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The child type of the array type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +#### `duckdb_array_type_array_size` + +Retrieves the array size of the given array type. + +##### Syntax + +
idx_t duckdb_array_type_array_size(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The fixed number of elements the values of this array type can store. + +
+ +#### `duckdb_map_type_key_type` + +Retrieves the key type of the given map type. + +The result must be freed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_map_type_key_type(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The key type of the map type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +#### `duckdb_map_type_value_type` + +Retrieves the value type of the given map type. + +The result must be freed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_map_type_value_type(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The value type of the map type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +#### `duckdb_struct_type_child_count` + +Returns the number of children of a struct type. + +##### Syntax + +
idx_t duckdb_struct_type_child_count(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `returns` + +The number of children of a struct type. + +
+ +#### `duckdb_struct_type_child_name` + +Retrieves the name of the struct child. + +The result must be freed with `duckdb_free`. + +##### Syntax + +
char *duckdb_struct_type_child_name(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The name of the struct type. Must be freed with `duckdb_free`. + +
+ +#### `duckdb_struct_type_child_type` + +Retrieves the child type of the given struct type at the specified index. + +The result must be freed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_struct_type_child_type(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The child type of the struct type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +#### `duckdb_union_type_member_count` + +Returns the number of members that the union type has. + +##### Syntax + +
idx_t duckdb_union_type_member_count(
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `type` + +The logical type (union) object +* `returns` + +The number of members of a union type. + +
+ +#### `duckdb_union_type_member_name` + +Retrieves the name of the union member. + +The result must be freed with `duckdb_free`. + +##### Syntax + +
char *duckdb_union_type_member_name(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The name of the union member. Must be freed with `duckdb_free`. + +
+ +#### `duckdb_union_type_member_type` + +Retrieves the child type of the given union member at the specified index. + +The result must be freed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_union_type_member_type(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +##### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The child type of the union member. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +#### `duckdb_destroy_logical_type` + +Destroys the logical type and de-allocates all memory allocated for that type. + +##### Syntax + +
void duckdb_destroy_logical_type(
+  duckdb_logical_type *type
+);
+
+ +##### Parameters + +* `type` + +The logical type to destroy. + +
+ +#### `duckdb_create_data_chunk` + +Creates an empty DataChunk with the specified set of types. + +Note that the result must be destroyed with `duckdb_destroy_data_chunk`. + +##### Syntax + +
duckdb_data_chunk duckdb_create_data_chunk(
+  duckdb_logical_type *types,
+  idx_t column_count
+);
+
+ +##### Parameters + +* `types` + +An array of types of the data chunk. +* `column_count` + +The number of columns. +* `returns` + +The data chunk. + +
+ +#### `duckdb_destroy_data_chunk` + +Destroys the data chunk and de-allocates all memory allocated for that chunk. + +##### Syntax + +
void duckdb_destroy_data_chunk(
+  duckdb_data_chunk *chunk
+);
+
+ +##### Parameters + +* `chunk` + +The data chunk to destroy. + +
+ +#### `duckdb_data_chunk_reset` + +Resets a data chunk, clearing the validity masks and setting the cardinality of the data chunk to 0. + +##### Syntax + +
void duckdb_data_chunk_reset(
+  duckdb_data_chunk chunk
+);
+
+ +##### Parameters + +* `chunk` + +The data chunk to reset. + +
+ +#### `duckdb_data_chunk_get_column_count` + +Retrieves the number of columns in a data chunk. + +##### Syntax + +
idx_t duckdb_data_chunk_get_column_count(
+  duckdb_data_chunk chunk
+);
+
+ +##### Parameters + +* `chunk` + +The data chunk to get the data from +* `returns` + +The number of columns in the data chunk + +
+ +#### `duckdb_data_chunk_get_vector` + +Retrieves the vector at the specified column index in the data chunk. + +The pointer to the vector is valid for as long as the chunk is alive. +It does NOT need to be destroyed. + +##### Syntax + +
duckdb_vector duckdb_data_chunk_get_vector(
+  duckdb_data_chunk chunk,
+  idx_t col_idx
+);
+
+ +##### Parameters + +* `chunk` + +The data chunk to get the data from +* `returns` + +The vector + +
+ +#### `duckdb_data_chunk_get_size` + +Retrieves the current number of tuples in a data chunk. + +##### Syntax + +
idx_t duckdb_data_chunk_get_size(
+  duckdb_data_chunk chunk
+);
+
+ +##### Parameters + +* `chunk` + +The data chunk to get the data from +* `returns` + +The number of tuples in the data chunk + +
+ +#### `duckdb_data_chunk_set_size` + +Sets the current number of tuples in a data chunk. + +##### Syntax + +
void duckdb_data_chunk_set_size(
+  duckdb_data_chunk chunk,
+  idx_t size
+);
+
+ +##### Parameters + +* `chunk` + +The data chunk to set the size in +* `size` + +The number of tuples in the data chunk + +
+ +#### `duckdb_vector_get_column_type` + +Retrieves the column type of the specified vector. + +The result must be destroyed with `duckdb_destroy_logical_type`. + +##### Syntax + +
duckdb_logical_type duckdb_vector_get_column_type(
+  duckdb_vector vector
+);
+
+ +##### Parameters + +* `vector` + +The vector get the data from +* `returns` + +The type of the vector + +
+ +#### `duckdb_vector_get_data` + +Retrieves the data pointer of the vector. + +The data pointer can be used to read or write values from the vector. +How to read or write values depends on the type of the vector. + +##### Syntax + +
void *duckdb_vector_get_data(
+  duckdb_vector vector
+);
+
+ +##### Parameters + +* `vector` + +The vector to get the data from +* `returns` + +The data pointer + +
+ +#### `duckdb_vector_get_validity` + +Retrieves the validity mask pointer of the specified vector. + +If all values are valid, this function MIGHT return NULL! + +The validity mask is a bitset that signifies null-ness within the data chunk. +It is a series of uint64_t values, where each uint64_t value contains validity for 64 tuples. +The bit is set to 1 if the value is valid (i.e., not NULL) or 0 if the value is invalid (i.e., NULL). + +Validity of a specific value can be obtained like this: + +```c +idx_t entry_idx = row_idx / 64; +idx_t idx_in_entry = row_idx % 64; +bool is_valid = validity_mask[entry_idx] & (1 << idx_in_entry); +``` + +Alternatively, the (slower) duckdb_validity_row_is_valid function can be used. + +##### Syntax + +
uint64_t *duckdb_vector_get_validity(
+  duckdb_vector vector
+);
+
+ +##### Parameters + +* `vector` + +The vector to get the data from +* `returns` + +The pointer to the validity mask, or NULL if no validity mask is present + +
+ +#### `duckdb_vector_ensure_validity_writable` + +Ensures the validity mask is writable by allocating it. + +After this function is called, `duckdb_vector_get_validity` will ALWAYS return non-NULL. +This allows null values to be written to the vector, regardless of whether a validity mask was present before. + +##### Syntax + +
void duckdb_vector_ensure_validity_writable(
+  duckdb_vector vector
+);
+
+ +##### Parameters + +* `vector` + +The vector to alter + +
+ +#### `duckdb_vector_assign_string_element` + +Assigns a string element in the vector at the specified location. + +##### Syntax + +
void duckdb_vector_assign_string_element(
+  duckdb_vector vector,
+  idx_t index,
+  const char *str
+);
+
+ +##### Parameters + +* `vector` + +The vector to alter +* `index` + +The row position in the vector to assign the string to +* `str` + +The null-terminated string + +
+ +#### `duckdb_vector_assign_string_element_len` + +Assigns a string element in the vector at the specified location. You may also use this function to assign BLOBs. + +##### Syntax + +
void duckdb_vector_assign_string_element_len(
+  duckdb_vector vector,
+  idx_t index,
+  const char *str,
+  idx_t str_len
+);
+
+ +##### Parameters + +* `vector` + +The vector to alter +* `index` + +The row position in the vector to assign the string to +* `str` + +The string +* `str_len` + +The length of the string (in bytes) + +
+ +#### `duckdb_list_vector_get_child` + +Retrieves the child vector of a list vector. + +The resulting vector is valid as long as the parent vector is valid. + +##### Syntax + +
duckdb_vector duckdb_list_vector_get_child(
+  duckdb_vector vector
+);
+
+ +##### Parameters + +* `vector` + +The vector +* `returns` + +The child vector + +
+ +#### `duckdb_list_vector_get_size` + +Returns the size of the child vector of the list. + +##### Syntax + +
idx_t duckdb_list_vector_get_size(
+  duckdb_vector vector
+);
+
+ +##### Parameters + +* `vector` + +The vector +* `returns` + +The size of the child list + +
+ +#### `duckdb_list_vector_set_size` + +Sets the total size of the underlying child-vector of a list vector. + +##### Syntax + +
duckdb_state duckdb_list_vector_set_size(
+  duckdb_vector vector,
+  idx_t size
+);
+
+ +##### Parameters + +* `vector` + +The list vector. +* `size` + +The size of the child list. +* `returns` + +The duckdb state. Returns DuckDBError if the vector is nullptr. + +
+ +#### `duckdb_list_vector_reserve` + +Sets the total capacity of the underlying child-vector of a list. + +##### Syntax + +
duckdb_state duckdb_list_vector_reserve(
+  duckdb_vector vector,
+  idx_t required_capacity
+);
+
+ +##### Parameters + +* `vector` + +The list vector. +* `required_capacity` + +the total capacity to reserve. +* `return` + +The duckdb state. Returns DuckDBError if the vector is nullptr. + +
+ +#### `duckdb_struct_vector_get_child` + +Retrieves the child vector of a struct vector. + +The resulting vector is valid as long as the parent vector is valid. + +##### Syntax + +
duckdb_vector duckdb_struct_vector_get_child(
+  duckdb_vector vector,
+  idx_t index
+);
+
+ +##### Parameters + +* `vector` + +The vector +* `index` + +The child index +* `returns` + +The child vector + +
+ +#### `duckdb_array_vector_get_child` + +Retrieves the child vector of a array vector. + +The resulting vector is valid as long as the parent vector is valid. +The resulting vector has the size of the parent vector multiplied by the array size. + +##### Syntax + +
duckdb_vector duckdb_array_vector_get_child(
+  duckdb_vector vector
+);
+
+ +##### Parameters + +* `vector` + +The vector +* `returns` + +The child vector + +
+ +#### `duckdb_validity_row_is_valid` + +Returns whether or not a row is valid (i.e., not NULL) in the given validity mask. + +##### Syntax + +
bool duckdb_validity_row_is_valid(
+  uint64_t *validity,
+  idx_t row
+);
+
+ +##### Parameters + +* `validity` + +The validity mask, as obtained through `duckdb_vector_get_validity` +* `row` + +The row index +* `returns` + +true if the row is valid, false otherwise + +
+ +#### `duckdb_validity_set_row_validity` + +In a validity mask, sets a specific row to either valid or invalid. + +Note that `duckdb_vector_ensure_validity_writable` should be called before calling `duckdb_vector_get_validity`, +to ensure that there is a validity mask to write to. + +##### Syntax + +
void duckdb_validity_set_row_validity(
+  uint64_t *validity,
+  idx_t row,
+  bool valid
+);
+
+ +##### Parameters + +* `validity` + +The validity mask, as obtained through `duckdb_vector_get_validity`. +* `row` + +The row index +* `valid` + +Whether or not to set the row to valid, or invalid + +
+ +#### `duckdb_validity_set_row_invalid` + +In a validity mask, sets a specific row to invalid. + +Equivalent to `duckdb_validity_set_row_validity` with valid set to false. + +##### Syntax + +
void duckdb_validity_set_row_invalid(
+  uint64_t *validity,
+  idx_t row
+);
+
+ +##### Parameters + +* `validity` + +The validity mask +* `row` + +The row index + +
+ +#### `duckdb_validity_set_row_valid` + +In a validity mask, sets a specific row to valid. + +Equivalent to `duckdb_validity_set_row_validity` with valid set to true. + +##### Syntax + +
void duckdb_validity_set_row_valid(
+  uint64_t *validity,
+  idx_t row
+);
+
+ +##### Parameters + +* `validity` + +The validity mask +* `row` + +The row index + +
+ +#### `duckdb_create_table_function` + +Creates a new empty table function. + +The return value should be destroyed with `duckdb_destroy_table_function`. + +##### Syntax + +
duckdb_table_function duckdb_create_table_function(
+  
+);
+
+ +##### Parameters + +* `returns` + +The table function object. + +
+ +#### `duckdb_destroy_table_function` + +Destroys the given table function object. + +##### Syntax + +
void duckdb_destroy_table_function(
+  duckdb_table_function *table_function
+);
+
+ +##### Parameters + +* `table_function` + +The table function to destroy + +
+ +#### `duckdb_table_function_set_name` + +Sets the name of the given table function. + +##### Syntax + +
void duckdb_table_function_set_name(
+  duckdb_table_function table_function,
+  const char *name
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `name` + +The name of the table function + +
+ +#### `duckdb_table_function_add_parameter` + +Adds a parameter to the table function. + +##### Syntax + +
void duckdb_table_function_add_parameter(
+  duckdb_table_function table_function,
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `type` + +The type of the parameter to add. + +
+ +#### `duckdb_table_function_add_named_parameter` + +Adds a named parameter to the table function. + +##### Syntax + +
void duckdb_table_function_add_named_parameter(
+  duckdb_table_function table_function,
+  const char *name,
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `name` + +The name of the parameter +* `type` + +The type of the parameter to add. + +
+ +#### `duckdb_table_function_set_extra_info` + +Assigns extra information to the table function that can be fetched during binding, etc. + +##### Syntax + +
void duckdb_table_function_set_extra_info(
+  duckdb_table_function table_function,
+  void *extra_info,
+  duckdb_delete_callback_t destroy
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `extra_info` + +The extra information +* `destroy` + +The callback that will be called to destroy the bind data (if any) + +
+ +#### `duckdb_table_function_set_bind` + +Sets the bind function of the table function. + +##### Syntax + +
void duckdb_table_function_set_bind(
+  duckdb_table_function table_function,
+  duckdb_table_function_bind_t bind
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `bind` + +The bind function + +
+ +#### `duckdb_table_function_set_init` + +Sets the init function of the table function. + +##### Syntax + +
void duckdb_table_function_set_init(
+  duckdb_table_function table_function,
+  duckdb_table_function_init_t init
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `init` + +The init function + +
+ +#### `duckdb_table_function_set_local_init` + +Sets the thread-local init function of the table function. + +##### Syntax + +
void duckdb_table_function_set_local_init(
+  duckdb_table_function table_function,
+  duckdb_table_function_init_t init
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `init` + +The init function + +
+ +#### `duckdb_table_function_set_function` + +Sets the main function of the table function. + +##### Syntax + +
void duckdb_table_function_set_function(
+  duckdb_table_function table_function,
+  duckdb_table_function_t function
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `function` + +The function + +
+ +#### `duckdb_table_function_supports_projection_pushdown` + +Sets whether or not the given table function supports projection pushdown. + +If this is set to true, the system will provide a list of all required columns in the `init` stage through +the `duckdb_init_get_column_count` and `duckdb_init_get_column_index` functions. +If this is set to false (the default), the system will expect all columns to be projected. + +##### Syntax + +
void duckdb_table_function_supports_projection_pushdown(
+  duckdb_table_function table_function,
+  bool pushdown
+);
+
+ +##### Parameters + +* `table_function` + +The table function +* `pushdown` + +True if the table function supports projection pushdown, false otherwise. + +
+ +#### `duckdb_register_table_function` + +Register the table function object within the given connection. + +The function requires at least a name, a bind function, an init function and a main function. + +If the function is incomplete or a function with this name already exists DuckDBError is returned. + +##### Syntax + +
duckdb_state duckdb_register_table_function(
+  duckdb_connection con,
+  duckdb_table_function function
+);
+
+ +##### Parameters + +* `con` + +The connection to register it in. +* `function` + +The function pointer +* `returns` + +Whether or not the registration was successful. + +
+ +#### `duckdb_bind_get_extra_info` + +Retrieves the extra info of the function as set in `duckdb_table_function_set_extra_info`. + +##### Syntax + +
void *duckdb_bind_get_extra_info(
+  duckdb_bind_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The extra info + +
+ +#### `duckdb_bind_add_result_column` + +Adds a result column to the output of the table function. + +##### Syntax + +
void duckdb_bind_add_result_column(
+  duckdb_bind_info info,
+  const char *name,
+  duckdb_logical_type type
+);
+
+ +##### Parameters + +* `info` + +The info object +* `name` + +The name of the column +* `type` + +The logical type of the column + +
+ +#### `duckdb_bind_get_parameter_count` + +Retrieves the number of regular (non-named) parameters to the function. + +##### Syntax + +
idx_t duckdb_bind_get_parameter_count(
+  duckdb_bind_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The number of parameters + +
+ +#### `duckdb_bind_get_parameter` + +Retrieves the parameter at the given index. + +The result must be destroyed with `duckdb_destroy_value`. + +##### Syntax + +
duckdb_value duckdb_bind_get_parameter(
+  duckdb_bind_info info,
+  idx_t index
+);
+
+ +##### Parameters + +* `info` + +The info object +* `index` + +The index of the parameter to get +* `returns` + +The value of the parameter. Must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_bind_get_named_parameter` + +Retrieves a named parameter with the given name. + +The result must be destroyed with `duckdb_destroy_value`. + +##### Syntax + +
duckdb_value duckdb_bind_get_named_parameter(
+  duckdb_bind_info info,
+  const char *name
+);
+
+ +##### Parameters + +* `info` + +The info object +* `name` + +The name of the parameter +* `returns` + +The value of the parameter. Must be destroyed with `duckdb_destroy_value`. + +
+ +#### `duckdb_bind_set_bind_data` + +Sets the user-provided bind data in the bind object. This object can be retrieved again during execution. + +##### Syntax + +
void duckdb_bind_set_bind_data(
+  duckdb_bind_info info,
+  void *bind_data,
+  duckdb_delete_callback_t destroy
+);
+
+ +##### Parameters + +* `info` + +The info object +* `extra_data` + +The bind data object. +* `destroy` + +The callback that will be called to destroy the bind data (if any) + +
+ +#### `duckdb_bind_set_cardinality` + +Sets the cardinality estimate for the table function, used for optimization. + +##### Syntax + +
void duckdb_bind_set_cardinality(
+  duckdb_bind_info info,
+  idx_t cardinality,
+  bool is_exact
+);
+
+ +##### Parameters + +* `info` + +The bind data object. +* `is_exact` + +Whether or not the cardinality estimate is exact, or an approximation + +
+ +#### `duckdb_bind_set_error` + +Report that an error has occurred while calling bind. + +##### Syntax + +
void duckdb_bind_set_error(
+  duckdb_bind_info info,
+  const char *error
+);
+
+ +##### Parameters + +* `info` + +The info object +* `error` + +The error message + +
+ +#### `duckdb_init_get_extra_info` + +Retrieves the extra info of the function as set in `duckdb_table_function_set_extra_info`. + +##### Syntax + +
void *duckdb_init_get_extra_info(
+  duckdb_init_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The extra info + +
+ +#### `duckdb_init_get_bind_data` + +Gets the bind data set by `duckdb_bind_set_bind_data` during the bind. + +Note that the bind data should be considered as read-only. +For tracking state, use the init data instead. + +##### Syntax + +
void *duckdb_init_get_bind_data(
+  duckdb_init_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The bind data object + +
+ +#### `duckdb_init_set_init_data` + +Sets the user-provided init data in the init object. This object can be retrieved again during execution. + +##### Syntax + +
void duckdb_init_set_init_data(
+  duckdb_init_info info,
+  void *init_data,
+  duckdb_delete_callback_t destroy
+);
+
+ +##### Parameters + +* `info` + +The info object +* `extra_data` + +The init data object. +* `destroy` + +The callback that will be called to destroy the init data (if any) + +
+ +#### `duckdb_init_get_column_count` + +Returns the number of projected columns. + +This function must be used if projection pushdown is enabled to figure out which columns to emit. + +##### Syntax + +
idx_t duckdb_init_get_column_count(
+  duckdb_init_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The number of projected columns. + +
+ +#### `duckdb_init_get_column_index` + +Returns the column index of the projected column at the specified position. + +This function must be used if projection pushdown is enabled to figure out which columns to emit. + +##### Syntax + +
idx_t duckdb_init_get_column_index(
+  duckdb_init_info info,
+  idx_t column_index
+);
+
+ +##### Parameters + +* `info` + +The info object +* `column_index` + +The index at which to get the projected column index, from 0..duckdb_init_get_column_count(info) +* `returns` + +The column index of the projected column. + +
+ +#### `duckdb_init_set_max_threads` + +Sets how many threads can process this table function in parallel (default: 1) + +##### Syntax + +
void duckdb_init_set_max_threads(
+  duckdb_init_info info,
+  idx_t max_threads
+);
+
+ +##### Parameters + +* `info` + +The info object +* `max_threads` + +The maximum amount of threads that can process this table function + +
+ +#### `duckdb_init_set_error` + +Report that an error has occurred while calling init. + +##### Syntax + +
void duckdb_init_set_error(
+  duckdb_init_info info,
+  const char *error
+);
+
+ +##### Parameters + +* `info` + +The info object +* `error` + +The error message + +
+ +#### `duckdb_function_get_extra_info` + +Retrieves the extra info of the function as set in `duckdb_table_function_set_extra_info`. + +##### Syntax + +
void *duckdb_function_get_extra_info(
+  duckdb_function_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The extra info + +
+ +#### `duckdb_function_get_bind_data` + +Gets the bind data set by `duckdb_bind_set_bind_data` during the bind. + +Note that the bind data should be considered as read-only. +For tracking state, use the init data instead. + +##### Syntax + +
void *duckdb_function_get_bind_data(
+  duckdb_function_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The bind data object + +
+ +#### `duckdb_function_get_init_data` + +Gets the init data set by `duckdb_init_set_init_data` during the init. + +##### Syntax + +
void *duckdb_function_get_init_data(
+  duckdb_function_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The init data object + +
+ +#### `duckdb_function_get_local_init_data` + +Gets the thread-local init data set by `duckdb_init_set_init_data` during the local_init. + +##### Syntax + +
void *duckdb_function_get_local_init_data(
+  duckdb_function_info info
+);
+
+ +##### Parameters + +* `info` + +The info object +* `returns` + +The init data object + +
+ +#### `duckdb_function_set_error` + +Report that an error has occurred while executing the function. + +##### Syntax + +
void duckdb_function_set_error(
+  duckdb_function_info info,
+  const char *error
+);
+
+ +##### Parameters + +* `info` + +The info object +* `error` + +The error message + +
+ +#### `duckdb_add_replacement_scan` + +Add a replacement scan definition to the specified database. + +##### Syntax + +
void duckdb_add_replacement_scan(
+  duckdb_database db,
+  duckdb_replacement_callback_t replacement,
+  void *extra_data,
+  duckdb_delete_callback_t delete_callback
+);
+
+ +##### Parameters + +* `db` + +The database object to add the replacement scan to +* `replacement` + +The replacement scan callback +* `extra_data` + +Extra data that is passed back into the specified callback +* `delete_callback` + +The delete callback to call on the extra data, if any + +
+ +#### `duckdb_replacement_scan_set_function_name` + +Sets the replacement function name. If this function is called in the replacement callback, +the replacement scan is performed. If it is not called, the replacement callback is not performed. + +##### Syntax + +
void duckdb_replacement_scan_set_function_name(
+  duckdb_replacement_scan_info info,
+  const char *function_name
+);
+
+ +##### Parameters + +* `info` + +The info object +* `function_name` + +The function name to substitute. + +
+ +#### `duckdb_replacement_scan_add_parameter` + +Adds a parameter to the replacement scan function. + +##### Syntax + +
void duckdb_replacement_scan_add_parameter(
+  duckdb_replacement_scan_info info,
+  duckdb_value parameter
+);
+
+ +##### Parameters + +* `info` + +The info object +* `parameter` + +The parameter to add. + +
+ +#### `duckdb_replacement_scan_set_error` + +Report that an error has occurred while executing the replacement scan. + +##### Syntax + +
void duckdb_replacement_scan_set_error(
+  duckdb_replacement_scan_info info,
+  const char *error
+);
+
+ +##### Parameters + +* `info` + +The info object +* `error` + +The error message + +
+ +#### `duckdb_appender_create` + +Creates an appender object. + +Note that the object must be destroyed with `duckdb_appender_destroy`. + +##### Syntax + +
duckdb_state duckdb_appender_create(
+  duckdb_connection connection,
+  const char *schema,
+  const char *table,
+  duckdb_appender *out_appender
+);
+
+ +##### Parameters + +* `connection` + +The connection context to create the appender in. +* `schema` + +The schema of the table to append to, or `nullptr` for the default schema. +* `table` + +The table name to append to. +* `out_appender` + +The resulting appender object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_appender_column_count` + +Returns the number of columns in the table that belongs to the appender. + +* appender The appender to get the column count from. + +##### Syntax + +
idx_t duckdb_appender_column_count(
+  duckdb_appender appender
+);
+
+ +##### Parameters + +* `returns` + +The number of columns in the table. + +
+ +#### `duckdb_appender_column_type` + +Returns the type of the column at the specified index. + +Note: The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +* appender The appender to get the column type from. +* col_idx The index of the column to get the type of. + +##### Syntax + +
duckdb_logical_type duckdb_appender_column_type(
+  duckdb_appender appender,
+  idx_t col_idx
+);
+
+ +##### Parameters + +* `returns` + +The duckdb_logical_type of the column. + +
+ +#### `duckdb_appender_error` + +Returns the error message associated with the given appender. +If the appender has no error message, this returns `nullptr` instead. + +The error message should not be freed. It will be de-allocated when `duckdb_appender_destroy` is called. + +##### Syntax + +
const char *duckdb_appender_error(
+  duckdb_appender appender
+);
+
+ +##### Parameters + +* `appender` + +The appender to get the error from. +* `returns` + +The error message, or `nullptr` if there is none. + +
+ +#### `duckdb_appender_flush` + +Flush the appender to the table, forcing the cache of the appender to be cleared. If flushing the data triggers a +constraint violation or any other error, then all data is invalidated, and this function returns DuckDBError. +It is not possible to append more values. Call duckdb_appender_error to obtain the error message followed by +duckdb_appender_destroy to destroy the invalidated appender. + +##### Syntax + +
duckdb_state duckdb_appender_flush(
+  duckdb_appender appender
+);
+
+ +##### Parameters + +* `appender` + +The appender to flush. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_appender_close` + +Closes the appender by flushing all intermediate states and closing it for further appends. If flushing the data +triggers a constraint violation or any other error, then all data is invalidated, and this function returns DuckDBError. +Call duckdb_appender_error to obtain the error message followed by duckdb_appender_destroy to destroy the invalidated +appender. + +##### Syntax + +
duckdb_state duckdb_appender_close(
+  duckdb_appender appender
+);
+
+ +##### Parameters + +* `appender` + +The appender to flush and close. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_appender_destroy` + +Closes the appender by flushing all intermediate states to the table and destroying it. By destroying it, this function +de-allocates all memory associated with the appender. If flushing the data triggers a constraint violation, +then all data is invalidated, and this function returns DuckDBError. Due to the destruction of the appender, it is no +longer possible to obtain the specific error message with duckdb_appender_error. Therefore, call duckdb_appender_close +before destroying the appender, if you need insights into the specific error. + +##### Syntax + +
duckdb_state duckdb_appender_destroy(
+  duckdb_appender *appender
+);
+
+ +##### Parameters + +* `appender` + +The appender to flush, close and destroy. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_appender_begin_row` + +A nop function, provided for backwards compatibility reasons. Does nothing. Only `duckdb_appender_end_row` is required. + +##### Syntax + +
duckdb_state duckdb_appender_begin_row(
+  duckdb_appender appender
+);
+
+
+ +#### `duckdb_appender_end_row` + +Finish the current row of appends. After end_row is called, the next row can be appended. + +##### Syntax + +
duckdb_state duckdb_appender_end_row(
+  duckdb_appender appender
+);
+
+ +##### Parameters + +* `appender` + +The appender. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_append_bool` + +Append a bool value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_bool(
+  duckdb_appender appender,
+  bool value
+);
+
+
+ +#### `duckdb_append_int8` + +Append an int8_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_int8(
+  duckdb_appender appender,
+  int8_t value
+);
+
+
+ +#### `duckdb_append_int16` + +Append an int16_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_int16(
+  duckdb_appender appender,
+  int16_t value
+);
+
+
+ +#### `duckdb_append_int32` + +Append an int32_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_int32(
+  duckdb_appender appender,
+  int32_t value
+);
+
+
+ +#### `duckdb_append_int64` + +Append an int64_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_int64(
+  duckdb_appender appender,
+  int64_t value
+);
+
+
+ +#### `duckdb_append_hugeint` + +Append a duckdb_hugeint value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_hugeint(
+  duckdb_appender appender,
+  duckdb_hugeint value
+);
+
+
+ +#### `duckdb_append_uint8` + +Append a uint8_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_uint8(
+  duckdb_appender appender,
+  uint8_t value
+);
+
+
+ +#### `duckdb_append_uint16` + +Append a uint16_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_uint16(
+  duckdb_appender appender,
+  uint16_t value
+);
+
+
+ +#### `duckdb_append_uint32` + +Append a uint32_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_uint32(
+  duckdb_appender appender,
+  uint32_t value
+);
+
+
+ +#### `duckdb_append_uint64` + +Append a uint64_t value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_uint64(
+  duckdb_appender appender,
+  uint64_t value
+);
+
+
+ +#### `duckdb_append_uhugeint` + +Append a duckdb_uhugeint value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_uhugeint(
+  duckdb_appender appender,
+  duckdb_uhugeint value
+);
+
+
+ +#### `duckdb_append_float` + +Append a float value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_float(
+  duckdb_appender appender,
+  float value
+);
+
+
+ +#### `duckdb_append_double` + +Append a double value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_double(
+  duckdb_appender appender,
+  double value
+);
+
+
+ +#### `duckdb_append_date` + +Append a duckdb_date value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_date(
+  duckdb_appender appender,
+  duckdb_date value
+);
+
+
+ +#### `duckdb_append_time` + +Append a duckdb_time value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_time(
+  duckdb_appender appender,
+  duckdb_time value
+);
+
+
+ +#### `duckdb_append_timestamp` + +Append a duckdb_timestamp value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_timestamp(
+  duckdb_appender appender,
+  duckdb_timestamp value
+);
+
+
+ +#### `duckdb_append_interval` + +Append a duckdb_interval value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_interval(
+  duckdb_appender appender,
+  duckdb_interval value
+);
+
+
+ +#### `duckdb_append_varchar` + +Append a varchar value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_varchar(
+  duckdb_appender appender,
+  const char *val
+);
+
+
+ +#### `duckdb_append_varchar_length` + +Append a varchar value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_varchar_length(
+  duckdb_appender appender,
+  const char *val,
+  idx_t length
+);
+
+
+ +#### `duckdb_append_blob` + +Append a blob value to the appender. + +##### Syntax + +
duckdb_state duckdb_append_blob(
+  duckdb_appender appender,
+  const void *data,
+  idx_t length
+);
+
+
+ +#### `duckdb_append_null` + +Append a NULL value to the appender (of any type). + +##### Syntax + +
duckdb_state duckdb_append_null(
+  duckdb_appender appender
+);
+
+
+ +#### `duckdb_append_data_chunk` + +Appends a pre-filled data chunk to the specified appender. + +The types of the data chunk must exactly match the types of the table, no casting is performed. +If the types do not match or the appender is in an invalid state, DuckDBError is returned. +If the append is successful, DuckDBSuccess is returned. + +##### Syntax + +
duckdb_state duckdb_append_data_chunk(
+  duckdb_appender appender,
+  duckdb_data_chunk chunk
+);
+
+ +##### Parameters + +* `appender` + +The appender to append to. +* `chunk` + +The data chunk to append. +* `returns` + +The return state. + +
+ +#### `duckdb_query_arrow` + +> Deprecated This method is scheduled for removal in a future release. + +Executes a SQL query within a connection and stores the full (materialized) result in an arrow structure. +If the query fails to execute, DuckDBError is returned and the error message can be retrieved by calling +`duckdb_query_arrow_error`. + +Note that after running `duckdb_query_arrow`, `duckdb_destroy_arrow` must be called on the result object even if the +query fails, otherwise the error stored within the result will not be freed correctly. + +##### Syntax + +
duckdb_state duckdb_query_arrow(
+  duckdb_connection connection,
+  const char *query,
+  duckdb_arrow *out_result
+);
+
+ +##### Parameters + +* `connection` + +The connection to perform the query in. +* `query` + +The SQL query to run. +* `out_result` + +The query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_query_arrow_schema` + +> Deprecated This method is scheduled for removal in a future release. + +Fetch the internal arrow schema from the arrow result. Remember to call release on the respective +ArrowSchema object. + +##### Syntax + +
duckdb_state duckdb_query_arrow_schema(
+  duckdb_arrow result,
+  duckdb_arrow_schema *out_schema
+);
+
+ +##### Parameters + +* `result` + +The result to fetch the schema from. +* `out_schema` + +The output schema. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_prepared_arrow_schema` + +> Deprecated This method is scheduled for removal in a future release. + +Fetch the internal arrow schema from the prepared statement. Remember to call release on the respective +ArrowSchema object. + +##### Syntax + +
duckdb_state duckdb_prepared_arrow_schema(
+  duckdb_prepared_statement prepared,
+  duckdb_arrow_schema *out_schema
+);
+
+ +##### Parameters + +* `result` + +The prepared statement to fetch the schema from. +* `out_schema` + +The output schema. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_result_arrow_array` + +> Deprecated This method is scheduled for removal in a future release. + +Convert a data chunk into an arrow struct array. Remember to call release on the respective +ArrowArray object. + +##### Syntax + +
void duckdb_result_arrow_array(
+  duckdb_result result,
+  duckdb_data_chunk chunk,
+  duckdb_arrow_array *out_array
+);
+
+ +##### Parameters + +* `result` + +The result object the data chunk have been fetched from. +* `chunk` + +The data chunk to convert. +* `out_array` + +The output array. + +
+ +#### `duckdb_query_arrow_array` + +> Deprecated This method is scheduled for removal in a future release. + +Fetch an internal arrow struct array from the arrow result. Remember to call release on the respective +ArrowArray object. + +This function can be called multiple time to get next chunks, which will free the previous out_array. +So consume the out_array before calling this function again. + +##### Syntax + +
duckdb_state duckdb_query_arrow_array(
+  duckdb_arrow result,
+  duckdb_arrow_array *out_array
+);
+
+ +##### Parameters + +* `result` + +The result to fetch the array from. +* `out_array` + +The output array. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_arrow_column_count` + +> Deprecated This method is scheduled for removal in a future release. + +Returns the number of columns present in the arrow result object. + +##### Syntax + +
idx_t duckdb_arrow_column_count(
+  duckdb_arrow result
+);
+
+ +##### Parameters + +* `result` + +The result object. +* `returns` + +The number of columns present in the result object. + +
+ +#### `duckdb_arrow_row_count` + +> Deprecated This method is scheduled for removal in a future release. + +Returns the number of rows present in the arrow result object. + +##### Syntax + +
idx_t duckdb_arrow_row_count(
+  duckdb_arrow result
+);
+
+ +##### Parameters + +* `result` + +The result object. +* `returns` + +The number of rows present in the result object. + +
+ +#### `duckdb_arrow_rows_changed` + +> Deprecated This method is scheduled for removal in a future release. + +Returns the number of rows changed by the query stored in the arrow result. This is relevant only for +INSERT/UPDATE/DELETE queries. For other queries the rows_changed will be 0. + +##### Syntax + +
idx_t duckdb_arrow_rows_changed(
+  duckdb_arrow result
+);
+
+ +##### Parameters + +* `result` + +The result object. +* `returns` + +The number of rows changed. + +
+ +#### `duckdb_query_arrow_error` + +> Deprecated This method is scheduled for removal in a future release. + +Returns the error message contained within the result. The error is only set if `duckdb_query_arrow` returns +`DuckDBError`. + +The error message should not be freed. It will be de-allocated when `duckdb_destroy_arrow` is called. + +##### Syntax + +
const char *duckdb_query_arrow_error(
+  duckdb_arrow result
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the error from. +* `returns` + +The error of the result. + +
+ +#### `duckdb_destroy_arrow` + +> Deprecated This method is scheduled for removal in a future release. + +Closes the result and de-allocates all memory allocated for the arrow result. + +##### Syntax + +
void duckdb_destroy_arrow(
+  duckdb_arrow *result
+);
+
+ +##### Parameters + +* `result` + +The result to destroy. + +
+ +#### `duckdb_destroy_arrow_stream` + +> Deprecated This method is scheduled for removal in a future release. + +Releases the arrow array stream and de-allocates its memory. + +##### Syntax + +
void duckdb_destroy_arrow_stream(
+  duckdb_arrow_stream *stream_p
+);
+
+ +##### Parameters + +* `stream` + +The arrow array stream to destroy. + +
+ +#### `duckdb_execute_prepared_arrow` + +> Deprecated This method is scheduled for removal in a future release. + +Executes the prepared statement with the given bound parameters, and returns an arrow query result. +Note that after running `duckdb_execute_prepared_arrow`, `duckdb_destroy_arrow` must be called on the result object. + +##### Syntax + +
duckdb_state duckdb_execute_prepared_arrow(
+  duckdb_prepared_statement prepared_statement,
+  duckdb_arrow *out_result
+);
+
+ +##### Parameters + +* `prepared_statement` + +The prepared statement to execute. +* `out_result` + +The query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_arrow_scan` + +> Deprecated This method is scheduled for removal in a future release. + +Scans the Arrow stream and creates a view with the given name. + +##### Syntax + +
duckdb_state duckdb_arrow_scan(
+  duckdb_connection connection,
+  const char *table_name,
+  duckdb_arrow_stream arrow
+);
+
+ +##### Parameters + +* `connection` + +The connection on which to execute the scan. +* `table_name` + +Name of the temporary view to create. +* `arrow` + +Arrow stream wrapper. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_arrow_array_scan` + +> Deprecated This method is scheduled for removal in a future release. + +Scans the Arrow array and creates a view with the given name. +Note that after running `duckdb_arrow_array_scan`, `duckdb_destroy_arrow_stream` must be called on the out stream. + +##### Syntax + +
duckdb_state duckdb_arrow_array_scan(
+  duckdb_connection connection,
+  const char *table_name,
+  duckdb_arrow_schema arrow_schema,
+  duckdb_arrow_array arrow_array,
+  duckdb_arrow_stream *out_stream
+);
+
+ +##### Parameters + +* `connection` + +The connection on which to execute the scan. +* `table_name` + +Name of the temporary view to create. +* `arrow_schema` + +Arrow schema wrapper. +* `arrow_array` + +Arrow array wrapper. +* `out_stream` + +Output array stream that wraps around the passed schema, for releasing/deleting once done. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +#### `duckdb_execute_tasks` + +Execute DuckDB tasks on this thread. + +Will return after `max_tasks` have been executed, or if there are no more tasks present. + +##### Syntax + +
void duckdb_execute_tasks(
+  duckdb_database database,
+  idx_t max_tasks
+);
+
+ +##### Parameters + +* `database` + +The database object to execute tasks for +* `max_tasks` + +The maximum amount of tasks to execute + +
+ +#### `duckdb_create_task_state` + +Creates a task state that can be used with duckdb_execute_tasks_state to execute tasks until +`duckdb_finish_execution` is called on the state. + +`duckdb_destroy_state` must be called on the result. + +##### Syntax + +
duckdb_task_state duckdb_create_task_state(
+  duckdb_database database
+);
+
+ +##### Parameters + +* `database` + +The database object to create the task state for +* `returns` + +The task state that can be used with duckdb_execute_tasks_state. + +
+ +#### `duckdb_execute_tasks_state` + +Execute DuckDB tasks on this thread. + +The thread will keep on executing tasks forever, until duckdb_finish_execution is called on the state. +Multiple threads can share the same duckdb_task_state. + +##### Syntax + +
void duckdb_execute_tasks_state(
+  duckdb_task_state state
+);
+
+ +##### Parameters + +* `state` + +The task state of the executor + +
+ +#### `duckdb_execute_n_tasks_state` + +Execute DuckDB tasks on this thread. + +The thread will keep on executing tasks until either duckdb_finish_execution is called on the state, +max_tasks tasks have been executed or there are no more tasks to be executed. + +Multiple threads can share the same duckdb_task_state. + +##### Syntax + +
idx_t duckdb_execute_n_tasks_state(
+  duckdb_task_state state,
+  idx_t max_tasks
+);
+
+ +##### Parameters + +* `state` + +The task state of the executor +* `max_tasks` + +The maximum amount of tasks to execute +* `returns` + +The amount of tasks that have actually been executed + +
+ +#### `duckdb_finish_execution` + +Finish execution on a specific task. + +##### Syntax + +
void duckdb_finish_execution(
+  duckdb_task_state state
+);
+
+ +##### Parameters + +* `state` + +The task state to finish execution + +
+ +#### `duckdb_task_state_is_finished` + +Check if the provided duckdb_task_state has finished execution + +##### Syntax + +
bool duckdb_task_state_is_finished(
+  duckdb_task_state state
+);
+
+ +##### Parameters + +* `state` + +The task state to inspect +* `returns` + +Whether or not duckdb_finish_execution has been called on the task state + +
+ +#### `duckdb_destroy_task_state` + +Destroys the task state returned from duckdb_create_task_state. + +Note that this should not be called while there is an active duckdb_execute_tasks_state running +on the task state. + +##### Syntax + +
void duckdb_destroy_task_state(
+  duckdb_task_state state
+);
+
+ +##### Parameters + +* `state` + +The task state to clean up + +
+ +#### `duckdb_execution_is_finished` + +Returns true if the execution of the current query is finished. + +##### Syntax + +
bool duckdb_execution_is_finished(
+  duckdb_connection con
+);
+
+ +##### Parameters + +* `con` + +The connection on which to check + +
+ +#### `duckdb_stream_fetch_chunk` + +> Deprecated This method is scheduled for removal in a future release. + +Fetches a data chunk from the (streaming) duckdb_result. This function should be called repeatedly until the result is +exhausted. + +The result must be destroyed with `duckdb_destroy_data_chunk`. + +This function can only be used on duckdb_results created with 'duckdb_pending_prepared_streaming' + +If this function is used, none of the other result functions can be used and vice versa (i.e., this function cannot be +mixed with the legacy result functions or the materialized result functions). + +It is not known beforehand how many chunks will be returned by this result. + +##### Syntax + +
duckdb_data_chunk duckdb_stream_fetch_chunk(
+  duckdb_result result
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the data chunk from. +* `returns` + +The resulting data chunk. Returns `NULL` if the result has an error. + +
+ +#### `duckdb_fetch_chunk` + +Fetches a data chunk from a duckdb_result. This function should be called repeatedly until the result is exhausted. + +The result must be destroyed with `duckdb_destroy_data_chunk`. + +It is not known beforehand how many chunks will be returned by this result. + +##### Syntax + +
duckdb_data_chunk duckdb_fetch_chunk(
+  duckdb_result result
+);
+
+ +##### Parameters + +* `result` + +The result object to fetch the data chunk from. +* `returns` + +The resulting data chunk. Returns `NULL` if the result has an error. + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/appender.md b/docs/archive/1.0/api/c/appender.md new file mode 100644 index 00000000000..cfc76e1ac4f --- /dev/null +++ b/docs/archive/1.0/api/c/appender.md @@ -0,0 +1,595 @@ +--- +layout: docu +title: Appender +--- + +Appenders are the most efficient way of loading data into DuckDB from within the C interface, and are recommended for +fast data loading. The appender is much faster than using prepared statements or individual `INSERT INTO` statements. + +Appends are made in row-wise format. For every column, a `duckdb_append_[type]` call should be made, after which +the row should be finished by calling `duckdb_appender_end_row`. After all rows have been appended, +`duckdb_appender_destroy` should be used to finalize the appender and clean up the resulting memory. + +Note that `duckdb_appender_destroy` should always be called on the resulting appender, even if the function returns +`DuckDBError`. + +## Example + +```c +duckdb_query(con, "CREATE TABLE people (id INTEGER, name VARCHAR)", NULL); + +duckdb_appender appender; +if (duckdb_appender_create(con, NULL, "people", &appender) == DuckDBError) { + // handle error +} +// append the first row (1, Mark) +duckdb_append_int32(appender, 1); +duckdb_append_varchar(appender, "Mark"); +duckdb_appender_end_row(appender); + +// append the second row (2, Hannes) +duckdb_append_int32(appender, 2); +duckdb_append_varchar(appender, "Hannes"); +duckdb_appender_end_row(appender); + +// finish appending and flush all the rows to the table +duckdb_appender_destroy(&appender); +``` + +## API Reference Overview + + + +
duckdb_state duckdb_appender_create(duckdb_connection connection, const char *schema, const char *table, duckdb_appender *out_appender);
+idx_t duckdb_appender_column_count(duckdb_appender appender);
+duckdb_logical_type duckdb_appender_column_type(duckdb_appender appender, idx_t col_idx);
+const char *duckdb_appender_error(duckdb_appender appender);
+duckdb_state duckdb_appender_flush(duckdb_appender appender);
+duckdb_state duckdb_appender_close(duckdb_appender appender);
+duckdb_state duckdb_appender_destroy(duckdb_appender *appender);
+duckdb_state duckdb_appender_begin_row(duckdb_appender appender);
+duckdb_state duckdb_appender_end_row(duckdb_appender appender);
+duckdb_state duckdb_append_bool(duckdb_appender appender, bool value);
+duckdb_state duckdb_append_int8(duckdb_appender appender, int8_t value);
+duckdb_state duckdb_append_int16(duckdb_appender appender, int16_t value);
+duckdb_state duckdb_append_int32(duckdb_appender appender, int32_t value);
+duckdb_state duckdb_append_int64(duckdb_appender appender, int64_t value);
+duckdb_state duckdb_append_hugeint(duckdb_appender appender, duckdb_hugeint value);
+duckdb_state duckdb_append_uint8(duckdb_appender appender, uint8_t value);
+duckdb_state duckdb_append_uint16(duckdb_appender appender, uint16_t value);
+duckdb_state duckdb_append_uint32(duckdb_appender appender, uint32_t value);
+duckdb_state duckdb_append_uint64(duckdb_appender appender, uint64_t value);
+duckdb_state duckdb_append_uhugeint(duckdb_appender appender, duckdb_uhugeint value);
+duckdb_state duckdb_append_float(duckdb_appender appender, float value);
+duckdb_state duckdb_append_double(duckdb_appender appender, double value);
+duckdb_state duckdb_append_date(duckdb_appender appender, duckdb_date value);
+duckdb_state duckdb_append_time(duckdb_appender appender, duckdb_time value);
+duckdb_state duckdb_append_timestamp(duckdb_appender appender, duckdb_timestamp value);
+duckdb_state duckdb_append_interval(duckdb_appender appender, duckdb_interval value);
+duckdb_state duckdb_append_varchar(duckdb_appender appender, const char *val);
+duckdb_state duckdb_append_varchar_length(duckdb_appender appender, const char *val, idx_t length);
+duckdb_state duckdb_append_blob(duckdb_appender appender, const void *data, idx_t length);
+duckdb_state duckdb_append_null(duckdb_appender appender);
+duckdb_state duckdb_append_data_chunk(duckdb_appender appender, duckdb_data_chunk chunk);
+
+ +### `duckdb_appender_create` + +Creates an appender object. + +Note that the object must be destroyed with `duckdb_appender_destroy`. + +#### Syntax + +
duckdb_state duckdb_appender_create(
+  duckdb_connection connection,
+  const char *schema,
+  const char *table,
+  duckdb_appender *out_appender
+);
+
+ +#### Parameters + +* `connection` + +The connection context to create the appender in. +* `schema` + +The schema of the table to append to, or `nullptr` for the default schema. +* `table` + +The table name to append to. +* `out_appender` + +The resulting appender object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_appender_column_count` + +Returns the number of columns in the table that belongs to the appender. + +* appender The appender to get the column count from. + +#### Syntax + +
idx_t duckdb_appender_column_count(
+  duckdb_appender appender
+);
+
+ +#### Parameters + +* `returns` + +The number of columns in the table. + +
+ +### `duckdb_appender_column_type` + +Returns the type of the column at the specified index. + +Note: The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +* appender The appender to get the column type from. +* col_idx The index of the column to get the type of. + +#### Syntax + +
duckdb_logical_type duckdb_appender_column_type(
+  duckdb_appender appender,
+  idx_t col_idx
+);
+
+ +#### Parameters + +* `returns` + +The duckdb_logical_type of the column. + +
+ +### `duckdb_appender_error` + +Returns the error message associated with the given appender. +If the appender has no error message, this returns `nullptr` instead. + +The error message should not be freed. It will be de-allocated when `duckdb_appender_destroy` is called. + +#### Syntax + +
const char *duckdb_appender_error(
+  duckdb_appender appender
+);
+
+ +#### Parameters + +* `appender` + +The appender to get the error from. +* `returns` + +The error message, or `nullptr` if there is none. + +
+ +### `duckdb_appender_flush` + +Flush the appender to the table, forcing the cache of the appender to be cleared. If flushing the data triggers a +constraint violation or any other error, then all data is invalidated, and this function returns DuckDBError. +It is not possible to append more values. Call duckdb_appender_error to obtain the error message followed by +duckdb_appender_destroy to destroy the invalidated appender. + +#### Syntax + +
duckdb_state duckdb_appender_flush(
+  duckdb_appender appender
+);
+
+ +#### Parameters + +* `appender` + +The appender to flush. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_appender_close` + +Closes the appender by flushing all intermediate states and closing it for further appends. If flushing the data +triggers a constraint violation or any other error, then all data is invalidated, and this function returns DuckDBError. +Call duckdb_appender_error to obtain the error message followed by duckdb_appender_destroy to destroy the invalidated +appender. + +#### Syntax + +
duckdb_state duckdb_appender_close(
+  duckdb_appender appender
+);
+
+ +#### Parameters + +* `appender` + +The appender to flush and close. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_appender_destroy` + +Closes the appender by flushing all intermediate states to the table and destroying it. By destroying it, this function +de-allocates all memory associated with the appender. If flushing the data triggers a constraint violation, +then all data is invalidated, and this function returns DuckDBError. Due to the destruction of the appender, it is no +longer possible to obtain the specific error message with duckdb_appender_error. Therefore, call duckdb_appender_close +before destroying the appender, if you need insights into the specific error. + +#### Syntax + +
duckdb_state duckdb_appender_destroy(
+  duckdb_appender *appender
+);
+
+ +#### Parameters + +* `appender` + +The appender to flush, close and destroy. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_appender_begin_row` + +A nop function, provided for backwards compatibility reasons. Does nothing. Only `duckdb_appender_end_row` is required. + +#### Syntax + +
duckdb_state duckdb_appender_begin_row(
+  duckdb_appender appender
+);
+
+
+ +### `duckdb_appender_end_row` + +Finish the current row of appends. After end_row is called, the next row can be appended. + +#### Syntax + +
duckdb_state duckdb_appender_end_row(
+  duckdb_appender appender
+);
+
+ +#### Parameters + +* `appender` + +The appender. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_append_bool` + +Append a bool value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_bool(
+  duckdb_appender appender,
+  bool value
+);
+
+
+ +### `duckdb_append_int8` + +Append an int8_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_int8(
+  duckdb_appender appender,
+  int8_t value
+);
+
+
+ +### `duckdb_append_int16` + +Append an int16_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_int16(
+  duckdb_appender appender,
+  int16_t value
+);
+
+
+ +### `duckdb_append_int32` + +Append an int32_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_int32(
+  duckdb_appender appender,
+  int32_t value
+);
+
+
+ +### `duckdb_append_int64` + +Append an int64_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_int64(
+  duckdb_appender appender,
+  int64_t value
+);
+
+
+ +### `duckdb_append_hugeint` + +Append a duckdb_hugeint value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_hugeint(
+  duckdb_appender appender,
+  duckdb_hugeint value
+);
+
+
+ +### `duckdb_append_uint8` + +Append a uint8_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_uint8(
+  duckdb_appender appender,
+  uint8_t value
+);
+
+
+ +### `duckdb_append_uint16` + +Append a uint16_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_uint16(
+  duckdb_appender appender,
+  uint16_t value
+);
+
+
+ +### `duckdb_append_uint32` + +Append a uint32_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_uint32(
+  duckdb_appender appender,
+  uint32_t value
+);
+
+
+ +### `duckdb_append_uint64` + +Append a uint64_t value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_uint64(
+  duckdb_appender appender,
+  uint64_t value
+);
+
+
+ +### `duckdb_append_uhugeint` + +Append a duckdb_uhugeint value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_uhugeint(
+  duckdb_appender appender,
+  duckdb_uhugeint value
+);
+
+
+ +### `duckdb_append_float` + +Append a float value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_float(
+  duckdb_appender appender,
+  float value
+);
+
+
+ +### `duckdb_append_double` + +Append a double value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_double(
+  duckdb_appender appender,
+  double value
+);
+
+
+ +### `duckdb_append_date` + +Append a duckdb_date value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_date(
+  duckdb_appender appender,
+  duckdb_date value
+);
+
+
+ +### `duckdb_append_time` + +Append a duckdb_time value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_time(
+  duckdb_appender appender,
+  duckdb_time value
+);
+
+
+ +### `duckdb_append_timestamp` + +Append a duckdb_timestamp value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_timestamp(
+  duckdb_appender appender,
+  duckdb_timestamp value
+);
+
+
+ +### `duckdb_append_interval` + +Append a duckdb_interval value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_interval(
+  duckdb_appender appender,
+  duckdb_interval value
+);
+
+
+ +### `duckdb_append_varchar` + +Append a varchar value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_varchar(
+  duckdb_appender appender,
+  const char *val
+);
+
+
+ +### `duckdb_append_varchar_length` + +Append a varchar value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_varchar_length(
+  duckdb_appender appender,
+  const char *val,
+  idx_t length
+);
+
+
+ +### `duckdb_append_blob` + +Append a blob value to the appender. + +#### Syntax + +
duckdb_state duckdb_append_blob(
+  duckdb_appender appender,
+  const void *data,
+  idx_t length
+);
+
+
+ +### `duckdb_append_null` + +Append a NULL value to the appender (of any type). + +#### Syntax + +
duckdb_state duckdb_append_null(
+  duckdb_appender appender
+);
+
+
+ +### `duckdb_append_data_chunk` + +Appends a pre-filled data chunk to the specified appender. + +The types of the data chunk must exactly match the types of the table, no casting is performed. +If the types do not match or the appender is in an invalid state, DuckDBError is returned. +If the append is successful, DuckDBSuccess is returned. + +#### Syntax + +
duckdb_state duckdb_append_data_chunk(
+  duckdb_appender appender,
+  duckdb_data_chunk chunk
+);
+
+ +#### Parameters + +* `appender` + +The appender to append to. +* `chunk` + +The data chunk to append. +* `returns` + +The return state. + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/config.md b/docs/archive/1.0/api/c/config.md new file mode 100644 index 00000000000..f63331ce514 --- /dev/null +++ b/docs/archive/1.0/api/c/config.md @@ -0,0 +1,182 @@ +--- +layout: docu +title: Configuration +--- + +Configuration options can be provided to change different settings of the database system. Note that many of these +settings can be changed later on using [`PRAGMA` statements](../../configuration/pragmas) as well. The configuration object +should be created, filled with values and passed to `duckdb_open_ext`. + +## Example + +```c +duckdb_database db; +duckdb_config config; + +// create the configuration object +if (duckdb_create_config(&config) == DuckDBError) { + // handle error +} +// set some configuration options +duckdb_set_config(config, "access_mode", "READ_WRITE"); // or READ_ONLY +duckdb_set_config(config, "threads", "8"); +duckdb_set_config(config, "max_memory", "8GB"); +duckdb_set_config(config, "default_order", "DESC"); + +// open the database using the configuration +if (duckdb_open_ext(NULL, &db, config, NULL) == DuckDBError) { + // handle error +} +// cleanup the configuration object +duckdb_destroy_config(&config); + +// run queries... + +// cleanup +duckdb_close(&db); +``` + +## API Reference Overview + + + +
duckdb_state duckdb_create_config(duckdb_config *out_config);
+size_t duckdb_config_count();
+duckdb_state duckdb_get_config_flag(size_t index, const char **out_name, const char **out_description);
+duckdb_state duckdb_set_config(duckdb_config config, const char *name, const char *option);
+void duckdb_destroy_config(duckdb_config *config);
+
+ +### `duckdb_create_config` + +Initializes an empty configuration object that can be used to provide start-up options for the DuckDB instance +through `duckdb_open_ext`. +The duckdb_config must be destroyed using 'duckdb_destroy_config' + +This will always succeed unless there is a malloc failure. + +#### Syntax + +
duckdb_state duckdb_create_config(
+  duckdb_config *out_config
+);
+
+ +#### Parameters + +* `out_config` + +The result configuration object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_config_count` + +This returns the total amount of configuration options available for usage with `duckdb_get_config_flag`. + +This should not be called in a loop as it internally loops over all the options. + +#### Syntax + +
size_t duckdb_config_count(
+  
+);
+
+ +#### Parameters + +* `returns` + +The amount of config options available. + +
+ +### `duckdb_get_config_flag` + +Obtains a human-readable name and description of a specific configuration option. This can be used to e.g. +display configuration options. This will succeed unless `index` is out of range (i.e., `>= duckdb_config_count`). + +The result name or description MUST NOT be freed. + +#### Syntax + +
duckdb_state duckdb_get_config_flag(
+  size_t index,
+  const char **out_name,
+  const char **out_description
+);
+
+ +#### Parameters + +* `index` + +The index of the configuration option (between 0 and `duckdb_config_count`) +* `out_name` + +A name of the configuration flag. +* `out_description` + +A description of the configuration flag. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_set_config` + +Sets the specified option for the specified configuration. The configuration option is indicated by name. +To obtain a list of config options, see `duckdb_get_config_flag`. + +In the source code, configuration options are defined in `config.cpp`. + +This can fail if either the name is invalid, or if the value provided for the option is invalid. + +#### Syntax + +
duckdb_state duckdb_set_config(
+  duckdb_config config,
+  const char *name,
+  const char *option
+);
+
+ +#### Parameters + +* `duckdb_config` + +The configuration object to set the option on. +* `name` + +The name of the configuration flag to set. +* `option` + +The value to set the configuration flag to. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_destroy_config` + +Destroys the specified configuration object and de-allocates all memory allocated for the object. + +#### Syntax + +
void duckdb_destroy_config(
+  duckdb_config *config
+);
+
+ +#### Parameters + +* `config` + +The configuration object to destroy. + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/connect.md b/docs/archive/1.0/api/c/connect.md new file mode 100644 index 00000000000..44ee692ab16 --- /dev/null +++ b/docs/archive/1.0/api/c/connect.md @@ -0,0 +1,232 @@ +--- +layout: docu +title: Startup & Shutdown +--- + +To use DuckDB, you must first initialize a `duckdb_database` handle using `duckdb_open()`. `duckdb_open()` takes as parameter the database file to read and write from. The special value `NULL` (`nullptr`) can be used to create an **in-memory database**. Note that for an in-memory database no data is persisted to disk (i.e., all data is lost when you exit the process). + +With the `duckdb_database` handle, you can create one or many `duckdb_connection` using `duckdb_connect()`. While individual connections are thread-safe, they will be locked during querying. It is therefore recommended that each thread uses its own connection to allow for the best parallel performance. + +All `duckdb_connection`s have to explicitly be disconnected with `duckdb_disconnect()` and the `duckdb_database` has to be explicitly closed with `duckdb_close()` to avoid memory and file handle leaking. + +## Example + +```c +duckdb_database db; +duckdb_connection con; + +if (duckdb_open(NULL, &db) == DuckDBError) { + // handle error +} +if (duckdb_connect(db, &con) == DuckDBError) { + // handle error +} + +// run queries... + +// cleanup +duckdb_disconnect(&con); +duckdb_close(&db); +``` + +## API Reference Overview + + + +
duckdb_state duckdb_open(const char *path, duckdb_database *out_database);
+duckdb_state duckdb_open_ext(const char *path, duckdb_database *out_database, duckdb_config config, char **out_error);
+void duckdb_close(duckdb_database *database);
+duckdb_state duckdb_connect(duckdb_database database, duckdb_connection *out_connection);
+void duckdb_interrupt(duckdb_connection connection);
+duckdb_query_progress_type duckdb_query_progress(duckdb_connection connection);
+void duckdb_disconnect(duckdb_connection *connection);
+const char *duckdb_library_version();
+
+ +### `duckdb_open` + +Creates a new database or opens an existing database file stored at the given path. +If no path is given a new in-memory database is created instead. +The instantiated database should be closed with 'duckdb_close'. + +#### Syntax + +
duckdb_state duckdb_open(
+  const char *path,
+  duckdb_database *out_database
+);
+
+ +#### Parameters + +* `path` + +Path to the database file on disk, or `nullptr` or `:memory:` to open an in-memory database. +* `out_database` + +The result database object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_open_ext` + +Extended version of duckdb_open. Creates a new database or opens an existing database file stored at the given path. +The instantiated database should be closed with 'duckdb_close'. + +#### Syntax + +
duckdb_state duckdb_open_ext(
+  const char *path,
+  duckdb_database *out_database,
+  duckdb_config config,
+  char **out_error
+);
+
+ +#### Parameters + +* `path` + +Path to the database file on disk, or `nullptr` or `:memory:` to open an in-memory database. +* `out_database` + +The result database object. +* `config` + +(Optional) configuration used to start up the database system. +* `out_error` + +If set and the function returns DuckDBError, this will contain the reason why the start-up failed. +Note that the error must be freed using `duckdb_free`. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_close` + +Closes the specified database and de-allocates all memory allocated for that database. +This should be called after you are done with any database allocated through `duckdb_open` or `duckdb_open_ext`. +Note that failing to call `duckdb_close` (in case of e.g., a program crash) will not cause data corruption. +Still, it is recommended to always correctly close a database object after you are done with it. + +#### Syntax + +
void duckdb_close(
+  duckdb_database *database
+);
+
+ +#### Parameters + +* `database` + +The database object to shut down. + +
+ +### `duckdb_connect` + +Opens a connection to a database. Connections are required to query the database, and store transactional state +associated with the connection. +The instantiated connection should be closed using 'duckdb_disconnect'. + +#### Syntax + +
duckdb_state duckdb_connect(
+  duckdb_database database,
+  duckdb_connection *out_connection
+);
+
+ +#### Parameters + +* `database` + +The database file to connect to. +* `out_connection` + +The result connection object. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_interrupt` + +Interrupt running query + +#### Syntax + +
void duckdb_interrupt(
+  duckdb_connection connection
+);
+
+ +#### Parameters + +* `connection` + +The connection to interrupt + +
+ +### `duckdb_query_progress` + +Get progress of the running query + +#### Syntax + +
duckdb_query_progress_type duckdb_query_progress(
+  duckdb_connection connection
+);
+
+ +#### Parameters + +* `connection` + +The working connection +* `returns` + +-1 if no progress or a percentage of the progress + +
+ +### `duckdb_disconnect` + +Closes the specified connection and de-allocates all memory allocated for that connection. + +#### Syntax + +
void duckdb_disconnect(
+  duckdb_connection *connection
+);
+
+ +#### Parameters + +* `connection` + +The connection to close. + +
+ +### `duckdb_library_version` + +Returns the version of the linked DuckDB, with a version postfix for dev versions + +Usually used for developing C extensions that must return this for a compatibility check. + +#### Syntax + +
const char *duckdb_library_version(
+  
+);
+
+
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/data_chunk.md b/docs/archive/1.0/api/c/data_chunk.md new file mode 100644 index 00000000000..961c7d284e5 --- /dev/null +++ b/docs/archive/1.0/api/c/data_chunk.md @@ -0,0 +1,188 @@ +--- +layout: docu +title: Data Chunks +--- + +Data chunks represent a horizontal slice of a table. They hold a number of [vectors]({% link docs/archive/1.0/api/c/vector.md %}), that can each hold up to the `VECTOR_SIZE` rows. The vector size can be obtained through the `duckdb_vector_size` function and is configurable, but is usually set to `2048`. + +Data chunks and vectors are what DuckDB uses natively to store and represent data. For this reason, the data chunk interface is the most efficient way of interfacing with DuckDB. Be aware, however, that correctly interfacing with DuckDB using the data chunk API does require knowledge of DuckDB's internal vector format. + +Data chunks can be used in two manners: + +* **Reading Data**: Data chunks can be obtained from query results using the `duckdb_fetch_chunk` method, or as input to a user-defined function. In this case, the [vector methods]({% link docs/archive/1.0/api/c/vector.md %}) can be used to read individual values. +* **Writing Data**: Data chunks can be created using `duckdb_create_data_chunk`. The data chunk can then be filled with values and used in `duckdb_append_data_chunk` to write data to the database. + +The primary manner of interfacing with data chunks is by obtaining the internal vectors of the data chunk using the `duckdb_data_chunk_get_vector` method. Afterwards, the [vector methods]({% link docs/archive/1.0/api/c/vector.md %}) can be used to read from or write to the individual vectors. + + +## API Reference Overview + + + +
duckdb_data_chunk duckdb_create_data_chunk(duckdb_logical_type *types, idx_t column_count);
+void duckdb_destroy_data_chunk(duckdb_data_chunk *chunk);
+void duckdb_data_chunk_reset(duckdb_data_chunk chunk);
+idx_t duckdb_data_chunk_get_column_count(duckdb_data_chunk chunk);
+duckdb_vector duckdb_data_chunk_get_vector(duckdb_data_chunk chunk, idx_t col_idx);
+idx_t duckdb_data_chunk_get_size(duckdb_data_chunk chunk);
+void duckdb_data_chunk_set_size(duckdb_data_chunk chunk, idx_t size);
+
+ +### `duckdb_create_data_chunk` + +Creates an empty DataChunk with the specified set of types. + +Note that the result must be destroyed with `duckdb_destroy_data_chunk`. + +#### Syntax + +
duckdb_data_chunk duckdb_create_data_chunk(
+  duckdb_logical_type *types,
+  idx_t column_count
+);
+
+ +#### Parameters + +* `types` + +An array of types of the data chunk. +* `column_count` + +The number of columns. +* `returns` + +The data chunk. + +
+ +### `duckdb_destroy_data_chunk` + +Destroys the data chunk and de-allocates all memory allocated for that chunk. + +#### Syntax + +
void duckdb_destroy_data_chunk(
+  duckdb_data_chunk *chunk
+);
+
+ +#### Parameters + +* `chunk` + +The data chunk to destroy. + +
+ +### `duckdb_data_chunk_reset` + +Resets a data chunk, clearing the validity masks and setting the cardinality of the data chunk to 0. + +#### Syntax + +
void duckdb_data_chunk_reset(
+  duckdb_data_chunk chunk
+);
+
+ +#### Parameters + +* `chunk` + +The data chunk to reset. + +
+ +### `duckdb_data_chunk_get_column_count` + +Retrieves the number of columns in a data chunk. + +#### Syntax + +
idx_t duckdb_data_chunk_get_column_count(
+  duckdb_data_chunk chunk
+);
+
+ +#### Parameters + +* `chunk` + +The data chunk to get the data from +* `returns` + +The number of columns in the data chunk + +
+ +### `duckdb_data_chunk_get_vector` + +Retrieves the vector at the specified column index in the data chunk. + +The pointer to the vector is valid for as long as the chunk is alive. +It does NOT need to be destroyed. + +#### Syntax + +
duckdb_vector duckdb_data_chunk_get_vector(
+  duckdb_data_chunk chunk,
+  idx_t col_idx
+);
+
+ +#### Parameters + +* `chunk` + +The data chunk to get the data from +* `returns` + +The vector + +
+ +### `duckdb_data_chunk_get_size` + +Retrieves the current number of tuples in a data chunk. + +#### Syntax + +
idx_t duckdb_data_chunk_get_size(
+  duckdb_data_chunk chunk
+);
+
+ +#### Parameters + +* `chunk` + +The data chunk to get the data from +* `returns` + +The number of tuples in the data chunk + +
+ +### `duckdb_data_chunk_set_size` + +Sets the current number of tuples in a data chunk. + +#### Syntax + +
void duckdb_data_chunk_set_size(
+  duckdb_data_chunk chunk,
+  idx_t size
+);
+
+ +#### Parameters + +* `chunk` + +The data chunk to set the size in +* `size` + +The number of tuples in the data chunk + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/overview.md b/docs/archive/1.0/api/c/overview.md new file mode 100644 index 00000000000..9179e555ad3 --- /dev/null +++ b/docs/archive/1.0/api/c/overview.md @@ -0,0 +1,15 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/api/c +- /docs/archive/1.0/api/c/ +title: Overview +--- + +DuckDB implements a custom C API modelled somewhat following the SQLite C API. The API is contained in the `duckdb.h` header. Continue to [Startup & Shutdown]({% link docs/archive/1.0/api/c/connect.md %}) to get started, or check out the [Full API overview]({% link docs/archive/1.0/api/c/api.md %}). + +We also provide a SQLite API wrapper which means that if your applications is programmed against the SQLite C API, you can re-link to DuckDB and it should continue working. See the [`sqlite_api_wrapper`](https://github.com/duckdb/duckdb/tree/main/tools/sqlite3_api_wrapper) folder in our source repository for more information. + +## Installation + +The DuckDB C API can be installed as part of the `libduckdb` packages. Please see the [installation page](../../installation?environment=cplusplus) for details. \ No newline at end of file diff --git a/docs/archive/1.0/api/c/prepared.md b/docs/archive/1.0/api/c/prepared.md new file mode 100644 index 00000000000..afd0548e9c9 --- /dev/null +++ b/docs/archive/1.0/api/c/prepared.md @@ -0,0 +1,246 @@ +--- +layout: docu +title: Prepared Statements +--- + +A prepared statement is a parameterized query. The query is prepared with question marks (`?`) or dollar symbols (`$1`) indicating the parameters of the query. Values can then be bound to these parameters, after which the prepared statement can be executed using those parameters. A single query can be prepared once and executed many times. + +Prepared statements are useful to: +* Easily supply parameters to functions while avoiding string concatenation/SQL injection attacks. +* Speeding up queries that will be executed many times with different parameters. + +DuckDB supports prepared statements in the C API with the `duckdb_prepare` method. The `duckdb_bind` family of functions is used to supply values for subsequent execution of the prepared statement using `duckdb_execute_prepared`. After we are done with the prepared statement it can be cleaned up using the `duckdb_destroy_prepare` method. + +## Example + +```c +duckdb_prepared_statement stmt; +duckdb_result result; +if (duckdb_prepare(con, "INSERT INTO integers VALUES ($1, $2)", &stmt) == DuckDBError) { + // handle error +} + +duckdb_bind_int32(stmt, 1, 42); // the parameter index starts counting at 1! +duckdb_bind_int32(stmt, 2, 43); +// NULL as second parameter means no result set is requested +duckdb_execute_prepared(stmt, NULL); +duckdb_destroy_prepare(&stmt); + +// we can also query result sets using prepared statements +if (duckdb_prepare(con, "SELECT * FROM integers WHERE i = ?", &stmt) == DuckDBError) { + // handle error +} +duckdb_bind_int32(stmt, 1, 42); +duckdb_execute_prepared(stmt, &result); + +// do something with result + +// clean up +duckdb_destroy_result(&result); +duckdb_destroy_prepare(&stmt); +``` + +After calling `duckdb_prepare`, the prepared statement parameters can be inspected using `duckdb_nparams` and `duckdb_param_type`. In case the prepare fails, the error can be obtained through `duckdb_prepare_error`. + +It is not required that the `duckdb_bind` family of functions matches the prepared statement parameter type exactly. The values will be auto-cast to the required value as required. For example, calling `duckdb_bind_int8` on a parameter type of `DUCKDB_TYPE_INTEGER` will work as expected. + +> Warning Do **not** use prepared statements to insert large amounts of data into DuckDB. Instead it is recommended to use the [Appender]({% link docs/archive/1.0/api/c/appender.md %}). + +## API Reference Overview + + + +
duckdb_state duckdb_prepare(duckdb_connection connection, const char *query, duckdb_prepared_statement *out_prepared_statement);
+void duckdb_destroy_prepare(duckdb_prepared_statement *prepared_statement);
+const char *duckdb_prepare_error(duckdb_prepared_statement prepared_statement);
+idx_t duckdb_nparams(duckdb_prepared_statement prepared_statement);
+const char *duckdb_parameter_name(duckdb_prepared_statement prepared_statement, idx_t index);
+duckdb_type duckdb_param_type(duckdb_prepared_statement prepared_statement, idx_t param_idx);
+duckdb_state duckdb_clear_bindings(duckdb_prepared_statement prepared_statement);
+duckdb_statement_type duckdb_prepared_statement_type(duckdb_prepared_statement statement);
+
+ +### `duckdb_prepare` + +Create a prepared statement object from a query. + +Note that after calling `duckdb_prepare`, the prepared statement should always be destroyed using +`duckdb_destroy_prepare`, even if the prepare fails. + +If the prepare fails, `duckdb_prepare_error` can be called to obtain the reason why the prepare failed. + +#### Syntax + +
duckdb_state duckdb_prepare(
+  duckdb_connection connection,
+  const char *query,
+  duckdb_prepared_statement *out_prepared_statement
+);
+
+ +#### Parameters + +* `connection` + +The connection object +* `query` + +The SQL query to prepare +* `out_prepared_statement` + +The resulting prepared statement object +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_destroy_prepare` + +Closes the prepared statement and de-allocates all memory allocated for the statement. + +#### Syntax + +
void duckdb_destroy_prepare(
+  duckdb_prepared_statement *prepared_statement
+);
+
+ +#### Parameters + +* `prepared_statement` + +The prepared statement to destroy. + +
+ +### `duckdb_prepare_error` + +Returns the error message associated with the given prepared statement. +If the prepared statement has no error message, this returns `nullptr` instead. + +The error message should not be freed. It will be de-allocated when `duckdb_destroy_prepare` is called. + +#### Syntax + +
const char *duckdb_prepare_error(
+  duckdb_prepared_statement prepared_statement
+);
+
+ +#### Parameters + +* `prepared_statement` + +The prepared statement to obtain the error from. +* `returns` + +The error message, or `nullptr` if there is none. + +
+ +### `duckdb_nparams` + +Returns the number of parameters that can be provided to the given prepared statement. + +Returns 0 if the query was not successfully prepared. + +#### Syntax + +
idx_t duckdb_nparams(
+  duckdb_prepared_statement prepared_statement
+);
+
+ +#### Parameters + +* `prepared_statement` + +The prepared statement to obtain the number of parameters for. + +
+ +### `duckdb_parameter_name` + +Returns the name used to identify the parameter +The returned string should be freed using `duckdb_free`. + +Returns NULL if the index is out of range for the provided prepared statement. + +#### Syntax + +
const char *duckdb_parameter_name(
+  duckdb_prepared_statement prepared_statement,
+  idx_t index
+);
+
+ +#### Parameters + +* `prepared_statement` + +The prepared statement for which to get the parameter name from. + +
+ +### `duckdb_param_type` + +Returns the parameter type for the parameter at the given index. + +Returns `DUCKDB_TYPE_INVALID` if the parameter index is out of range or the statement was not successfully prepared. + +#### Syntax + +
duckdb_type duckdb_param_type(
+  duckdb_prepared_statement prepared_statement,
+  idx_t param_idx
+);
+
+ +#### Parameters + +* `prepared_statement` + +The prepared statement. +* `param_idx` + +The parameter index. +* `returns` + +The parameter type + +
+ +### `duckdb_clear_bindings` + +Clear the params bind to the prepared statement. + +#### Syntax + +
duckdb_state duckdb_clear_bindings(
+  duckdb_prepared_statement prepared_statement
+);
+
+
+ +### `duckdb_prepared_statement_type` + +Returns the statement type of the statement to be executed + +#### Syntax + +
duckdb_statement_type duckdb_prepared_statement_type(
+  duckdb_prepared_statement statement
+);
+
+ +#### Parameters + +* `statement` + +The prepared statement. +* `returns` + +duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/query.md b/docs/archive/1.0/api/c/query.md new file mode 100644 index 00000000000..02b5cdebdf1 --- /dev/null +++ b/docs/archive/1.0/api/c/query.md @@ -0,0 +1,486 @@ +--- +layout: docu +title: Query +--- + +The `duckdb_query` method allows SQL queries to be run in DuckDB from C. This method takes two parameters, a (null-terminated) SQL query string and a `duckdb_result` result pointer. The result pointer may be `NULL` if the application is not interested in the result set or if the query produces no result. After the result is consumed, the `duckdb_destroy_result` method should be used to clean up the result. + +Elements can be extracted from the `duckdb_result` object using a variety of methods. The `duckdb_column_count` can be used to extract the number of columns. `duckdb_column_name` and `duckdb_column_type` can be used to extract the names and types of individual columns. + +## Example + +```c +duckdb_state state; +duckdb_result result; + +// create a table +state = duckdb_query(con, "CREATE TABLE integers (i INTEGER, j INTEGER);", NULL); +if (state == DuckDBError) { + // handle error +} +// insert three rows into the table +state = duckdb_query(con, "INSERT INTO integers VALUES (3, 4), (5, 6), (7, NULL);", NULL); +if (state == DuckDBError) { + // handle error +} +// query rows again +state = duckdb_query(con, "SELECT * FROM integers", &result); +if (state == DuckDBError) { + // handle error +} +// handle the result +// ... + +// destroy the result after we are done with it +duckdb_destroy_result(&result); +``` + +## Value Extraction + +Values can be extracted using either the `duckdb_fetch_chunk` function, or using the `duckdb_value` convenience functions. The `duckdb_fetch_chunk` function directly hands you data chunks in DuckDB's native array format and can therefore be very fast. The `duckdb_value` functions perform bounds- and type-checking, and will automatically cast values to the desired type. This makes them more convenient and easier to use, at the expense of being slower. + +See the [Types]({% link docs/archive/1.0/api/c/types.md %}) page for more information. + +> For optimal performance, use `duckdb_fetch_chunk` to extract data from the query result. +> The `duckdb_value` functions perform internal type-checking, bounds-checking and casting which makes them slower. + +### `duckdb_fetch_chunk` + +Below is an end-to-end example that prints the above result to CSV format using the `duckdb_fetch_chunk` function. +Note that the function is NOT generic: we do need to know exactly what the types of the result columns are. + +```c +duckdb_database db; +duckdb_connection con; +duckdb_open(nullptr, &db); +duckdb_connect(db, &con); + +duckdb_result res; +duckdb_query(con, "CREATE TABLE integers (i INTEGER, j INTEGER);", NULL); +duckdb_query(con, "INSERT INTO integers VALUES (3, 4), (5, 6), (7, NULL);", NULL); +duckdb_query(con, "SELECT * FROM integers;", &res); + +// iterate until result is exhausted +while (true) { + duckdb_data_chunk result = duckdb_fetch_chunk(res); + if (!result) { + // result is exhausted + break; + } + // get the number of rows from the data chunk + idx_t row_count = duckdb_data_chunk_get_size(result); + // get the first column + duckdb_vector col1 = duckdb_data_chunk_get_vector(result, 0); + int32_t *col1_data = (int32_t *) duckdb_vector_get_data(col1); + uint64_t *col1_validity = duckdb_vector_get_validity(col1); + + // get the second column + duckdb_vector col2 = duckdb_data_chunk_get_vector(result, 1); + int32_t *col2_data = (int32_t *) duckdb_vector_get_data(col2); + uint64_t *col2_validity = duckdb_vector_get_validity(col2); + + // iterate over the rows + for (idx_t row = 0; row < row_count; row++) { + if (duckdb_validity_row_is_valid(col1_validity, row)) { + printf("%d", col1_data[row]); + } else { + printf("NULL"); + } + printf(","); + if (duckdb_validity_row_is_valid(col2_validity, row)) { + printf("%d", col2_data[row]); + } else { + printf("NULL"); + } + printf("\n"); + } + duckdb_destroy_data_chunk(&result); +} +// clean-up +duckdb_destroy_result(&res); +duckdb_disconnect(&con); +duckdb_close(&db); +``` + +This prints the following result: + +```csv +3,4 +5,6 +7,NULL +``` + +### `duckdb_value` + +> Deprecated The `duckdb_value` functions are deprecated and are scheduled for removal in a future release. + +Below is an example that prints the above result to CSV format using the `duckdb_value_varchar` function. +Note that the function is generic: we do not need to know about the types of the individual result columns. + +```c +// print the above result to CSV format using `duckdb_value_varchar` +idx_t row_count = duckdb_row_count(&result); +idx_t column_count = duckdb_column_count(&result); +for (idx_t row = 0; row < row_count; row++) { + for (idx_t col = 0; col < column_count; col++) { + if (col > 0) printf(","); + auto str_val = duckdb_value_varchar(&result, col, row); + printf("%s", str_val); + duckdb_free(str_val); + } + printf("\n"); +} +``` + +## API Reference Overview + + + +
duckdb_state duckdb_query(duckdb_connection connection, const char *query, duckdb_result *out_result);
+void duckdb_destroy_result(duckdb_result *result);
+const char *duckdb_column_name(duckdb_result *result, idx_t col);
+duckdb_type duckdb_column_type(duckdb_result *result, idx_t col);
+duckdb_statement_type duckdb_result_statement_type(duckdb_result result);
+duckdb_logical_type duckdb_column_logical_type(duckdb_result *result, idx_t col);
+idx_t duckdb_column_count(duckdb_result *result);
+idx_t duckdb_row_count(duckdb_result *result);
+idx_t duckdb_rows_changed(duckdb_result *result);
+void *duckdb_column_data(duckdb_result *result, idx_t col);
+bool *duckdb_nullmask_data(duckdb_result *result, idx_t col);
+const char *duckdb_result_error(duckdb_result *result);
+
+ +### `duckdb_query` + +Executes a SQL query within a connection and stores the full (materialized) result in the out_result pointer. +If the query fails to execute, DuckDBError is returned and the error message can be retrieved by calling +`duckdb_result_error`. + +Note that after running `duckdb_query`, `duckdb_destroy_result` must be called on the result object even if the +query fails, otherwise the error stored within the result will not be freed correctly. + +#### Syntax + +
duckdb_state duckdb_query(
+  duckdb_connection connection,
+  const char *query,
+  duckdb_result *out_result
+);
+
+ +#### Parameters + +* `connection` + +The connection to perform the query in. +* `query` + +The SQL query to run. +* `out_result` + +The query result. +* `returns` + +`DuckDBSuccess` on success or `DuckDBError` on failure. + +
+ +### `duckdb_destroy_result` + +Closes the result and de-allocates all memory allocated for that connection. + +#### Syntax + +
void duckdb_destroy_result(
+  duckdb_result *result
+);
+
+ +#### Parameters + +* `result` + +The result to destroy. + +
+ +### `duckdb_column_name` + +Returns the column name of the specified column. The result should not need to be freed; the column names will +automatically be destroyed when the result is destroyed. + +Returns `NULL` if the column is out of range. + +#### Syntax + +
const char *duckdb_column_name(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the column name from. +* `col` + +The column index. +* `returns` + +The column name of the specified column. + +
+ +### `duckdb_column_type` + +Returns the column type of the specified column. + +Returns `DUCKDB_TYPE_INVALID` if the column is out of range. + +#### Syntax + +
duckdb_type duckdb_column_type(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the column type from. +* `col` + +The column index. +* `returns` + +The column type of the specified column. + +
+ +### `duckdb_result_statement_type` + +Returns the statement type of the statement that was executed + +#### Syntax + +
duckdb_statement_type duckdb_result_statement_type(
+  duckdb_result result
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the statement type from. +* `returns` + +duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID + +
+ +### `duckdb_column_logical_type` + +Returns the logical column type of the specified column. + +The return type of this call should be destroyed with `duckdb_destroy_logical_type`. + +Returns `NULL` if the column is out of range. + +#### Syntax + +
duckdb_logical_type duckdb_column_logical_type(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the column type from. +* `col` + +The column index. +* `returns` + +The logical column type of the specified column. + +
+ +### `duckdb_column_count` + +Returns the number of columns present in a the result object. + +#### Syntax + +
idx_t duckdb_column_count(
+  duckdb_result *result
+);
+
+ +#### Parameters + +* `result` + +The result object. +* `returns` + +The number of columns present in the result object. + +
+ +### `duckdb_row_count` + +**DEPRECATION NOTICE**: This method is scheduled for removal in a future release. + +Returns the number of rows present in the result object. + +#### Syntax + +
idx_t duckdb_row_count(
+  duckdb_result *result
+);
+
+ +#### Parameters + +* `result` + +The result object. +* `returns` + +The number of rows present in the result object. + +
+ +### `duckdb_rows_changed` + +Returns the number of rows changed by the query stored in the result. This is relevant only for INSERT/UPDATE/DELETE +queries. For other queries the rows_changed will be 0. + +#### Syntax + +
idx_t duckdb_rows_changed(
+  duckdb_result *result
+);
+
+ +#### Parameters + +* `result` + +The result object. +* `returns` + +The number of rows changed. + +
+ +### `duckdb_column_data` + +**DEPRECATED**: Prefer using `duckdb_result_get_chunk` instead. + +Returns the data of a specific column of a result in columnar format. + +The function returns a dense array which contains the result data. The exact type stored in the array depends on the +corresponding duckdb_type (as provided by `duckdb_column_type`). For the exact type by which the data should be +accessed, see the comments in the [Types section]({% link docs/archive/1.0/api/c/types.md %}) or the `DUCKDB_TYPE` enum. + +For example, for a column of type `DUCKDB_TYPE_INTEGER`, rows can be accessed in the following manner: + +```c +int32_t *data = (int32_t *) duckdb_column_data(&result, 0); +printf("Data for row %d: %d\n", row, data[row]); +``` + +#### Syntax + +
void *duckdb_column_data(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the column data from. +* `col` + +The column index. +* `returns` + +The column data of the specified column. + +
+ +### `duckdb_nullmask_data` + +**DEPRECATED**: Prefer using `duckdb_result_get_chunk` instead. + +Returns the nullmask of a specific column of a result in columnar format. The nullmask indicates for every row +whether or not the corresponding row is `NULL`. If a row is `NULL`, the values present in the array provided +by `duckdb_column_data` are undefined. + +```c +int32_t *data = (int32_t *) duckdb_column_data(&result, 0); +bool *nullmask = duckdb_nullmask_data(&result, 0); +if (nullmask[row]) { +printf("Data for row %d: NULL\n", row); +} else { +printf("Data for row %d: %d\n", row, data[row]); +} +``` + +#### Syntax + +
bool *duckdb_nullmask_data(
+  duckdb_result *result,
+  idx_t col
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the nullmask from. +* `col` + +The column index. +* `returns` + +The nullmask of the specified column. + +
+ +### `duckdb_result_error` + +Returns the error message contained within the result. The error is only set if `duckdb_query` returns `DuckDBError`. + +The result of this function must not be freed. It will be cleaned up when `duckdb_destroy_result` is called. + +#### Syntax + +
const char *duckdb_result_error(
+  duckdb_result *result
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the error from. +* `returns` + +The error of the result. + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/replacement_scans.md b/docs/archive/1.0/api/c/replacement_scans.md new file mode 100644 index 00000000000..1a58be200dc --- /dev/null +++ b/docs/archive/1.0/api/c/replacement_scans.md @@ -0,0 +1,117 @@ +--- +layout: docu +title: Replacement Scans +--- + +The replacement scan API can be used to register a callback that is called when a table is read that does not exist in the catalog. For example, when a query such as `SELECT * FROM my_table` is executed and `my_table` does not exist, the replacement scan callback will be called with `my_table` as parameter. The replacement scan can then insert a table function with a specific parameter to replace the read of the table. + +## API Reference Overview + + + +
void duckdb_add_replacement_scan(duckdb_database db, duckdb_replacement_callback_t replacement, void *extra_data, duckdb_delete_callback_t delete_callback);
+void duckdb_replacement_scan_set_function_name(duckdb_replacement_scan_info info, const char *function_name);
+void duckdb_replacement_scan_add_parameter(duckdb_replacement_scan_info info, duckdb_value parameter);
+void duckdb_replacement_scan_set_error(duckdb_replacement_scan_info info, const char *error);
+
+ +### `duckdb_add_replacement_scan` + +Add a replacement scan definition to the specified database. + +#### Syntax + +
void duckdb_add_replacement_scan(
+  duckdb_database db,
+  duckdb_replacement_callback_t replacement,
+  void *extra_data,
+  duckdb_delete_callback_t delete_callback
+);
+
+ +#### Parameters + +* `db` + +The database object to add the replacement scan to +* `replacement` + +The replacement scan callback +* `extra_data` + +Extra data that is passed back into the specified callback +* `delete_callback` + +The delete callback to call on the extra data, if any + +
+ +### `duckdb_replacement_scan_set_function_name` + +Sets the replacement function name. If this function is called in the replacement callback, +the replacement scan is performed. If it is not called, the replacement callback is not performed. + +#### Syntax + +
void duckdb_replacement_scan_set_function_name(
+  duckdb_replacement_scan_info info,
+  const char *function_name
+);
+
+ +#### Parameters + +* `info` + +The info object +* `function_name` + +The function name to substitute. + +
+ +### `duckdb_replacement_scan_add_parameter` + +Adds a parameter to the replacement scan function. + +#### Syntax + +
void duckdb_replacement_scan_add_parameter(
+  duckdb_replacement_scan_info info,
+  duckdb_value parameter
+);
+
+ +#### Parameters + +* `info` + +The info object +* `parameter` + +The parameter to add. + +
+ +### `duckdb_replacement_scan_set_error` + +Report that an error has occurred while executing the replacement scan. + +#### Syntax + +
void duckdb_replacement_scan_set_error(
+  duckdb_replacement_scan_info info,
+  const char *error
+);
+
+ +#### Parameters + +* `info` + +The info object +* `error` + +The error message + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/table_functions.md b/docs/archive/1.0/api/c/table_functions.md new file mode 100644 index 00000000000..41d66d375e5 --- /dev/null +++ b/docs/archive/1.0/api/c/table_functions.md @@ -0,0 +1,832 @@ +--- +layout: docu +title: Table Functions +--- + +The table function API can be used to define a table function that can then be called from within DuckDB in the `FROM` clause of a query. + +## API Reference Overview + + + +
duckdb_table_function duckdb_create_table_function();
+void duckdb_destroy_table_function(duckdb_table_function *table_function);
+void duckdb_table_function_set_name(duckdb_table_function table_function, const char *name);
+void duckdb_table_function_add_parameter(duckdb_table_function table_function, duckdb_logical_type type);
+void duckdb_table_function_add_named_parameter(duckdb_table_function table_function, const char *name, duckdb_logical_type type);
+void duckdb_table_function_set_extra_info(duckdb_table_function table_function, void *extra_info, duckdb_delete_callback_t destroy);
+void duckdb_table_function_set_bind(duckdb_table_function table_function, duckdb_table_function_bind_t bind);
+void duckdb_table_function_set_init(duckdb_table_function table_function, duckdb_table_function_init_t init);
+void duckdb_table_function_set_local_init(duckdb_table_function table_function, duckdb_table_function_init_t init);
+void duckdb_table_function_set_function(duckdb_table_function table_function, duckdb_table_function_t function);
+void duckdb_table_function_supports_projection_pushdown(duckdb_table_function table_function, bool pushdown);
+duckdb_state duckdb_register_table_function(duckdb_connection con, duckdb_table_function function);
+
+ +### Table Function Bind + +
void *duckdb_bind_get_extra_info(duckdb_bind_info info);
+void duckdb_bind_add_result_column(duckdb_bind_info info, const char *name, duckdb_logical_type type);
+idx_t duckdb_bind_get_parameter_count(duckdb_bind_info info);
+duckdb_value duckdb_bind_get_parameter(duckdb_bind_info info, idx_t index);
+duckdb_value duckdb_bind_get_named_parameter(duckdb_bind_info info, const char *name);
+void duckdb_bind_set_bind_data(duckdb_bind_info info, void *bind_data, duckdb_delete_callback_t destroy);
+void duckdb_bind_set_cardinality(duckdb_bind_info info, idx_t cardinality, bool is_exact);
+void duckdb_bind_set_error(duckdb_bind_info info, const char *error);
+
+ +### Table Function Init + +
void *duckdb_init_get_extra_info(duckdb_init_info info);
+void *duckdb_init_get_bind_data(duckdb_init_info info);
+void duckdb_init_set_init_data(duckdb_init_info info, void *init_data, duckdb_delete_callback_t destroy);
+idx_t duckdb_init_get_column_count(duckdb_init_info info);
+idx_t duckdb_init_get_column_index(duckdb_init_info info, idx_t column_index);
+void duckdb_init_set_max_threads(duckdb_init_info info, idx_t max_threads);
+void duckdb_init_set_error(duckdb_init_info info, const char *error);
+
+ +### Table Function + +
void *duckdb_function_get_extra_info(duckdb_function_info info);
+void *duckdb_function_get_bind_data(duckdb_function_info info);
+void *duckdb_function_get_init_data(duckdb_function_info info);
+void *duckdb_function_get_local_init_data(duckdb_function_info info);
+void duckdb_function_set_error(duckdb_function_info info, const char *error);
+
+ +### `duckdb_create_table_function` + +Creates a new empty table function. + +The return value should be destroyed with `duckdb_destroy_table_function`. + +#### Syntax + +
duckdb_table_function duckdb_create_table_function(
+  
+);
+
+ +#### Parameters + +* `returns` + +The table function object. + +
+ +### `duckdb_destroy_table_function` + +Destroys the given table function object. + +#### Syntax + +
void duckdb_destroy_table_function(
+  duckdb_table_function *table_function
+);
+
+ +#### Parameters + +* `table_function` + +The table function to destroy + +
+ +### `duckdb_table_function_set_name` + +Sets the name of the given table function. + +#### Syntax + +
void duckdb_table_function_set_name(
+  duckdb_table_function table_function,
+  const char *name
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `name` + +The name of the table function + +
+ +### `duckdb_table_function_add_parameter` + +Adds a parameter to the table function. + +#### Syntax + +
void duckdb_table_function_add_parameter(
+  duckdb_table_function table_function,
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `type` + +The type of the parameter to add. + +
+ +### `duckdb_table_function_add_named_parameter` + +Adds a named parameter to the table function. + +#### Syntax + +
void duckdb_table_function_add_named_parameter(
+  duckdb_table_function table_function,
+  const char *name,
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `name` + +The name of the parameter +* `type` + +The type of the parameter to add. + +
+ +### `duckdb_table_function_set_extra_info` + +Assigns extra information to the table function that can be fetched during binding, etc. + +#### Syntax + +
void duckdb_table_function_set_extra_info(
+  duckdb_table_function table_function,
+  void *extra_info,
+  duckdb_delete_callback_t destroy
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `extra_info` + +The extra information +* `destroy` + +The callback that will be called to destroy the bind data (if any) + +
+ +### `duckdb_table_function_set_bind` + +Sets the bind function of the table function. + +#### Syntax + +
void duckdb_table_function_set_bind(
+  duckdb_table_function table_function,
+  duckdb_table_function_bind_t bind
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `bind` + +The bind function + +
+ +### `duckdb_table_function_set_init` + +Sets the init function of the table function. + +#### Syntax + +
void duckdb_table_function_set_init(
+  duckdb_table_function table_function,
+  duckdb_table_function_init_t init
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `init` + +The init function + +
+ +### `duckdb_table_function_set_local_init` + +Sets the thread-local init function of the table function. + +#### Syntax + +
void duckdb_table_function_set_local_init(
+  duckdb_table_function table_function,
+  duckdb_table_function_init_t init
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `init` + +The init function + +
+ +### `duckdb_table_function_set_function` + +Sets the main function of the table function. + +#### Syntax + +
void duckdb_table_function_set_function(
+  duckdb_table_function table_function,
+  duckdb_table_function_t function
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `function` + +The function + +
+ +### `duckdb_table_function_supports_projection_pushdown` + +Sets whether or not the given table function supports projection pushdown. + +If this is set to true, the system will provide a list of all required columns in the `init` stage through +the `duckdb_init_get_column_count` and `duckdb_init_get_column_index` functions. +If this is set to false (the default), the system will expect all columns to be projected. + +#### Syntax + +
void duckdb_table_function_supports_projection_pushdown(
+  duckdb_table_function table_function,
+  bool pushdown
+);
+
+ +#### Parameters + +* `table_function` + +The table function +* `pushdown` + +True if the table function supports projection pushdown, false otherwise. + +
+ +### `duckdb_register_table_function` + +Register the table function object within the given connection. + +The function requires at least a name, a bind function, an init function and a main function. + +If the function is incomplete or a function with this name already exists DuckDBError is returned. + +#### Syntax + +
duckdb_state duckdb_register_table_function(
+  duckdb_connection con,
+  duckdb_table_function function
+);
+
+ +#### Parameters + +* `con` + +The connection to register it in. +* `function` + +The function pointer +* `returns` + +Whether or not the registration was successful. + +
+ +### `duckdb_bind_get_extra_info` + +Retrieves the extra info of the function as set in `duckdb_table_function_set_extra_info`. + +#### Syntax + +
void *duckdb_bind_get_extra_info(
+  duckdb_bind_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The extra info + +
+ +### `duckdb_bind_add_result_column` + +Adds a result column to the output of the table function. + +#### Syntax + +
void duckdb_bind_add_result_column(
+  duckdb_bind_info info,
+  const char *name,
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `info` + +The info object +* `name` + +The name of the column +* `type` + +The logical type of the column + +
+ +### `duckdb_bind_get_parameter_count` + +Retrieves the number of regular (non-named) parameters to the function. + +#### Syntax + +
idx_t duckdb_bind_get_parameter_count(
+  duckdb_bind_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The number of parameters + +
+ +### `duckdb_bind_get_parameter` + +Retrieves the parameter at the given index. + +The result must be destroyed with `duckdb_destroy_value`. + +#### Syntax + +
duckdb_value duckdb_bind_get_parameter(
+  duckdb_bind_info info,
+  idx_t index
+);
+
+ +#### Parameters + +* `info` + +The info object +* `index` + +The index of the parameter to get +* `returns` + +The value of the parameter. Must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_bind_get_named_parameter` + +Retrieves a named parameter with the given name. + +The result must be destroyed with `duckdb_destroy_value`. + +#### Syntax + +
duckdb_value duckdb_bind_get_named_parameter(
+  duckdb_bind_info info,
+  const char *name
+);
+
+ +#### Parameters + +* `info` + +The info object +* `name` + +The name of the parameter +* `returns` + +The value of the parameter. Must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_bind_set_bind_data` + +Sets the user-provided bind data in the bind object. This object can be retrieved again during execution. + +#### Syntax + +
void duckdb_bind_set_bind_data(
+  duckdb_bind_info info,
+  void *bind_data,
+  duckdb_delete_callback_t destroy
+);
+
+ +#### Parameters + +* `info` + +The info object +* `extra_data` + +The bind data object. +* `destroy` + +The callback that will be called to destroy the bind data (if any) + +
+ +### `duckdb_bind_set_cardinality` + +Sets the cardinality estimate for the table function, used for optimization. + +#### Syntax + +
void duckdb_bind_set_cardinality(
+  duckdb_bind_info info,
+  idx_t cardinality,
+  bool is_exact
+);
+
+ +#### Parameters + +* `info` + +The bind data object. +* `is_exact` + +Whether or not the cardinality estimate is exact, or an approximation + +
+ +### `duckdb_bind_set_error` + +Report that an error has occurred while calling bind. + +#### Syntax + +
void duckdb_bind_set_error(
+  duckdb_bind_info info,
+  const char *error
+);
+
+ +#### Parameters + +* `info` + +The info object +* `error` + +The error message + +
+ +### `duckdb_init_get_extra_info` + +Retrieves the extra info of the function as set in `duckdb_table_function_set_extra_info`. + +#### Syntax + +
void *duckdb_init_get_extra_info(
+  duckdb_init_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The extra info + +
+ +### `duckdb_init_get_bind_data` + +Gets the bind data set by `duckdb_bind_set_bind_data` during the bind. + +Note that the bind data should be considered as read-only. +For tracking state, use the init data instead. + +#### Syntax + +
void *duckdb_init_get_bind_data(
+  duckdb_init_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The bind data object + +
+ +### `duckdb_init_set_init_data` + +Sets the user-provided init data in the init object. This object can be retrieved again during execution. + +#### Syntax + +
void duckdb_init_set_init_data(
+  duckdb_init_info info,
+  void *init_data,
+  duckdb_delete_callback_t destroy
+);
+
+ +#### Parameters + +* `info` + +The info object +* `extra_data` + +The init data object. +* `destroy` + +The callback that will be called to destroy the init data (if any) + +
+ +### `duckdb_init_get_column_count` + +Returns the number of projected columns. + +This function must be used if projection pushdown is enabled to figure out which columns to emit. + +#### Syntax + +
idx_t duckdb_init_get_column_count(
+  duckdb_init_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The number of projected columns. + +
+ +### `duckdb_init_get_column_index` + +Returns the column index of the projected column at the specified position. + +This function must be used if projection pushdown is enabled to figure out which columns to emit. + +#### Syntax + +
idx_t duckdb_init_get_column_index(
+  duckdb_init_info info,
+  idx_t column_index
+);
+
+ +#### Parameters + +* `info` + +The info object +* `column_index` + +The index at which to get the projected column index, from 0..duckdb_init_get_column_count(info) +* `returns` + +The column index of the projected column. + +
+ +### `duckdb_init_set_max_threads` + +Sets how many threads can process this table function in parallel (default: 1) + +#### Syntax + +
void duckdb_init_set_max_threads(
+  duckdb_init_info info,
+  idx_t max_threads
+);
+
+ +#### Parameters + +* `info` + +The info object +* `max_threads` + +The maximum amount of threads that can process this table function + +
+ +### `duckdb_init_set_error` + +Report that an error has occurred while calling init. + +#### Syntax + +
void duckdb_init_set_error(
+  duckdb_init_info info,
+  const char *error
+);
+
+ +#### Parameters + +* `info` + +The info object +* `error` + +The error message + +
+ +### `duckdb_function_get_extra_info` + +Retrieves the extra info of the function as set in `duckdb_table_function_set_extra_info`. + +#### Syntax + +
void *duckdb_function_get_extra_info(
+  duckdb_function_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The extra info + +
+ +### `duckdb_function_get_bind_data` + +Gets the bind data set by `duckdb_bind_set_bind_data` during the bind. + +Note that the bind data should be considered as read-only. +For tracking state, use the init data instead. + +#### Syntax + +
void *duckdb_function_get_bind_data(
+  duckdb_function_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The bind data object + +
+ +### `duckdb_function_get_init_data` + +Gets the init data set by `duckdb_init_set_init_data` during the init. + +#### Syntax + +
void *duckdb_function_get_init_data(
+  duckdb_function_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The init data object + +
+ +### `duckdb_function_get_local_init_data` + +Gets the thread-local init data set by `duckdb_init_set_init_data` during the local_init. + +#### Syntax + +
void *duckdb_function_get_local_init_data(
+  duckdb_function_info info
+);
+
+ +#### Parameters + +* `info` + +The info object +* `returns` + +The init data object + +
+ +### `duckdb_function_set_error` + +Report that an error has occurred while executing the function. + +#### Syntax + +
void duckdb_function_set_error(
+  duckdb_function_info info,
+  const char *error
+);
+
+ +#### Parameters + +* `info` + +The info object +* `error` + +The error message + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/types.md b/docs/archive/1.0/api/c/types.md new file mode 100644 index 00000000000..0ffb3cd54f4 --- /dev/null +++ b/docs/archive/1.0/api/c/types.md @@ -0,0 +1,1263 @@ +--- +layout: docu +title: Types +--- + +DuckDB is a strongly typed database system. As such, every column has a single type specified. This type is constant +over the entire column. That is to say, a column that is labeled as an `INTEGER` column will only contain `INTEGER` +values. + +DuckDB also supports columns of composite types. For example, it is possible to define an array of integers (`INTEGER[]`). It is also possible to define types as arbitrary structs (`ROW(i INTEGER, j VARCHAR)`). For that reason, native DuckDB type objects are not mere enums, but a class that can potentially be nested. + +Types in the C API are modeled using an enum (`duckdb_type`) and a complex class (`duckdb_logical_type`). For most primitive types, e.g., integers or varchars, the enum is sufficient. For more complex types, such as lists, structs or decimals, the logical type must be used. + +```c +typedef enum DUCKDB_TYPE { + DUCKDB_TYPE_INVALID = 0, + DUCKDB_TYPE_BOOLEAN = 1, + DUCKDB_TYPE_TINYINT = 2, + DUCKDB_TYPE_SMALLINT = 3, + DUCKDB_TYPE_INTEGER = 4, + DUCKDB_TYPE_BIGINT = 5, + DUCKDB_TYPE_UTINYINT = 6, + DUCKDB_TYPE_USMALLINT = 7, + DUCKDB_TYPE_UINTEGER = 8, + DUCKDB_TYPE_UBIGINT = 9, + DUCKDB_TYPE_FLOAT = 10, + DUCKDB_TYPE_DOUBLE = 11, + DUCKDB_TYPE_TIMESTAMP = 12, + DUCKDB_TYPE_DATE = 13, + DUCKDB_TYPE_TIME = 14, + DUCKDB_TYPE_INTERVAL = 15, + DUCKDB_TYPE_HUGEINT = 16, + DUCKDB_TYPE_UHUGEINT = 32, + DUCKDB_TYPE_VARCHAR = 17, + DUCKDB_TYPE_BLOB = 18, + DUCKDB_TYPE_DECIMAL = 19, + DUCKDB_TYPE_TIMESTAMP_S = 20, + DUCKDB_TYPE_TIMESTAMP_MS = 21, + DUCKDB_TYPE_TIMESTAMP_NS = 22, + DUCKDB_TYPE_ENUM = 23, + DUCKDB_TYPE_LIST = 24, + DUCKDB_TYPE_STRUCT = 25, + DUCKDB_TYPE_MAP = 26, + DUCKDB_TYPE_ARRAY = 33, + DUCKDB_TYPE_UUID = 27, + DUCKDB_TYPE_UNION = 28, + DUCKDB_TYPE_BIT = 29, + DUCKDB_TYPE_TIME_TZ = 30, + DUCKDB_TYPE_TIMESTAMP_TZ = 31, +} duckdb_type; +``` + +## Functions + +The enum type of a column in the result can be obtained using the `duckdb_column_type` function. The logical type of a column can be obtained using the `duckdb_column_logical_type` function. + +### `duckdb_value` + +The `duckdb_value` functions will auto-cast values as required. For example, it is no problem to use +`duckdb_value_double` on a column of type `duckdb_value_int32`. The value will be auto-cast and returned as a double. +Note that in certain cases the cast may fail. For example, this can happen if we request a `duckdb_value_int8` and the value does not fit within an `int8` value. In this case, a default value will be returned (usually `0` or `nullptr`). The same default value will also be returned if the corresponding value is `NULL`. + +The `duckdb_value_is_null` function can be used to check if a specific value is `NULL` or not. + +The exception to the auto-cast rule is the `duckdb_value_varchar_internal` function. This function does not auto-cast and only works for `VARCHAR` columns. The reason this function exists is that the result does not need to be freed. + +> `duckdb_value_varchar` and `duckdb_value_blob` require the result to be de-allocated using `duckdb_free`. + +### `duckdb_fetch_chunk` + +The `duckdb_fetch_chunk` function can be used to read data chunks from a DuckDB result set, and is the most efficient way of reading data from a DuckDB result using the C API. It is also the only way of reading data of certain types from a DuckDB result. For example, the `duckdb_value` functions do not support structural reading of composite types (lists or structs) or more complex types like enums and decimals. + +For more information about data chunks, see the [documentation on data chunks]({% link docs/archive/1.0/api/c/data_chunk.md %}). + +## API Reference Overview + + + +
duckdb_data_chunk duckdb_result_get_chunk(duckdb_result result, idx_t chunk_index);
+bool duckdb_result_is_streaming(duckdb_result result);
+idx_t duckdb_result_chunk_count(duckdb_result result);
+duckdb_result_type duckdb_result_return_type(duckdb_result result);
+
+ +### Date/Time/Timestamp Helpers + +
duckdb_date_struct duckdb_from_date(duckdb_date date);
+duckdb_date duckdb_to_date(duckdb_date_struct date);
+bool duckdb_is_finite_date(duckdb_date date);
+duckdb_time_struct duckdb_from_time(duckdb_time time);
+duckdb_time_tz duckdb_create_time_tz(int64_t micros, int32_t offset);
+duckdb_time_tz_struct duckdb_from_time_tz(duckdb_time_tz micros);
+duckdb_time duckdb_to_time(duckdb_time_struct time);
+duckdb_timestamp_struct duckdb_from_timestamp(duckdb_timestamp ts);
+duckdb_timestamp duckdb_to_timestamp(duckdb_timestamp_struct ts);
+bool duckdb_is_finite_timestamp(duckdb_timestamp ts);
+
+ +### Hugeint Helpers + +
double duckdb_hugeint_to_double(duckdb_hugeint val);
+duckdb_hugeint duckdb_double_to_hugeint(double val);
+
+ +### Decimal Helpers + +
duckdb_decimal duckdb_double_to_decimal(double val, uint8_t width, uint8_t scale);
+double duckdb_decimal_to_double(duckdb_decimal val);
+
+ +### Logical Type Interface + +
duckdb_logical_type duckdb_create_logical_type(duckdb_type type);
+char *duckdb_logical_type_get_alias(duckdb_logical_type type);
+duckdb_logical_type duckdb_create_list_type(duckdb_logical_type type);
+duckdb_logical_type duckdb_create_array_type(duckdb_logical_type type, idx_t array_size);
+duckdb_logical_type duckdb_create_map_type(duckdb_logical_type key_type, duckdb_logical_type value_type);
+duckdb_logical_type duckdb_create_union_type(duckdb_logical_type *member_types, const char **member_names, idx_t member_count);
+duckdb_logical_type duckdb_create_struct_type(duckdb_logical_type *member_types, const char **member_names, idx_t member_count);
+duckdb_logical_type duckdb_create_enum_type(const char **member_names, idx_t member_count);
+duckdb_logical_type duckdb_create_decimal_type(uint8_t width, uint8_t scale);
+duckdb_type duckdb_get_type_id(duckdb_logical_type type);
+uint8_t duckdb_decimal_width(duckdb_logical_type type);
+uint8_t duckdb_decimal_scale(duckdb_logical_type type);
+duckdb_type duckdb_decimal_internal_type(duckdb_logical_type type);
+duckdb_type duckdb_enum_internal_type(duckdb_logical_type type);
+uint32_t duckdb_enum_dictionary_size(duckdb_logical_type type);
+char *duckdb_enum_dictionary_value(duckdb_logical_type type, idx_t index);
+duckdb_logical_type duckdb_list_type_child_type(duckdb_logical_type type);
+duckdb_logical_type duckdb_array_type_child_type(duckdb_logical_type type);
+idx_t duckdb_array_type_array_size(duckdb_logical_type type);
+duckdb_logical_type duckdb_map_type_key_type(duckdb_logical_type type);
+duckdb_logical_type duckdb_map_type_value_type(duckdb_logical_type type);
+idx_t duckdb_struct_type_child_count(duckdb_logical_type type);
+char *duckdb_struct_type_child_name(duckdb_logical_type type, idx_t index);
+duckdb_logical_type duckdb_struct_type_child_type(duckdb_logical_type type, idx_t index);
+idx_t duckdb_union_type_member_count(duckdb_logical_type type);
+char *duckdb_union_type_member_name(duckdb_logical_type type, idx_t index);
+duckdb_logical_type duckdb_union_type_member_type(duckdb_logical_type type, idx_t index);
+void duckdb_destroy_logical_type(duckdb_logical_type *type);
+
+ +### `duckdb_result_get_chunk` + +> Deprecated This method is scheduled for removal in a future release. + +Fetches a data chunk from the duckdb_result. This function should be called repeatedly until the result is exhausted. + +The result must be destroyed with `duckdb_destroy_data_chunk`. + +This function supersedes all `duckdb_value` functions, as well as the `duckdb_column_data` and `duckdb_nullmask_data` +functions. It results in significantly better performance, and should be preferred in newer code-bases. + +If this function is used, none of the other result functions can be used and vice versa (i.e., this function cannot be +mixed with the legacy result functions). + +Use `duckdb_result_chunk_count` to figure out how many chunks there are in the result. + +#### Syntax + +
duckdb_data_chunk duckdb_result_get_chunk(
+  duckdb_result result,
+  idx_t chunk_index
+);
+
+ +#### Parameters + +* `result` + +The result object to fetch the data chunk from. +* `chunk_index` + +The chunk index to fetch from. +* `returns` + +The resulting data chunk. Returns `NULL` if the chunk index is out of bounds. + +
+ +### `duckdb_result_is_streaming` + +> Deprecated This method is scheduled for removal in a future release. + +Checks if the type of the internal result is StreamQueryResult. + +#### Syntax + +
bool duckdb_result_is_streaming(
+  duckdb_result result
+);
+
+ +#### Parameters + +* `result` + +The result object to check. +* `returns` + +Whether or not the result object is of the type StreamQueryResult + +
+ +### `duckdb_result_chunk_count` + +> Deprecated This method is scheduled for removal in a future release. + +Returns the number of data chunks present in the result. + +#### Syntax + +
idx_t duckdb_result_chunk_count(
+  duckdb_result result
+);
+
+ +#### Parameters + +* `result` + +The result object +* `returns` + +Number of data chunks present in the result. + +
+ +### `duckdb_result_return_type` + +Returns the return_type of the given result, or DUCKDB_RETURN_TYPE_INVALID on error + +#### Syntax + +
duckdb_result_type duckdb_result_return_type(
+  duckdb_result result
+);
+
+ +#### Parameters + +* `result` + +The result object +* `returns` + +The return_type + +
+ +### `duckdb_from_date` + +Decompose a `duckdb_date` object into year, month and date (stored as `duckdb_date_struct`). + +#### Syntax + +
duckdb_date_struct duckdb_from_date(
+  duckdb_date date
+);
+
+ +#### Parameters + +* `date` + +The date object, as obtained from a `DUCKDB_TYPE_DATE` column. +* `returns` + +The `duckdb_date_struct` with the decomposed elements. + +
+ +### `duckdb_to_date` + +Re-compose a `duckdb_date` from year, month and date (`duckdb_date_struct`). + +#### Syntax + +
duckdb_date duckdb_to_date(
+  duckdb_date_struct date
+);
+
+ +#### Parameters + +* `date` + +The year, month and date stored in a `duckdb_date_struct`. +* `returns` + +The `duckdb_date` element. + +
+ +### `duckdb_is_finite_date` + +Test a `duckdb_date` to see if it is a finite value. + +#### Syntax + +
bool duckdb_is_finite_date(
+  duckdb_date date
+);
+
+ +#### Parameters + +* `date` + +The date object, as obtained from a `DUCKDB_TYPE_DATE` column. +* `returns` + +True if the date is finite, false if it is ±infinity. + +
+ +### `duckdb_from_time` + +Decompose a `duckdb_time` object into hour, minute, second and microsecond (stored as `duckdb_time_struct`). + +#### Syntax + +
duckdb_time_struct duckdb_from_time(
+  duckdb_time time
+);
+
+ +#### Parameters + +* `time` + +The time object, as obtained from a `DUCKDB_TYPE_TIME` column. +* `returns` + +The `duckdb_time_struct` with the decomposed elements. + +
+ +### `duckdb_create_time_tz` + +Create a `duckdb_time_tz` object from micros and a timezone offset. + +#### Syntax + +
duckdb_time_tz duckdb_create_time_tz(
+  int64_t micros,
+  int32_t offset
+);
+
+ +#### Parameters + +* `micros` + +The microsecond component of the time. +* `offset` + +The timezone offset component of the time. +* `returns` + +The `duckdb_time_tz` element. + +
+ +### `duckdb_from_time_tz` + +Decompose a TIME_TZ objects into micros and a timezone offset. + +Use `duckdb_from_time` to further decompose the micros into hour, minute, second and microsecond. + +#### Syntax + +
duckdb_time_tz_struct duckdb_from_time_tz(
+  duckdb_time_tz micros
+);
+
+ +#### Parameters + +* `micros` + +The time object, as obtained from a `DUCKDB_TYPE_TIME_TZ` column. +* `out_micros` + +The microsecond component of the time. +* `out_offset` + +The timezone offset component of the time. + +
+ +### `duckdb_to_time` + +Re-compose a `duckdb_time` from hour, minute, second and microsecond (`duckdb_time_struct`). + +#### Syntax + +
duckdb_time duckdb_to_time(
+  duckdb_time_struct time
+);
+
+ +#### Parameters + +* `time` + +The hour, minute, second and microsecond in a `duckdb_time_struct`. +* `returns` + +The `duckdb_time` element. + +
+ +### `duckdb_from_timestamp` + +Decompose a `duckdb_timestamp` object into a `duckdb_timestamp_struct`. + +#### Syntax + +
duckdb_timestamp_struct duckdb_from_timestamp(
+  duckdb_timestamp ts
+);
+
+ +#### Parameters + +* `ts` + +The ts object, as obtained from a `DUCKDB_TYPE_TIMESTAMP` column. +* `returns` + +The `duckdb_timestamp_struct` with the decomposed elements. + +
+ +### `duckdb_to_timestamp` + +Re-compose a `duckdb_timestamp` from a duckdb_timestamp_struct. + +#### Syntax + +
duckdb_timestamp duckdb_to_timestamp(
+  duckdb_timestamp_struct ts
+);
+
+ +#### Parameters + +* `ts` + +The de-composed elements in a `duckdb_timestamp_struct`. +* `returns` + +The `duckdb_timestamp` element. + +
+ +### `duckdb_is_finite_timestamp` + +Test a `duckdb_timestamp` to see if it is a finite value. + +#### Syntax + +
bool duckdb_is_finite_timestamp(
+  duckdb_timestamp ts
+);
+
+ +#### Parameters + +* `ts` + +The timestamp object, as obtained from a `DUCKDB_TYPE_TIMESTAMP` column. +* `returns` + +True if the timestamp is finite, false if it is ±infinity. + +
+ +### `duckdb_hugeint_to_double` + +Converts a duckdb_hugeint object (as obtained from a `DUCKDB_TYPE_HUGEINT` column) into a double. + +#### Syntax + +
double duckdb_hugeint_to_double(
+  duckdb_hugeint val
+);
+
+ +#### Parameters + +* `val` + +The hugeint value. +* `returns` + +The converted `double` element. + +
+ +### `duckdb_double_to_hugeint` + +Converts a double value to a duckdb_hugeint object. + +If the conversion fails because the double value is too big the result will be 0. + +#### Syntax + +
duckdb_hugeint duckdb_double_to_hugeint(
+  double val
+);
+
+ +#### Parameters + +* `val` + +The double value. +* `returns` + +The converted `duckdb_hugeint` element. + +
+ +### `duckdb_double_to_decimal` + +Converts a double value to a duckdb_decimal object. + +If the conversion fails because the double value is too big, or the width/scale are invalid the result will be 0. + +#### Syntax + +
duckdb_decimal duckdb_double_to_decimal(
+  double val,
+  uint8_t width,
+  uint8_t scale
+);
+
+ +#### Parameters + +* `val` + +The double value. +* `returns` + +The converted `duckdb_decimal` element. + +
+ +### `duckdb_decimal_to_double` + +Converts a duckdb_decimal object (as obtained from a `DUCKDB_TYPE_DECIMAL` column) into a double. + +#### Syntax + +
double duckdb_decimal_to_double(
+  duckdb_decimal val
+);
+
+ +#### Parameters + +* `val` + +The decimal value. +* `returns` + +The converted `double` element. + +
+ +### `duckdb_create_logical_type` + +Creates a `duckdb_logical_type` from a standard primitive type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +This should not be used with `DUCKDB_TYPE_DECIMAL`. + +#### Syntax + +
duckdb_logical_type duckdb_create_logical_type(
+  duckdb_type type
+);
+
+ +#### Parameters + +* `type` + +The primitive type to create. +* `returns` + +The logical type. + +
+ +### `duckdb_logical_type_get_alias` + +Returns the alias of a duckdb_logical_type, if one is set, else `NULL`. +The result must be destroyed with `duckdb_free`. + +#### Syntax + +
char *duckdb_logical_type_get_alias(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type to return the alias of +* `returns` + +The alias or `NULL` + +
+ +### `duckdb_create_list_type` + +Creates a list type from its child type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_create_list_type(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The child type of list type to create. +* `returns` + +The logical type. + +
+ +### `duckdb_create_array_type` + +Creates a array type from its child type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_create_array_type(
+  duckdb_logical_type type,
+  idx_t array_size
+);
+
+ +#### Parameters + +* `type` + +The child type of array type to create. +* `array_size` + +The number of elements in the array. +* `returns` + +The logical type. + +
+ +### `duckdb_create_map_type` + +Creates a map type from its key type and value type. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_create_map_type(
+  duckdb_logical_type key_type,
+  duckdb_logical_type value_type
+);
+
+ +#### Parameters + +* `type` + +The key type and value type of map type to create. +* `returns` + +The logical type. + +
+ +### `duckdb_create_union_type` + +Creates a UNION type from the passed types array. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_create_union_type(
+  duckdb_logical_type *member_types,
+  const char **member_names,
+  idx_t member_count
+);
+
+ +#### Parameters + +* `types` + +The array of types that the union should consist of. +* `type_amount` + +The size of the types array. +* `returns` + +The logical type. + +
+ +### `duckdb_create_struct_type` + +Creates a STRUCT type from the passed member name and type arrays. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_create_struct_type(
+  duckdb_logical_type *member_types,
+  const char **member_names,
+  idx_t member_count
+);
+
+ +#### Parameters + +* `member_types` + +The array of types that the struct should consist of. +* `member_names` + +The array of names that the struct should consist of. +* `member_count` + +The number of members that were specified for both arrays. +* `returns` + +The logical type. + +
+ +### `duckdb_create_enum_type` + +Creates an ENUM type from the passed member name array. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_create_enum_type(
+  const char **member_names,
+  idx_t member_count
+);
+
+ +#### Parameters + +* `enum_name` + +The name of the enum. +* `member_names` + +The array of names that the enum should consist of. +* `member_count` + +The number of elements that were specified in the array. +* `returns` + +The logical type. + +
+ +### `duckdb_create_decimal_type` + +Creates a `duckdb_logical_type` of type decimal with the specified width and scale. +The resulting type should be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_create_decimal_type(
+  uint8_t width,
+  uint8_t scale
+);
+
+ +#### Parameters + +* `width` + +The width of the decimal type +* `scale` + +The scale of the decimal type +* `returns` + +The logical type. + +
+ +### `duckdb_get_type_id` + +Retrieves the enum type class of a `duckdb_logical_type`. + +#### Syntax + +
duckdb_type duckdb_get_type_id(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The type id + +
+ +### `duckdb_decimal_width` + +Retrieves the width of a decimal type. + +#### Syntax + +
uint8_t duckdb_decimal_width(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The width of the decimal type + +
+ +### `duckdb_decimal_scale` + +Retrieves the scale of a decimal type. + +#### Syntax + +
uint8_t duckdb_decimal_scale(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The scale of the decimal type + +
+ +### `duckdb_decimal_internal_type` + +Retrieves the internal storage type of a decimal type. + +#### Syntax + +
duckdb_type duckdb_decimal_internal_type(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The internal type of the decimal type + +
+ +### `duckdb_enum_internal_type` + +Retrieves the internal storage type of an enum type. + +#### Syntax + +
duckdb_type duckdb_enum_internal_type(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The internal type of the enum type + +
+ +### `duckdb_enum_dictionary_size` + +Retrieves the dictionary size of the enum type. + +#### Syntax + +
uint32_t duckdb_enum_dictionary_size(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The dictionary size of the enum type + +
+ +### `duckdb_enum_dictionary_value` + +Retrieves the dictionary value at the specified position from the enum. + +The result must be freed with `duckdb_free`. + +#### Syntax + +
char *duckdb_enum_dictionary_value(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `index` + +The index in the dictionary +* `returns` + +The string value of the enum type. Must be freed with `duckdb_free`. + +
+ +### `duckdb_list_type_child_type` + +Retrieves the child type of the given list type. + +The result must be freed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_list_type_child_type(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The child type of the list type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +### `duckdb_array_type_child_type` + +Retrieves the child type of the given array type. + +The result must be freed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_array_type_child_type(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The child type of the array type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +### `duckdb_array_type_array_size` + +Retrieves the array size of the given array type. + +#### Syntax + +
idx_t duckdb_array_type_array_size(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The fixed number of elements the values of this array type can store. + +
+ +### `duckdb_map_type_key_type` + +Retrieves the key type of the given map type. + +The result must be freed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_map_type_key_type(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The key type of the map type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +### `duckdb_map_type_value_type` + +Retrieves the value type of the given map type. + +The result must be freed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_map_type_value_type(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The value type of the map type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +### `duckdb_struct_type_child_count` + +Returns the number of children of a struct type. + +#### Syntax + +
idx_t duckdb_struct_type_child_count(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `returns` + +The number of children of a struct type. + +
+ +### `duckdb_struct_type_child_name` + +Retrieves the name of the struct child. + +The result must be freed with `duckdb_free`. + +#### Syntax + +
char *duckdb_struct_type_child_name(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The name of the struct type. Must be freed with `duckdb_free`. + +
+ +### `duckdb_struct_type_child_type` + +Retrieves the child type of the given struct type at the specified index. + +The result must be freed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_struct_type_child_type(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The child type of the struct type. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +### `duckdb_union_type_member_count` + +Returns the number of members that the union type has. + +#### Syntax + +
idx_t duckdb_union_type_member_count(
+  duckdb_logical_type type
+);
+
+ +#### Parameters + +* `type` + +The logical type (union) object +* `returns` + +The number of members of a union type. + +
+ +### `duckdb_union_type_member_name` + +Retrieves the name of the union member. + +The result must be freed with `duckdb_free`. + +#### Syntax + +
char *duckdb_union_type_member_name(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The name of the union member. Must be freed with `duckdb_free`. + +
+ +### `duckdb_union_type_member_type` + +Retrieves the child type of the given union member at the specified index. + +The result must be freed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_union_type_member_type(
+  duckdb_logical_type type,
+  idx_t index
+);
+
+ +#### Parameters + +* `type` + +The logical type object +* `index` + +The child index +* `returns` + +The child type of the union member. Must be destroyed with `duckdb_destroy_logical_type`. + +
+ +### `duckdb_destroy_logical_type` + +Destroys the logical type and de-allocates all memory allocated for that type. + +#### Syntax + +
void duckdb_destroy_logical_type(
+  duckdb_logical_type *type
+);
+
+ +#### Parameters + +* `type` + +The logical type to destroy. + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/value.md b/docs/archive/1.0/api/c/value.md new file mode 100644 index 00000000000..680c431cc31 --- /dev/null +++ b/docs/archive/1.0/api/c/value.md @@ -0,0 +1,241 @@ +--- +layout: docu +title: Values +--- + +The value class represents a single value of any type. + +## API Reference Overview + + + +
void duckdb_destroy_value(duckdb_value *value);
+duckdb_value duckdb_create_varchar(const char *text);
+duckdb_value duckdb_create_varchar_length(const char *text, idx_t length);
+duckdb_value duckdb_create_int64(int64_t val);
+duckdb_value duckdb_create_struct_value(duckdb_logical_type type, duckdb_value *values);
+duckdb_value duckdb_create_list_value(duckdb_logical_type type, duckdb_value *values, idx_t value_count);
+duckdb_value duckdb_create_array_value(duckdb_logical_type type, duckdb_value *values, idx_t value_count);
+char *duckdb_get_varchar(duckdb_value value);
+int64_t duckdb_get_int64(duckdb_value value);
+
+ +### `duckdb_destroy_value` + +Destroys the value and de-allocates all memory allocated for that type. + +#### Syntax + +
void duckdb_destroy_value(
+  duckdb_value *value
+);
+
+ +#### Parameters + +* `value` + +The value to destroy. + +
+ +### `duckdb_create_varchar` + +Creates a value from a null-terminated string + +#### Syntax + +
duckdb_value duckdb_create_varchar(
+  const char *text
+);
+
+ +#### Parameters + +* `value` + +The null-terminated string +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_create_varchar_length` + +Creates a value from a string + +#### Syntax + +
duckdb_value duckdb_create_varchar_length(
+  const char *text,
+  idx_t length
+);
+
+ +#### Parameters + +* `value` + +The text +* `length` + +The length of the text +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_create_int64` + +Creates a value from an int64 + +#### Syntax + +
duckdb_value duckdb_create_int64(
+  int64_t val
+);
+
+ +#### Parameters + +* `value` + +The bigint value +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_create_struct_value` + +Creates a struct value from a type and an array of values + +#### Syntax + +
duckdb_value duckdb_create_struct_value(
+  duckdb_logical_type type,
+  duckdb_value *values
+);
+
+ +#### Parameters + +* `type` + +The type of the struct +* `values` + +The values for the struct fields +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_create_list_value` + +Creates a list value from a type and an array of values of length `value_count` + +#### Syntax + +
duckdb_value duckdb_create_list_value(
+  duckdb_logical_type type,
+  duckdb_value *values,
+  idx_t value_count
+);
+
+ +#### Parameters + +* `type` + +The type of the list +* `values` + +The values for the list +* `value_count` + +The number of values in the list +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_create_array_value` + +Creates a array value from a type and an array of values of length `value_count` + +#### Syntax + +
duckdb_value duckdb_create_array_value(
+  duckdb_logical_type type,
+  duckdb_value *values,
+  idx_t value_count
+);
+
+ +#### Parameters + +* `type` + +The type of the array +* `values` + +The values for the array +* `value_count` + +The number of values in the array +* `returns` + +The value. This must be destroyed with `duckdb_destroy_value`. + +
+ +### `duckdb_get_varchar` + +Obtains a string representation of the given value. +The result must be destroyed with `duckdb_free`. + +#### Syntax + +
char *duckdb_get_varchar(
+  duckdb_value value
+);
+
+ +#### Parameters + +* `value` + +The value +* `returns` + +The string value. This must be destroyed with `duckdb_free`. + +
+ +### `duckdb_get_int64` + +Obtains an int64 of the given value. + +#### Syntax + +
int64_t duckdb_get_int64(
+  duckdb_value value
+);
+
+ +#### Parameters + +* `value` + +The value +* `returns` + +The int64 value, or 0 if no conversion is possible + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/c/vector.md b/docs/archive/1.0/api/c/vector.md new file mode 100644 index 00000000000..10ba711b548 --- /dev/null +++ b/docs/archive/1.0/api/c/vector.md @@ -0,0 +1,803 @@ +--- +layout: docu +title: Vectors +--- + +Vectors represent a horizontal slice of a column. They hold a number of values of a specific type, similar to an array. Vectors are the core data representation used in DuckDB. Vectors are typically stored within [data chunks]({% link docs/archive/1.0/api/c/data_chunk.md %}). + +The vector and data chunk interfaces are the most efficient way of interacting with DuckDB, allowing for the highest performance. However, the interfaces are also difficult to use and care must be taken when using them. + +## Vector Format + +Vectors are arrays of a specific data type. The logical type of a vector can be obtained using `duckdb_vector_get_column_type`. The type id of the logical type can then be obtained using `duckdb_get_type_id`. + +Vectors themselves do not have sizes. Instead, the parent data chunk has a size (that can be obtained through `duckdb_data_chunk_get_size`). All vectors that belong to a data chunk have the same size. + +### Primitive Types + +For primitive types, the underlying array can be obtained using the `duckdb_vector_get_data` method. The array can then be accessed using the correct native type. Below is a table that contains a mapping of the `duckdb_type` to the native type of the array. + +
+ +| duckdb_type | NativeType | +|--------------------------|------------------| +| DUCKDB_TYPE_BOOLEAN | bool | +| DUCKDB_TYPE_TINYINT | int8_t | +| DUCKDB_TYPE_SMALLINT | int16_t | +| DUCKDB_TYPE_INTEGER | int32_t | +| DUCKDB_TYPE_BIGINT | int64_t | +| DUCKDB_TYPE_UTINYINT | uint8_t | +| DUCKDB_TYPE_USMALLINT | uint16_t | +| DUCKDB_TYPE_UINTEGER | uint32_t | +| DUCKDB_TYPE_UBIGINT | uint64_t | +| DUCKDB_TYPE_FLOAT | float | +| DUCKDB_TYPE_DOUBLE | double | +| DUCKDB_TYPE_TIMESTAMP | duckdb_timestamp | +| DUCKDB_TYPE_DATE | duckdb_date | +| DUCKDB_TYPE_TIME | duckdb_time | +| DUCKDB_TYPE_INTERVAL | duckdb_interval | +| DUCKDB_TYPE_HUGEINT | duckdb_hugeint | +| DUCKDB_TYPE_UHUGEINT | duckdb_uhugeint | +| DUCKDB_TYPE_VARCHAR | duckdb_string_t | +| DUCKDB_TYPE_BLOB | duckdb_string_t | +| DUCKDB_TYPE_TIMESTAMP_S | duckdb_timestamp | +| DUCKDB_TYPE_TIMESTAMP_MS | duckdb_timestamp | +| DUCKDB_TYPE_TIMESTAMP_NS | duckdb_timestamp | +| DUCKDB_TYPE_UUID | duckdb_hugeint | +| DUCKDB_TYPE_TIME_TZ | duckdb_time_tz | +| DUCKDB_TYPE_TIMESTAMP_TZ | duckdb_timestamp | + +### Null Values + +Any value in a vector can be `NULL`. When a value is `NULL`, the values contained within the primary array at that index is undefined (and can be uninitialized). The validity mask is a bitmask consisting of `uint64_t` elements. For every `64` values in the vector, one `uint64_t` element exists (rounded up). The validity mask has its bit set to 1 if the value is valid, or set to 0 if the value is invalid (i.e .`NULL`). + +The bits of the bitmask can be read directly, or the slower helper method `duckdb_validity_row_is_valid` can be used to check whether or not a value is `NULL`. + +The `duckdb_vector_get_validity` returns a pointer to the validity mask. Note that if all values in a vector are valid, this function **might** return `nullptr` in which case the validity mask does not need to be checked. + +### Strings + +String values are stored as a `duckdb_string_t`. This is a special struct that stores the string inline (if it is short, i.e., `<= 12 bytes`) or a pointer to the string data if it is longer than `12` bytes. + +```c +typedef struct { + union { + struct { + uint32_t length; + char prefix[4]; + char *ptr; + } pointer; + struct { + uint32_t length; + char inlined[12]; + } inlined; + } value; +} duckdb_string_t; +``` + +The length can either be accessed directly, or the `duckdb_string_is_inlined` can be used to check if a string is inlined. + +### Decimals + +Decimals are stored as integer values internally. The exact native type depends on the `width` of the decimal type, as shown in the following table: + +
+ +| Width | NativeType | +|-------|----------------| +| <= 4 | int16_t | +| <= 9 | int32_t | +| <= 18 | int64_t | +| <= 38 | duckdb_hugeint | + +The `duckdb_decimal_internal_type` can be used to obtain the internal type of the decimal. + +Decimals are stored as integer values multiplied by `10^scale`. The scale of a decimal can be obtained using `duckdb_decimal_scale`. For example, a decimal value of `10.5` with type `DECIMAL(8, 3)` is stored internally as an `int32_t` value of `10500`. In order to obtain the correct decimal value, the value should be divided by the appropriate power-of-ten. + +### Enums + +Enums are stored as unsigned integer values internally. The exact native type depends on the size of the enum dictionary, as shown in the following table: + +
+ +| Dictionary Size | NativeType | +|-----------------|------------| +| <= 255 | uint8_t | +| <= 65535 | uint16_t | +| <= 4294967295 | uint32_t | + +The `duckdb_enum_internal_type` can be used to obtain the internal type of the enum. + +In order to obtain the actual string value of the enum, the `duckdb_enum_dictionary_value` function must be used to obtain the enum value that corresponds to the given dictionary entry. Note that the enum dictionary is the same for the entire column - and so only needs to be constructed once. + +### Structs + +Structs are nested types that contain any number of child types. Think of them like a `struct` in C. The way to access struct data using vectors is to access the child vectors recursively using the `duckdb_struct_vector_get_child` method. + +The struct vector itself does not have any data (i.e., you should not use `duckdb_vector_get_data` method on the struct). **However**, the struct vector itself **does** have a validity mask. The reason for this is that the child elements of a struct can be `NULL`, but the struct **itself** can also be `NULL`. + +### Lists + +Lists are nested types that contain a single child type, repeated `x` times per row. Think of them like a variable-length array in C. The way to access list data using vectors is to access the child vector using the `duckdb_list_vector_get_child` method. + +The `duckdb_vector_get_data` must be used to get the offsets and lengths of the lists stored as `duckdb_list_entry`, that can then be applied to the child vector. + +```c +typedef struct { + uint64_t offset; + uint64_t length; +} duckdb_list_entry; +``` + +Note that both list entries itself **and** any children stored in the lists can also be `NULL`. This must be checked using the validity mask again. + +### Arrays + +Arrays are nested types that contain a single child type, repeated exactly `array_size` times per row. Think of them like a fixed-size array in C. Arrays work exactly the same as lists, **except** the length and offset of each entry is fixed. The fixed array size can be obtained by using `duckdb_array_type_array_size`. The data for entry `n` then resides at `offset = n * array_size`, and always has `length = array_size`. + +Note that much like lists, arrays can still be `NULL`, which must be checked using the validity mask. + +## Examples + +Below are several full end-to-end examples of how to interact with vectors. + +### Example: Reading an int64 Vector with `NULL` Values + +```c +duckdb_database db; +duckdb_connection con; +duckdb_open(nullptr, &db); +duckdb_connect(db, &con); + +duckdb_result res; +duckdb_query(con, "SELECT CASE WHEN i%2=0 THEN NULL ELSE i END res_col FROM range(10) t(i)", &res); + +// iterate until result is exhausted +while (true) { + duckdb_data_chunk result = duckdb_fetch_chunk(res); + if (!result) { + // result is exhausted + break; + } + // get the number of rows from the data chunk + idx_t row_count = duckdb_data_chunk_get_size(result); + // get the first column + duckdb_vector res_col = duckdb_data_chunk_get_vector(result, 0); + // get the native array and the validity mask of the vector + int64_t *vector_data = (int64_t *) duckdb_vector_get_data(res_col); + uint64_t *vector_validity = duckdb_vector_get_validity(res_col); + // iterate over the rows + for (idx_t row = 0; row < row_count; row++) { + if (duckdb_validity_row_is_valid(vector_validity, row)) { + printf("%lld\n", vector_data[row]); + } else { + printf("NULL\n"); + } + } + duckdb_destroy_data_chunk(&result); +} +// clean-up +duckdb_destroy_result(&res); +duckdb_disconnect(&con); +duckdb_close(&db); +``` + +### Example: Reading a String Vector + +```c +duckdb_database db; +duckdb_connection con; +duckdb_open(nullptr, &db); +duckdb_connect(db, &con); + +duckdb_result res; +duckdb_query(con, "SELECT CASE WHEN i%2=0 THEN CONCAT('short_', i) ELSE CONCAT('longstringprefix', i) END FROM range(10) t(i)", &res); + +// iterate until result is exhausted +while (true) { + duckdb_data_chunk result = duckdb_fetch_chunk(res); + if (!result) { + // result is exhausted + break; + } + // get the number of rows from the data chunk + idx_t row_count = duckdb_data_chunk_get_size(result); + // get the first column + duckdb_vector res_col = duckdb_data_chunk_get_vector(result, 0); + // get the native array and the validity mask of the vector + duckdb_string_t *vector_data = (duckdb_string_t *) duckdb_vector_get_data(res_col); + uint64_t *vector_validity = duckdb_vector_get_validity(res_col); + // iterate over the rows + for (idx_t row = 0; row < row_count; row++) { + if (duckdb_validity_row_is_valid(vector_validity, row)) { + duckdb_string_t str = vector_data[row]; + if (duckdb_string_is_inlined(str)) { + // use inlined string + printf("%.*s\n", str.value.inlined.length, str.value.inlined.inlined); + } else { + // follow string pointer + printf("%.*s\n", str.value.pointer.length, str.value.pointer.ptr); + } + } else { + printf("NULL\n"); + } + } + duckdb_destroy_data_chunk(&result); +} +// clean-up +duckdb_destroy_result(&res); +duckdb_disconnect(&con); +duckdb_close(&db); +``` + +### Example: Reading a Struct Vector + +```c +duckdb_database db; +duckdb_connection con; +duckdb_open(nullptr, &db); +duckdb_connect(db, &con); + +duckdb_result res; +duckdb_query(con, "SELECT CASE WHEN i%5=0 THEN NULL ELSE {'col1': i, 'col2': CASE WHEN i%2=0 THEN NULL ELSE 100 + i * 42 END} END FROM range(10) t(i)", &res); + +// iterate until result is exhausted +while (true) { + duckdb_data_chunk result = duckdb_fetch_chunk(res); + if (!result) { + // result is exhausted + break; + } + // get the number of rows from the data chunk + idx_t row_count = duckdb_data_chunk_get_size(result); + // get the struct column + duckdb_vector struct_col = duckdb_data_chunk_get_vector(result, 0); + uint64_t *struct_validity = duckdb_vector_get_validity(struct_col); + // get the child columns of the struct + duckdb_vector col1_vector = duckdb_struct_vector_get_child(struct_col, 0); + int64_t *col1_data = (int64_t *) duckdb_vector_get_data(col1_vector); + uint64_t *col1_validity = duckdb_vector_get_validity(col1_vector); + + duckdb_vector col2_vector = duckdb_struct_vector_get_child(struct_col, 1); + int64_t *col2_data = (int64_t *) duckdb_vector_get_data(col2_vector); + uint64_t *col2_validity = duckdb_vector_get_validity(col2_vector); + + // iterate over the rows + for (idx_t row = 0; row < row_count; row++) { + if (!duckdb_validity_row_is_valid(struct_validity, row)) { + // entire struct is NULL + printf("NULL\n"); + continue; + } + // read col1 + printf("{'col1': "); + if (!duckdb_validity_row_is_valid(col1_validity, row)) { + // col1 is NULL + printf("NULL"); + } else { + printf("%lld", col1_data[row]); + } + printf(", 'col2': "); + if (!duckdb_validity_row_is_valid(col2_validity, row)) { + // col2 is NULL + printf("NULL"); + } else { + printf("%lld", col2_data[row]); + } + printf("}\n"); + } + duckdb_destroy_data_chunk(&result); +} +// clean-up +duckdb_destroy_result(&res); +duckdb_disconnect(&con); +duckdb_close(&db); +``` + +### Example: Reading a List Vector + +```c +duckdb_database db; +duckdb_connection con; +duckdb_open(nullptr, &db); +duckdb_connect(db, &con); + +duckdb_result res; +duckdb_query(con, "SELECT CASE WHEN i % 5 = 0 THEN NULL WHEN i % 2 = 0 THEN [i, i + 1] ELSE [i * 42, NULL, i * 84] END FROM range(10) t(i)", &res); + +// iterate until result is exhausted +while (true) { + duckdb_data_chunk result = duckdb_fetch_chunk(res); + if (!result) { + // result is exhausted + break; + } + // get the number of rows from the data chunk + idx_t row_count = duckdb_data_chunk_get_size(result); + // get the list column + duckdb_vector list_col = duckdb_data_chunk_get_vector(result, 0); + duckdb_list_entry *list_data = (duckdb_list_entry *) duckdb_vector_get_data(list_col); + uint64_t *list_validity = duckdb_vector_get_validity(list_col); + // get the child column of the list + duckdb_vector list_child = duckdb_list_vector_get_child(list_col); + int64_t *child_data = (int64_t *) duckdb_vector_get_data(list_child); + uint64_t *child_validity = duckdb_vector_get_validity(list_child); + + // iterate over the rows + for (idx_t row = 0; row < row_count; row++) { + if (!duckdb_validity_row_is_valid(list_validity, row)) { + // entire list is NULL + printf("NULL\n"); + continue; + } + // read the list offsets for this row + duckdb_list_entry list = list_data[row]; + printf("["); + for (idx_t child_idx = list.offset; child_idx < list.offset + list.length; child_idx++) { + if (child_idx > list.offset) { + printf(", "); + } + if (!duckdb_validity_row_is_valid(child_validity, child_idx)) { + // col1 is NULL + printf("NULL"); + } else { + printf("%lld", child_data[child_idx]); + } + } + printf("]\n"); + } + duckdb_destroy_data_chunk(&result); +} +// clean-up +duckdb_destroy_result(&res); +duckdb_disconnect(&con); +duckdb_close(&db); +``` + +## API Reference Overview + + + +
duckdb_logical_type duckdb_vector_get_column_type(duckdb_vector vector);
+void *duckdb_vector_get_data(duckdb_vector vector);
+uint64_t *duckdb_vector_get_validity(duckdb_vector vector);
+void duckdb_vector_ensure_validity_writable(duckdb_vector vector);
+void duckdb_vector_assign_string_element(duckdb_vector vector, idx_t index, const char *str);
+void duckdb_vector_assign_string_element_len(duckdb_vector vector, idx_t index, const char *str, idx_t str_len);
+duckdb_vector duckdb_list_vector_get_child(duckdb_vector vector);
+idx_t duckdb_list_vector_get_size(duckdb_vector vector);
+duckdb_state duckdb_list_vector_set_size(duckdb_vector vector, idx_t size);
+duckdb_state duckdb_list_vector_reserve(duckdb_vector vector, idx_t required_capacity);
+duckdb_vector duckdb_struct_vector_get_child(duckdb_vector vector, idx_t index);
+duckdb_vector duckdb_array_vector_get_child(duckdb_vector vector);
+
+ +### Validity Mask Functions + +
bool duckdb_validity_row_is_valid(uint64_t *validity, idx_t row);
+void duckdb_validity_set_row_validity(uint64_t *validity, idx_t row, bool valid);
+void duckdb_validity_set_row_invalid(uint64_t *validity, idx_t row);
+void duckdb_validity_set_row_valid(uint64_t *validity, idx_t row);
+
+ +### `duckdb_vector_get_column_type` + +Retrieves the column type of the specified vector. + +The result must be destroyed with `duckdb_destroy_logical_type`. + +#### Syntax + +
duckdb_logical_type duckdb_vector_get_column_type(
+  duckdb_vector vector
+);
+
+ +#### Parameters + +* `vector` + +The vector get the data from +* `returns` + +The type of the vector + +
+ +### `duckdb_vector_get_data` + +Retrieves the data pointer of the vector. + +The data pointer can be used to read or write values from the vector. +How to read or write values depends on the type of the vector. + +#### Syntax + +
void *duckdb_vector_get_data(
+  duckdb_vector vector
+);
+
+ +#### Parameters + +* `vector` + +The vector to get the data from +* `returns` + +The data pointer + +
+ +### `duckdb_vector_get_validity` + +Retrieves the validity mask pointer of the specified vector. + +If all values are valid, this function MIGHT return NULL! + +The validity mask is a bitset that signifies null-ness within the data chunk. +It is a series of uint64_t values, where each uint64_t value contains validity for 64 tuples. +The bit is set to 1 if the value is valid (i.e., not NULL) or 0 if the value is invalid (i.e., NULL). + +Validity of a specific value can be obtained like this: + +idx_t entry_idx = row_idx / 64; +idx_t idx_in_entry = row_idx % 64; +bool is_valid = validity_mask[entry_idx] & (1 << idx_in_entry); + +Alternatively, the (slower) duckdb_validity_row_is_valid function can be used. + +#### Syntax + +
uint64_t *duckdb_vector_get_validity(
+  duckdb_vector vector
+);
+
+ +#### Parameters + +* `vector` + +The vector to get the data from +* `returns` + +The pointer to the validity mask, or NULL if no validity mask is present + +
+ +### `duckdb_vector_ensure_validity_writable` + +Ensures the validity mask is writable by allocating it. + +After this function is called, `duckdb_vector_get_validity` will ALWAYS return non-NULL. +This allows null values to be written to the vector, regardless of whether a validity mask was present before. + +#### Syntax + +
void duckdb_vector_ensure_validity_writable(
+  duckdb_vector vector
+);
+
+ +#### Parameters + +* `vector` + +The vector to alter + +
+ +### `duckdb_vector_assign_string_element` + +Assigns a string element in the vector at the specified location. + +#### Syntax + +
void duckdb_vector_assign_string_element(
+  duckdb_vector vector,
+  idx_t index,
+  const char *str
+);
+
+ +#### Parameters + +* `vector` + +The vector to alter +* `index` + +The row position in the vector to assign the string to +* `str` + +The null-terminated string + +
+ +### `duckdb_vector_assign_string_element_len` + +Assigns a string element in the vector at the specified location. You may also use this function to assign BLOBs. + +#### Syntax + +
void duckdb_vector_assign_string_element_len(
+  duckdb_vector vector,
+  idx_t index,
+  const char *str,
+  idx_t str_len
+);
+
+ +#### Parameters + +* `vector` + +The vector to alter +* `index` + +The row position in the vector to assign the string to +* `str` + +The string +* `str_len` + +The length of the string (in bytes) + +
+ +### `duckdb_list_vector_get_child` + +Retrieves the child vector of a list vector. + +The resulting vector is valid as long as the parent vector is valid. + +#### Syntax + +
duckdb_vector duckdb_list_vector_get_child(
+  duckdb_vector vector
+);
+
+ +#### Parameters + +* `vector` + +The vector +* `returns` + +The child vector + +
+ +### `duckdb_list_vector_get_size` + +Returns the size of the child vector of the list. + +#### Syntax + +
idx_t duckdb_list_vector_get_size(
+  duckdb_vector vector
+);
+
+ +#### Parameters + +* `vector` + +The vector +* `returns` + +The size of the child list + +
+ +### `duckdb_list_vector_set_size` + +Sets the total size of the underlying child-vector of a list vector. + +#### Syntax + +
duckdb_state duckdb_list_vector_set_size(
+  duckdb_vector vector,
+  idx_t size
+);
+
+ +#### Parameters + +* `vector` + +The list vector. +* `size` + +The size of the child list. +* `returns` + +The duckdb state. Returns DuckDBError if the vector is nullptr. + +
+ +### `duckdb_list_vector_reserve` + +Sets the total capacity of the underlying child-vector of a list. + +#### Syntax + +
duckdb_state duckdb_list_vector_reserve(
+  duckdb_vector vector,
+  idx_t required_capacity
+);
+
+ +#### Parameters + +* `vector` + +The list vector. +* `required_capacity` + +the total capacity to reserve. +* `return` + +The duckdb state. Returns DuckDBError if the vector is nullptr. + +
+ +### `duckdb_struct_vector_get_child` + +Retrieves the child vector of a struct vector. + +The resulting vector is valid as long as the parent vector is valid. + +#### Syntax + +
duckdb_vector duckdb_struct_vector_get_child(
+  duckdb_vector vector,
+  idx_t index
+);
+
+ +#### Parameters + +* `vector` + +The vector +* `index` + +The child index +* `returns` + +The child vector + +
+ +### `duckdb_array_vector_get_child` + +Retrieves the child vector of a array vector. + +The resulting vector is valid as long as the parent vector is valid. +The resulting vector has the size of the parent vector multiplied by the array size. + +#### Syntax + +
duckdb_vector duckdb_array_vector_get_child(
+  duckdb_vector vector
+);
+
+ +#### Parameters + +* `vector` + +The vector +* `returns` + +The child vector + +
+ +### `duckdb_validity_row_is_valid` + +Returns whether or not a row is valid (i.e., not NULL) in the given validity mask. + +#### Syntax + +
bool duckdb_validity_row_is_valid(
+  uint64_t *validity,
+  idx_t row
+);
+
+ +#### Parameters + +* `validity` + +The validity mask, as obtained through `duckdb_vector_get_validity` +* `row` + +The row index +* `returns` + +true if the row is valid, false otherwise + +
+ +### `duckdb_validity_set_row_validity` + +In a validity mask, sets a specific row to either valid or invalid. + +Note that `duckdb_vector_ensure_validity_writable` should be called before calling `duckdb_vector_get_validity`, +to ensure that there is a validity mask to write to. + +#### Syntax + +
void duckdb_validity_set_row_validity(
+  uint64_t *validity,
+  idx_t row,
+  bool valid
+);
+
+ +#### Parameters + +* `validity` + +The validity mask, as obtained through `duckdb_vector_get_validity`. +* `row` + +The row index +* `valid` + +Whether or not to set the row to valid, or invalid + +
+ +### `duckdb_validity_set_row_invalid` + +In a validity mask, sets a specific row to invalid. + +Equivalent to `duckdb_validity_set_row_validity` with valid set to false. + +#### Syntax + +
void duckdb_validity_set_row_invalid(
+  uint64_t *validity,
+  idx_t row
+);
+
+ +#### Parameters + +* `validity` + +The validity mask +* `row` + +The row index + +
+ +### `duckdb_validity_set_row_valid` + +In a validity mask, sets a specific row to valid. + +Equivalent to `duckdb_validity_set_row_validity` with valid set to true. + +#### Syntax + +
void duckdb_validity_set_row_valid(
+  uint64_t *validity,
+  idx_t row
+);
+
+ +#### Parameters + +* `validity` + +The validity mask +* `row` + +The row index + +
\ No newline at end of file diff --git a/docs/archive/1.0/api/cli/arguments.md b/docs/archive/1.0/api/cli/arguments.md new file mode 100644 index 00000000000..dffe164901e --- /dev/null +++ b/docs/archive/1.0/api/cli/arguments.md @@ -0,0 +1,54 @@ +--- +layout: docu +title: Command Line Arguments +--- + +The table below summarizes DuckDB's command line options. +To list all command line options, use the command: + +```bash +duckdb -help +``` + +For a list of dot commands available in the CLI shell, see the [Dot Commands page]({% link docs/archive/1.0/api/cli/dot_commands.md %}). + +
+ + + +| Argument | Description | +|---|-------| +| `-append` | Append the database to the end of the file | +| `-ascii` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `ascii` | +| `-bail` | Stop after hitting an error | +| `-batch` | Force batch I/O | +| `-box` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `box` | +| `-column` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `column` | +| `-cmd COMMAND` | Run `COMMAND` before reading `stdin` | +| `-c COMMAND` | Run `COMMAND` and exit | +| `-csv` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `csv` | +| `-echo` | Print commands before execution | +| `-init FILENAME` | Run the script in `FILENAME` upon startup (instead of `~/.duckdbrc`) | +| `-header` | Turn headers on | +| `-help` | Show this message | +| `-html` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to HTML | +| `-interactive` | Force interactive I/O | +| `-json` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `json` | +| `-line` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `line` | +| `-list` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `list` | +| `-markdown` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `markdown` | +| `-newline SEP` | Set output row separator. Default: `\n` | +| `-nofollow` | Refuse to open symbolic links to database files | +| `-noheader` | Turn headers off | +| `-no-stdin` | Exit after processing options instead of reading stdin | +| `-nullvalue TEXT` | Set text string for `NULL` values. Default: empty string | +| `-quote` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `quote` | +| `-readonly` | Open the database read-only | +| `-s COMMAND` | Run `COMMAND` and exit | +| `-separator SEP` | Set output column separator to `SEP`. Default: `|` | +| `-stats` | Print memory stats before each finalize | +| `-table` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) to `table` | +| `-unsigned` | Allow loading of [unsigned extensions]({% link docs/archive/1.0/extensions/overview.md %}#unsigned-extensions) | +| `-version` | Show DuckDB version | + + \ No newline at end of file diff --git a/docs/archive/1.0/api/cli/autocomplete.md b/docs/archive/1.0/api/cli/autocomplete.md new file mode 100644 index 00000000000..5d0ad429ec0 --- /dev/null +++ b/docs/archive/1.0/api/cli/autocomplete.md @@ -0,0 +1,57 @@ +--- +layout: docu +title: Autocomplete +--- + +The shell offers context-aware autocomplete of SQL queries through the [`autocomplete` extension]({% link docs/archive/1.0/extensions/autocomplete.md %}). autocomplete is triggered by pressing `Tab`. + +Multiple autocomplete suggestions can be present. You can cycle forwards through the suggestions by repeatedly pressing `Tab`, or `Shift+Tab` to cycle backwards. autocompletion can be reverted by pressing `ESC` twice. + +The shell autocompletes four different groups: + +* Keywords +* Table names and table functions +* Column names and scalar functions +* File names + +The shell looks at the position in the SQL statement to determine which of these autocompletions to trigger. For example: + +```sql +SELECT s +``` + +```text +student_id +``` + +```sql +SELECT student_id F +``` + +```text +FROM +``` + +```sql +SELECT student_id FROM g +``` + +```text +grades +``` + +```sql +SELECT student_id FROM 'd +``` + +```text +'data/ +``` + +```sql +SELECT student_id FROM 'data/ +``` + +```text +'data/grades.csv +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/cli/dot_commands.md b/docs/archive/1.0/api/cli/dot_commands.md new file mode 100644 index 00000000000..469f1eda37d --- /dev/null +++ b/docs/archive/1.0/api/cli/dot_commands.md @@ -0,0 +1,243 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/api/cli/dot-commands +title: Dot Commands +--- + +Dot commands are available in the DuckDB CLI client. To use one of these commands, begin the line with a period (`.`) immediately followed by the name of the command you wish to execute. Additional arguments to the command are entered, space separated, after the command. If an argument must contain a space, either single or double quotes may be used to wrap that parameter. Dot commands must be entered on a single line, and no whitespace may occur before the period. No semicolon is required at the end of the line. To see available commands, use the `.help` command. + +## Dot Commands + +
+ + + +| Command | Description | +|---|------| +| `.bail on|off` | Stop after hitting an error. Default: `off` | +| `.binary on|off` | Turn binary output on or off. Default: `off` | +| `.cd DIRECTORY` | Change the working directory to `DIRECTORY` | +| `.changes on|off` | Show number of rows changed by SQL | +| `.check GLOB` | Fail if output since .testcase does not match | +| `.columns` | Column-wise rendering of query results | +| `.constant ?COLOR?` | Sets the syntax highlighting color used for constant values | +| `.constantcode ?CODE?` | Sets the syntax highlighting terminal code used for constant values | +| `.databases` | List names and files of attached databases | +| `.echo on|off` | Turn command echo on or `off` | +| `.excel` | Display the output of next command in spreadsheet | +| `.exit ?CODE?` | Exit this program with return-code `CODE` | +| `.explain ?on|off|auto?` | Change the `EXPLAIN` formatting mode. Default: `auto` | +| `.fullschema ?--indent?` | Show schema and the content of `sqlite_stat` tables | +| `.headers on|off` | Turn display of headers on or `off` | +| `.help ?-all? ?PATTERN?` | Show help text for `PATTERN` | +| `.highlight [on|off]` | Toggle syntax highlighting in the shell `on`/`off` | +| `.import FILE TABLE` | Import data from `FILE` into `TABLE` | +| `.indexes ?TABLE?` | Show names of indexes | +| `.keyword ?COLOR?` | Sets the syntax highlighting color used for keywords | +| `.keywordcode ?CODE?` | Sets the syntax highlighting terminal code used for keywords | +| `.lint OPTIONS` | Report potential schema issues. | +| `.log FILE|off` | Turn logging on or off. `FILE` can be `stderr`/`stdout` | +| `.maxrows COUNT` | Sets the maximum number of rows for display. Only for [duckbox mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) | +| `.maxwidth COUNT` | Sets the maximum width in characters. 0 defaults to terminal width. Only for [duckbox mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) | +| `.mode MODE ?TABLE?` | Set [output mode]({% link docs/archive/1.0/api/cli/output_formats.md %}) | +| `.multiline` | Set multi-line mode (default) | +| `.nullvalue STRING` | Use `STRING` in place of `NULL` values | +| `.once ?OPTIONS? ?FILE?` | Output for the next SQL command only to `FILE` | +| `.open ?OPTIONS? ?FILE?` | Close existing database and reopen `FILE` | +| `.output ?FILE?` | Send output to `FILE` or `stdout` if `FILE` is omitted | +| `.parameter CMD ...` | Manage SQL parameter bindings | +| `.print STRING...` | Print literal `STRING` | +| `.prompt MAIN CONTINUE` | Replace the standard prompts | +| `.quit` | Exit this program | +| `.read FILE` | Read input from `FILE` | +| `.rows` | Row-wise rendering of query results (default) | +| `.schema ?PATTERN?` | Show the `CREATE` statements matching `PATTERN` | +| `.separator COL ?ROW?` | Change the column and row separators | +| `.sha3sum ...` | Compute a SHA3 hash of database content | +| `.shell CMD ARGS...` | Run `CMD ARGS...` in a system shell | +| `.show` | Show the current values for various settings | +| `.singleline` | Set single-line mode | +| `.system CMD ARGS...` | Run `CMD ARGS...` in a system shell | +| `.tables ?TABLE?` | List names of tables [matching LIKE pattern]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}) `TABLE` | +| `.testcase NAME` | Begin redirecting output to `NAME` | +| `.timer on|off` | Turn SQL timer on or off | +| `.width NUM1 NUM2 ...` | Set minimum column widths for columnar output | + +## Using the `.help` Command + +The `.help` text may be filtered by passing in a text string as the second argument. + +```text +.help m +``` + +```text +.maxrows COUNT Sets the maximum number of rows for display (default: 40). Only for duckbox mode. +.maxwidth COUNT Sets the maximum width in characters. 0 defaults to terminal width. Only for duckbox mode. +.mode MODE ?TABLE? Set output mode +``` + +### `.output`: Writing Results to a File + +By default, the DuckDB CLI sends results to the terminal's standard output. However, this can be modified using either the `.output` or `.once` commands. Pass in the desired output file location as a parameter. The `.once` command will only output the next set of results and then revert to standard out, but `.output` will redirect all subsequent output to that file location. Note that each result will overwrite the entire file at that destination. To revert back to standard output, enter `.output` with no file parameter. + +In this example, the output format is changed to `markdown`, the destination is identified as a Markdown file, and then DuckDB will write the output of the SQL statement to that file. Output is then reverted to standard output using `.output` with no parameter. + +```sql +.mode markdown +.output my_results.md +SELECT 'taking flight' AS output_column; +.output +SELECT 'back to the terminal' AS displayed_column; +``` + +The file `my_results.md` will then contain: + +```text +| output_column | +|---------------| +| taking flight | +``` + +The terminal will then display: + +```text +| displayed_column | +|----------------------| +| back to the terminal | +``` + +A common output format is CSV, or comma separated values. DuckDB supports [SQL syntax to export data as CSV or Parquet]({% link docs/archive/1.0/sql/statements/copy.md %}#copy-to), but the CLI-specific commands may be used to write a CSV instead if desired. + +```sql +.mode csv +.once my_output_file.csv +SELECT 1 AS col_1, 2 AS col_2 +UNION ALL +SELECT 10 AS col1, 20 AS col_2; +``` + +The file `my_output_file.csv` will then contain: + +```csv +col_1,col_2 +1,2 +10,20 +``` + +By passing special options (flags) to the `.once` command, query results can also be sent to a temporary file and automatically opened in the user's default program. Use either the `-e` flag for a text file (opened in the default text editor), or the `-x` flag for a CSV file (opened in the default spreadsheet editor). This is useful for more detailed inspection of query results, especially if there is a relatively large result set. The `.excel` command is equivalent to `.once -x`. + +```sql +.once -e +SELECT 'quack' AS hello; +``` + +The results then open in the default text file editor of the system, for example: + +cli_docs_output_to_text_editor + +## Querying the Database Schema + +All DuckDB clients support [querying the database schema with SQL]({% link docs/archive/1.0/sql/meta/information_schema.md %}), but the CLI has additional [dot commands]({% link docs/archive/1.0/api/cli/dot_commands.md %}) that can make it easier to understand the contents of a database. +The `.tables` command will return a list of tables in the database. It has an optional argument that will filter the results according to a [`LIKE` pattern]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#like). + +```sql +CREATE TABLE swimmers AS SELECT 'duck' AS animal; +CREATE TABLE fliers AS SELECT 'duck' AS animal; +CREATE TABLE walkers AS SELECT 'duck' AS animal; +.tables +``` + +```text +fliers swimmers walkers +``` + +For example, to filter to only tables that contain an `l`, use the `LIKE` pattern `%l%`. + +```sql +.tables %l% +``` + +```text +fliers walkers +``` + +The `.schema` command will show all of the SQL statements used to define the schema of the database. + +```text +.schema +``` + +```sql +CREATE TABLE fliers (animal VARCHAR); +CREATE TABLE swimmers (animal VARCHAR); +CREATE TABLE walkers (animal VARCHAR); +``` + +## Configuring the Syntax Highlighter + +By default the shell includes support for syntax highlighting. +The CLI's syntax highlighter can be configured using the following commands. + +To turn off the highlighter: + +```text +.highlight on +``` + +To turn on the highlighter: + +```text +.highlight off +``` + +To configure the color used to highlight constants: + +```text +.constant [red|green|yellow|blue|magenta|cyan|white|brightblack|brightred|brightgreen|brightyellow|brightblue|brightmagenta|brightcyan|brightwhite] +``` + +```text +.constantcode [terminal_code] +``` + +To configure the color used to highlight keywords: + +```text +.keyword [red|green|yellow|blue|magenta|cyan|white|brightblack|brightred|brightgreen|brightyellow|brightblue|brightmagenta|brightcyan|brightwhite] +``` + +```text +.keywordcode [terminal_code] +``` + +## Importing Data from CSV + +> Deprecated This feature is only included for compatibility reasons and may be removed in the future. +> Use the [`read_csv` function or the `COPY` statement]({% link docs/archive/1.0/data/csv/overview.md %}) to load CSV files. + +DuckDB supports [SQL syntax to directly query or import CSV files]({% link docs/archive/1.0/data/csv/overview.md %}), but the CLI-specific commands may be used to import a CSV instead if desired. The `.import` command takes two arguments and also supports several options. The first argument is the path to the CSV file, and the second is the name of the DuckDB table to create. Since DuckDB requires stricter typing than SQLite (upon which the DuckDB CLI is based), the destination table must be created before using the `.import` command. To automatically detect the schema and create a table from a CSV, see the [`read_csv` examples in the import docs]({% link docs/archive/1.0/data/csv/overview.md %}). + +In this example, a CSV file is generated by changing to CSV mode and setting an output file location: + +```sql +.mode csv +.output import_example.csv +SELECT 1 AS col_1, 2 AS col_2 UNION ALL SELECT 10 AS col1, 20 AS col_2; +``` + +Now that the CSV has been written, a table can be created with the desired schema and the CSV can be imported. The output is reset to the terminal to avoid continuing to edit the output file specified above. The `--skip N` option is used to ignore the first row of data since it is a header row and the table has already been created with the correct column names. + +```text +.mode csv +.output +CREATE TABLE test_table (col_1 INTEGER, col_2 INTEGER); +.import import_example.csv test_table --skip 1 +``` + +Note that the `.import` command utilizes the current `.mode` and `.separator` settings when identifying the structure of the data to import. The `--csv` option can be used to override that behavior. + +```text +.import import_example.csv test_table --skip 1 --csv +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/cli/editing.md b/docs/archive/1.0/api/cli/editing.md new file mode 100644 index 00000000000..aa35847afac --- /dev/null +++ b/docs/archive/1.0/api/cli/editing.md @@ -0,0 +1,100 @@ +--- +layout: docu +title: Editing +--- + +> The linenoise-based CLI editor is currently only available for macOS and Linux. + +DuckDB's CLI uses a line-editing library based on [linenoise](https://github.com/antirez/linenoise), which has shortcuts that are based on [Emacs mode of readline](https://readline.kablamo.org/emacs.html). Below is a list of available commands. + +## Moving + +
+ +| Key | Action | +|-----------------|------------------------------------------------------------------------| +| `Left` | Move back a character | +| `Right` | Move forward a character | +| `Up` | Move up a line. When on the first line, move to previous history entry | +| `Down` | Move down a line. When on last line, move to next history entry | +| `Home` | Move to beginning of buffer | +| `End` | Move to end of buffer | +| `Ctrl`+`Left` | Move back a word | +| `Ctrl`+`Right` | Move forward a word | +| `Ctrl`+`A` | Move to beginning of buffer | +| `Ctrl`+`B` | Move back a character | +| `Ctrl`+`E` | Move to end of buffer | +| `Ctrl`+`F` | Move forward a character | +| `Alt`+`Left` | Move back a word | +| `Alt`+`Right` | Move forward a word | + +## History + +
+ +| Key | Action | +|------------|--------------------------------| +| `Ctrl`+`P` | Move to previous history entry | +| `Ctrl`+`N` | Move to next history entry | +| `Ctrl`+`R` | Search the history | +| `Ctrl`+`S` | Search the history | +| `Alt`+`<` | Move to first history entry | +| `Alt`+`>` | Move to last history entry | +| `Alt`+`N` | Search the history | +| `Alt`+`P` | Search the history | + +## Changing Text + +
+ +| Key | Action | +|-------------------|----------------------------------------------------------| +| `Backspace` | Delete previous character | +| `Delete` | Delete next character | +| `Ctrl`+`D` | Delete next character. When buffer is empty, end editing | +| `Ctrl`+`H` | Delete previous character | +| `Ctrl`+`K` | Delete everything after the cursor | +| `Ctrl`+`T` | Swap current and next character | +| `Ctrl`+`U` | Delete all text | +| `Ctrl`+`W` | Delete previous word | +| `Alt`+`C` | Convert next word to titlecase | +| `Alt`+`D` | Delete next word | +| `Alt`+`L` | Convert next word to lowercase | +| `Alt`+`R` | Delete all text | +| `Alt`+`T` | Swap current and next word | +| `Alt`+`U` | Convert next word to uppercase | +| `Alt`+`Backspace` | Delete previous word | +| `Alt`+`\` | Delete spaces around cursor | + +## Completing + +
+ +| Key | Action | +|---------------|--------------------------------------------------------| +| `Tab` | Autocomplete. When autocompleting, cycle to next entry | +| `Shift`+`Tab` | When autocompleting, cycle to previous entry | +| `Esc`+`Esc` | When autocompleting, revert autocompletion | + +## Miscellaneous + +
+ +| Key | Action | +|------------|------------------------------------------------------------------------------------| +| `Enter` | Execute query. If query is not complete, insert a newline at the end of the buffer | +| `Ctrl`+`J` | Execute query. If query is not complete, insert a newline at the end of the buffer | +| `Ctrl`+`C` | Cancel editing of current query | +| `Ctrl`+`G` | Cancel editing of current query | +| `Ctrl`+`L` | Clear screen | +| `Ctrl`+`O` | Cancel editing of current query | +| `Ctrl`+`X` | Insert a newline after the cursor | +| `Ctrl`+`Z` | Suspend CLI and return to shell, use `fg` to re-open | + +## Using Read-Line + +If you prefer, you can use [`rlwrap`](https://github.com/hanslub42/rlwrap) to use read-line directly with the shell. Then, use `Shift`+`Enter` to insert a newline and `Enter` to execute the query: + +```bash +rlwrap --substitute-prompt="D " duckdb -batch +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/cli/output_formats.md b/docs/archive/1.0/api/cli/output_formats.md new file mode 100644 index 00000000000..d11a5223cc5 --- /dev/null +++ b/docs/archive/1.0/api/cli/output_formats.md @@ -0,0 +1,81 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/api/cli/output-formats +title: Output Formats +--- + +The `.mode` [dot command]({% link docs/archive/1.0/api/cli/dot_commands.md %}) may be used to change the appearance of the tables returned in the terminal output. In addition to customizing the appearance, these modes have additional benefits. This can be useful for presenting DuckDB output elsewhere by redirecting the terminal [output to a file]({% link docs/archive/1.0/api/cli/dot_commands.md %}#output-writing-results-to-a-file). Using the `insert` mode will build a series of SQL statements that can be used to insert the data at a later point. +The `markdown` mode is particularly useful for building documentation and the `latex` mode is useful for writing academic papers. + +
+ +| Mode | Description | +|--------------|----------------------------------------------| +| `ascii` | Columns/rows delimited by 0x1F and 0x1E | +| `box` | Tables using unicode box-drawing characters | +| `csv` | Comma-separated values | +| `column` | Output in columns. (See .width) | +| `duckbox` | Tables with extensive features (default) | +| `html` | HTML `` code | +| `insert` | SQL insert statements for TABLE | +| `json` | Results in a JSON array | +| `jsonlines` | Results in a NDJSON | +| `latex` | LaTeX tabular environment code | +| `line` | One value per line | +| `list` | Values delimited by "\|" | +| `markdown` | Markdown table format | +| `quote` | Escape answers as for SQL | +| `table` | ASCII-art table | +| `tabs` | Tab-separated values | +| `tcl` | TCL list elements | +| `trash` | No output | + +Use `.mode` directly to query the appearance currently in use. + +```sql +.mode +``` + +```text +current output mode: duckbox +``` + +```sql +.mode markdown +SELECT 'quacking intensifies' AS incoming_ducks; +``` + +```text +| incoming_ducks | +|----------------------| +| quacking intensifies | +``` + +The output appearance can also be adjusted with the `.separator` command. If using an export mode that relies on a separator (`csv` or `tabs` for example), the separator will be reset when the mode is changed. For example, `.mode csv` will set the separator to a comma (`,`). Using `.separator "|"` will then convert the output to be pipe-separated. + +```sql +.mode csv +SELECT 1 AS col_1, 2 AS col_2 +UNION ALL +SELECT 10 AS col1, 20 AS col_2; +``` + +```csv +col_1,col_2 +1,2 +10,20 +``` + +```sql +.separator "|" +SELECT 1 AS col_1, 2 AS col_2 +UNION ALL +SELECT 10 AS col1, 20 AS col_2; +``` + +```csv +col_1|col_2 +1|2 +10|20 +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/cli/overview.md b/docs/archive/1.0/api/cli/overview.md new file mode 100644 index 00000000000..9f9e7a39094 --- /dev/null +++ b/docs/archive/1.0/api/cli/overview.md @@ -0,0 +1,313 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/api/cli +- /docs/archive/1.0/api/cli/ +title: CLI API +--- + +## Installation + +The DuckDB CLI (Command Line Interface) is a single, dependency-free executable. It is precompiled for Windows, Mac, and Linux for both the stable version and for nightly builds produced by GitHub Actions. Please see the [installation page]({% link docs/archive/1.0/installation/index.html %}) under the CLI tab for download links. + +The DuckDB CLI is based on the SQLite command line shell, so CLI-client-specific functionality is similar to what is described in the [SQLite documentation](https://www.sqlite.org/cli.html) (although DuckDB's SQL syntax follows PostgreSQL conventions with a [few exceptions]({% link docs/archive/1.0/sql/dialect/postgresql_compatibility.md %})). + +> DuckDB has a [tldr page](https://tldr.inbrowser.app/pages/common/duckdb), which summarizes the most common uses of the CLI client. +> If you have [tldr](https://github.com/tldr-pages/tldr) installed, you can display it by running `tldr duckdb`. + +## Getting Started + +Once the CLI executable has been downloaded, unzip it and save it to any directory. +Navigate to that directory in a terminal and enter the command `duckdb` to run the executable. +If in a PowerShell or POSIX shell environment, use the command `./duckdb` instead. + +## Usage + +The typical usage of the `duckdb` command is the following: + +```bash +duckdb [OPTIONS] [FILENAME] +``` + +### Options + +The `[OPTIONS]` part encodes [arguments for the CLI client]({% link docs/archive/1.0/api/cli/arguments.md %}). Common options include: + +* `-csv`: sets the output mode to CSV +* `-json`: sets the output mode to JSON +* `-readonly`: open the database in read-only mode (see [concurrency in DuckDB]({% link docs/archive/1.0/connect/concurrency.md %}#handling-concurrency)) + +For a full list of options, see the [command line arguments page]({% link docs/archive/1.0/api/cli/arguments.md %}). + +### In-Memory vs. Persistent Database + +When no `[FILENAME]` argument is provided, the DuckDB CLI will open a temporary [in-memory database]({% link docs/archive/1.0/connect/overview.md %}#in-memory-database). +You will see DuckDB's version number, the information on the connection and a prompt starting with a `D`. + +```bash +duckdb +``` + +```text +v{{ site.currentduckdbversion }} {{ site.currentduckdbhash }} +Enter ".help" for usage hints. +Connected to a transient in-memory database. +Use ".open FILENAME" to reopen on a persistent database. +D +``` + +To open or create a [persistent database]({% link docs/archive/1.0/connect/overview.md %}#persistent-database), simply include a path as a command line argument: + +```bash +duckdb my_database.duckdb +``` + +### Running SQL Statements in the CLI + +Once the CLI has been opened, enter a SQL statement followed by a semicolon, then hit enter and it will be executed. Results will be displayed in a table in the terminal. If a semicolon is omitted, hitting enter will allow for multi-line SQL statements to be entered. + +```sql +SELECT 'quack' AS my_column; +``` + +| my_column | +|-----------| +| quack | + +The CLI supports all of DuckDB's rich [SQL syntax]({% link docs/archive/1.0/sql/introduction.md %}) including `SELECT`, `CREATE`, and `ALTER` statements. + +### Editor Features + +The CLI supports [autocompletion]({% link docs/archive/1.0/api/cli/autocomplete.md %}), and has sophisticated [editor features]({% link docs/archive/1.0/api/cli/editing.md %}) and [syntax highlighting]({% link docs/archive/1.0/api/cli/syntax_highlighting.md %}) on certain platforms. + +### Exiting the CLI + +To exit the CLI, press `Ctrl`+`D` if your platform supports it. Otherwise, press `Ctrl`+`C` or use the `.exit` command. If used a persistent database, DuckDB will automatically checkpoint (save the latest edits to disk) and close. This will remove the `.wal` file (the [write-ahead log](https://en.wikipedia.org/wiki/Write-ahead_logging)) and consolidate all of your data into the single-file database. + +### Dot Commands + +In addition to SQL syntax, special [dot commands]({% link docs/archive/1.0/api/cli/dot_commands.md %}) may be entered into the CLI client. To use one of these commands, begin the line with a period (`.`) immediately followed by the name of the command you wish to execute. Additional arguments to the command are entered, space separated, after the command. If an argument must contain a space, either single or double quotes may be used to wrap that parameter. Dot commands must be entered on a single line, and no whitespace may occur before the period. No semicolon is required at the end of the line. + +Frequently-used configurations can be stored in the file `~/.duckdbrc`, which will be loaded when starting the CLI client. See the [Configuring the CLI](#configuring-the-cli) section below for further information on these options. + +Below, we summarize a few important dot commands. To see all available commands, see the [dot commands page]({% link docs/archive/1.0/api/cli/dot_commands.md %}) or use the `.help` command. + +#### Opening Database Files + +In addition to connecting to a database when opening the CLI, a new database connection can be made by using the `.open` command. If no additional parameters are supplied, a new in-memory database connection is created. This database will not be persisted when the CLI connection is closed. + +```text +.open +``` + +The `.open` command optionally accepts several options, but the final parameter can be used to indicate a path to a persistent database (or where one should be created). The special string `:memory:` can also be used to open a temporary in-memory database. + +```text +.open persistent.duckdb +``` + +> Warning `.open` closes the current database. +> To keep the current database, while adding a new database, use the [`ATTACH` statement]({% link docs/archive/1.0/sql/statements/attach.md %}). + +One important option accepted by `.open` is the `--readonly` flag. This disallows any editing of the database. To open in read only mode, the database must already exist. This also means that a new in-memory database can't be opened in read only mode since in-memory databases are created upon connection. + +```text +.open --readonly preexisting.duckdb +``` + +#### Output Formats + +The `.mode` [dot command]({% link docs/archive/1.0/api/cli/dot_commands.md %}#mode) may be used to change the appearance of the tables returned in the terminal output. +These include the default `duckbox` mode, `csv` and `json` mode for ingestion by other tools, `markdown` and `latex` for documents, and `insert` mode for generating SQL statements. + +#### Writing Results to a File + +By default, the DuckDB CLI sends results to the terminal's standard output. However, this can be modified using either the `.output` or `.once` commands. +For details, see the documentation for the [output dot command]({% link docs/archive/1.0/api/cli/dot_commands.md %}#output-writing-results-to-a-file). + +#### Reading SQL from a File + +The DuckDB CLI can read both SQL commands and dot commands from an external file instead of the terminal using the `.read` command. This allows for a number of commands to be run in sequence and allows command sequences to be saved and reused. + +The `.read` command requires only one argument: the path to the file containing the SQL and/or commands to execute. After running the commands in the file, control will revert back to the terminal. Output from the execution of that file is governed by the same `.output` and `.once` commands that have been discussed previously. This allows the output to be displayed back to the terminal, as in the first example below, or out to another file, as in the second example. + +In this example, the file `select_example.sql` is located in the same directory as duckdb.exe and contains the following SQL statement: + +```sql +SELECT * +FROM generate_series(5); +``` + +To execute it from the CLI, the `.read` command is used. + +```text +.read select_example.sql +``` + +The output below is returned to the terminal by default. The formatting of the table can be adjusted using the `.output` or `.once` commands. + +```text +| generate_series | +|----------------:| +| 0 | +| 1 | +| 2 | +| 3 | +| 4 | +| 5 | +``` + +Multiple commands, including both SQL and dot commands, can also be run in a single `.read` command. In this example, the file `write_markdown_to_file.sql` is located in the same directory as duckdb.exe and contains the following commands: + +```sql +.mode markdown +.output series.md +SELECT * +FROM generate_series(5); +``` + +To execute it from the CLI, the `.read` command is used as before. + +```text +.read write_markdown_to_file.sql +``` + +In this case, no output is returned to the terminal. Instead, the file `series.md` is created (or replaced if it already existed) with the markdown-formatted results shown here: + +```text +| generate_series | +|----------------:| +| 0 | +| 1 | +| 2 | +| 3 | +| 4 | +| 5 | +``` + + + +## Configuring the CLI + +Several dot commands can be used to configure the CLI. +On startup, the CLI reads and executes all commands in the file `~/.duckdbrc`, including dot commands and SQL statements. +This allows you to store the configuration state of the CLI. +You may also point to a different initialization file using the `-init`. + +### Setting a Custom Prompt + +As an example, a file in the same directory as the DuckDB CLI named `prompt.sql` will change the DuckDB prompt to be a duck head and run a SQL statement. +Note that the duck head is built with Unicode characters and does not work in all terminal environments (e.g., in Windows, unless running with WSL and using the Windows Terminal). + +```text +.prompt '⚫◗ ' +``` + +To invoke that file on initialization, use this command: + +```bash +duckdb -init prompt.sql +``` + +This outputs: + +```text +-- Loading resources from prompt.sql +v⟨version⟩ ⟨git hash⟩ +Enter ".help" for usage hints. +Connected to a transient in-memory database. +Use ".open FILENAME" to reopen on a persistent database. +⚫◗ +``` + +## Non-Interactive Usage + +To read/process a file and exit immediately, pipe the file contents in to `duckdb`: + +```bash +duckdb < select_example.sql +``` + +To execute a command with SQL text passed in directly from the command line, call `duckdb` with two arguments: the database location (or `:memory:`), and a string with the SQL statement to execute. + +```bash +duckdb :memory: "SELECT 42 AS the_answer" +``` + +## Loading Extensions + +To load extensions, use DuckDB's SQL `INSTALL` and `LOAD` commands as you would other SQL statements. + +```sql +INSTALL fts; +LOAD fts; +``` + +For details, see the [Extension docs]({% link docs/archive/1.0/extensions/overview.md %}). + +## Reading from stdin and Writing to stdout + +When in a Unix environment, it can be useful to pipe data between multiple commands. +DuckDB is able to read data from stdin as well as write to stdout using the file location of stdin (`/dev/stdin`) and stdout (`/dev/stdout`) within SQL commands, as pipes act very similarly to file handles. + +This command will create an example CSV: + +```sql +COPY (SELECT 42 AS woot UNION ALL SELECT 43 AS woot) TO 'test.csv' (HEADER); +``` + +First, read a file and pipe it to the `duckdb` CLI executable. As arguments to the DuckDB CLI, pass in the location of the database to open, in this case, an in-memory database, and a SQL command that utilizes `/dev/stdin` as a file location. + +```bash +cat test.csv | duckdb "SELECT * FROM read_csv('/dev/stdin')" +``` + +| woot | +|-----:| +| 42 | +| 43 | + +To write back to stdout, the copy command can be used with the `/dev/stdout` file location. + +```bash +cat test.csv | \ + duckdb "COPY (SELECT * FROM read_csv('/dev/stdin')) TO '/dev/stdout' WITH (FORMAT 'csv', HEADER)" +``` + +```csv +woot +42 +43 +``` + +## Reading Environment Variables + +The `getenv` function can read environment variables. + +### Examples + +To retrieve the home directory's path from the `HOME` environment variable, use: + +```sql +SELECT getenv('HOME') AS home; +``` + +| home | +|------------------| +| /Users/user_name | + +The output of the `getenv` function can be used to set [configuration options]({% link docs/archive/1.0/configuration/overview.md %}). For example, to set the `NULL` order based on the environment variable `DEFAULT_NULL_ORDER`, use: + +```sql +SET default_null_order = getenv('DEFAULT_NULL_ORDER'); +``` + +### Restrictions for Reading Environment Variables + +The `getenv` function can only be run when the [`enable_external_access`]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference) is set to `true` (the default setting). +It is only available in the CLI client and is not supported in other DuckDB clients. + +## Prepared Statements + +The DuckDB CLI supports executing [prepared statements]({% link docs/archive/1.0/sql/query_syntax/prepared_statements.md %}) in addition to regular `SELECT` statements. +To create and execute a prepared statement in the CLI client, use the `PREPARE` clause and the `EXECUTE` statement. \ No newline at end of file diff --git a/docs/archive/1.0/api/cli/syntax_highlighting.md b/docs/archive/1.0/api/cli/syntax_highlighting.md new file mode 100644 index 00000000000..1cef30903d3 --- /dev/null +++ b/docs/archive/1.0/api/cli/syntax_highlighting.md @@ -0,0 +1,65 @@ +--- +layout: docu +title: Syntax Highlighting +--- + +> Syntax highlighting in the CLI is currently only available for macOS and Linux. + +SQL queries that are written in the shell are automatically highlighted using syntax highlighting. + +![Image showing syntax highlighting in the shell](/images/syntax_highlighting_screenshot.png) + +There are several components of a query that are highlighted in different colors. The colors can be configured using [dot commands]({% link docs/archive/1.0/api/cli/dot_commands.md %}). +Syntax highlighting can also be disabled entirely using the `.highlight off` command. + +Below is a list of components that can be configured. + +
+ +| Type | Command | Default Color | +|-------------------------|-------------|-----------------| +| Keywords | `.keyword` | `green` | +| Constants ad literals | `.constant` | `yellow` | +| Comments | `.comment` | `brightblack` | +| Errors | `.error` | `red` | +| Continuation | `.cont` | `brightblack` | +| Continuation (Selected) | `.cont_sel` | `green` | + +The components can be configured using either a supported color name (e.g., `.keyword red`), or by directly providing a terminal code to use for rendering (e.g., `.keywordcode \033[31m`). Below is a list of supported color names and their corresponding terminal codes. + +
+ +| Color | Terminal Code | +|---------------|---------------| +| red | `\033[31m` | +| green | `\033[32m` | +| yellow | `\033[33m` | +| blue | `\033[34m` | +| magenta | `\033[35m` | +| cyan | `\033[36m` | +| white | `\033[37m` | +| brightblack | `\033[90m` | +| brightred | `\033[91m` | +| brightgreen | `\033[92m` | +| brightyellow | `\033[93m` | +| brightblue | `\033[94m` | +| brightmagenta | `\033[95m` | +| brightcyan | `\033[96m` | +| brightwhite | `\033[97m` | + +For example, here is an alternative set of syntax highlighting colors: + +```text +.keyword brightred +.constant brightwhite +.comment cyan +.error yellow +.cont blue +.cont_sel brightblue +``` + +If you wish to start up the CLI with a different set of colors every time, you can place these commands in the `~/.duckdbrc` file that is loaded on start-up of the CLI. + +## Error Highlighting + +The shell has support for highlighting certain errors. In particular, mismatched brackets and unclosed quotes are highlighted in red (or another color if specified). This highlighting is automatically disabled for large queries. In addition, it can be disabled manually using the `.render_errors off` command. \ No newline at end of file diff --git a/docs/archive/1.0/api/cpp.md b/docs/archive/1.0/api/cpp.md new file mode 100644 index 00000000000..e219db1344e --- /dev/null +++ b/docs/archive/1.0/api/cpp.md @@ -0,0 +1,236 @@ +--- +layout: docu +title: C++ API +--- + +## Installation + +The DuckDB C++ API can be installed as part of the `libduckdb` packages. Please see the [installation page]({% link docs/archive/1.0/installation/index.html %}?environment=cplusplus) for details. + +## Basic API Usage + +DuckDB implements a custom C++ API. This is built around the abstractions of a database instance (`DuckDB` class), multiple `Connection`s to the database instance and `QueryResult` instances as the result of queries. The header file for the C++ API is `duckdb.hpp`. + +### Startup & Shutdown + +To use DuckDB, you must first initialize a `DuckDB` instance using its constructor. `DuckDB()` takes as parameter the database file to read and write from. The special value `nullptr` can be used to create an **in-memory database**. Note that for an in-memory database no data is persisted to disk (i.e., all data is lost when you exit the process). The second parameter to the `DuckDB` constructor is an optional `DBConfig` object. In `DBConfig`, you can set various database parameters, for example the read/write mode or memory limits. The `DuckDB` constructor may throw exceptions, for example if the database file is not usable. + +With the `DuckDB` instance, you can create one or many `Connection` instances using the `Connection()` constructor. While connections should be thread-safe, they will be locked during querying. It is therefore recommended that each thread uses its own connection if you are in a multithreaded environment. + +```cpp +DuckDB db(nullptr); +Connection con(db); +``` + +### Querying + +Connections expose the `Query()` method to send a SQL query string to DuckDB from C++. `Query()` fully materializes the query result as a `MaterializedQueryResult` in memory before returning at which point the query result can be consumed. There is also a streaming API for queries, see further below. + +```cpp +// create a table +con.Query("CREATE TABLE integers (i INTEGER, j INTEGER)"); + +// insert three rows into the table +con.Query("INSERT INTO integers VALUES (3, 4), (5, 6), (7, NULL)"); + +auto result = con.Query("SELECT * FROM integers"); +if (result->HasError()) { + cerr << result->GetError() << endl; +} else { + cout << result->ToString() << endl; +} +``` + +The `MaterializedQueryResult` instance contains firstly two fields that indicate whether the query was successful. `Query` will not throw exceptions under normal circumstances. Instead, invalid queries or other issues will lead to the `success` boolean field in the query result instance to be set to `false`. In this case an error message may be available in `error` as a string. If successful, other fields are set: the type of statement that was just executed (e.g., `StatementType::INSERT_STATEMENT`) is contained in `statement_type`. The high-level (“Logical type”/“SQL type”) types of the result set columns are in `types`. The names of the result columns are in the `names` string vector. In case multiple result sets are returned, for example because the result set contained multiple statements, the result set can be chained using the `next` field. + +DuckDB also supports prepared statements in the C++ API with the `Prepare()` method. This returns an instance of `PreparedStatement`. This instance can be used to execute the prepared statement with parameters. Below is an example: + +```cpp +std::unique_ptr prepare = con.Prepare("SELECT count(*) FROM a WHERE i = $1"); +std::unique_ptr result = prepare->Execute(12); +``` + +> Warning Do **not** use prepared statements to insert large amounts of data into DuckDB. See [the data import documentation]({% link docs/archive/1.0/data/overview.md %}) for better options. + +### UDF API + +The UDF API allows the definition of user-defined functions. It is exposed in `duckdb:Connection` through the methods: `CreateScalarFunction()`, `CreateVectorizedFunction()`, and variants. +These methods created UDFs into the temporary schema (`TEMP_SCHEMA`) of the owner connection that is the only one allowed to use and change them. + +#### CreateScalarFunction + +The user can code an ordinary scalar function and invoke the `CreateScalarFunction()` to register and afterward use the UDF in a `SELECT` statement, for instance: + +```cpp +bool bigger_than_four(int value) { + return value > 4; +} + +connection.CreateScalarFunction("bigger_than_four", &bigger_than_four); + +connection.Query("SELECT bigger_than_four(i) FROM (VALUES(3), (5)) tbl(i)")->Print(); +``` + +The `CreateScalarFunction()` methods automatically creates vectorized scalar UDFs so they are as efficient as built-in functions, we have two variants of this method interface as follows: + +**1.** + +```cpp +template +void CreateScalarFunction(string name, TR (*udf_func)(Args…)) +``` + +- template parameters: + - **TR** is the return type of the UDF function; + - **Args** are the arguments up to 3 for the UDF function (this method only supports until ternary functions); +- **name**: is the name to register the UDF function; +- **udf_func**: is a pointer to the UDF function. + +This method automatically discovers from the template typenames the corresponding LogicalTypes: + +- `bool` → `LogicalType::BOOLEAN` +- `int8_t` → `LogicalType::TINYINT` +- `int16_t` → `LogicalType::SMALLINT` +- `int32_t` → `LogicalType::INTEGER` +- `int64_t` →` LogicalType::BIGINT` +- `float` → `LogicalType::FLOAT` +- `double` → `LogicalType::DOUBLE` +- `string_t` → `LogicalType::VARCHAR` + +In DuckDB some primitive types, e.g., `int32_t`, are mapped to the same `LogicalType`: `INTEGER`, `TIME` and `DATE`, then for disambiguation the users can use the following overloaded method. + +**2.** + +```cpp +template +void CreateScalarFunction(string name, vector args, LogicalType ret_type, TR (*udf_func)(Args…)) +``` + +An example of use would be: + +```cpp +int32_t udf_date(int32_t a) { + return a; +} + +con.Query("CREATE TABLE dates (d DATE)"); +con.Query("INSERT INTO dates VALUES ('1992-01-01')"); + +con.CreateScalarFunction("udf_date", {LogicalType::DATE}, LogicalType::DATE, &udf_date); + +con.Query("SELECT udf_date(d) FROM dates")->Print(); +``` + +- template parameters: + - **TR** is the return type of the UDF function; + - **Args** are the arguments up to 3 for the UDF function (this method only supports until ternary functions); +- **name**: is the name to register the UDF function; +- **args**: are the LogicalType arguments that the function uses, which should match with the template Args types; +- **ret_type**: is the LogicalType of return of the function, which should match with the template TR type; +- **udf_func**: is a pointer to the UDF function. + +This function checks the template types against the LogicalTypes passed as arguments and they must match as follow: + +- LogicalTypeId::BOOLEAN → bool +- LogicalTypeId::TINYINT → int8_t +- LogicalTypeId::SMALLINT → int16_t +- LogicalTypeId::DATE, LogicalTypeId::TIME, LogicalTypeId::INTEGER → int32_t +- LogicalTypeId::BIGINT, LogicalTypeId::TIMESTAMP → int64_t +- LogicalTypeId::FLOAT, LogicalTypeId::DOUBLE, LogicalTypeId::DECIMAL → double +- LogicalTypeId::VARCHAR, LogicalTypeId::CHAR, LogicalTypeId::BLOB → string_t +- LogicalTypeId::VARBINARY → blob_t + +#### CreateVectorizedFunction + +The `CreateVectorizedFunction()` methods register a vectorized UDF such as: + +```cpp +/* +* This vectorized function copies the input values to the result vector +*/ +template +static void udf_vectorized(DataChunk &args, ExpressionState &state, Vector &result) { + // set the result vector type + result.vector_type = VectorType::FLAT_VECTOR; + // get a raw array from the result + auto result_data = FlatVector::GetData(result); + + // get the solely input vector + auto &input = args.data[0]; + // now get an orrified vector + VectorData vdata; + input.Orrify(args.size(), vdata); + + // get a raw array from the orrified input + auto input_data = (TYPE *)vdata.data; + + // handling the data + for (idx_t i = 0; i < args.size(); i++) { + auto idx = vdata.sel->get_index(i); + if ((*vdata.nullmask)[idx]) { + continue; + } + result_data[i] = input_data[idx]; + } +} + +con.Query("CREATE TABLE integers (i INTEGER)"); +con.Query("INSERT INTO integers VALUES (1), (2), (3), (999)"); + +con.CreateVectorizedFunction("udf_vectorized_int", &&udf_vectorized); + +con.Query("SELECT udf_vectorized_int(i) FROM integers")->Print(); +``` + +The Vectorized UDF is a pointer of the type _scalar_function_t_: + +```cpp +typedef std::function scalar_function_t; +``` + +- **args** is a [DataChunk](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/common/types/data_chunk.hpp) that holds a set of input vectors for the UDF that all have the same length; +- **expr** is an [ExpressionState](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/execution/expression_executor_state.hpp) that provides information to the query's expression state; +- **result**: is a [Vector](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/common/types/vector.hpp) to store the result values. + +There are different vector types to handle in a Vectorized UDF: +- ConstantVector; +- DictionaryVector; +- FlatVector; +- ListVector; +- StringVector; +- StructVector; +- SequenceVector. + +The general API of the `CreateVectorizedFunction()` method is as follows: + +**1.** + +```cpp +template +void CreateVectorizedFunction(string name, scalar_function_t udf_func, LogicalType varargs = LogicalType::INVALID) +``` + +- template parameters: + - **TR** is the return type of the UDF function; + - **Args** are the arguments up to 3 for the UDF function. +- **name** is the name to register the UDF function; +- **udf_func** is a _vectorized_ UDF function; +- **varargs** The type of varargs to support, or LogicalTypeId::INVALID (default value) if the function does not accept variable length arguments. + +This method automatically discovers from the template typenames the corresponding LogicalTypes: + +- bool → LogicalType::BOOLEAN; +- int8_t → LogicalType::TINYINT; +- int16_t → LogicalType::SMALLINT +- int32_t → LogicalType::INTEGER +- int64_t → LogicalType::BIGINT +- float → LogicalType::FLOAT +- double → LogicalType::DOUBLE +- string_t → LogicalType::VARCHAR + +**2.** + +```cpp +template +void CreateVectorizedFunction(string name, vector args, LogicalType ret_type, scalar_function_t udf_func, LogicalType varargs = LogicalType::INVALID) +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/go.md b/docs/archive/1.0/api/go.md new file mode 100644 index 00000000000..26594c8b142 --- /dev/null +++ b/docs/archive/1.0/api/go.md @@ -0,0 +1,117 @@ +--- +github_repository: https://github.com/marcboeker/go-duckdb +layout: docu +title: Go +--- + +The DuckDB Go driver, `go-duckdb`, allows using DuckDB via the `database/sql` interface. +For examples on how to use this interface, see the [official documentation](https://pkg.go.dev/database/sql) and [tutorial](https://go.dev/doc/tutorial/database-access). + +> The Go client is a third-party library and its repository is hosted . + +## Installation + +To install the `go-duckdb` client, run: + +```bash +go get github.com/marcboeker/go-duckdb +``` + +## Importing + +To import the DuckDB Go package, add the following entries to your imports: + +```go +import ( + "database/sql" + _ "github.com/marcboeker/go-duckdb" +) +``` + +## Appender + +The DuckDB Go client supports the [DuckDB Appender API]({% link docs/archive/1.0/data/appender.md %}) for bulk inserts. You can obtain a new Appender by supplying a DuckDB connection to `NewAppenderFromConn()`. For example: + +```go +connector, err := duckdb.NewConnector("test.db", nil) +if err != nil { + ... +} +conn, err := connector.Connect(context.Background()) +if err != nil { + ... +} +defer conn.Close() + +// Retrieve appender from connection (note that you have to create the table 'test' beforehand). +appender, err := NewAppenderFromConn(conn, "", "test") +if err != nil { + ... +} +defer appender.Close() + +err = appender.AppendRow(...) +if err != nil { + ... +} + +// Optional, if you want to access the appended rows immediately. +err = appender.Flush() +if err != nil { + ... +} +``` + +## Examples + +### Simple Example + +An example for using the Go API is as follows: + +```go +package main + +import ( + "database/sql" + "errors" + "fmt" + "log" + + _ "github.com/marcboeker/go-duckdb" +) + +func main() { + db, err := sql.Open("duckdb", "") + if err != nil { + log.Fatal(err) + } + defer db.Close() + + _, err = db.Exec(`CREATE TABLE people (id INTEGER, name VARCHAR)`) + if err != nil { + log.Fatal(err) + } + _, err = db.Exec(`INSERT INTO people VALUES (42, 'John')`) + if err != nil { + log.Fatal(err) + } + + var ( + id int + name string + ) + row := db.QueryRow(`SELECT id, name FROM people`) + err = row.Scan(&id, &name) + if errors.Is(err, sql.ErrNoRows) { + log.Println("no rows") + } else if err != nil { + log.Fatal(err) + } + + fmt.Printf("id: %d, name: %s\n", id, name) +} +``` + +### More Examples + +For more examples, see the [examples in the `duckdb-go` repository](https://github.com/marcboeker/go-duckdb/tree/master/examples). \ No newline at end of file diff --git a/docs/archive/1.0/api/java.md b/docs/archive/1.0/api/java.md new file mode 100644 index 00000000000..999c26cc342 --- /dev/null +++ b/docs/archive/1.0/api/java.md @@ -0,0 +1,304 @@ +--- +github_repository: https://github.com/duckdb/duckdb-java +layout: docu +redirect_from: +- /docs/archive/1.0/api/scala +title: Java JDBC API +--- + +## Installation + +The DuckDB Java JDBC API can be installed from [Maven Central](https://search.maven.org/artifact/org.duckdb/duckdb_jdbc). Please see the [installation page]({% link docs/archive/1.0/installation/index.html %}?environment=java) for details. + +## Basic API Usage + +DuckDB's JDBC API implements the main parts of the standard Java Database Connectivity (JDBC) API, version 4.1. Describing JDBC is beyond the scope of this page, see the [official documentation](https://docs.oracle.com/javase/tutorial/jdbc/basics/index.html) for details. Below we focus on the DuckDB-specific parts. + +Refer to the externally hosted [API Reference](https://javadoc.io/doc/org.duckdb/duckdb_jdbc) for more information about our extensions to the JDBC specification, or the below [Arrow Methods](#arrow-methods). + +### Startup & Shutdown + +In JDBC, database connections are created through the standard `java.sql.DriverManager` class. +The driver should auto-register in the `DriverManager`, if that does not work for some reason, you can enforce registration using the following statement: + +```java +Class.forName("org.duckdb.DuckDBDriver"); +``` + +To create a DuckDB connection, call `DriverManager` with the `jdbc:duckdb:` JDBC URL prefix, like so: + +```java +import java.sql.Connection; +import java.sql.DriverManager; + +Connection conn = DriverManager.getConnection("jdbc:duckdb:"); +``` + +To use DuckDB-specific features such as the [Appender](#appender), cast the object to a `DuckDBConnection`: + +```java +import java.sql.DriverManager; +import org.duckdb.DuckDBConnection; + +DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:"); +``` + +When using the `jdbc:duckdb:` URL alone, an **in-memory database** is created. Note that for an in-memory database no data is persisted to disk (i.e., all data is lost when you exit the Java program). If you would like to access or create a persistent database, append its file name after the path. For example, if your database is stored in `/tmp/my_database`, use the JDBC URL `jdbc:duckdb:/tmp/my_database` to create a connection to it. + +It is possible to open a DuckDB database file in **read-only** mode. This is for example useful if multiple Java processes want to read the same database file at the same time. To open an existing database file in read-only mode, set the connection property `duckdb.read_only` like so: + +```java +Properties readOnlyProperty = new Properties(); +readOnlyProperty.setProperty("duckdb.read_only", "true"); +Connection conn = DriverManager.getConnection("jdbc:duckdb:/tmp/my_database", readOnlyProperty); +``` + +Additional connections can be created using the `DriverManager`. A more efficient mechanism is to call the `DuckDBConnection#duplicate()` method: + +```java +Connection conn2 = ((DuckDBConnection) conn).duplicate(); +``` + +Multiple connections are allowed, but mixing read-write and read-only connections is unsupported. + +### Configuring Connections + +Configuration options can be provided to change different settings of the database system. Note that many of these +settings can be changed later on using [`PRAGMA` statements]({% link docs/archive/1.0/configuration/pragmas.md %}) as well. + +```java +Properties connectionProperties = new Properties(); +connectionProperties.setProperty("temp_directory", "/path/to/temp/dir/"); +Connection conn = DriverManager.getConnection("jdbc:duckdb:/tmp/my_database", connectionProperties); +``` + +### Querying + +DuckDB supports the standard JDBC methods to send queries and retrieve result sets. First a `Statement` object has to be created from the `Connection`, this object can then be used to send queries using `execute` and `executeQuery`. `execute()` is meant for queries where no results are expected like `CREATE TABLE` or `UPDATE` etc. and `executeQuery()` is meant to be used for queries that produce results (e.g., `SELECT`). Below two examples. See also the JDBC [`Statement`](https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html) and [`ResultSet`](https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html) documentations. + +```java +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Statement; + +Connection conn = DriverManager.getConnection("jdbc:duckdb:"); + +// create a table +Statement stmt = conn.createStatement(); +stmt.execute("CREATE TABLE items (item VARCHAR, value DECIMAL(10, 2), count INTEGER)"); +// insert two items into the table +stmt.execute("INSERT INTO items VALUES ('jeans', 20.0, 1), ('hammer', 42.2, 2)"); + +try (ResultSet rs = stmt.executeQuery("SELECT * FROM items")) { + while (rs.next()) { + System.out.println(rs.getString(1)); + System.out.println(rs.getInt(3)); + } +} +stmt.close(); +``` + +```text +jeans +1 +hammer +2 +``` + +DuckDB also supports prepared statements as per the JDBC API: + +```java +import java.sql.PreparedStatement; + +try (PreparedStatement stmt = conn.prepareStatement("INSERT INTO items VALUES (?, ?, ?);")) { + stmt.setString(1, "chainsaw"); + stmt.setDouble(2, 500.0); + stmt.setInt(3, 42); + stmt.execute(); + // more calls to execute() possible +} +``` + +> Warning Do *not* use prepared statements to insert large amounts of data into DuckDB. See [the data import documentation]({% link docs/archive/1.0/data/overview.md %}) for better options. + +### Arrow Methods + +Refer to the [API Reference](https://javadoc.io/doc/org.duckdb/duckdb_jdbc/latest/org/duckdb/DuckDBResultSet.html#arrowExportStream(java.lang.Object,long)) for type signatures + +#### Arrow Export + +The following demonstrates exporting an arrow stream and consuming it using the java arrow bindings + +```java +import org.apache.arrow.memory.RootAllocator; +import org.apache.arrow.vector.ipc.ArrowReader; +import org.duckdb.DuckDBResultSet; + +try (var conn = DriverManager.getConnection("jdbc:duckdb:"); + var stmt = conn.prepareStatement("SELECT * FROM generate_series(2000)"); + var resultset = (DuckDBResultSet) stmt.executeQuery(); + var allocator = new RootAllocator()) { + try (var reader = (ArrowReader) resultset.arrowExportStream(allocator, 256)) { + while (reader.loadNextBatch()) { + System.out.println(reader.getVectorSchemaRoot().getVector("generate_series")); + } + } + stmt.close(); +} +``` + +#### Arrow Import + +The following demonstrates consuming an Arrow stream from the Java Arrow bindings. + +```java +import org.apache.arrow.memory.RootAllocator; +import org.apache.arrow.vector.ipc.ArrowReader; +import org.duckdb.DuckDBConnection; + +// Arrow binding +try (var allocator = new RootAllocator(); + ArrowStreamReader reader = null; // should not be null of course + var arrow_array_stream = ArrowArrayStream.allocateNew(allocator)) { + Data.exportArrayStream(allocator, reader, arrow_array_stream); + + // DuckDB setup + try (var conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:")) { + conn.registerArrowStream("asdf", arrow_array_stream); + + // run a query + try (var stmt = conn.createStatement(); + var rs = (DuckDBResultSet) stmt.executeQuery("SELECT count(*) FROM asdf")) { + while (rs.next()) { + System.out.println(rs.getInt(1)); + } + } + } +} +``` + +### Streaming Results + +Result streaming is opt-in in the JDBC driver – by setting the `jdbc_stream_results` config to `true` before running a query. The easiest way do that is to pass it in the `Properties` object. + +```java +Properties props = new Properties(); +props.setProperty(DuckDBDriver.JDBC_STREAM_RESULTS, String.valueOf(true)); + +Connection conn = DriverManager.getConnection("jdbc:duckdb:", props); +``` + +### Appender + +The [Appender]({% link docs/archive/1.0/data/appender.md %}) is available in the DuckDB JDBC driver via the `org.duckdb.DuckDBAppender` class. +The constructor of the class requires the schema name and the table name it is applied to. +The Appender is flushed when the `close()` method is called. + +Example: + +```java +import java.sql.DriverManager; +import java.sql.Statement; +import org.duckdb.DuckDBConnection; + +DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:"); +Statement stmt = conn.createStatement(); +stmt.execute("CREATE TABLE tbl (x BIGINT, y FLOAT, s VARCHAR)"); + +// using try-with-resources to automatically close the appender at the end of the scope +try (var appender = conn.createAppender(DuckDBConnection.DEFAULT_SCHEMA, "tbl")) { + appender.beginRow(); + appender.append(10); + appender.append(3.2); + appender.append("hello"); + appender.endRow(); + appender.beginRow(); + appender.append(20); + appender.append(-8.1); + appender.append("world"); + appender.endRow(); +} +stmt.close(); +``` + +### Batch Writer + +The DuckDB JDBC driver offers batch write functionality. +The batch writer supports prepared statements to mitigate the overhead of query parsing. + +> The preferred method for bulk inserts is to use the [Appender](#appender) due to its higher performance. +> However, when using the Appender is not possbile, the batch writer is available as alternative. + +#### Batch Writer with Prepared Statements + +```java +import java.sql.DriverManager; +import java.sql.PreparedStatement; +import org.duckdb.DuckDBConnection; + +DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:"); +PreparedStatement stmt = conn.prepareStatement("INSERT INTO test (x, y, z) VALUES (?, ?, ?);"); + +stmt.setObject(1, 1); +stmt.setObject(2, 2); +stmt.setObject(3, 3); +stmt.addBatch(); + +stmt.setObject(1, 4); +stmt.setObject(2, 5); +stmt.setObject(3, 6); +stmt.addBatch(); + +stmt.executeBatch(); +stmt.close(); +``` + +#### Batch Writer with Vanilla Statements + +The batch writer also supports vanilla SQL statements: + +```java +import java.sql.DriverManager; +import java.sql.Statement; +import org.duckdb.DuckDBConnection; + +DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:"); +Statement stmt = conn.createStatement(); + +stmt.execute("CREATE TABLE test (x INTEGER, y INTEGER, z INTEGER)"); + +stmt.addBatch("INSERT INTO test (x, y, z) VALUES (1, 2, 3);"); +stmt.addBatch("INSERT INTO test (x, y, z) VALUES (4, 5, 6);"); + +stmt.executeBatch(); +stmt.close(); +``` + +## Troubleshooting + +### Driver Class Not Found + +If the Java application is unable to find the DuckDB, it may throw the following error: + +```console +Exception in thread "main" java.sql.SQLException: No suitable driver found for jdbc:duckdb: + at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:706) + at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:252) + ... +``` + +And when trying to load the class manually, it may result in this error: + +```console +Exception in thread "main" java.lang.ClassNotFoundException: org.duckdb.DuckDBDriver + at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641) + at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) + at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520) + at java.base/java.lang.Class.forName0(Native Method) + at java.base/java.lang.Class.forName(Class.java:375) + ... +``` + +These errors stem from the DuckDB Maven/Gradle dependency not being detected. To ensure that it is detected, force refresh the Maven configuration in your IDE. \ No newline at end of file diff --git a/docs/archive/1.0/api/julia.md b/docs/archive/1.0/api/julia.md new file mode 100644 index 00000000000..bf4e59f550e --- /dev/null +++ b/docs/archive/1.0/api/julia.md @@ -0,0 +1,160 @@ +--- +layout: docu +title: Julia Package +--- + +The DuckDB Julia package provides a high-performance front-end for DuckDB. Much like SQLite, DuckDB runs in-process within the Julia client, and provides a DBInterface front-end. + +The package also supports multi-threaded execution. It uses Julia threads/tasks for this purpose. If you wish to run queries in parallel, you must launch Julia with multi-threading support (by e.g., setting the `JULIA_NUM_THREADS` environment variable). + +## Installation + +Install DuckDB as follows: + +```julia +using Pkg +Pkg.add("DuckDB") +``` + +Alternatively, enter the package manager using the `]` key, and issue the following command: + +```julia +pkg> add DuckDB +``` + +## Basics + +```julia +using DuckDB + +# create a new in-memory database +con = DBInterface.connect(DuckDB.DB, ":memory:") + +# create a table +DBInterface.execute(con, "CREATE TABLE integers (i INTEGER)") + +# insert data by executing a prepared statement +stmt = DBInterface.prepare(con, "INSERT INTO integers VALUES(?)") +DBInterface.execute(stmt, [42]) + +# query the database +results = DBInterface.execute(con, "SELECT 42 a") +print(results) +``` + +Some SQL statements, such as PIVOT and IMPORT DATABASE are executed as multiple prepared statements and will error when using `DuckDB.execute()`. Instead they can be run with `DuckDB.query()` instead of `DuckDB.execute()` and will always return a materialized result. + +## Scanning DataFrames + +The DuckDB Julia package also provides support for querying Julia DataFrames. Note that the DataFrames are directly read by DuckDB - they are not inserted or copied into the database itself. + +If you wish to load data from a DataFrame into a DuckDB table you can run a `CREATE TABLE ... AS` or `INSERT INTO` query. + +```julia +using DuckDB +using DataFrames + +# create a new in-memory dabase +con = DBInterface.connect(DuckDB.DB) + +# create a DataFrame +df = DataFrame(a = [1, 2, 3], b = [42, 84, 42]) + +# register it as a view in the database +DuckDB.register_data_frame(con, df, "my_df") + +# run a SQL query over the DataFrame +results = DBInterface.execute(con, "SELECT * FROM my_df") +print(results) +``` + +## Appender API + +The DuckDB Julia package also supports the [Appender API]({% link docs/archive/1.0/data/appender.md %}), which is much faster than using prepared statements or individual `INSERT INTO` statements. Appends are made in row-wise format. For every column, an `append()` call should be made, after which the row should be finished by calling `flush()`. After all rows have been appended, `close()` should be used to finalize the Appender and clean up the resulting memory. + +```julia +using DuckDB, DataFrames, Dates +db = DuckDB.DB() +# create a table +DBInterface.execute(db, + "CREATE OR REPLACE TABLE data(id INTEGER PRIMARY KEY, value FLOAT, timestamp TIMESTAMP, date DATE)") +# create data to insert +len = 100 +df = DataFrames.DataFrame( + id = collect(1:len), + value = rand(len), + timestamp = Dates.now() + Dates.Second.(1:len), + date = Dates.today() + Dates.Day.(1:len) + ) +# append data by row +appender = DuckDB.Appender(db, "data") +for i in eachrow(df) + for j in i + DuckDB.append(appender, j) + end + DuckDB.end_row(appender) +end +# close the appender after all rows +DuckDB.close(appender) +``` + +## Concurrency + +Within a Julia process, tasks are able to concurrently read and write to the database, as long as each task maintains its own connection to the database. In the example below, a single task is spawned to periodically read the database and many tasks are spawned to write to the database using both [`INSERT` statements]({% link docs/archive/1.0/sql/statements/insert.md %}) as well as the [Appender API]({% link docs/archive/1.0/data/appender.md %}). + +```julia +using Dates, DataFrames, DuckDB +db = DuckDB.DB() +DBInterface.connect(db) +DBInterface.execute(db, "CREATE OR REPLACE TABLE data (date TIMESTAMP, id INTEGER)") + +function run_reader(db) + # create a DuckDB connection specifically for this task + conn = DBInterface.connect(db) + while true + println(DBInterface.execute(conn, + "SELECT id, count(date) AS count, max(date) AS max_date + FROM data GROUP BY id ORDER BY id") |> DataFrames.DataFrame) + Threads.sleep(1) + end + DBInterface.close(conn) +end +# spawn one reader task +Threads.@spawn run_reader(db) + +function run_inserter(db, id) + # create a DuckDB connection specifically for this task + conn = DBInterface.connect(db) + for i in 1:1000 + Threads.sleep(0.01) + DuckDB.execute(conn, "INSERT INTO data VALUES (current_timestamp, ?)"; id); + end + DBInterface.close(conn) +end +# spawn many insert tasks +for i in 1:100 + Threads.@spawn run_inserter(db, 1) +end + +function run_appender(db, id) + # create a DuckDB connection specifically for this task + appender = DuckDB.Appender(db, "data") + for i in 1:1000 + Threads.sleep(0.01) + row = (Dates.now(Dates.UTC), id) + for j in row + DuckDB.append(appender, j); + end + DuckDB.end_row(appender); + end + DuckDB.close(appender); +end +# spawn many appender tasks +for i in 1:100 + Threads.@spawn run_appender(db, 2) +end +``` + +## Original Julia Connector + +Credits to kimmolinna for the [original DuckDB Julia connector](https://github.com/kimmolinna/DuckDB.jl). \ No newline at end of file diff --git a/docs/archive/1.0/api/nodejs/overview.md b/docs/archive/1.0/api/nodejs/overview.md new file mode 100644 index 00000000000..3d1dad54ab8 --- /dev/null +++ b/docs/archive/1.0/api/nodejs/overview.md @@ -0,0 +1,176 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/api/nodejs +- /docs/archive/1.0/api/nodejs/ +title: Node.js API +--- + +This package provides a Node.js API for DuckDB. +The API for this client is somewhat compliant to the SQLite Node.js client for easier transition. + +For TypeScript wrappers, see the [duckdb-async project](https://www.npmjs.com/package/duckdb-async). + +## Initializing + +Load the package and create a database object: + +```js +const duckdb = require('duckdb'); +const db = new duckdb.Database(':memory:'); // or a file name for a persistent DB +``` + +All options as described on [Database configuration]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference) can be (optionally) supplied to the `Database` constructor as second argument. The third argument can be optionally supplied to get feedback on the given options. + +```js +const db = new duckdb.Database(':memory:', { + "access_mode": "READ_WRITE", + "max_memory": "512MB", + "threads": "4" +}, (err) => { + if (err) { + console.error(err); + } +}); +``` + +## Running a Query + +The following code snippet runs a simple query using the `Database.all()` method. + +```js +db.all('SELECT 42 AS fortytwo', function(err, res) { + if (err) { + console.warn(err); + return; + } + console.log(res[0].fortytwo) +}); +``` + +Other available methods are `each`, where the callback is invoked for each row, `run` to execute a single statement without results and `exec`, which can execute several SQL commands at once but also does not return results. All those commands can work with prepared statements, taking the values for the parameters as additional arguments. For example like so: + +```js +db.all('SELECT ?::INTEGER AS fortytwo, ?::VARCHAR AS hello', 42, 'Hello, World', function(err, res) { + if (err) { + console.warn(err); + return; + } + console.log(res[0].fortytwo) + console.log(res[0].hello) +}); +``` + +## Connections + +A database can have multiple `Connection`s, those are created using `db.connect()`. + +```js +const con = db.connect(); +``` + +You can create multiple connections, each with their own transaction context. + +`Connection` objects also contain shorthands to directly call `run()`, `all()` and `each()` with parameters and callbacks, respectively, for example: + +```js +con.all('SELECT 42 AS fortytwo', function(err, res) { + if (err) { + console.warn(err); + return; + } + console.log(res[0].fortytwo) +}); +``` + +## Prepared Statements + +From connections, you can create prepared statements (and only that) using `con.prepare()`: + +```js +const stmt = con.prepare('SELECT ?::INTEGER AS fortytwo'); +``` + +To execute this statement, you can call for example `all()` on the `stmt` object: + +```js +stmt.all(42, function(err, res) { + if (err) { + console.warn(err); + } else { + console.log(res[0].fortytwo) + } +}); +``` + +You can also execute the prepared statement multiple times. This is for example useful to fill a table with data: + +```js +con.run('CREATE TABLE a (i INTEGER)'); +const stmt = con.prepare('INSERT INTO a VALUES (?)'); +for (let i = 0; i < 10; i++) { + stmt.run(i); +} +stmt.finalize(); +con.all('SELECT * FROM a', function(err, res) { + if (err) { + console.warn(err); + } else { + console.log(res) + } +}); +``` + +`prepare()` can also take a callback which gets the prepared statement as an argument: + +```js +const stmt = con.prepare('SELECT ?::INTEGER AS fortytwo', function(err, stmt) { + stmt.all(42, function(err, res) { + if (err) { + console.warn(err); + } else { + console.log(res[0].fortytwo) + } + }); +}); +``` + +## Inserting Data via Arrow + +[Apache Arrow]({% link docs/archive/1.0/guides/python/sql_on_arrow.md %}) can be used to insert data into DuckDB without making a copy: + +```js +const arrow = require('apache-arrow'); +const db = new duckdb.Database(':memory:'); + +const jsonData = [ + {"userId":1,"id":1,"title":"delectus aut autem","completed":false}, + {"userId":1,"id":2,"title":"quis ut nam facilis et officia qui","completed":false} +]; + +// note; doesn't work on Windows yet +db.exec(`INSTALL arrow; LOAD arrow;`, (err) => { + if (err) { + console.warn(err); + return; + } + + const arrowTable = arrow.tableFromJSON(jsonData); + db.register_buffer("jsonDataTable", [arrow.tableToIPC(arrowTable)], true, (err, res) => { + if (err) { + console.warn(err); + return; + } + + // `SELECT * FROM jsonDataTable` would return the entries in `jsonData` + }); +}); +``` + +## Loading Unsigned Extensions + +To load [unsigned extensions]({% link docs/archive/1.0/extensions/overview.md %}#ensuring-the-integrity-of-extensions), instantiate the database as follows: + +```js +db = new duckdb.Database(':memory:', {"allow_unsigned_extensions": "true"}); +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/nodejs/reference.md b/docs/archive/1.0/api/nodejs/reference.md new file mode 100644 index 00000000000..c640b19d592 --- /dev/null +++ b/docs/archive/1.0/api/nodejs/reference.md @@ -0,0 +1,890 @@ +--- +layout: docu +title: Node.js API +--- + +## Modules + +
+
duckdb
+
+
+ +## Typedefs + +
+
ColumnInfo : object
+
+
TypeInfo : object
+
+
DuckDbError : object
+
+
HTTPError : object
+
+
+ + + +## duckdb + +**Summary**: DuckDB is an embeddable SQL OLAP Database Management System + +* [duckdb](#module_duckdb) + * [~Connection](#module_duckdb..Connection) + * [.run(sql, ...params, callback)](#module_duckdb..Connection+run) ⇒ void + * [.all(sql, ...params, callback)](#module_duckdb..Connection+all) ⇒ void + * [.arrowIPCAll(sql, ...params, callback)](#module_duckdb..Connection+arrowIPCAll) ⇒ void + * [.arrowIPCStream(sql, ...params, callback)](#module_duckdb..Connection+arrowIPCStream) ⇒ + * [.each(sql, ...params, callback)](#module_duckdb..Connection+each) ⇒ void + * [.stream(sql, ...params)](#module_duckdb..Connection+stream) + * [.register_udf(name, return_type, fun)](#module_duckdb..Connection+register_udf) ⇒ void + * [.prepare(sql, ...params, callback)](#module_duckdb..Connection+prepare) ⇒ Statement + * [.exec(sql, ...params, callback)](#module_duckdb..Connection+exec) ⇒ void + * [.register_udf_bulk(name, return_type, callback)](#module_duckdb..Connection+register_udf_bulk) ⇒ void + * [.unregister_udf(name, return_type, callback)](#module_duckdb..Connection+unregister_udf) ⇒ void + * [.register_buffer(name, array, force, callback)](#module_duckdb..Connection+register_buffer) ⇒ void + * [.unregister_buffer(name, callback)](#module_duckdb..Connection+unregister_buffer) ⇒ void + * [.close(callback)](#module_duckdb..Connection+close) ⇒ void + * [~Statement](#module_duckdb..Statement) + * [.sql](#module_duckdb..Statement+sql) ⇒ + * [.get()](#module_duckdb..Statement+get) + * [.run(sql, ...params, callback)](#module_duckdb..Statement+run) ⇒ void + * [.all(sql, ...params, callback)](#module_duckdb..Statement+all) ⇒ void + * [.arrowIPCAll(sql, ...params, callback)](#module_duckdb..Statement+arrowIPCAll) ⇒ void + * [.each(sql, ...params, callback)](#module_duckdb..Statement+each) ⇒ void + * [.finalize(sql, ...params, callback)](#module_duckdb..Statement+finalize) ⇒ void + * [.stream(sql, ...params)](#module_duckdb..Statement+stream) + * [.columns()](#module_duckdb..Statement+columns) ⇒ [Array.<ColumnInfo>](#ColumnInfo) + * [~QueryResult](#module_duckdb..QueryResult) + * [.nextChunk()](#module_duckdb..QueryResult+nextChunk) ⇒ + * [.nextIpcBuffer()](#module_duckdb..QueryResult+nextIpcBuffer) ⇒ + * [.asyncIterator()](#module_duckdb..QueryResult+asyncIterator) + * [~Database](#module_duckdb..Database) + * [.close(callback)](#module_duckdb..Database+close) ⇒ void + * [.close_internal(callback)](#module_duckdb..Database+close_internal) ⇒ void + * [.wait(callback)](#module_duckdb..Database+wait) ⇒ void + * [.serialize(callback)](#module_duckdb..Database+serialize) ⇒ void + * [.parallelize(callback)](#module_duckdb..Database+parallelize) ⇒ void + * [.connect(path)](#module_duckdb..Database+connect) ⇒ Connection + * [.interrupt(callback)](#module_duckdb..Database+interrupt) ⇒ void + * [.prepare(sql)](#module_duckdb..Database+prepare) ⇒ Statement + * [.run(sql, ...params, callback)](#module_duckdb..Database+run) ⇒ void + * [.scanArrowIpc(sql, ...params, callback)](#module_duckdb..Database+scanArrowIpc) ⇒ void + * [.each(sql, ...params, callback)](#module_duckdb..Database+each) ⇒ void + * [.stream(sql, ...params)](#module_duckdb..Database+stream) + * [.all(sql, ...params, callback)](#module_duckdb..Database+all) ⇒ void + * [.arrowIPCAll(sql, ...params, callback)](#module_duckdb..Database+arrowIPCAll) ⇒ void + * [.arrowIPCStream(sql, ...params, callback)](#module_duckdb..Database+arrowIPCStream) ⇒ void + * [.exec(sql, ...params, callback)](#module_duckdb..Database+exec) ⇒ void + * [.register_udf(name, return_type, fun)](#module_duckdb..Database+register_udf) ⇒ this + * [.register_buffer(name)](#module_duckdb..Database+register_buffer) ⇒ this + * [.unregister_buffer(name)](#module_duckdb..Database+unregister_buffer) ⇒ this + * [.unregister_udf(name)](#module_duckdb..Database+unregister_udf) ⇒ this + * [.registerReplacementScan(fun)](#module_duckdb..Database+registerReplacementScan) ⇒ this + * [.tokenize(text)](#module_duckdb..Database+tokenize) ⇒ ScriptTokens + * [.get()](#module_duckdb..Database+get) + * [~TokenType](#module_duckdb..TokenType) + * [~ERROR](#module_duckdb..ERROR) : number + * [~OPEN_READONLY](#module_duckdb..OPEN_READONLY) : number + * [~OPEN_READWRITE](#module_duckdb..OPEN_READWRITE) : number + * [~OPEN_CREATE](#module_duckdb..OPEN_CREATE) : number + * [~OPEN_FULLMUTEX](#module_duckdb..OPEN_FULLMUTEX) : number + * [~OPEN_SHAREDCACHE](#module_duckdb..OPEN_SHAREDCACHE) : number + * [~OPEN_PRIVATECACHE](#module_duckdb..OPEN_PRIVATECACHE) : number + + + +### duckdb~Connection + +**Kind**: inner class of [duckdb](#module_duckdb) + +* [~Connection](#module_duckdb..Connection) + * [.run(sql, ...params, callback)](#module_duckdb..Connection+run) ⇒ void + * [.all(sql, ...params, callback)](#module_duckdb..Connection+all) ⇒ void + * [.arrowIPCAll(sql, ...params, callback)](#module_duckdb..Connection+arrowIPCAll) ⇒ void + * [.arrowIPCStream(sql, ...params, callback)](#module_duckdb..Connection+arrowIPCStream) ⇒ + * [.each(sql, ...params, callback)](#module_duckdb..Connection+each) ⇒ void + * [.stream(sql, ...params)](#module_duckdb..Connection+stream) + * [.register_udf(name, return_type, fun)](#module_duckdb..Connection+register_udf) ⇒ void + * [.prepare(sql, ...params, callback)](#module_duckdb..Connection+prepare) ⇒ Statement + * [.exec(sql, ...params, callback)](#module_duckdb..Connection+exec) ⇒ void + * [.register_udf_bulk(name, return_type, callback)](#module_duckdb..Connection+register_udf_bulk) ⇒ void + * [.unregister_udf(name, return_type, callback)](#module_duckdb..Connection+unregister_udf) ⇒ void + * [.register_buffer(name, array, force, callback)](#module_duckdb..Connection+register_buffer) ⇒ void + * [.unregister_buffer(name, callback)](#module_duckdb..Connection+unregister_buffer) ⇒ void + * [.close(callback)](#module_duckdb..Connection+close) ⇒ void + + + +#### connection.run(sql, ...params, callback) ⇒ void + +Run a SQL statement and trigger a callback when done + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### connection.all(sql, ...params, callback) ⇒ void + +Run a SQL query and triggers the callback once for all result rows + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### connection.arrowIPCAll(sql, ...params, callback) ⇒ void + +Run a SQL query and serialize the result into the Apache Arrow IPC format (requires arrow extension to be loaded) + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### connection.arrowIPCStream(sql, ...params, callback) ⇒ + +Run a SQL query, returns a IpcResultStreamIterator that allows streaming the result into the Apache Arrow IPC format +(requires arrow extension to be loaded) + +**Kind**: instance method of [Connection](#module_duckdb..Connection) +**Returns**: Promise + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### connection.each(sql, ...params, callback) ⇒ void + +Runs a SQL query and triggers the callback for each result row + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### connection.stream(sql, ...params) + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | + + + +#### connection.register\_udf(name, return_type, fun) ⇒ void + +Register a User Defined Function + +**Kind**: instance method of [Connection](#module_duckdb..Connection) +**Note**: this follows the wasm udfs somewhat but is simpler because we can pass data much more cleanly + +| Param | +| --- | +| name | +| return_type | +| fun | + + + +#### connection.prepare(sql, ...params, callback) ⇒ Statement + +Prepare a SQL query for execution + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### connection.exec(sql, ...params, callback) ⇒ void + +Execute a SQL query + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### connection.register\_udf\_bulk(name, return_type, callback) ⇒ void + +Register a User Defined Function + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | +| --- | +| name | +| return_type | +| callback | + + + +#### connection.unregister\_udf(name, return_type, callback) ⇒ void + +Unregister a User Defined Function + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | +| --- | +| name | +| return_type | +| callback | + + + +#### connection.register\_buffer(name, array, force, callback) ⇒ void + +Register a Buffer to be scanned using the Apache Arrow IPC scanner +(requires arrow extension to be loaded) + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | +| --- | +| name | +| array | +| force | +| callback | + + + +#### connection.unregister\_buffer(name, callback) ⇒ void + +Unregister the Buffer + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | +| --- | +| name | +| callback | + + + +#### connection.close(callback) ⇒ void + +Closes connection + +**Kind**: instance method of [Connection](#module_duckdb..Connection) + +| Param | +| --- | +| callback | + + + +### duckdb~Statement + +**Kind**: inner class of [duckdb](#module_duckdb) + +* [~Statement](#module_duckdb..Statement) + * [.sql](#module_duckdb..Statement+sql) ⇒ + * [.get()](#module_duckdb..Statement+get) + * [.run(sql, ...params, callback)](#module_duckdb..Statement+run) ⇒ void + * [.all(sql, ...params, callback)](#module_duckdb..Statement+all) ⇒ void + * [.arrowIPCAll(sql, ...params, callback)](#module_duckdb..Statement+arrowIPCAll) ⇒ void + * [.each(sql, ...params, callback)](#module_duckdb..Statement+each) ⇒ void + * [.finalize(sql, ...params, callback)](#module_duckdb..Statement+finalize) ⇒ void + * [.stream(sql, ...params)](#module_duckdb..Statement+stream) + * [.columns()](#module_duckdb..Statement+columns) ⇒ [Array.<ColumnInfo>](#ColumnInfo) + + + +#### statement.sql ⇒ + +**Kind**: instance property of [Statement](#module_duckdb..Statement) +**Returns**: sql contained in statement +**Field**: + + +#### statement.get() + +Not implemented + +**Kind**: instance method of [Statement](#module_duckdb..Statement) + + +#### statement.run(sql, ...params, callback) ⇒ void + +**Kind**: instance method of [Statement](#module_duckdb..Statement) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### statement.all(sql, ...params, callback) ⇒ void + +**Kind**: instance method of [Statement](#module_duckdb..Statement) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### statement.arrowIPCAll(sql, ...params, callback) ⇒ void + +**Kind**: instance method of [Statement](#module_duckdb..Statement) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### statement.each(sql, ...params, callback) ⇒ void + +**Kind**: instance method of [Statement](#module_duckdb..Statement) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### statement.finalize(sql, ...params, callback) ⇒ void + +**Kind**: instance method of [Statement](#module_duckdb..Statement) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### statement.stream(sql, ...params) + +**Kind**: instance method of [Statement](#module_duckdb..Statement) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | + + + +#### statement.columns() ⇒ [Array.<ColumnInfo>](#ColumnInfo) + +**Kind**: instance method of [Statement](#module_duckdb..Statement) +**Returns**: [Array.<ColumnInfo>](#ColumnInfo) - - Array of column names and types + + +### duckdb~QueryResult + +**Kind**: inner class of [duckdb](#module_duckdb) + +* [~QueryResult](#module_duckdb..QueryResult) + * [.nextChunk()](#module_duckdb..QueryResult+nextChunk) ⇒ + * [.nextIpcBuffer()](#module_duckdb..QueryResult+nextIpcBuffer) ⇒ + * [.asyncIterator()](#module_duckdb..QueryResult+asyncIterator) + + + +#### queryResult.nextChunk() ⇒ + +**Kind**: instance method of [QueryResult](#module_duckdb..QueryResult) +**Returns**: data chunk + + +#### queryResult.nextIpcBuffer() ⇒ + +Function to fetch the next result blob of an Arrow IPC Stream in a zero-copy way. +(requires arrow extension to be loaded) + +**Kind**: instance method of [QueryResult](#module_duckdb..QueryResult) +**Returns**: data chunk + + +#### queryResult.asyncIterator() + +**Kind**: instance method of [QueryResult](#module_duckdb..QueryResult) + + +### duckdb~Database + +Main database interface + +**Kind**: inner property of [duckdb](#module_duckdb) + +| Param | Description | +| --- | --- | +| path | path to database file or :memory: for in-memory database | +| access_mode | access mode | +| config | the configuration object | +| callback | callback function | + + +* [~Database](#module_duckdb..Database) + * [.close(callback)](#module_duckdb..Database+close) ⇒ void + * [.close_internal(callback)](#module_duckdb..Database+close_internal) ⇒ void + * [.wait(callback)](#module_duckdb..Database+wait) ⇒ void + * [.serialize(callback)](#module_duckdb..Database+serialize) ⇒ void + * [.parallelize(callback)](#module_duckdb..Database+parallelize) ⇒ void + * [.connect(path)](#module_duckdb..Database+connect) ⇒ Connection + * [.interrupt(callback)](#module_duckdb..Database+interrupt) ⇒ void + * [.prepare(sql)](#module_duckdb..Database+prepare) ⇒ Statement + * [.run(sql, ...params, callback)](#module_duckdb..Database+run) ⇒ void + * [.scanArrowIpc(sql, ...params, callback)](#module_duckdb..Database+scanArrowIpc) ⇒ void + * [.each(sql, ...params, callback)](#module_duckdb..Database+each) ⇒ void + * [.stream(sql, ...params)](#module_duckdb..Database+stream) + * [.all(sql, ...params, callback)](#module_duckdb..Database+all) ⇒ void + * [.arrowIPCAll(sql, ...params, callback)](#module_duckdb..Database+arrowIPCAll) ⇒ void + * [.arrowIPCStream(sql, ...params, callback)](#module_duckdb..Database+arrowIPCStream) ⇒ void + * [.exec(sql, ...params, callback)](#module_duckdb..Database+exec) ⇒ void + * [.register_udf(name, return_type, fun)](#module_duckdb..Database+register_udf) ⇒ this + * [.register_buffer(name)](#module_duckdb..Database+register_buffer) ⇒ this + * [.unregister_buffer(name)](#module_duckdb..Database+unregister_buffer) ⇒ this + * [.unregister_udf(name)](#module_duckdb..Database+unregister_udf) ⇒ this + * [.registerReplacementScan(fun)](#module_duckdb..Database+registerReplacementScan) ⇒ this + * [.tokenize(text)](#module_duckdb..Database+tokenize) ⇒ ScriptTokens + * [.get()](#module_duckdb..Database+get) + + + +#### database.close(callback) ⇒ void + +Closes database instance + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| callback | + + + +#### database.close\_internal(callback) ⇒ void + +Internal method. Do not use, call Connection#close instead + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| callback | + + + +#### database.wait(callback) ⇒ void + +Triggers callback when all scheduled database tasks have completed. + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| callback | + + + +#### database.serialize(callback) ⇒ void + +Currently a no-op. Provided for SQLite compatibility + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| callback | + + + +#### database.parallelize(callback) ⇒ void + +Currently a no-op. Provided for SQLite compatibility + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| callback | + + + +#### database.connect(path) ⇒ Connection + +Create a new database connection + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Description | +| --- | --- | +| path | the database to connect to, either a file path, or `:memory:` | + + + +#### database.interrupt(callback) ⇒ void + +Supposedly interrupt queries, but currently does not do anything. + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| callback | + + + +#### database.prepare(sql) ⇒ Statement + +Prepare a SQL query for execution + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| sql | + + + +#### database.run(sql, ...params, callback) ⇒ void + +Convenience method for Connection#run using a built-in default connection + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### database.scanArrowIpc(sql, ...params, callback) ⇒ void + +Convenience method for Connection#scanArrowIpc using a built-in default connection + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### database.each(sql, ...params, callback) ⇒ void + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### database.stream(sql, ...params) + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | + + + +#### database.all(sql, ...params, callback) ⇒ void + +Convenience method for Connection#apply using a built-in default connection + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### database.arrowIPCAll(sql, ...params, callback) ⇒ void + +Convenience method for Connection#arrowIPCAll using a built-in default connection + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### database.arrowIPCStream(sql, ...params, callback) ⇒ void + +Convenience method for Connection#arrowIPCStream using a built-in default connection + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### database.exec(sql, ...params, callback) ⇒ void + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Type | +| --- | --- | +| sql | | +| ...params | \* | +| callback | | + + + +#### database.register\_udf(name, return_type, fun) ⇒ this + +Register a User Defined Function + +Convenience method for Connection#register_udf + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| name | +| return_type | +| fun | + + + +#### database.register\_buffer(name) ⇒ this + +Register a buffer containing serialized data to be scanned from DuckDB. + +Convenience method for Connection#unregister_buffer + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| name | + + + +#### database.unregister\_buffer(name) ⇒ this + +Unregister a Buffer + +Convenience method for Connection#unregister_buffer + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| name | + + + +#### database.unregister\_udf(name) ⇒ this + +Unregister a UDF + +Convenience method for Connection#unregister_udf + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| name | + + + +#### database.registerReplacementScan(fun) ⇒ this + +Register a table replace scan function + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | Description | +| --- | --- | +| fun | Replacement scan function | + + + +#### database.tokenize(text) ⇒ ScriptTokens + +Return positions and types of tokens in given text + +**Kind**: instance method of [Database](#module_duckdb..Database) + +| Param | +| --- | +| text | + + + +#### database.get() + +Not implemented + +**Kind**: instance method of [Database](#module_duckdb..Database) + + +### duckdb~TokenType + +Types of tokens return by `tokenize`. + +**Kind**: inner property of [duckdb](#module_duckdb) + + +### duckdb~ERROR : number + +Check that errno attribute equals this to check for a duckdb error + +**Kind**: inner constant of [duckdb](#module_duckdb) + + +### duckdb~OPEN\_READONLY : number + +Open database in readonly mode + +**Kind**: inner constant of [duckdb](#module_duckdb) + + +### duckdb~OPEN\_READWRITE : number + +Currently ignored + +**Kind**: inner constant of [duckdb](#module_duckdb) + + +### duckdb~OPEN\_CREATE : number + +Currently ignored + +**Kind**: inner constant of [duckdb](#module_duckdb) + + +### duckdb~OPEN\_FULLMUTEX : number + +Currently ignored + +**Kind**: inner constant of [duckdb](#module_duckdb) + + +### duckdb~OPEN\_SHAREDCACHE : number + +Currently ignored + +**Kind**: inner constant of [duckdb](#module_duckdb) + + +### duckdb~OPEN\_PRIVATECACHE : number + +Currently ignored + +**Kind**: inner constant of [duckdb](#module_duckdb) + + +## ColumnInfo : object + +**Kind**: global typedef +**Properties** + +| Name | Type | Description | +| --- | --- | --- | +| name | string | Column name | +| type | [TypeInfo](#TypeInfo) | Column type | + + + +## TypeInfo : object + +**Kind**: global typedef +**Properties** + +| Name | Type | Description | +| --- | --- | --- | +| id | string | Type ID | +| [alias] | string | SQL type alias | +| sql_type | string | SQL type name | + + + +## DuckDbError : object + +**Kind**: global typedef +**Properties** + +| Name | Type | Description | +| --- | --- | --- | +| errno | number | -1 for DuckDB errors | +| message | string | Error message | +| code | string | 'DUCKDB_NODEJS_ERROR' for DuckDB errors | +| errorType | string | DuckDB error type code (eg, HTTP, IO, Catalog) | + + + +## HTTPError : object + +**Kind**: global typedef +**Extends**: [DuckDbError](#DuckDbError) +**Properties** + +| Name | Type | Description | +| --- | --- | --- | +| statusCode | number | HTTP response status code | +| reason | string | HTTP response reason | +| response | string | HTTP response body | +| headers | object | HTTP headers | \ No newline at end of file diff --git a/docs/archive/1.0/api/odbc/configuration.md b/docs/archive/1.0/api/odbc/configuration.md new file mode 100644 index 00000000000..be746d24f95 --- /dev/null +++ b/docs/archive/1.0/api/odbc/configuration.md @@ -0,0 +1,54 @@ +--- +github_repository: https://github.com/duckdb/duckdb-odbc +layout: docu +title: ODBC Configuration +--- + +This page documents the files using the ODBC configuration, [`odbc.ini`](#odbcini-and-odbcini) and [`odbcinst.ini`](#odbcinstini-and-odbcinstini). +These are either placed in the home directory as dotfiles (`.odbc.ini` and `.odbcinst.ini`, respectively) or in a system directory. +For platform-specific details, see the pages for [Linux]({% link docs/archive/1.0/api/odbc/linux.md %}), [macOS]({% link docs/archive/1.0/api/odbc/macos.md %}), and [Windows]({% link docs/archive/1.0/api/odbc/windows.md %}). + +## `odbc.ini` and `.odbc.ini` + +The `odbc.ini` file contains the DSNs for the drivers, which can have specific knobs. +An example of `odbc.ini` with DuckDB: + +```ini +[DuckDB] +Driver = DuckDB Driver +Database = :memory: +access_mode = read_only +allow_unsigned_extensions = true +``` + +The lines correspond to the following parameters: + +* `[DuckDB]`: between the brackets is a DSN for the DuckDB. +* `Driver`: Describes the driver's name, as well as where to find the configurations in the `odbcinst.ini`. +* `Database`: Describes the database name used by DuckDB, can also be a file path to a `.db` in the system. +* `access_mode`: The mode in which to connect to the database. +* `allow_unsigned_extensions`: Allow the use of [unsigned extensions]({% link docs/archive/1.0/extensions/overview.md %}#unsigned-extensions). + +## `odbcinst.ini` and `.odbcinst.ini` + +The `odbcinst.ini` file contains general configurations for the ODBC installed drivers in the system. +A driver section starts with the driver name between brackets, and then it follows specific configuration knobs belonging to that driver. + +Example of `odbcinst.ini` with the DuckDB: + +```ini +[ODBC] +Trace = yes +TraceFile = /tmp/odbctrace + +[DuckDB Driver] +Driver = /path/to/libduckdb_odbc.dylib +``` + +The lines correspond to the following parameters: + +* `[ODBC]`: The DM configuration section. +* `Trace`: Enables the ODBC trace file using the option `yes`. +* `TraceFile`: The absolute system file path for the ODBC trace file. +* `[DuckDB Driver]`: The section of the DuckDB installed driver. +* `Driver`: The absolute system file path of the DuckDB driver. Change to match your configuration. \ No newline at end of file diff --git a/docs/archive/1.0/api/odbc/linux.md b/docs/archive/1.0/api/odbc/linux.md new file mode 100644 index 00000000000..ae3deef779b --- /dev/null +++ b/docs/archive/1.0/api/odbc/linux.md @@ -0,0 +1,88 @@ +--- +github_repository: https://github.com/duckdb/duckdb-odbc +layout: docu +title: ODBC API on Linux +--- + +## Driver Manager + +A driver manager is required to manage communication between applications and the ODBC driver. +We tested and support `unixODBC` that is a complete ODBC driver manager for Linux. +Users can install it from the command line: + +On Debian-based distributions (Ubuntu, Mint, etc.), run: + +```bash +sudo apt-get install unixodbc odbcinst +``` + +On Fedora-based distributions (Amazon Linux, RHEL, CentOS, etc.), run: + +```bash +sudo yum install unixODBC +``` + +## Setting Up the Driver + +1. Download the ODBC Linux Asset corresponding to your architecture: + + + + * [x86_64 (AMD64)](https://github.com/duckdb/duckdb/releases/download/v{{ site.currentduckdbversion }}/duckdb_odbc-linux-amd64.zip) + * [arm64](https://github.com/duckdb/duckdb/releases/download/v{{ site.currentduckdbversion }}/duckdb_odbc-linux-aarch64.zip) + + + +2. The package contains the following files: + + * `libduckdb_odbc.so`: the DuckDB driver. + * `unixodbc_setup.sh`: a setup script to aid the configuration on Linux. + + To extract them, run: + + ```bash + mkdir duckdb_odbc && unzip duckdb_odbc-linux-amd64.zip -d duckdb_odbc + ``` + +3. The `unixodbc_setup.sh` script performs the configuration of the DuckDB ODBC Driver. It is based on the unixODBC package that provides some commands to handle the ODBC setup and test like `odbcinst` and `isql`. + + Run the following commands with either option `-u` or `-s` to configure DuckDB ODBC. + + The `-u` option based on the user home directory to setup the ODBC init files. + + ```bash + ./unixodbc_setup.sh -u + ``` + + The `-s` option changes the system level files that will be visible for all users, because of that it requires root privileges. + + ```bash + sudo ./unixodbc_setup.sh -s + ``` + + The option `--help` shows the usage of `unixodbc_setup.sh` prints the help. + + ```bash + ./unixodbc_setup.sh --help + ``` + + ```text + Usage: ./unixodbc_setup.sh [options] + + Example: ./unixodbc_setup.sh -u -db ~/database_path -D ~/driver_path/libduckdb_odbc.so + + Level: + -s: System-level, using 'sudo' to configure DuckDB ODBC at the system-level, changing the files: /etc/odbc[inst].ini + -u: User-level, configuring the DuckDB ODBC at the user-level, changing the files: ~/.odbc[inst].ini. + + Options: + -db database_path>: the DuckDB database file path, the default is ':memory:' if not provided. + -D driver_path: the driver file path (i.e., the path for libduckdb_odbc.so), the default is using the base script directory + ``` + +4. The ODBC setup on Linux is based on the `.odbc.ini` and `.odbcinst.ini` files. + + These files can be placed to the user home directory `/home/⟨username⟩` or in the system `/etc` directory. + The Driver Manager prioritizes the user configuration files over the system files. + + For the details of the configuration parameters, see the [ODBC configuration page]({% link docs/archive/1.0/api/odbc/configuration.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/api/odbc/macos.md b/docs/archive/1.0/api/odbc/macos.md new file mode 100644 index 00000000000..2c846987abc --- /dev/null +++ b/docs/archive/1.0/api/odbc/macos.md @@ -0,0 +1,70 @@ +--- +github_repository: https://github.com/duckdb/duckdb-odbc +layout: docu +title: ODBC API on macOS +--- + +1. A driver manager is required to manage communication between applications and the ODBC driver. DuckDB supports `unixODBC`, which is a complete ODBC driver manager for macOS and Linux. Users can install it from the command line via [Homebrew](https://brew.sh/): + + ```bash + brew install unixodbc + ``` + +2. DuckDB releases a universal [ODBC driver for macOS](https://github.com/duckdb/duckdb/releases/download/v{{ site.currentduckdbversion }}/duckdb_odbc-osx-universal.zip) (supporting both Intel and Apple Silicon CPUs). To download it, run: + + ```bash + wget https://github.com/duckdb/duckdb/releases/download/v{{ site.currentduckdbversion }}/duckdb_odbc-osx-universal.zip + ``` + + + +3. The archive contains the `libduckdb_odbc.dylib` artifact. To extract it to a directory, run: + + ```bash + mkdir duckdb_odbc && unzip duckdb_odbc-osx-universal.zip -d duckdb_odbc + ``` + +4. There are two ways to configure the ODBC driver, either by initializing via the configuration files, or by connecting with [`SQLDriverConnect`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqldriverconnect-function?view=sql-server-ver16). + A combination of the two is also possible. + + Furthermore, the ODBC driver supports all the [configuration options]({% link docs/archive/1.0/configuration/overview.md %}) included in DuckDB. + + > If a configuration is set in both the connection string passed to `SQLDriverConnect` and in the `odbc.ini` file, + > the one passed to `SQLDriverConnect` will take precedence. + + For the details of the configuration parameters, see the [ODBC configuration page]({% link docs/archive/1.0/api/odbc/configuration.md %}). + +5. After the configuration, to validate the installation, it is possible to use an ODBC client. unixODBC uses a command line tool called `isql`. + + Use the DSN defined in `odbc.ini` as a parameter of `isql`. + + ```bash + isql DuckDB + ``` + + ```text + +---------------------------------------+ + | Connected! | + | | + | sql-statement | + | help [tablename] | + | echo [string] | + | quit | + | | + +---------------------------------------+ + ``` + + ```sql + SQL> SELECT 42; + ``` + + ```text + +------------+ + | 42 | + +------------+ + | 42 | + +------------+ + + SQLRowCount returns -1 + 1 rows fetched + ``` \ No newline at end of file diff --git a/docs/archive/1.0/api/odbc/overview.md b/docs/archive/1.0/api/odbc/overview.md new file mode 100644 index 00000000000..34efd887025 --- /dev/null +++ b/docs/archive/1.0/api/odbc/overview.md @@ -0,0 +1,24 @@ +--- +github_repository: https://github.com/duckdb/duckdb-odbc +layout: docu +redirect_from: +- /docs/archive/1.0/api/odbc +- /docs/archive/1.0/api/odbc/ +title: ODBC API Overview +--- + +The ODBC (Open Database Connectivity) is a C-style API that provides access to different flavors of Database Management Systems (DBMSs). +The ODBC API consists of the Driver Manager (DM) and the ODBC drivers. + +The Driver Manager is part of the system library, e.g., unixODBC, which manages the communications between the user applications and the ODBC drivers. +Typically, applications are linked against the DM, which uses Data Source Name (DSN) to look up the correct ODBC driver. + +The ODBC driver is a DBMS implementation of the ODBC API, which handles all the internals of that DBMS. + +The DM maps user application calls of ODBC functions to the correct ODBC driver that performs the specified function and returns the proper values. + +## DuckDB ODBC Driver + +DuckDB supports the ODBC version 3.0 according to the [Core Interface Conformance](https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/core-interface-conformance?view=sql-server-ver15). + +The ODBC driver is available for all operating systems. Visit the [installation page]({% link docs/archive/1.0/installation/index.html %}) for direct links. \ No newline at end of file diff --git a/docs/archive/1.0/api/odbc/windows.md b/docs/archive/1.0/api/odbc/windows.md new file mode 100644 index 00000000000..a1179072fa9 --- /dev/null +++ b/docs/archive/1.0/api/odbc/windows.md @@ -0,0 +1,93 @@ +--- +github_repository: https://github.com/duckdb/duckdb-odbc +layout: docu +title: ODBC API on Windows +--- + +Using the DuckDB ODBC API on Windows requires the following steps: + +1. The Microsoft Windows requires an ODBC Driver Manager to manage communication between applications and the ODBC drivers. + The Driver Manager on Windows is provided in a DLL file `odbccp32.dll`, and other files and tools. + For detailed information check out the [Common ODBC Component Files](https://docs.microsoft.com/en-us/previous-versions/windows/desktop/odbc/dn170563(v=vs.85)). + +2. DuckDB releases the ODBC driver as an asset. For Windows, download it from the [Windows ODBC asset (x86_64/AMD64)](https://github.com/duckdb/duckdb/releases/download/v{{ site.currentduckdbversion }}/duckdb_odbc-windows-amd64.zip). + +3. The archive contains the following artifacts: + + * `duckdb_odbc.dll`: the DuckDB driver compiled for Windows. + * `duckdb_odbc_setup.dll`: a setup DLL used by the Windows ODBC Data Source Administrator tool. + * `odbc_install.exe`: an installation script to aid the configuration on Windows. + + Decompress the archive to a directory (e.g., `duckdb_odbc`). For example, run: + + ```bash + mkdir duckdb_odbc && unzip duckdb_odbc-windows-amd64.zip -d duckdb_odbc + ``` + +4. The `odbc_install.exe` binary performs the configuration of the DuckDB ODBC Driver on Windows. It depends on the `Odbccp32.dll` that provides functions to configure the ODBC registry entries. + + Inside the permanent directory (e.g., `duckdb_odbc`), double-click on the `odbc_install.exe`. + + Windows administrator privileges are required. In case of a non-administrator, a User Account Control prompt will occur. + +5. `odbc_install.exe` adds a default DSN configuration into the ODBC registries with a default database `:memory:`. + +### DSN Windows Setup + +After the installation, it is possible to change the default DSN configuration or add a new one using the Windows ODBC Data Source Administrator tool `odbcad32.exe`. + +It also can be launched thought the Windows start: + + + +### Default DuckDB DSN + +The newly installed DSN is visible on the ***System DSN*** in the Windows ODBC Data Source Administrator tool: + +![Windows ODBC Config Tool](/images/blog/odbc/odbcad32_exe.png) + +### Changing DuckDB DSN + +When selecting the default DSN (i.e., `DuckDB`) or adding a new configuration, the following setup window will display: + +![DuckDB Windows DSN Setup](/images/blog/odbc/duckdb_DSN_setup.png) + +This window allows you to set the DSN and the database file path associated with that DSN. + +## More Detailed Windows Setup + +There are two ways to configure the ODBC driver, either by altering the registry keys as detailed below, +or by connecting with [`SQLDriverConnect`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqldriverconnect-function?view=sql-server-ver16). +A combination of the two is also possible. + +Furthermore, the ODBC driver supports all the [configuration options]({% link docs/archive/1.0/configuration/overview.md %}) +included in DuckDB. + +> If a configuration is set in both the connection string passed to `SQLDriverConnect` and in the `odbc.ini` file, +> the one passed to `SQLDriverConnect` will take precedence. + +For the details of the configuration parameters, see the [ODBC configuration page]({% link docs/archive/1.0/api/odbc/configuration.md %}). + +### Registry Keys + +The ODBC setup on Windows is based on registry keys (see [Registry Entries for ODBC Components](https://docs.microsoft.com/en-us/sql/odbc/reference/install/registry-entries-for-odbc-components?view=sql-server-ver15)). +The ODBC entries can be placed at the current user registry key (`HKCU`) or the system registry key (`HKLM`). + +We have tested and used the system entries based on `HKLM->SOFTWARE->ODBC`. +The `odbc_install.exe` changes this entry that has two subkeys: `ODBC.INI` and `ODBCINST.INI`. + +The `ODBC.INI` is where users usually insert DSN registry entries for the drivers. + +For example, the DSN registry for DuckDB would look like this: + +![`HKLM->SOFTWARE->ODBC->ODBC.INI->DuckDB`](/images/blog/odbc/odbc_ini-registry-entry.png) + +The `ODBCINST.INI` contains one entry for each ODBC driver and other keys predefined for [Windows ODBC configuration](https://docs.microsoft.com/en-us/sql/odbc/reference/install/registry-entries-for-odbc-components?view=sql-server-ver15). + +### Updating the ODBC Driver + +When a new version of the ODBC driver is released, installing the new version will overwrite the existing one. +However, the installer doesn't always update the version number in the registry. +To ensure the correct version is used, +check that `HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBCINST.INI\DuckDB Driver` has the most recent version, +and `HKEY_LOCAL_MACHINE\SOFTWARE\ODBC\ODBC.INI\DuckDB\Driver` has the correct path to the new driver. \ No newline at end of file diff --git a/docs/archive/1.0/api/overview.md b/docs/archive/1.0/api/overview.md new file mode 100644 index 00000000000..cb8dfe8cf72 --- /dev/null +++ b/docs/archive/1.0/api/overview.md @@ -0,0 +1,32 @@ +--- +layout: docu +title: Client APIs Overview +--- + +DuckDB is an in-process database system and offers client APIs for several languages. These clients support the same DuckDB file format and SQL syntax. We strived to make their APIs follow their host language's conventions. + +Client APIs: + +* Standalone [Command Line Interface (CLI)]({% link docs/archive/1.0/api/cli/overview.md %}) client +* [ADBC API]({% link docs/archive/1.0/api/adbc.md %}) +* [C]({% link docs/archive/1.0/api/c/overview.md %}) +* [C++]({% link docs/archive/1.0/api/cpp.md %}) +* [Go]({% link docs/archive/1.0/api/go.md %}) by [marcboeker](https://github.com/marcboeker) +* [Java]({% link docs/archive/1.0/api/java.md %}) +* [Julia]({% link docs/archive/1.0/api/julia.md %}) +* [Node.js]({% link docs/archive/1.0/api/nodejs/overview.md %}) +* [ODBC API]({% link docs/archive/1.0/api/odbc/overview.md %}) +* [Python]({% link docs/archive/1.0/api/python/overview.md %}) +* [R]({% link docs/archive/1.0/api/r.md %}) +* [Rust]({% link docs/archive/1.0/api/rust.md %}) +* [Swift]({% link docs/archive/1.0/api/swift.md %}) +* [WebAssembly (Wasm)]({% link docs/archive/1.0/api/wasm/overview.md %}) + +There are also contributed third-party DuckDB wrappers, which currently do not have an official documentation page: + +* [C#](https://github.com/Giorgi/DuckDB.NET) by [Giorgi](https://github.com/Giorgi) +* [Common Lisp](https://github.com/ak-coram/cl-duckdb) by [ak-coram](https://github.com/ak-coram) +* [Crystal](https://github.com/amauryt/crystal-duckdb) by [amauryt](https://github.com/amauryt) +* [Elixir](https://github.com/AlexR2D2/duckdbex) by [AlexR2D2](https://github.com/AlexR2D2/duckdbex) +* [Ruby](https://github.com/suketa/ruby-duckdb) by [suketa](https://github.com/suketa) +* [Zig](https://github.com/karlseguin/zuckdb.zig) by [karlseguin](https://github.com/karlseguin) \ No newline at end of file diff --git a/docs/archive/1.0/api/python/conversion.md b/docs/archive/1.0/api/python/conversion.md new file mode 100644 index 00000000000..68a930376e4 --- /dev/null +++ b/docs/archive/1.0/api/python/conversion.md @@ -0,0 +1,212 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/api/python/result_conversion +title: Conversion between DuckDB and Python +--- + +This page documents the rules for converting [Python objects to DuckDB](#object-conversion-python-object-to-duckdb) and [DuckDB results to Python](#result-conversion-duckdb-results-to-python). + +## Object Conversion: Python Object to DuckDB + +This is a mapping of Python object types to DuckDB [Logical Types]({% link docs/archive/1.0/sql/data_types/overview.md %}): + +* `None` → `NULL` +* `bool` → `BOOLEAN` +* `datetime.timedelta` → `INTERVAL` +* `str` → `VARCHAR` +* `bytearray` → `BLOB` +* `memoryview` → `BLOB` +* `decimal.Decimal` → `DECIMAL` / `DOUBLE` +* `uuid.UUID` → `UUID` + +The rest of the conversion rules are as follows. + +### `int` + +Since integers can be of arbitrary size in Python, there is not a one-to-one conversion possible for ints. +Instead we perform these casts in order until one succeeds: + +* `BIGINT` +* `INTEGER` +* `UBIGINT` +* `UINTEGER` +* `DOUBLE` + +When using the DuckDB Value class, it's possible to set a target type, which will influence the conversion. + +### `float` + +These casts are tried in order until one succeeds: + +* `DOUBLE` +* `FLOAT` + +### `datetime.datetime` + +For `datetime` we will check `pandas.isnull` if it's available and return `NULL` if it returns `true`. +We check against `datetime.datetime.min` and `datetime.datetime.max` to convert to `-inf` and `+inf` respectively. + +If the `datetime` has tzinfo, we will use `TIMESTAMPTZ`, otherwise it becomes `TIMESTAMP`. + +### `datetime.time` + +If the `time` has tzinfo, we will use `TIMETZ`, otherwise it becomes `TIME`. + +### `datetime.date` + +`date` converts to the `DATE` type. +We check against `datetime.date.min` and `datetime.date.max` to convert to `-inf` and `+inf` respectively. + +### `bytes` + +`bytes` converts to `BLOB` by default, when it's used to construct a Value object of type `BITSTRING`, it maps to `BITSTRING` instead. + +### `list` + +`list` becomes a `LIST` type of the “most permissive” type of its children, for example: + +```python +my_list_value = [ + 12345, + "test" +] +``` + +Will become `VARCHAR[]` because 12345 can convert to `VARCHAR` but `test` can not convert to `INTEGER`. + +```sql +[12345, test] +``` + +### `dict` + +The `dict` object can convert to either `STRUCT(...)` or `MAP(..., ...)` depending on its structure. +If the dict has a structure similar to: + +```python +my_map_dict = { + "key": [ + 1, 2, 3 + ], + "value": [ + "one", "two", "three" + ] +} +``` + +Then we'll convert it to a `MAP` of key-value pairs of the two lists zipped together. +The example above becomes a `MAP(INTEGER, VARCHAR)`: + +```sql +{1=one, 2=two, 3=three} +``` + +> The names of the fields matter and the two lists need to have the same size. + +Otherwise we'll try to convert it to a `STRUCT`. + +```python +my_struct_dict = { + 1: "one", + "2": 2, + "three": [1, 2, 3], + False: True +} +``` + +Becomes: + +```sql +{'1': one, '2': 2, 'three': [1, 2, 3], 'False': true} +``` + +> Every `key` of the dictionary is converted to string. + +### `tuple` + +`tuple` converts to `LIST` by default, when it's used to construct a Value object of type `STRUCT` it will convert to `STRUCT` instead. + +### `numpy.ndarray` and `numpy.datetime64` + +`ndarray` and `datetime64` are converted by calling `tolist()` and converting the result of that. + +## Result Conversion: DuckDB Results to Python + +DuckDB's Python client provides multiple additional methods that can be used to efficiently retrieve data. + +### NumPy + +* `fetchnumpy()` fetches the data as a dictionary of NumPy arrays + +### Pandas + +* `df()` fetches the data as a Pandas DataFrame +* `fetchdf()` is an alias of `df()` +* `fetch_df()` is an alias of `df()` +* `fetch_df_chunk(vector_multiple)` fetches a portion of the results into a DataFrame. The number of rows returned in each chunk is the vector size (2048 by default) * vector_multiple (1 by default). + +### Apache Arrow + +* `arrow()` fetches the data as an [Arrow table](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html) +* `fetch_arrow_table()` is an alias of `arrow()` +* `fetch_record_batch(chunk_size)` returns an [Arrow record batch reader](https://arrow.apache.org/docs/python/generated/pyarrow.ipc.RecordBatchStreamReader.html) with `chunk_size` rows per batch + +### Polars + +* `pl()` fetches the data as a Polars DataFrame + +### Examples + +Below are some examples using this functionality. See the [Python guides]({% link docs/archive/1.0/guides/overview.md %}#python-client) for more examples. + +Fetch as Pandas DataFrame: + +```python +df = con.execute("SELECT * FROM items").fetchdf() +print(df) +``` + +```text + item value count +0 jeans 20.0 1 +1 hammer 42.2 2 +2 laptop 2000.0 1 +3 chainsaw 500.0 10 +4 iphone 300.0 2 +``` + +Fetch as dictionary of NumPy arrays: + +```python +arr = con.execute("SELECT * FROM items").fetchnumpy() +print(arr) +``` + +```text +{'item': masked_array(data=['jeans', 'hammer', 'laptop', 'chainsaw', 'iphone'], + mask=[False, False, False, False, False], + fill_value='?', + dtype=object), 'value': masked_array(data=[20.0, 42.2, 2000.0, 500.0, 300.0], + mask=[False, False, False, False, False], + fill_value=1e+20), 'count': masked_array(data=[1, 2, 1, 10, 2], + mask=[False, False, False, False, False], + fill_value=999999, + dtype=int32)} +``` + +Fetch as an Arrow table. Converting to Pandas afterwards just for pretty printing: + +```python +tbl = con.execute("SELECT * FROM items").fetch_arrow_table() +print(tbl.to_pandas()) +``` + +```text + item value count +0 jeans 20.00 1 +1 hammer 42.20 2 +2 laptop 2000.00 1 +3 chainsaw 500.00 10 +4 iphone 300.00 2 +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/python/data_ingestion.md b/docs/archive/1.0/api/python/data_ingestion.md new file mode 100644 index 00000000000..6399def4260 --- /dev/null +++ b/docs/archive/1.0/api/python/data_ingestion.md @@ -0,0 +1,195 @@ +--- +layout: docu +title: Data Ingestion +--- + +This page contains examples for data ingestion to Python using DuckDB. First, import the DuckDB page: + +```python +import duckdb +``` + +Then, proceed with any of the following sections. + +## CSV Files + +CSV files can be read using the `read_csv` function, called either from within Python or directly from within SQL. By default, the `read_csv` function attempts to auto-detect the CSV settings by sampling from the provided file. + +Read from a file using fully auto-detected settings: + +```python +duckdb.read_csv("example.csv") +``` + +Read multiple CSV files from a folder: + +```python +duckdb.read_csv("folder/*.csv") +``` + +Specify options on how the CSV is formatted internally: + +```python +duckdb.read_csv("example.csv", header = False, sep = ",") +``` + +Override types of the first two columns: + +```python +duckdb.read_csv("example.csv", dtype = ["int", "varchar"]) +``` + +Directly read a CSV file from within SQL: + +```python +duckdb.sql("SELECT * FROM 'example.csv'") +``` + +Call `read_csv` from within SQL: + +```python +duckdb.sql("SELECT * FROM read_csv('example.csv')") +``` + +See the [CSV Import]({% link docs/archive/1.0/data/csv/overview.md %}) page for more information. + +## Parquet Files + +Parquet files can be read using the `read_parquet` function, called either from within Python or directly from within SQL. + +Read from a single Parquet file: + +```python +duckdb.read_parquet("example.parquet") +``` + +Read multiple Parquet files from a folder: + +```python +duckdb.read_parquet("folder/*.parquet") +``` + +Read a Parquet file over [https]({% link docs/archive/1.0/extensions/httpfs/overview.md %}): + +```python +duckdb.read_parquet("https://some.url/some_file.parquet") +``` + +Read a list of Parquet files: + +```python +duckdb.read_parquet(["file1.parquet", "file2.parquet", "file3.parquet"]) +``` + +Directly read a Parquet file from within SQL: + +```python +duckdb.sql("SELECT * FROM 'example.parquet'") +``` + +Call `read_parquet` from within SQL: + +```python +duckdb.sql("SELECT * FROM read_parquet('example.parquet')") +``` + +See the [Parquet Loading]({% link docs/archive/1.0/data/parquet/overview.md %}) page for more information. + +## JSON Files + +JSON files can be read using the `read_json` function, called either from within Python or directly from within SQL. By default, the `read_json` function will automatically detect if a file contains newline-delimited JSON or regular JSON, and will detect the schema of the objects stored within the JSON file. + +Read from a single JSON file: + +```python +duckdb.read_json("example.json") +``` + +Read multiple JSON files from a folder: + +```python +duckdb.read_json("folder/*.json") +``` + +Directly read a JSON file from within SQL: + +```python +duckdb.sql("SELECT * FROM 'example.json'") +``` + +Call `read_json` from within SQL: + +```python +duckdb.sql("SELECT * FROM read_json_auto('example.json')") +``` + +## Directly Accessing DataFrames and Arrow Objects + +DuckDB is automatically able to query certain Python variables by referring to their variable name (as if it was a table). +These types include the following: Pandas DataFrame, Polars DataFrame, Polars LazyFrame, NumPy arrays, [relations]({% link docs/archive/1.0/api/python/relational_api.md %}), and Arrow objects. +Accessing these is made possible by [replacement scans]({% link docs/archive/1.0/api/c/replacement_scans.md %}). + +DuckDB supports querying multiple types of Apache Arrow objects including [tables](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html), [datasets](https://arrow.apache.org/docs/python/generated/pyarrow.dataset.Dataset.html), [RecordBatchReaders](https://arrow.apache.org/docs/python/generated/pyarrow.ipc.RecordBatchStreamReader.html), and [scanners](https://arrow.apache.org/docs/python/generated/pyarrow.dataset.Scanner.html). See the Python [guides]({% link docs/archive/1.0/guides/overview.md %}#python-client) for more examples. + +```python +import duckdb +import pandas as pd + +test_df = pd.DataFrame.from_dict({"i": [1, 2, 3, 4], "j": ["one", "two", "three", "four"]}) +print(duckdb.sql("SELECT * FROM test_df").fetchall()) +``` + +```text +[(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')] +``` + +DuckDB also supports “registering” a DataFrame or Arrow object as a virtual table, comparable to a SQL `VIEW`. This is useful when querying a DataFrame/Arrow object that is stored in another way (as a class variable, or a value in a dictionary). Below is a Pandas example: + +If your Pandas DataFrame is stored in another location, here is an example of manually registering it: + +```python +import duckdb +import pandas as pd + +my_dictionary = {} +my_dictionary["test_df"] = pd.DataFrame.from_dict({"i": [1, 2, 3, 4], "j": ["one", "two", "three", "four"]}) +duckdb.register("test_df_view", my_dictionary["test_df"]) +print(duckdb.sql("SELECT * FROM test_df_view").fetchall()) +``` + +```text +[(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')] +``` + +You can also create a persistent table in DuckDB from the contents of the DataFrame (or the view): + +```python +# create a new table from the contents of a DataFrame +con.execute("CREATE TABLE test_df_table AS SELECT * FROM test_df") +# insert into an existing table from the contents of a DataFrame +con.execute("INSERT INTO test_df_table SELECT * FROM test_df") +``` + +### Pandas DataFrames – `object` Columns + +`pandas.DataFrame` columns of an `object` dtype require some special care, since this stores values of arbitrary type. +To convert these columns to DuckDB, we first go through an analyze phase before converting the values. +In this analyze phase a sample of all the rows of the column are analyzed to determine the target type. +This sample size is by default set to 1000. +If the type picked during the analyze step is incorrect, this will result in a "Failed to cast value:" error, in which case you will need to increase the sample size. +The sample size can be changed by setting the `pandas_analyze_sample` config option. + +```python +# example setting the sample size to 100k +duckdb.execute("SET GLOBAL pandas_analyze_sample = 100_000") +``` + +### Registering Objects + +You can register Python objects as DuckDB tables using the [`DuckDBPyConnection.register()` function]({% link docs/archive/1.0/api/python/reference/index.md %}#duckdb.DuckDBPyConnection.register). + +The precedence of objects with the same name is as follows: + +* Objects explicitly registered via `DuckDBPyConnection.register()` +* Native DuckDB tables and views +* [Replacement scans]({% link docs/archive/1.0/api/c/replacement_scans.md %}) \ No newline at end of file diff --git a/docs/archive/1.0/api/python/dbapi.md b/docs/archive/1.0/api/python/dbapi.md new file mode 100644 index 00000000000..7f3a180878d --- /dev/null +++ b/docs/archive/1.0/api/python/dbapi.md @@ -0,0 +1,172 @@ +--- +layout: docu +title: Python DB API +--- + +The standard DuckDB Python API provides a SQL interface compliant with the [DB-API 2.0 specification described by PEP 249](https://www.python.org/dev/peps/pep-0249/) similar to the [SQLite Python API](https://docs.python.org/3.7/library/sqlite3.html). + +## Connection + +To use the module, you must first create a `DuckDBPyConnection` object that represents a connection to a database. +This is done through the [`duckdb.connect`]({% link docs/archive/1.0/api/python/reference/index.md %}#duckdb.connect) method. + +The 'config' keyword argument can be used to provide a `dict` that contains key->value pairs referencing [settings]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference) understood by DuckDB. + +### In-Memory Connection + +The special value `:memory:` can be used to create an **in-memory database**. Note that for an in-memory database no data is persisted to disk (i.e., all data is lost when you exit the Python process). + +#### Named in-memory Connections + +The special value `:memory:` can also be postfixed with a name, for example: `:memory:conn3`. +When a name is provided, subsequent `duckdb.connect` calls will create a new connection to the same database, sharing the catalogs (views, tables, macros etc..). + +Using `:memory:` without a name will always create a new and separate database instance. + +### Default Connection + +By default we create an (unnamed) **in-memory-database** that lives inside the `duckdb` module. +Every method of `DuckDBPyConnection` is also available on the `duckdb` module, this connection is what's used by these methods. + +The special value `:default:` can be used to get this default connection. + +### File-Based Connection + +If the `database` is a file path, a connection to a persistent database is established. +If the file does not exist the file will be created (the extension of the file is irrelevant and can be `.db`, `.duckdb` or anything else). + +#### `read_only` Connections + +If you would like to connect in read-only mode, you can set the `read_only` flag to `True`. If the file does not exist, it is **not** created when connecting in read-only mode. +Read-only mode is required if multiple Python processes want to access the same database file at the same time. + +```python +import duckdb + +duckdb.execute("CREATE TABLE tbl AS SELECT 42 a") +con = duckdb.connect(":default:") +con.sql("SELECT * FROM tbl") +# or +duckdb.default_connection.sql("SELECT * FROM tbl") +``` + +```text +┌───────┐ +│ a │ +│ int32 │ +├───────┤ +│ 42 │ +└───────┘ +``` + +```python +import duckdb + +# to start an in-memory database +con = duckdb.connect(database = ":memory:") +# to use a database file (not shared between processes) +con = duckdb.connect(database = "my-db.duckdb", read_only = False) +# to use a database file (shared between processes) +con = duckdb.connect(database = "my-db.duckdb", read_only = True) +# to explicitly get the default connection +con = duckdb.connect(database = ":default:") +``` + +If you want to create a second connection to an existing database, you can use the `cursor()` method. This might be useful for example to allow parallel threads running queries independently. A single connection is thread-safe but is locked for the duration of the queries, effectively serializing database access in this case. + +Connections are closed implicitly when they go out of scope or if they are explicitly closed using `close()`. Once the last connection to a database instance is closed, the database instance is closed as well. + +## Querying + +SQL queries can be sent to DuckDB using the `execute()` method of connections. Once a query has been executed, results can be retrieved using the `fetchone` and `fetchall` methods on the connection. `fetchall` will retrieve all results and complete the transaction. `fetchone` will retrieve a single row of results each time that it is invoked until no more results are available. The transaction will only close once `fetchone` is called and there are no more results remaining (the return value will be `None`). As an example, in the case of a query only returning a single row, `fetchone` should be called once to retrieve the results and a second time to close the transaction. Below are some short examples: + +```python +# create a table +con.execute("CREATE TABLE items (item VARCHAR, value DECIMAL(10, 2), count INTEGER)") +# insert two items into the table +con.execute("INSERT INTO items VALUES ('jeans', 20.0, 1), ('hammer', 42.2, 2)") + +# retrieve the items again +con.execute("SELECT * FROM items") +print(con.fetchall()) +# [('jeans', Decimal('20.00'), 1), ('hammer', Decimal('42.20'), 2)] + +# retrieve the items one at a time +con.execute("SELECT * FROM items") +print(con.fetchone()) +# ('jeans', Decimal('20.00'), 1) +print(con.fetchone()) +# ('hammer', Decimal('42.20'), 2) +print(con.fetchone()) # This closes the transaction. Any subsequent calls to .fetchone will return None +# None +``` + +The `description` property of the connection object contains the column names as per the standard. + +### Prepared Statements + +DuckDB also supports [prepared statements]({% link docs/archive/1.0/sql/query_syntax/prepared_statements.md %}) in the API with the `execute` and `executemany` methods. The values may be passed as an additional parameter after a query that contains `?` or `$1` (dollar symbol and a number) placeholders. Using the `?` notation adds the values in the same sequence as passed within the Python parameter. Using the `$` notation allows for values to be reused within the SQL statement based on the number and index of the value found within the Python parameter. Values are converted according to the [conversion rules]({% link docs/archive/1.0/api/python/conversion.md %}#object-conversion-python-object-to-duckdb). + +Here are some examples. First, insert a row using a [prepared statement]({% link docs/archive/1.0/sql/query_syntax/prepared_statements.md %}): + +```python +con.execute("INSERT INTO items VALUES (?, ?, ?)", ["laptop", 2000, 1]) +``` + +Second, insert several rows using a [prepared statement]({% link docs/archive/1.0/sql/query_syntax/prepared_statements.md %}): + +```python +con.executemany("INSERT INTO items VALUES (?, ?, ?)", [["chainsaw", 500, 10], ["iphone", 300, 2]] ) +``` + +Query the database using a [prepared statement]({% link docs/archive/1.0/sql/query_syntax/prepared_statements.md %}): + +```python +con.execute("SELECT item FROM items WHERE value > ?", [400]) +print(con.fetchall()) +``` + +```text +[('laptop',), ('chainsaw',)] +``` + +Query using the `$` notation for a [prepared statement]({% link docs/archive/1.0/sql/query_syntax/prepared_statements.md %}) and reused values: + +```python +con.execute("SELECT $1, $1, $2", ["duck", "goose"]) +print(con.fetchall()) +``` + +```text +[('duck', 'duck', 'goose')] +``` + +> Warning Do *not* use `executemany` to insert large amounts of data into DuckDB. See the [data ingestion page]({% link docs/archive/1.0/api/python/data_ingestion.md %}) for better options. + +## Named Parameters + +Besides the standard unnamed parameters, like `$1`, `$2` etc., it's also possible to supply named parameters, like `$my_parameter`. +When using named parameters, you have to provide a dictionary mapping of `str` to value in the `parameters` argument. +An example use is the following: + +```python +import duckdb + +res = duckdb.execute(""" + SELECT + $my_param, + $other_param, + $also_param + """, + { + "my_param": 5, + "other_param": "DuckDB", + "also_param": [42] + } +).fetchall() +print(res) +``` + +```text +[(5, 'DuckDB', [42])] +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/python/expression.md b/docs/archive/1.0/api/python/expression.md new file mode 100644 index 00000000000..b62f1bc9916 --- /dev/null +++ b/docs/archive/1.0/api/python/expression.md @@ -0,0 +1,172 @@ +--- +layout: docu +title: Expression API +--- + +The `Expression` class represents an instance of an [expression]({% link docs/archive/1.0/sql/expressions/overview.md %}). + +## Why Would I Use the Expression API? + +Using this API makes it possible to dynamically build up expressions, which are typically created by the parser from the query string. +This allows you to skip that and have more fine-grained control over the used expressions. + +Below is a list of currently supported expressions that can be created through the API. + +## Column Expression + +This expression references a column by name. + +```python +import duckdb +import pandas as pd + +df = pd.DataFrame({ + 'a': [1, 2, 3, 4], + 'b': [True, None, False, True], + 'c': [42, 21, 13, 14] +}) + +# selecting a single column +col = duckdb.ColumnExpression('a') +res = duckdb.df(df).select(col).fetchall() +print(res) +# [(1,), (2,), (3,), (4,)] + +# selecting multiple columns +col_list = [ + duckdb.ColumnExpression('a') * 10, + duckdb.ColumnExpression('b').isnull(), + duckdb.ColumnExpression('c') + 5 + ] +res = duckdb.df(df).select(*col_list).fetchall() +print(res) +# [(10, False, 47), (20, True, 26), (30, False, 18), (40, False, 19)] +``` + +## Star Expression + +This expression selects all columns of the input source. + +Optionally it's possible to provide an `exclude` list to filter out columns of the table. +This `exclude` list can contain either strings or Expressions. + +```python +import duckdb +import pandas as pd + +df = pd.DataFrame({ + 'a': [1, 2, 3, 4], + 'b': [True, None, False, True], + 'c': [42, 21, 13, 14] +}) + +star = duckdb.StarExpression(exclude = ['b']) +res = duckdb.df(df).select(star).fetchall() +print(res) +# [(1, 42), (2, 21), (3, 13), (4, 14)] +``` + +## Constant Expression + +This expression contains a single value. + +```python +import duckdb +import pandas as pd + +df = pd.DataFrame({ + 'a': [1, 2, 3, 4], + 'b': [True, None, False, True], + 'c': [42, 21, 13, 14] +}) + +const = duckdb.ConstantExpression('hello') +res = duckdb.df(df).select(const).fetchall() +print(res) +# [('hello',), ('hello',), ('hello',), ('hello',)] +``` + +## Case Expression + +This expression contains a `CASE WHEN (...) THEN (...) ELSE (...) END` expression. +By default `ELSE` is `NULL` and it can be set using `.else(value = ...)`. +Additional `WHEN (...) THEN (...)` blocks can be added with `.when(condition = ..., value = ...)`. + +```python +import duckdb +import pandas as pd +from duckdb import ( + ConstantExpression, + ColumnExpression, + CaseExpression +) + +df = pd.DataFrame({ + 'a': [1, 2, 3, 4], + 'b': [True, None, False, True], + 'c': [42, 21, 13, 14] +}) + +hello = ConstantExpression('hello') +world = ConstantExpression('world') + +case = \ + CaseExpression(condition = ColumnExpression('b') == False, value = world) \ + .otherwise(hello) +res = duckdb.df(df).select(case).fetchall() +print(res) +# [('hello',), ('hello',), ('world',), ('hello',)] +``` + +## Function Expression + +This expression contains a function call. +It can be constructed by providing the function name and an arbitrary amount of Expressions as arguments. + +```python +import duckdb +import pandas as pd +from duckdb import ( + ConstantExpression, + ColumnExpression, + FunctionExpression +) + +df = pd.DataFrame({ + 'a': [ + 'test', + 'pest', + 'text', + 'rest', + ] +}) + +ends_with = FunctionExpression('ends_with', ColumnExpression('a'), ConstantExpression('est')) +res = duckdb.df(df).select(ends_with).fetchall() +print(res) +# [(True,), (True,), (False,), (True,)] +``` + +## Common Operations + +The Expression class also contains many operations that can be applied to any Expression type. + +| Operation | Description | +|--------------------------------|----------------------------------------------------------------------------------------------------------------| +| `.alias(name: str)` | Applies an alias to the expression. | +| `.cast(type: DuckDBPyType)` | Applies a cast to the provided type on the expression. | +| `.isin(*exprs: Expression)` | Creates an [`IN` expression]({% link docs/archive/1.0/sql/expressions/in.md %}#in) against the provided expressions as the list. | +| `.isnotin(*exprs: Expression)` | Creates a [`NOT IN` expression]({% link docs/archive/1.0/sql/expressions/in.md %}#not-in) against the provided expressions as the list. | +| `.isnotnull()` | Checks whether the expression is not `NULL`. | +| `.isnull()` | Checks whether the expression is `NULL`. | + +### Order Operations + +When expressions are provided to `DuckDBPyRelation.order()`, the following order operations can be applied. + +| Operation | Description | +|--------------------------------|----------------------------------------------------------------------------------------------------------------| +| `.asc()` | Indicates that this expression should be sorted in ascending order. | +| `.desc()` | Indicates that this expression should be sorted in descending order. | +| `.nulls_first()` | Indicates that the nulls in this expression should precede the non-null values. | +| `.nulls_last()` | Indicates that the nulls in this expression should come after the non-null values. | \ No newline at end of file diff --git a/docs/archive/1.0/api/python/function.md b/docs/archive/1.0/api/python/function.md new file mode 100644 index 00000000000..1716c2b190b --- /dev/null +++ b/docs/archive/1.0/api/python/function.md @@ -0,0 +1,232 @@ +--- +layout: docu +title: Python Function API +--- + +You can create a DuckDB user-defined function (UDF) from a Python function so it can be used in SQL queries. +Similarly to regular [functions]({% link docs/archive/1.0/sql/functions/overview.md %}), they need to have a name, a return type and parameter types. + +Here is an example using a Python function that calls a third-party library. + +```python +import duckdb +from duckdb.typing import * +from faker import Faker + +def generate_random_name(): + fake = Faker() + return fake.name() + +duckdb.create_function("random_name", generate_random_name, [], VARCHAR) +res = duckdb.sql("SELECT random_name()").fetchall() +print(res) +``` + +```text +[('Gerald Ashley',)] +``` + +## Creating Functions + +To register a Python UDF, use the `create_function` method from a DuckDB connection. Here is the syntax: + +```python +import duckdb +con = duckdb.connect() +con.create_function(name, function, parameters, return_type) +``` + +The `create_function` method takes the following parameters: + +1. **name**: A string representing the unique name of the UDF within the connection catalog. +2. **function**: The Python function you wish to register as a UDF. +3. **parameters**: Scalar functions can operate on one or more columns. This parameter takes a list of column types used as input. +4. **return_type**: Scalar functions return one element per row. This parameter specifies the return type of the function. +5. **type** (Optional): DuckDB supports both built-in Python types and PyArrow Tables. By default, built-in types are assumed, but you can specify `type = 'arrow'` to use PyArrow Tables. +6. **null_handling** (Optional): By default, null values are automatically handled as Null-In Null-Out. Users can specify a desired behavior for null values by setting `null_handling = 'special'`. +7. **exception_handling** (Optional): By default, when an exception is thrown from the Python function, it will be re-thrown in Python. Users can disable this behavior, and instead return `null`, by setting this parameter to `'return_null'` +8. **side_effects** (Optional): By default, functions are expected to produce the same result for the same input. If the result of a function is impacted by any type of randomness, `side_effects` must be set to `True`. + +To unregister a UDF, you can call the `remove_function` method with the UDF name: + +```python +con.remove_function(name) +``` + +## Type Annotation + +When the function has type annotation it's often possible to leave out all of the optional parameters. +Using `DuckDBPyType` we can implicitly convert many known types to DuckDBs type system. +For example: + +```python +import duckdb + +def my_function(x: int) -> str: + return x + +duckdb.create_function("my_func", my_function) +print(duckdb.sql("SELECT my_func(42)")) +``` + +```text +┌─────────────┐ +│ my_func(42) │ +│ varchar │ +├─────────────┤ +│ 42 │ +└─────────────┘ +``` + +If only the parameter list types can be inferred, you'll need to pass in `None` as `parameters`. + +## Null Handling + +By default when functions receive a `NULL` value, this instantly returns `NULL`, as part of the default `NULL`-handling. +When this is not desired, you need to explicitly set this parameter to `"special"`. + +```python +import duckdb +from duckdb.typing import * + +def dont_intercept_null(x): + return 5 + +duckdb.create_function("dont_intercept", dont_intercept_null, [BIGINT], BIGINT) +res = duckdb.sql("SELECT dont_intercept(NULL)").fetchall() +print(res) +``` + +```text +[(None,)] +``` + +With `null_handling="special"`: + +```python +import duckdb +from duckdb.typing import * + +def dont_intercept_null(x): + return 5 + +duckdb.create_function("dont_intercept", dont_intercept_null, [BIGINT], BIGINT, null_handling="special") +res = duckdb.sql("SELECT dont_intercept(NULL)").fetchall() +print(res) +``` + +```text +[(5,)] +``` + +## Exception Handling + +By default, when an exception is thrown from the Python function, we'll forward (re-throw) the exception. +If you want to disable this behavior, and instead return null, you'll need to set this parameter to `"return_null"` + +```python +import duckdb +from duckdb.typing import * + +def will_throw(): + raise ValueError("ERROR") + +duckdb.create_function("throws", will_throw, [], BIGINT) +try: + res = duckdb.sql("SELECT throws()").fetchall() +except duckdb.InvalidInputException as e: + print(e) + +duckdb.create_function("doesnt_throw", will_throw, [], BIGINT, exception_handling="return_null") +res = duckdb.sql("SELECT doesnt_throw()").fetchall() +print(res) +``` + +```console +Invalid Input Error: Python exception occurred while executing the UDF: ValueError: ERROR + +At: + ...(5): will_throw + ...(9): +``` + +```text +[(None,)] +``` + +## Side Effects + +By default DuckDB will assume the created function is a *pure* function, meaning it will produce the same output when given the same input. +If your function does not follow that rule, for example when your function makes use of randomness, then you will need to mark this function as having `side_effects`. + +For example, this function will produce a new count for every invocation + +```python +def count() -> int: + old = count.counter; + count.counter += 1 + return old + +count.counter = 0 +``` + +If we create this function without marking it as having side effects, the result will be the following: + +```python +con = duckdb.connect() +con.create_function("my_counter", count, side_effects = False) +res = con.sql("SELECT my_counter() FROM range(10)").fetchall() +print(res) +``` + +```text +[(0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,)] +``` + +Which is obviously not the desired result, when we add `side_effects = True`, the result is as we would expect: + +```python +con.remove_function("my_counter") +count.counter = 0 +con.create_function("my_counter", count, side_effects = True) +res = con.sql("SELECT my_counter() FROM range(10)").fetchall() +print(res) +``` + +```text +[(0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,), (9,)] +``` + +## Python Function Types + +Currently, two function types are supported, `native` (default) and `arrow`. + +### Arrow + +If the function is expected to receive arrow arrays, set the `type` parameter to `'arrow'`. + +This will let the system know to provide arrow arrays of up to `STANDARD_VECTOR_SIZE` tuples to the function, and also expect an array of the same amount of tuples to be returned from the function. + +### Native + +When the function type is set to `native` the function will be provided with a single tuple at a time, and expect only a single value to be returned. +This can be useful to interact with Python libraries that don't operate on Arrow, such as `faker`: + +```python +import duckdb + +from duckdb.typing import * +from faker import Faker + +def random_date(): + fake = Faker() + return fake.date_between() + +duckdb.create_function("random_date", random_date, [], DATE, type="native") +res = duckdb.sql("SELECT random_date()").fetchall() +print(res) +``` + +```text +[(datetime.date(2019, 5, 15),)] +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/python/known_issues.md b/docs/archive/1.0/api/python/known_issues.md new file mode 100644 index 00000000000..39ca60982b8 --- /dev/null +++ b/docs/archive/1.0/api/python/known_issues.md @@ -0,0 +1,104 @@ +--- +layout: docu +title: Known Python Issues +--- + +Unfortunately there are some issues that are either beyond our control or are very elusive / hard to track down. +Below is a list of these issues that you might have to be aware of, depending on your workflow. + +## Numpy Import Multithreading + +When making use of multi threading and fetching results either directly as Numpy arrays or indirectly through a Pandas DataFrame, it might be necessary to ensure that `numpy.core.multiarray` is imported. +If this module has not been imported from the main thread, and a different thread during execution attempts to import it this causes either a deadlock or a crash. + +To avoid this, it's recommended to `import numpy.core.multiarray` before starting up threads. + +## `DESCRIBE` and `SUMMARIZE` Return Empty Tables in Jupyter + +The `DESCRIBE` and `SUMMARIZE` statements return an empty table: + +```python +%sql +CREATE OR REPLACE TABLE tbl AS (SELECT 42 AS x); +DESCRIBE tbl; +``` + +To work around this, wrap them into a subquery: + +```python +%sql +CREATE OR REPLACE TABLE tbl AS (SELECT 42 AS x); +FROM (DESCRIBE tbl); +``` + +## Protobuf Error for JupySQL in IPython + +Loading the JupySQL extension in IPython fails: + +```python +In [1]: %load_ext sql +``` + +```console +ImportError: cannot import name 'builder' from 'google.protobuf.internal' (unknown location) +``` + +The solution is to fix the `protobuf` package. This may require uninstalling conflicting packages, e.g.: + +```python +%pip uninstall tensorflow +%pip install protobuf +``` + +## Running `EXPLAIN` Renders Newlines + +In Python, the output of the [`EXPLAIN` statement]({% link docs/archive/1.0/guides/meta/explain.md %}) contains hard line breaks (`\n`): + +```python +In [1]: import duckdb + ...: duckdb.sql("EXPLAIN SELECT 42 AS x") +``` + +```text +Out[1]: +┌───────────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ +│ explain_key │ explain_value │ +│ varchar │ varchar │ +├───────────────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ +│ physical_plan │ ┌───────────────────────────┐\n│ PROJECTION │\n│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │\n│ x … │ +└───────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ +``` + +To work around this, `print` the output of the `explain()` function: + +```python +In [2]: print(duckdb.sql("SELECT 42 AS x").explain()) +``` + +```text +Out[2]: +┌───────────────────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ x │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ DUMMY_SCAN │ +└───────────────────────────┘ +``` + +Please also check out the [Jupyter guide]({% link docs/archive/1.0/guides/python/jupyter.md %}) for tips on using Jupyter with JupySQL. + +## Error When Importing the DuckDB Python Package on Windows + +When importing DuckDB on Windows, the Python runtime may return the following error: + +```python +import duckdb +``` + +```console +ImportError: DLL load failed while importing duckdb: The specified module could not be found. +``` + +The solution is to install the [Microsoft Visual C++ Redistributable package](https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist). \ No newline at end of file diff --git a/docs/archive/1.0/api/python/overview.md b/docs/archive/1.0/api/python/overview.md new file mode 100644 index 00000000000..fc68ea21e92 --- /dev/null +++ b/docs/archive/1.0/api/python/overview.md @@ -0,0 +1,227 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/api/python +- /docs/archive/1.0/api/python/ +title: Python API +--- + +## Installation + +The DuckDB Python API can be installed using [pip](https://pip.pypa.io): `pip install duckdb`. Please see the [installation page]({% link docs/archive/1.0/installation/index.html %}?environment=python) for details. It is also possible to install DuckDB using [conda](https://docs.conda.io): `conda install python-duckdb -c conda-forge`. + +**Python version:** +DuckDB requires Python 3.7 or newer. + +## Basic API Usage + +The most straight-forward manner of running SQL queries using DuckDB is using the `duckdb.sql` command. + +```python +import duckdb + +duckdb.sql("SELECT 42").show() +``` + +This will run queries using an **in-memory database** that is stored globally inside the Python module. The result of the query is returned as a **Relation**. A relation is a symbolic representation of the query. The query is not executed until the result is fetched or requested to be printed to the screen. + +Relations can be referenced in subsequent queries by storing them inside variables, and using them as tables. This way queries can be constructed incrementally. + +```python +import duckdb + +r1 = duckdb.sql("SELECT 42 AS i") +duckdb.sql("SELECT i * 2 AS k FROM r1").show() +``` + +## Data Input + +DuckDB can ingest data from a wide variety of formats – both on-disk and in-memory. See the [data ingestion page]({% link docs/archive/1.0/api/python/data_ingestion.md %}) for more information. + +```python +import duckdb + +duckdb.read_csv("example.csv") # read a CSV file into a Relation +duckdb.read_parquet("example.parquet") # read a Parquet file into a Relation +duckdb.read_json("example.json") # read a JSON file into a Relation + +duckdb.sql("SELECT * FROM 'example.csv'") # directly query a CSV file +duckdb.sql("SELECT * FROM 'example.parquet'") # directly query a Parquet file +duckdb.sql("SELECT * FROM 'example.json'") # directly query a JSON file +``` + +### DataFrames + +DuckDB can directly query Pandas DataFrames, Polars DataFrames and Arrow tables. +Note that these are read-only, i.e., editing these tables via [`INSERT`]({% link docs/archive/1.0/sql/statements/insert.md %}) or [`UPDATE` statements]({% link docs/archive/1.0/sql/statements/update.md %}) is not possible. + +#### Pandas + +To directly query a Pandas DataFrame, run: + +```python +import duckdb +import pandas as pd + +pandas_df = pd.DataFrame({"a": [42]}) +duckdb.sql("SELECT * FROM pandas_df") +``` + +```text +┌───────┐ +│ a │ +│ int64 │ +├───────┤ +│ 42 │ +└───────┘ +``` + +#### Polars + +To directly query a Polars DataFrame, run: + +```python +import duckdb +import polars as pl + +polars_df = pl.DataFrame({"a": [42]}) +duckdb.sql("SELECT * FROM polars_df") +``` + +```text +┌───────┐ +│ a │ +│ int64 │ +├───────┤ +│ 42 │ +└───────┘ +``` + +#### PyArrow + +To directly query a PyArrow table, run: + +```python +import duckdb +import pyarrow as pa + +arrow_table = pa.Table.from_pydict({"a": [42]}) +duckdb.sql("SELECT * FROM arrow_table") +``` + +```text +┌───────┐ +│ a │ +│ int64 │ +├───────┤ +│ 42 │ +└───────┘ +``` + +## Result Conversion + +DuckDB supports converting query results efficiently to a variety of formats. See the [result conversion page]({% link docs/archive/1.0/api/python/conversion.md %}) for more information. + +```python +import duckdb + +duckdb.sql("SELECT 42").fetchall() # Python objects +duckdb.sql("SELECT 42").df() # Pandas DataFrame +duckdb.sql("SELECT 42").pl() # Polars DataFrame +duckdb.sql("SELECT 42").arrow() # Arrow Table +duckdb.sql("SELECT 42").fetchnumpy() # NumPy Arrays +``` + +## Writing Data to Disk + +DuckDB supports writing Relation objects directly to disk in a variety of formats. The [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}) can be used to write data to disk using SQL as an alternative. + +```python +import duckdb + +duckdb.sql("SELECT 42").write_parquet("out.parquet") # Write to a Parquet file +duckdb.sql("SELECT 42").write_csv("out.csv") # Write to a CSV file +duckdb.sql("COPY (SELECT 42) TO 'out.parquet'") # Copy to a Parquet file +``` + +## Connection Options + +Applications can open a new DuckDB connection via the `duckdb.connect()` method. + +### Using an In-Memory Database + +When using DuckDB through `duckdb.sql()`, it operates on an **in-memory** database, i.e., no tables are persisted on disk. +Invoking the `duckdb.connect()` method without arguments returns a connection, which also uses an in-memory database: + +```python +import duckdb + +con = duckdb.connect() +con.sql("SELECT 42 AS x").show() +``` + +### Persistent Storage + +The `duckdb.connect(dbname)` creates a connection to a **persistent** database. +Any data written to that connection will be persisted, and can be reloaded by reconnecting to the same file, both from Python and from other DuckDB clients. + +```python +import duckdb + +# create a connection to a file called 'file.db' +con = duckdb.connect("file.db") +# create a table and load data into it +con.sql("CREATE TABLE test (i INTEGER)") +con.sql("INSERT INTO test VALUES (42)") +# query the table +con.table("test").show() +# explicitly close the connection +con.close() +# Note: connections also closed implicitly when they go out of scope +``` + +You can also use a context manager to ensure that the connection is closed: + +```python +import duckdb + +with duckdb.connect("file.db") as con: + con.sql("CREATE TABLE test (i INTEGER)") + con.sql("INSERT INTO test VALUES (42)") + con.table("test").show() + # the context manager closes the connection automatically +``` + +### Configuration + +The `duckdb.connect()` accepts a `config` dictionary, where [configuration options]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference) can be specified. For example: + +```python +import duckdb + +con = duckdb.connect(config = {'threads': 1}) +``` + +### Connection Object and Module + +The connection object and the `duckdb` module can be used interchangeably – they support the same methods. The only difference is that when using the `duckdb` module a global in-memory database is used. + +> If you are developing a package designed for others to use, and use DuckDB in the package, it is recommend that you create connection objects instead of using the methods on the `duckdb` module. That is because the `duckdb` module uses a shared global database – which can cause hard to debug issues if used from within multiple different packages. + +### Using Connections in Parallel Python Programs + +The `DuckDBPyConnection` object is not thread-safe. If you would like to write to the same database from multiple threads, create a cursor for each thread with the [`DuckDBPyConnection.cursor()` method]({% link docs/archive/1.0/api/python/reference/index.md %}#duckdb.DuckDBPyConnection.cursor). + +## Loading and Installing Extensions + +DuckDB's Python API provides functions for installing and loading [extensions]({% link docs/archive/1.0/extensions/overview.md %}), which perform the equivalent operations to running the `INSTALL` and `LOAD` SQL commands, respectively. An example that installs and loads the [`spatial` extension]({% link docs/archive/1.0/extensions/spatial.md %}) looks like follows: + +```python +import duckdb + +con = duckdb.connect() +con.install_extension("spatial") +con.load_extension("spatial") +``` + +To load [unsigned extensions]({% link docs/archive/1.0/extensions/overview.md %}#unsigned-extensions), use the `config = {"allow_unsigned_extensions": "true"}` argument to the `duckdb.connect()` method. \ No newline at end of file diff --git a/docs/archive/1.0/api/python/reference/.gitignore b/docs/archive/1.0/api/python/reference/.gitignore new file mode 100644 index 00000000000..8d12109205f --- /dev/null +++ b/docs/archive/1.0/api/python/reference/.gitignore @@ -0,0 +1,7 @@ +environment.pickle +searchindex.js +objects.inv +.buildinfo +*.html +_sources +index.doctree diff --git a/docs/archive/1.0/api/python/reference/index.md b/docs/archive/1.0/api/python/reference/index.md new file mode 100644 index 00000000000..f3a1a01161c --- /dev/null +++ b/docs/archive/1.0/api/python/reference/index.md @@ -0,0 +1,3206 @@ +--- +layout: docu +title: Python Client API +--- + +
+
+
+ +
+
+duckdb.threadsafety bool +
+
+

Indicates that this package is threadsafe

+
+
+ +
+
+duckdb.apilevel int +
+
+

Indicates which Python DBAPI version this package implements

+
+
+ +
+
+duckdb.paramstyle str +
+
+

Indicates which parameter style duckdb supports

+
+
+ +
+
+duckdb.default_connection duckdb.DuckDBPyConnection +
+
+

The connection that is used by default if you don’t explicitly pass one to the root methods in this module

+
+
+ +
+
+class duckdb.BinaryValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.BinderException +
+
+

Bases: ProgrammingError

+
+
+ +
+
+class duckdb.BitValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.BlobValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.BooleanValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+duckdb.CaseExpression(condition: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+
+ +
+
+exception duckdb.CatalogException +
+
+

Bases: ProgrammingError

+
+
+ +
+
+duckdb.CoalesceOperator(*args) duckdb.duckdb.Expression +
+
+
+ +
+
+duckdb.ColumnExpression(name: str) duckdb.duckdb.Expression +
+
+

Create a column reference from the provided column name

+
+
+ +
+
+exception duckdb.ConnectionException +
+
+

Bases: OperationalError

+
+
+ +
+
+duckdb.ConstantExpression(value: object) duckdb.duckdb.Expression +
+
+

Create a constant expression from the provided value

+
+
+ +
+
+exception duckdb.ConstraintException +
+
+

Bases: IntegrityError

+
+
+ +
+
+exception duckdb.ConversionException +
+
+

Bases: DataError

+
+
+ +
+
+exception duckdb.DataError +
+
+

Bases: DatabaseError

+
+
+ +
+
+class duckdb.DateValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.DecimalValue(object: Any, width: int, scale: int) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.DoubleValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.DuckDBPyConnection +
+
+

Bases: pybind11_object

+
+
+append(self: duckdb.duckdb.DuckDBPyConnection, table_name: str, df: pandas.DataFrame, *, by_name: bool = False) duckdb.duckdb.DuckDBPyConnection +
+
+

Append the passed DataFrame to the named table

+
+
+ +
+
+array_type(self: duckdb.duckdb.DuckDBPyConnection, type: duckdb.duckdb.typing.DuckDBPyType, size: int) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create an array type object of ‘type’

+
+
+ +
+
+arrow(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.Table +
+
+

Fetch a result as Arrow table following execute()

+
+
+ +
+
+begin(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection +
+
+

Start a new transaction

+
+
+ +
+
+checkpoint(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection +
+
+

Synchronizes data in the write-ahead log (WAL) to the database data file (no-op for in-memory connections)

+
+
+ +
+
+close(self: duckdb.duckdb.DuckDBPyConnection) None +
+
+

Close the connection

+
+
+ +
+
+commit(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection +
+
+

Commit changes performed within a transaction

+
+
+ +
+
+create_function(self: duckdb.duckdb.DuckDBPyConnection, name: str, function: Callable, parameters: object = None, return_type: duckdb.duckdb.typing.DuckDBPyType = None, *, type: duckdb.duckdb.functional.PythonUDFType = <PythonUDFType.NATIVE: 0>, null_handling: duckdb.duckdb.functional.FunctionNullHandling = <FunctionNullHandling.DEFAULT: 0>, exception_handling: duckdb.duckdb.PythonExceptionHandling = <PythonExceptionHandling.DEFAULT: 0>, side_effects: bool = False) duckdb.duckdb.DuckDBPyConnection +
+
+

Create a DuckDB function out of the passing in Python function so it can be used in queries

+
+
+ +
+
+cursor(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection +
+
+

Create a duplicate of the current connection

+
+
+ +
+
+decimal_type(self: duckdb.duckdb.DuckDBPyConnection, width: int, scale: int) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a decimal type with ‘width’ and ‘scale’

+
+
+ +
+
+property description +
+
+

Get result set attributes, mainly column names

+
+
+ +
+
+df(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Fetch a result as DataFrame following execute()

+
+
+ +
+
+dtype(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a type object by parsing the ‘type_str’ string

+
+
+ +
+
+duplicate(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection +
+
+

Create a duplicate of the current connection

+
+
+ +
+
+enum_type(self: duckdb.duckdb.DuckDBPyConnection, name: str, type: duckdb.duckdb.typing.DuckDBPyType, values: list) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create an enum type of underlying ‘type’, consisting of the list of ‘values’

+
+
+ +
+
+execute(self: duckdb.duckdb.DuckDBPyConnection, query: object, parameters: object = None, multiple_parameter_sets: bool = False) duckdb.duckdb.DuckDBPyConnection +
+
+

Execute the given SQL query, optionally using prepared statements with parameters set

+
+
+ +
+
+executemany(self: duckdb.duckdb.DuckDBPyConnection, query: object, parameters: object = None) duckdb.duckdb.DuckDBPyConnection +
+
+

Execute the given prepared statement multiple times using the list of parameter sets in parameters

+
+
+ +
+
+extract_statements(self: duckdb.duckdb.DuckDBPyConnection, query: str) list +
+
+

Parse the query string and extract the Statement object(s) produced

+
+
+ +
+
+fetch_arrow_table(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.Table +
+
+

Fetch a result as Arrow table following execute()

+
+
+ +
+
+fetch_df(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Fetch a result as DataFrame following execute()

+
+
+ +
+
+fetch_df_chunk(self: duckdb.duckdb.DuckDBPyConnection, vectors_per_chunk: int = 1, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Fetch a chunk of the result as DataFrame following execute()

+
+
+ +
+
+fetch_record_batch(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) pyarrow.lib.RecordBatchReader +
+
+

Fetch an Arrow RecordBatchReader following execute()

+
+
+ +
+
+fetchall(self: duckdb.duckdb.DuckDBPyConnection) list +
+
+

Fetch all rows from a result following execute

+
+
+ +
+
+fetchdf(self: duckdb.duckdb.DuckDBPyConnection, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Fetch a result as DataFrame following execute()

+
+
+ +
+
+fetchmany(self: duckdb.duckdb.DuckDBPyConnection, size: int = 1) list +
+
+

Fetch the next set of rows from a result following execute

+
+
+ +
+
+fetchnumpy(self: duckdb.duckdb.DuckDBPyConnection) dict +
+
+

Fetch a result as list of NumPy arrays following execute

+
+
+ +
+
+fetchone(self: duckdb.duckdb.DuckDBPyConnection) Optional[tuple] +
+
+

Fetch a single row from a result following execute

+
+
+ +
+
+filesystem_is_registered(self: duckdb.duckdb.DuckDBPyConnection, name: str) bool +
+
+

Check if a filesystem with the provided name is currently registered

+
+
+ +
+
+from_arrow(self: duckdb.duckdb.DuckDBPyConnection, arrow_object: object) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object from an Arrow object

+
+
+ +
+
+from_csv_auto(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, *, header: object = None, compression: object = None, sep: object = None, delimiter: object = None, dtype: object = None, na_values: object = None, skiprows: object = None, quotechar: object = None, escapechar: object = None, encoding: object = None, parallel: object = None, date_format: object = None, timestamp_format: object = None, sample_size: object = None, all_varchar: object = None, normalize_names: object = None, filename: object = None, null_padding: object = None, names: object = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object from the CSV file in ‘name’

+
+
+ +
+
+from_df(self: duckdb.duckdb.DuckDBPyConnection, df: pandas.DataFrame) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object from the DataFrame in df

+
+
+ +
+
+from_parquet(*args, **kwargs) +
+
+

Overloaded function.

+
    +
  1. from_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

  2. +
+

Create a relation object from the Parquet files in file_glob

+
    +
  1. from_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

  2. +
+

Create a relation object from the Parquet files in file_globs

+
+
+ +
+
+from_query(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.

+
+
+ +
+
+from_substrait(self: duckdb.duckdb.DuckDBPyConnection, proto: bytes) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a query object from protobuf plan

+
+
+ +
+
+from_substrait_json(self: duckdb.duckdb.DuckDBPyConnection, json: str) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a query object from a JSON protobuf plan

+
+
+ +
+
+get_substrait(self: duckdb.duckdb.DuckDBPyConnection, query: str, *, enable_optimizer: bool = True) duckdb.duckdb.DuckDBPyRelation +
+
+

Serialize a query to protobuf

+
+
+ +
+
+get_substrait_json(self: duckdb.duckdb.DuckDBPyConnection, query: str, *, enable_optimizer: bool = True) duckdb.duckdb.DuckDBPyRelation +
+
+

Serialize a query to protobuf on the JSON format

+
+
+ +
+
+get_table_names(self: duckdb.duckdb.DuckDBPyConnection, query: str) set[str] +
+
+

Extract the required table names from a query

+
+
+ +
+
+install_extension(self: duckdb.duckdb.DuckDBPyConnection, extension: str, *, force_install: bool = False) None +
+
+

Install an extension by name

+
+
+ +
+
+interrupt(self: duckdb.duckdb.DuckDBPyConnection) None +
+
+

Interrupt pending operations

+
+
+ +
+
+list_filesystems(self: duckdb.duckdb.DuckDBPyConnection) list +
+
+

List registered filesystems, including builtin ones

+
+
+ +
+
+list_type(self: duckdb.duckdb.DuckDBPyConnection, type: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a list type object of ‘type’

+
+
+ +
+
+load_extension(self: duckdb.duckdb.DuckDBPyConnection, extension: str) None +
+
+

Load an installed extension

+
+
+ +
+
+map_type(self: duckdb.duckdb.DuckDBPyConnection, key: duckdb.duckdb.typing.DuckDBPyType, value: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a map type object from ‘key_type’ and ‘value_type’

+
+
+ +
+
+pl(self: duckdb.duckdb.DuckDBPyConnection, rows_per_batch: int = 1000000) duckdb::PolarsDataFrame +
+
+

Fetch a result as Polars DataFrame following execute()

+
+
+ +
+
+query(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.

+
+
+ +
+
+read_csv(self: duckdb.duckdb.DuckDBPyConnection, path_or_buffer: object, *, header: object = None, compression: object = None, sep: object = None, delimiter: object = None, dtype: object = None, na_values: object = None, skiprows: object = None, quotechar: object = None, escapechar: object = None, encoding: object = None, parallel: object = None, date_format: object = None, timestamp_format: object = None, sample_size: object = None, all_varchar: object = None, normalize_names: object = None, filename: object = None, null_padding: object = None, names: object = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object from the CSV file in ‘name’

+
+
+ +
+
+read_json(self: duckdb.duckdb.DuckDBPyConnection, name: str, *, columns: Optional[object] = None, sample_size: Optional[object] = None, maximum_depth: Optional[object] = None, records: Optional[str] = None, format: Optional[str] = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object from the JSON file in ‘name’

+
+
+ +
+
+read_parquet(*args, **kwargs) +
+
+

Overloaded function.

+
    +
  1. read_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_glob: str, binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

  2. +
+

Create a relation object from the Parquet files in file_glob

+
    +
  1. read_parquet(self: duckdb.duckdb.DuckDBPyConnection, file_globs: list[str], binary_as_string: bool = False, *, file_row_number: bool = False, filename: bool = False, hive_partitioning: bool = False, union_by_name: bool = False, compression: object = None) -> duckdb.duckdb.DuckDBPyRelation

  2. +
+

Create a relation object from the Parquet files in file_globs

+
+
+ +
+
+register(self: duckdb.duckdb.DuckDBPyConnection, view_name: str, python_object: object) duckdb.duckdb.DuckDBPyConnection +
+
+

Register the passed Python Object value for querying with a view

+
+
+ +
+
+register_filesystem(self: duckdb.duckdb.DuckDBPyConnection, filesystem: fsspec.AbstractFileSystem) None +
+
+

Register a fsspec compliant filesystem

+
+
+ +
+
+remove_function(self: duckdb.duckdb.DuckDBPyConnection, name: str) duckdb.duckdb.DuckDBPyConnection +
+
+

Remove a previously created function

+
+
+ +
+
+rollback(self: duckdb.duckdb.DuckDBPyConnection) duckdb.duckdb.DuckDBPyConnection +
+
+

Roll back changes performed within a transaction

+
+
+ +
+
+row_type(self: duckdb.duckdb.DuckDBPyConnection, fields: object) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a struct type object from ‘fields’

+
+
+ +
+
+property rowcount +
+
+

Get result set row count

+
+
+ +
+
+sql(self: duckdb.duckdb.DuckDBPyConnection, query: object, *, alias: str = '', params: object = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Run a SQL query. If it is a SELECT statement, create a relation object from the given SQL query, otherwise run the query as-is.

+
+
+ +
+
+sqltype(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a type object by parsing the ‘type_str’ string

+
+
+ +
+
+string_type(self: duckdb.duckdb.DuckDBPyConnection, collation: str = '') duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a string type with an optional collation

+
+
+ +
+
+struct_type(self: duckdb.duckdb.DuckDBPyConnection, fields: object) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a struct type object from ‘fields’

+
+
+ +
+
+table(self: duckdb.duckdb.DuckDBPyConnection, table_name: str) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object for the named table

+
+
+ +
+
+table_function(self: duckdb.duckdb.DuckDBPyConnection, name: str, parameters: object = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object from the named table function with given parameters

+
+
+ +
+
+tf(self: duckdb.duckdb.DuckDBPyConnection) dict +
+
+

Fetch a result as dict of TensorFlow Tensors following execute()

+
+
+ +
+
+torch(self: duckdb.duckdb.DuckDBPyConnection) dict +
+
+

Fetch a result as dict of PyTorch Tensors following execute()

+
+
+ +
+
+type(self: duckdb.duckdb.DuckDBPyConnection, type_str: str) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a type object by parsing the ‘type_str’ string

+
+
+ +
+
+union_type(self: duckdb.duckdb.DuckDBPyConnection, members: object) duckdb.duckdb.typing.DuckDBPyType +
+
+

Create a union type object from ‘members’

+
+
+ +
+
+unregister(self: duckdb.duckdb.DuckDBPyConnection, view_name: str) duckdb.duckdb.DuckDBPyConnection +
+
+

Unregister the view name

+
+
+ +
+
+unregister_filesystem(self: duckdb.duckdb.DuckDBPyConnection, name: str) None +
+
+

Unregister a filesystem

+
+
+ +
+
+values(self: duckdb.duckdb.DuckDBPyConnection, values: object) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object from the passed values

+
+
+ +
+
+view(self: duckdb.duckdb.DuckDBPyConnection, view_name: str) duckdb.duckdb.DuckDBPyRelation +
+
+

Create a relation object for the named view

+
+
+ +
+
+ +
+
+class duckdb.DuckDBPyRelation +
+
+

Bases: pybind11_object

+
+
+aggregate(self: duckdb.duckdb.DuckDBPyRelation, aggr_expr: str, group_expr: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Compute the aggregate aggr_expr by the optional groups group_expr on the relation

+
+
+ +
+
+property alias +
+
+

Get the name of the current alias

+
+
+ +
+
+any_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Returns the first non-null value from a given column

+
+
+ +
+
+apply(self: duckdb.duckdb.DuckDBPyRelation, function_name: str, function_aggr: str, group_expr: str = '', function_parameter: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Compute the function of a single column or a list of columns by the optional groups on the relation

+
+
+ +
+
+arg_max(self: duckdb.duckdb.DuckDBPyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Finds the row with the maximum value for a value column and returns the value of that row for an argument column

+
+
+ +
+
+arg_min(self: duckdb.duckdb.DuckDBPyRelation, arg_column: str, value_column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Finds the row with the minimum value for a value column and returns the value of that row for an argument column

+
+
+ +
+
+arrow(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table +
+
+

Execute and fetch all rows as an Arrow Table

+
+
+ +
+
+avg(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the average on a given column

+
+
+ +
+
+bit_and(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the bitwise AND of all bits present in a given column

+
+
+ +
+
+bit_or(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the bitwise OR of all bits present in a given column

+
+
+ +
+
+bit_xor(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the bitwise XOR of all bits present in a given column

+
+
+ +
+
+bitstring_agg(self: duckdb.duckdb.DuckDBPyRelation, column: str, min: Optional[object] = None, max: Optional[object] = None, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes a bitstring with bits set for each distinct value in a given column

+
+
+ +
+
+bool_and(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the logical AND of all values present in a given column

+
+
+ +
+
+bool_or(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the logical OR of all values present in a given column

+
+
+ +
+
+close(self: duckdb.duckdb.DuckDBPyRelation) None +
+
+

Closes the result

+
+
+ +
+
+property columns +
+
+

Return a list containing the names of the columns of the relation.

+
+
+ +
+
+count(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the number of elements present in a given column

+
+
+ +
+
+create(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None +
+
+

Creates a new table named table_name with the contents of the relation object

+
+
+ +
+
+create_view(self: duckdb.duckdb.DuckDBPyRelation, view_name: str, replace: bool = True) duckdb.duckdb.DuckDBPyRelation +
+
+

Creates a view named view_name that refers to the relation object

+
+
+ +
+
+cume_dist(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the cumulative distribution within the partition

+
+
+ +
+
+dense_rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the dense rank within the partition

+
+
+ +
+
+describe(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation +
+
+

Gives basic statistics (e.g., min,max) and if null exists for each column of the relation.

+
+
+ +
+
+property description +
+
+

Return the description of the result

+
+
+ +
+
+df(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Execute and fetch all rows as a pandas DataFrame

+
+
+ +
+
+distinct(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation +
+
+

Retrieve distinct rows from this relation object

+
+
+ +
+
+property dtypes +
+
+

Return a list containing the types of the columns of the relation.

+
+
+ +
+
+except_(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation +
+
+

Create the set except of this relation object with another relation object in other_rel

+
+
+ +
+
+execute(self: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation +
+
+

Transform the relation into a result set

+
+
+ +
+
+explain(self: duckdb.duckdb.DuckDBPyRelation, type: duckdb.duckdb.ExplainType = 'standard') str +
+
+
+ +
+
+favg(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the average of all values present in a given column using a more accurate floating point summation (Kahan Sum)

+
+
+ +
+
+fetch_arrow_reader(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.RecordBatchReader +
+
+

Execute and return an Arrow Record Batch Reader that yields all rows

+
+
+ +
+
+fetch_arrow_table(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table +
+
+

Execute and fetch all rows as an Arrow Table

+
+
+ +
+
+fetch_df_chunk(self: duckdb.duckdb.DuckDBPyRelation, vectors_per_chunk: int = 1, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Execute and fetch a chunk of the rows

+
+
+ +
+
+fetchall(self: duckdb.duckdb.DuckDBPyRelation) list +
+
+

Execute and fetch all rows as a list of tuples

+
+
+ +
+
+fetchdf(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Execute and fetch all rows as a pandas DataFrame

+
+
+ +
+
+fetchmany(self: duckdb.duckdb.DuckDBPyRelation, size: int = 1) list +
+
+

Execute and fetch the next set of rows as a list of tuples

+
+
+ +
+
+fetchnumpy(self: duckdb.duckdb.DuckDBPyRelation) dict +
+
+

Execute and fetch all rows as a Python dict mapping each column to one numpy arrays

+
+
+ +
+
+fetchone(self: duckdb.duckdb.DuckDBPyRelation) Optional[tuple] +
+
+

Execute and fetch a single row as a tuple

+
+
+ +
+
+filter(self: duckdb.duckdb.DuckDBPyRelation, filter_expr: object) duckdb.duckdb.DuckDBPyRelation +
+
+

Filter the relation object by the filter in filter_expr

+
+
+ +
+
+first(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Returns the first value of a given column

+
+
+ +
+
+first_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the first value within the group or partition

+
+
+ +
+
+fsum(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sum of all values present in a given column using a more accurate floating point summation (Kahan Sum)

+
+
+ +
+
+geomean(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the geometric mean over all values present in a given column

+
+
+ +
+
+histogram(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the histogram over all values present in a given column

+
+
+ +
+
+insert(self: duckdb.duckdb.DuckDBPyRelation, values: object) None +
+
+

Inserts the given values into the relation

+
+
+ +
+
+insert_into(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None +
+
+

Inserts the relation object into an existing table named table_name

+
+
+ +
+
+intersect(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation +
+
+

Create the set intersection of this relation object with another relation object in other_rel

+
+
+ +
+
+join(self: duckdb.duckdb.DuckDBPyRelation, other_rel: duckdb.duckdb.DuckDBPyRelation, condition: object, how: str = 'inner') duckdb.duckdb.DuckDBPyRelation +
+
+

Join the relation object with another relation object in other_rel using the join condition expression in join_condition. Types supported are ‘inner’ and ‘left’

+
+
+ +
+
+lag(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the lag within the partition

+
+
+ +
+
+last(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Returns the last value of a given column

+
+
+ +
+
+last_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the last value within the group or partition

+
+
+ +
+
+lead(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int = 1, default_value: str = 'NULL', ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the lead within the partition

+
+
+ +
+
+limit(self: duckdb.duckdb.DuckDBPyRelation, n: int, offset: int = 0) duckdb.duckdb.DuckDBPyRelation +
+
+

Only retrieve the first n rows from this relation object, starting at offset

+
+
+ +
+
+list(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Returns a list containing all values present in a given column

+
+
+ +
+
+map(self: duckdb.duckdb.DuckDBPyRelation, map_function: Callable, *, schema: Optional[object] = None) duckdb.duckdb.DuckDBPyRelation +
+
+

Calls the passed function on the relation

+
+
+ +
+
+max(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Returns the maximum value present in a given column

+
+
+ +
+
+mean(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the average on a given column

+
+
+ +
+
+median(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the median over all values present in a given column

+
+
+ +
+
+min(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Returns the minimum value present in a given column

+
+
+ +
+
+mode(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the mode over all values present in a given column

+
+
+ +
+
+n_tile(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, num_buckets: int, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Divides the partition as equally as possible into num_buckets

+
+
+ +
+
+nth_value(self: duckdb.duckdb.DuckDBPyRelation, column: str, window_spec: str, offset: int, ignore_nulls: bool = False, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the nth value within the partition

+
+
+ +
+
+order(self: duckdb.duckdb.DuckDBPyRelation, order_expr: str) duckdb.duckdb.DuckDBPyRelation +
+
+

Reorder the relation object by order_expr

+
+
+ +
+
+percent_rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the relative rank within the partition

+
+
+ +
+
+pl(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) duckdb::PolarsDataFrame +
+
+

Execute and fetch all rows as a Polars DataFrame

+
+
+ +
+
+product(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Returns the product of all values present in a given column

+
+
+ +
+
+project(self: duckdb.duckdb.DuckDBPyRelation, *args, **kwargs) duckdb.duckdb.DuckDBPyRelation +
+
+

Project the relation object by the projection in project_expr

+
+
+ +
+
+quantile(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the exact quantile value for a given column

+
+
+ +
+
+quantile_cont(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the interpolated quantile value for a given column

+
+
+ +
+
+quantile_disc(self: duckdb.duckdb.DuckDBPyRelation, column: str, q: object = 0.5, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the exact quantile value for a given column

+
+
+ +
+
+query(self: duckdb.duckdb.DuckDBPyRelation, virtual_table_name: str, sql_query: str) duckdb.duckdb.DuckDBPyRelation +
+
+

Run the given SQL query in sql_query on the view named virtual_table_name that refers to the relation object

+
+
+ +
+
+rank(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the rank within the partition

+
+
+ +
+
+rank_dense(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the dense rank within the partition

+
+
+ +
+
+record_batch(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.RecordBatchReader +
+
+

Execute and return an Arrow Record Batch Reader that yields all rows

+
+
+ +
+
+row_number(self: duckdb.duckdb.DuckDBPyRelation, window_spec: str, projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the row number within the partition

+
+
+ +
+
+select(self: duckdb.duckdb.DuckDBPyRelation, *args, **kwargs) duckdb.duckdb.DuckDBPyRelation +
+
+

Project the relation object by the projection in project_expr

+
+
+ +
+
+select_dtypes(self: duckdb.duckdb.DuckDBPyRelation, types: object) duckdb.duckdb.DuckDBPyRelation +
+
+

Select columns from the relation, by filtering based on type(s)

+
+
+ +
+
+select_types(self: duckdb.duckdb.DuckDBPyRelation, types: object) duckdb.duckdb.DuckDBPyRelation +
+
+

Select columns from the relation, by filtering based on type(s)

+
+
+ +
+
+set_alias(self: duckdb.duckdb.DuckDBPyRelation, alias: str) duckdb.duckdb.DuckDBPyRelation +
+
+

Rename the relation object to new alias

+
+
+ +
+
+property shape +
+
+

Tuple of # of rows, # of columns in relation.

+
+
+ +
+
+show(self: duckdb.duckdb.DuckDBPyRelation, *, max_width: Optional[int] = None, max_rows: Optional[int] = None, max_col_width: Optional[int] = None, null_value: Optional[str] = None, render_mode: object = None) None +
+
+

Display a summary of the data

+
+
+ +
+
+sort(self: duckdb.duckdb.DuckDBPyRelation, *args) duckdb.duckdb.DuckDBPyRelation +
+
+

Reorder the relation object by the provided expressions

+
+
+ +
+
+sql_query(self: duckdb.duckdb.DuckDBPyRelation) str +
+
+

Get the SQL query that is equivalent to the relation

+
+
+ +
+
+std(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sample standard deviation for a given column

+
+
+ +
+
+stddev(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sample standard deviation for a given column

+
+
+ +
+
+stddev_pop(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the population standard deviation for a given column

+
+
+ +
+
+stddev_samp(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sample standard deviation for a given column

+
+
+ +
+
+string_agg(self: duckdb.duckdb.DuckDBPyRelation, column: str, sep: str = ',', groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Concatenates the values present in a given column with a separator

+
+
+ +
+
+sum(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sum of all values present in a given column

+
+
+ +
+
+tf(self: duckdb.duckdb.DuckDBPyRelation) dict +
+
+

Fetch a result as dict of TensorFlow Tensors

+
+
+ +
+
+to_arrow_table(self: duckdb.duckdb.DuckDBPyRelation, batch_size: int = 1000000) pyarrow.lib.Table +
+
+

Execute and fetch all rows as an Arrow Table

+
+
+ +
+
+to_csv(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None) None +
+
+

Write the relation object to a CSV file in ‘file_name’

+
+
+ +
+
+to_df(self: duckdb.duckdb.DuckDBPyRelation, *, date_as_object: bool = False) pandas.DataFrame +
+
+

Execute and fetch all rows as a pandas DataFrame

+
+
+ +
+
+to_parquet(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None) None +
+
+

Write the relation object to a Parquet file in ‘file_name’

+
+
+ +
+
+to_table(self: duckdb.duckdb.DuckDBPyRelation, table_name: str) None +
+
+

Creates a new table named table_name with the contents of the relation object

+
+
+ +
+
+to_view(self: duckdb.duckdb.DuckDBPyRelation, view_name: str, replace: bool = True) duckdb.duckdb.DuckDBPyRelation +
+
+

Creates a view named view_name that refers to the relation object

+
+
+ +
+
+torch(self: duckdb.duckdb.DuckDBPyRelation) dict +
+
+

Fetch a result as dict of PyTorch Tensors

+
+
+ +
+
+property type +
+
+

Get the type of the relation.

+
+
+ +
+
+property types +
+
+

Return a list containing the types of the columns of the relation.

+
+
+ +
+
+union(self: duckdb.duckdb.DuckDBPyRelation, union_rel: duckdb.duckdb.DuckDBPyRelation) duckdb.duckdb.DuckDBPyRelation +
+
+

Create the set union of this relation object with another relation object in other_rel

+
+
+ +
+
+unique(self: duckdb.duckdb.DuckDBPyRelation, unique_aggr: str) duckdb.duckdb.DuckDBPyRelation +
+
+

Number of distinct values in a column.

+
+
+ +
+
+value_counts(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the number of elements present in a given column, also projecting the original column

+
+
+ +
+
+var(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sample variance for a given column

+
+
+ +
+
+var_pop(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the population variance for a given column

+
+
+ +
+
+var_samp(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sample variance for a given column

+
+
+ +
+
+variance(self: duckdb.duckdb.DuckDBPyRelation, column: str, groups: str = '', window_spec: str = '', projected_columns: str = '') duckdb.duckdb.DuckDBPyRelation +
+
+

Computes the sample variance for a given column

+
+
+ +
+
+write_csv(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, sep: object = None, na_rep: object = None, header: object = None, quotechar: object = None, escapechar: object = None, date_format: object = None, timestamp_format: object = None, quoting: object = None, encoding: object = None, compression: object = None, overwrite: object = None, per_thread_output: object = None, use_tmp_file: object = None, partition_by: object = None) None +
+
+

Write the relation object to a CSV file in ‘file_name’

+
+
+ +
+
+write_parquet(self: duckdb.duckdb.DuckDBPyRelation, file_name: str, *, compression: object = None, field_ids: object = None, row_group_size_bytes: object = None, row_group_size: object = None) None +
+
+

Write the relation object to a Parquet file in ‘file_name’

+
+
+ +
+
+ +
+
+exception duckdb.Error +
+
+

Bases: Exception

+
+
+ +
+
+class duckdb.ExplainType +
+
+

Bases: pybind11_object

+

Members:

+

STANDARD

+

ANALYZE

+
+
+ANALYZE = <ExplainType.ANALYZE: 1> +
+
+
+ +
+
+STANDARD = <ExplainType.STANDARD: 0> +
+
+
+ +
+
+property name +
+
+
+ +
+
+property value +
+
+
+ +
+
+ +
+
+class duckdb.Expression +
+
+

Bases: pybind11_object

+
+
+alias(self: duckdb.duckdb.Expression, arg0: str) duckdb.duckdb.Expression +
+
+

Create a copy of this expression with the given alias.

+
+
Parameters:
+
+

name: The alias to use for the expression, this will affect how it can be referenced.

+
+
Returns:
+
+

Expression: self with an alias.

+
+
+
+
+ +
+
+asc(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Set the order by modifier to ASCENDING.

+
+
+ +
+
+cast(self: duckdb.duckdb.Expression, type: duckdb.duckdb.typing.DuckDBPyType) duckdb.duckdb.Expression +
+
+

Create a CastExpression to type from self

+
+
Parameters:
+
+

type: The type to cast to

+
+
Returns:
+
+

CastExpression: self::type

+
+
+
+
+ +
+
+desc(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Set the order by modifier to DESCENDING.

+
+
+ +
+
+isin(self: duckdb.duckdb.Expression, *args) duckdb.duckdb.Expression +
+
+

Return an IN expression comparing self to the input arguments.

+
+
Returns:
+
+

DuckDBPyExpression: The compare IN expression

+
+
+
+
+ +
+
+isnotin(self: duckdb.duckdb.Expression, *args) duckdb.duckdb.Expression +
+
+

Return a NOT IN expression comparing self to the input arguments.

+
+
Returns:
+
+

DuckDBPyExpression: The compare NOT IN expression

+
+
+
+
+ +
+
+isnotnull(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Create a binary IS NOT NULL expression from self

+
+
Returns:
+
+

DuckDBPyExpression: self IS NOT NULL

+
+
+
+
+ +
+
+isnull(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Create a binary IS NULL expression from self

+
+
Returns:
+
+

DuckDBPyExpression: self IS NULL

+
+
+
+
+ +
+
+nulls_first(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Set the NULL order by modifier to NULLS FIRST.

+
+
+ +
+
+nulls_last(self: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Set the NULL order by modifier to NULLS LAST.

+
+
+ +
+
+otherwise(self: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Add an ELSE <value> clause to the CaseExpression.

+
+
Parameters:
+
+

value: The value to use if none of the WHEN conditions are met.

+
+
Returns:
+
+

CaseExpression: self with an ELSE clause.

+
+
+
+
+ +
+
+show(self: duckdb.duckdb.Expression) None +
+
+

Print the stringified version of the expression.

+
+
+ +
+
+when(self: duckdb.duckdb.Expression, condition: duckdb.duckdb.Expression, value: duckdb.duckdb.Expression) duckdb.duckdb.Expression +
+
+

Add an additional WHEN <condition> THEN <value> clause to the CaseExpression.

+
+
Parameters:
+
+

condition: The condition that must be met. +value: The value to use if the condition is met.

+
+
Returns:
+
+

CaseExpression: self with an additional WHEN clause.

+
+
+
+
+ +
+
+ +
+
+exception duckdb.FatalException +
+
+

Bases: DatabaseError

+
+
+ +
+
+class duckdb.FloatValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+duckdb.FunctionExpression(function_name: str, *args) duckdb.duckdb.Expression +
+
+
+ +
+
+exception duckdb.HTTPException +
+
+

Bases: IOException

+

Thrown when an error occurs in the httpfs extension, or whilst downloading an extension.

+
+
+body: str +
+
+
+ +
+
+headers: Dict[str, str] +
+
+
+ +
+
+reason: str +
+
+
+ +
+
+status_code: int +
+
+
+ +
+
+ +
+
+class duckdb.HugeIntegerValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.IOException +
+
+

Bases: OperationalError

+
+
+ +
+
+class duckdb.IntegerValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.IntegrityError +
+
+

Bases: DatabaseError

+
+
+ +
+
+exception duckdb.InternalError +
+
+

Bases: DatabaseError

+
+
+ +
+
+exception duckdb.InternalException +
+
+

Bases: InternalError

+
+
+ +
+
+exception duckdb.InterruptException +
+
+

Bases: DatabaseError

+
+
+ +
+
+class duckdb.IntervalValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.InvalidInputException +
+
+

Bases: ProgrammingError

+
+
+ +
+
+exception duckdb.InvalidTypeException +
+
+

Bases: ProgrammingError

+
+
+ +
+
+class duckdb.LongValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.NotImplementedException +
+
+

Bases: NotSupportedError

+
+
+ +
+
+exception duckdb.NotSupportedError +
+
+

Bases: DatabaseError

+
+
+ +
+
+class duckdb.NullValue +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.OperationalError +
+
+

Bases: DatabaseError

+
+
+ +
+
+exception duckdb.OutOfMemoryException +
+
+

Bases: OperationalError

+
+
+ +
+
+exception duckdb.OutOfRangeException +
+
+

Bases: DataError

+
+
+ +
+
+exception duckdb.ParserException +
+
+

Bases: ProgrammingError

+
+
+ +
+
+exception duckdb.PermissionException +
+
+

Bases: DatabaseError

+
+
+ +
+
+exception duckdb.ProgrammingError +
+
+

Bases: DatabaseError

+
+
+ +
+
+class duckdb.PythonExceptionHandling +
+
+

Bases: pybind11_object

+

Members:

+

DEFAULT

+

RETURN_NULL

+
+
+DEFAULT = <PythonExceptionHandling.DEFAULT: 0> +
+
+
+ +
+
+RETURN_NULL = <PythonExceptionHandling.RETURN_NULL: 1> +
+
+
+ +
+
+property name +
+
+
+ +
+
+property value +
+
+
+ +
+
+ +
+
+exception duckdb.SequenceException +
+
+

Bases: DatabaseError

+
+
+ +
+
+exception duckdb.SerializationException +
+
+

Bases: OperationalError

+
+
+ +
+
+class duckdb.ShortValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+duckdb.StarExpression(*args, **kwargs) +
+
+

Overloaded function.

+
    +
  1. StarExpression(*, exclude: list = []) -> duckdb.duckdb.Expression

  2. +
  3. StarExpression() -> duckdb.duckdb.Expression

  4. +
+
+
+ +
+
+class duckdb.StringValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.SyntaxException +
+
+

Bases: ProgrammingError

+
+
+ +
+
+class duckdb.TimeTimeZoneValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.TimeValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.TimestampMilisecondValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.TimestampNanosecondValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.TimestampSecondValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.TimestampTimeZoneValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.TimestampValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+exception duckdb.TransactionException +
+
+

Bases: OperationalError

+
+
+ +
+
+exception duckdb.TypeMismatchException +
+
+

Bases: DataError

+
+
+ +
+
+class duckdb.UUIDValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.UnsignedBinaryValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.UnsignedIntegerValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.UnsignedLongValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.UnsignedShortValue(object: Any) +
+
+

Bases: Value

+
+
+ +
+
+class duckdb.Value(object: Any, type: DuckDBPyType) +
+
+

Bases: object

+
+
+ +
+
+exception duckdb.Warning +
+
+

Bases: Exception

+
+
+ +
+
+duckdb.aggregate(df, aggr_expr, group_expr='', **kwargs) +
+
+
+ +
+
+duckdb.alias(df, alias, **kwargs) +
+
+
+ +
+
+duckdb.append(table_name, df, **kwargs) +
+
+
+ +
+
+duckdb.array_type(type, size, **kwargs) +
+
+
+ +
+
+duckdb.arrow(*args, **kwargs) +
+
+

Overloaded function.

+
    +
  1. arrow(rows_per_batch: int = 1000000, *, connection: duckdb.DuckDBPyConnection = None) -> pyarrow.lib.Table

  2. +
+

Fetch a result as Arrow table following execute()

+
    +
  1. arrow(arrow_object: object, *, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

  2. +
+

Create a relation object from an Arrow object

+
+
+ +
+
+duckdb.begin(**kwargs) +
+
+
+ +
+
+duckdb.checkpoint(**kwargs) +
+
+
+ +
+
+duckdb.close(**kwargs) +
+
+
+ +
+
+duckdb.commit(**kwargs) +
+
+
+ +
+
+duckdb.connect(database: str = ':memory:', read_only: bool = False, config: dict = None) duckdb.DuckDBPyConnection +
+
+

Create a DuckDB database instance. Can take a database file name to read/write persistent data and a read_only flag if no changes are desired

+
+
+ +
+
+duckdb.create_function(name, function, parameters=None, return_type=None, **kwargs) +
+
+
+ +
+
+duckdb.cursor(**kwargs) +
+
+
+ +
+
+duckdb.decimal_type(width, scale, **kwargs) +
+
+
+ +
+
+duckdb.description(**kwargs) +
+
+
+ +
+
+duckdb.df(*args, **kwargs) +
+
+

Overloaded function.

+
    +
  1. df(*, date_as_object: bool = False, connection: duckdb.DuckDBPyConnection = None) -> pandas.DataFrame

  2. +
+

Fetch a result as DataFrame following execute()

+
    +
  1. df(df: pandas.DataFrame, *, connection: duckdb.DuckDBPyConnection = None) -> duckdb.duckdb.DuckDBPyRelation

  2. +
+

Create a relation object from the DataFrame df

+
+
+ +
+
+duckdb.distinct(df, **kwargs) +
+
+
+ +
+
+duckdb.dtype(type_str, **kwargs) +
+
+
+ +
+
+duckdb.duplicate(**kwargs) +
+
+
+ +
+
+duckdb.enum_type(name, type, values, **kwargs) +
+
+
+ +
+
+duckdb.execute(query, parameters=None, multiple_parameter_sets=False, **kwargs) +
+
+
+ +
+
+duckdb.executemany(query, parameters=None, **kwargs) +
+
+
+ +
+
+duckdb.extract_statements(query, **kwargs) +
+
+
+ +
+
+duckdb.fetch_arrow_table(rows_per_batch=1000000, **kwargs) +
+
+
+ +
+
+duckdb.fetch_df(**kwargs) +
+
+
+ +
+
+duckdb.fetch_df_chunk(vectors_per_chunk=1, **kwargs) +
+
+
+ +
+
+duckdb.fetch_record_batch(rows_per_batch=1000000, **kwargs) +
+
+
+ +
+
+duckdb.fetchall(**kwargs) +
+
+
+ +
+
+duckdb.fetchdf(**kwargs) +
+
+
+ +
+
+duckdb.fetchmany(size=1, **kwargs) +
+
+
+ +
+
+duckdb.fetchnumpy(**kwargs) +
+
+
+ +
+
+duckdb.fetchone(**kwargs) +
+
+
+ +
+
+duckdb.filesystem_is_registered(name, **kwargs) +
+
+
+ +
+
+duckdb.filter(df, filter_expr, **kwargs) +
+
+
+ +
+
+duckdb.from_arrow(arrow_object, **kwargs) +
+
+
+ +
+
+duckdb.from_csv_auto(path_or_buffer, **kwargs) +
+
+
+ +
+
+duckdb.from_df(df, **kwargs) +
+
+
+ +
+
+duckdb.from_parquet(file_glob, binary_as_string=False, **kwargs) +
+
+
+ +
+
+duckdb.from_query(query, **kwargs) +
+
+
+ +
+
+duckdb.from_substrait(proto, **kwargs) +
+
+
+ +
+
+duckdb.from_substrait_json(json, **kwargs) +
+
+
+ +
+
+duckdb.get_substrait(query, **kwargs) +
+
+
+ +
+
+duckdb.get_substrait_json(query, **kwargs) +
+
+
+ +
+
+duckdb.get_table_names(query, **kwargs) +
+
+
+ +
+
+duckdb.install_extension(extension, **kwargs) +
+
+
+ +
+
+duckdb.interrupt(**kwargs) +
+
+
+ +
+
+duckdb.limit(df, n, offset=0, **kwargs) +
+
+
+ +
+
+duckdb.list_filesystems(**kwargs) +
+
+
+ +
+
+duckdb.list_type(type, **kwargs) +
+
+
+ +
+
+duckdb.load_extension(extension, **kwargs) +
+
+
+ +
+
+duckdb.map_type(key, value, **kwargs) +
+
+
+ +
+
+duckdb.order(df, order_expr, **kwargs) +
+
+
+ +
+
+duckdb.pl(rows_per_batch=1000000, **kwargs) +
+
+
+ +
+
+duckdb.project(df, project_expr, **kwargs) +
+
+
+ +
+
+duckdb.query(query, **kwargs) +
+
+
+ +
+
+duckdb.query_df(df, virtual_table_name, sql_query, **kwargs) +
+
+
+ +
+
+duckdb.read_csv(path_or_buffer, **kwargs) +
+
+
+ +
+
+duckdb.read_json(name, **kwargs) +
+
+
+ +
+
+duckdb.read_parquet(file_glob, binary_as_string=False, **kwargs) +
+
+
+ +
+
+duckdb.register(view_name, python_object, **kwargs) +
+
+
+ +
+
+duckdb.register_filesystem(filesystem, **kwargs) +
+
+
+ +
+
+duckdb.remove_function(name, **kwargs) +
+
+
+ +
+
+duckdb.rollback(**kwargs) +
+
+
+ +
+
+duckdb.row_type(fields, **kwargs) +
+
+
+ +
+
+duckdb.rowcount(**kwargs) +
+
+
+ +
+
+duckdb.sql(query, **kwargs) +
+
+
+ +
+
+duckdb.sqltype(type_str, **kwargs) +
+
+
+ +
+
+duckdb.string_type(collation='', **kwargs) +
+
+
+ +
+
+duckdb.struct_type(fields, **kwargs) +
+
+
+ +
+
+duckdb.table(table_name, **kwargs) +
+
+
+ +
+
+duckdb.table_function(name, parameters=None, **kwargs) +
+
+
+ +
+
+duckdb.tf(**kwargs) +
+
+
+ +
+
+class duckdb.token_type +
+
+

Bases: pybind11_object

+

Members:

+

identifier

+

numeric_const

+

string_const

+

operator

+

keyword

+

comment

+
+
+comment = <token_type.comment: 5> +
+
+
+ +
+
+identifier = <token_type.identifier: 0> +
+
+
+ +
+
+keyword = <token_type.keyword: 4> +
+
+
+ +
+
+property name +
+
+
+ +
+
+numeric_const = <token_type.numeric_const: 1> +
+
+
+ +
+
+operator = <token_type.operator: 3> +
+
+
+ +
+
+string_const = <token_type.string_const: 2> +
+
+
+ +
+
+property value +
+
+
+ +
+
+ +
+
+duckdb.tokenize(query: str) list +
+
+

Tokenizes a SQL string, returning a list of (position, type) tuples that can be used for e.g. syntax highlighting

+
+
+ +
+
+duckdb.torch(**kwargs) +
+
+
+ +
+
+duckdb.type(type_str, **kwargs) +
+
+
+ +
+
+duckdb.union_type(members, **kwargs) +
+
+
+ +
+
+duckdb.unregister(view_name, **kwargs) +
+
+
+ +
+
+duckdb.unregister_filesystem(name, **kwargs) +
+
+
+ +
+
+duckdb.values(values, **kwargs) +
+
+
+ +
+
+duckdb.view(view_name, **kwargs) +
+
+
+ +
+
+duckdb.write_csv(df, *args, **kwargs) +
+
+
+ + + +
+
+
+
\ No newline at end of file diff --git a/docs/archive/1.0/api/python/reference/templates/index.rst b/docs/archive/1.0/api/python/reference/templates/index.rst new file mode 100644 index 00000000000..0119c36cd70 --- /dev/null +++ b/docs/archive/1.0/api/python/reference/templates/index.rst @@ -0,0 +1,24 @@ +.. automodule:: duckdb + :members: + :undoc-members: + :show-inheritance: + + .. data:: threadsafety + :annotation: bool + + Indicates that this package is threadsafe + + .. data:: apilevel + :annotation: int + + Indicates which Python DBAPI version this package implements + + .. data:: paramstyle + :annotation: str + + Indicates which parameter style duckdb supports + + .. data:: default_connection + :annotation: duckdb.DuckDBPyConnection + + The connection that is used by default if you don't explicitly pass one to the root methods in this module diff --git a/docs/archive/1.0/api/python/relational_api.md b/docs/archive/1.0/api/python/relational_api.md new file mode 100644 index 00000000000..5c432c0da75 --- /dev/null +++ b/docs/archive/1.0/api/python/relational_api.md @@ -0,0 +1,314 @@ +--- +layout: docu +title: Relational API +--- + +The Relational API is an alternative API that can be used to incrementally construct queries. The API is centered around `DuckDBPyRelation` nodes. The relations can be seen as symbolic representations of SQL queries. They do not hold any data – and nothing is executed – until a method that triggers execution is called. + +## Constructing Relations + +Relations can be created from SQL queries using the `duckdb.sql` method. Alternatively, they can be created from the various data ingestion methods (`read_parquet`, `read_csv`, `read_json`). + +For example, here we create a relation from a SQL query: + +```python +import duckdb + +rel = duckdb.sql("SELECT * FROM range(10_000_000_000) tbl(id)") +rel.show() +``` + +```text +┌────────────────────────┐ +│ id │ +│ int64 │ +├────────────────────────┤ +│ 0 │ +│ 1 │ +│ 2 │ +│ 3 │ +│ 4 │ +│ 5 │ +│ 6 │ +│ 7 │ +│ 8 │ +│ 9 │ +│ · │ +│ · │ +│ · │ +│ 9990 │ +│ 9991 │ +│ 9992 │ +│ 9993 │ +│ 9994 │ +│ 9995 │ +│ 9996 │ +│ 9997 │ +│ 9998 │ +│ 9999 │ +├────────────────────────┤ +│ ? rows │ +│ (>9999 rows, 20 shown) │ +└────────────────────────┘ +``` + +Note how we are constructing a relation that computes an immense amount of data (10B rows or 74 GB of data). The relation is constructed instantly – and we can even print the relation instantly. + +When printing a relation using `show` or displaying it in the terminal, the first `10K` rows are fetched. If there are more than `10K` rows, the output window will show `>9999 rows` (as the amount of rows in the relation is unknown). + +## Data Ingestion + +Outside of SQL queries, the following methods are provided to construct relation objects from external data. + +* `from_arrow` +* `from_df` +* `read_csv` +* `read_json` +* `read_parquet` + +## SQL Queries + +Relation objects can be queried through SQL through [replacement scans]({% link docs/archive/1.0/api/c/replacement_scans.md %}). If you have a relation object stored in a variable, you can refer to that variable as if it was a SQL table (in the `FROM` clause). This allows you to incrementally build queries using relation objects. + +```python +import duckdb + +rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") +duckdb.sql("SELECT sum(id) FROM rel").show() +``` + +```text +┌──────────────┐ +│ sum(id) │ +│ int128 │ +├──────────────┤ +│ 499999500000 │ +└──────────────┘ +``` + +## Operations + +There are a number of operations that can be performed on relations. These are all short-hand for running the SQL queries – and will return relations again themselves. + +### `aggregate(expr, groups = {})` + +Apply an (optionally grouped) aggregate over the relation. The system will automatically group by any columns that are not aggregates. + +```python +import duckdb + +rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") +rel.aggregate("id % 2 AS g, sum(id), min(id), max(id)") +``` + +```text +┌───────┬──────────────┬─────────┬─────────┐ +│ g │ sum(id) │ min(id) │ max(id) │ +│ int64 │ int128 │ int64 │ int64 │ +├───────┼──────────────┼─────────┼─────────┤ +│ 0 │ 249999500000 │ 0 │ 999998 │ +│ 1 │ 250000000000 │ 1 │ 999999 │ +└───────┴──────────────┴─────────┴─────────┘ +``` + +### `except_(rel)` + +Select all rows in the first relation, that do not occur in the second relation. The relations must have the same number of columns. + +```python +import duckdb + +r1 = duckdb.sql("SELECT * FROM range(10) tbl(id)") +r2 = duckdb.sql("SELECT * FROM range(5) tbl(id)") +r1.except_(r2).show() +``` + +```text +┌───────┐ +│ id │ +│ int64 │ +├───────┤ +│ 5 │ +│ 6 │ +│ 7 │ +│ 8 │ +│ 9 │ +└───────┘ +``` + +### `filter(condition)` + +Apply the given condition to the relation, filtering any rows that do not satisfy the condition. + +```python +import duckdb + +rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") +rel.filter("id > 5").limit(3).show() +``` + +```text +┌───────┐ +│ id │ +│ int64 │ +├───────┤ +│ 6 │ +│ 7 │ +│ 8 │ +└───────┘ +``` + +### `intersect(rel)` + +Select the intersection of two relations – returning all rows that occur in both relations. The relations must have the same number of columns. + +```python +import duckdb + +r1 = duckdb.sql("SELECT * FROM range(10) tbl(id)") +r2 = duckdb.sql("SELECT * FROM range(5) tbl(id)") +r1.intersect(r2).show() +``` + +```text +┌───────┐ +│ id │ +│ int64 │ +├───────┤ +│ 0 │ +│ 1 │ +│ 2 │ +│ 3 │ +│ 4 │ +└───────┘ +``` + +### `join(rel, condition, type = "inner")` + +Combine two relations, joining them based on the provided condition. + +```python +import duckdb + +r1 = duckdb.sql("SELECT * FROM range(5) tbl(id)").set_alias("r1") +r2 = duckdb.sql("SELECT * FROM range(10, 15) tbl(id)").set_alias("r2") +r1.join(r2, "r1.id + 10 = r2.id").show() +``` + +```text +┌───────┬───────┐ +│ id │ id │ +│ int64 │ int64 │ +├───────┼───────┤ +│ 0 │ 10 │ +│ 1 │ 11 │ +│ 2 │ 12 │ +│ 3 │ 13 │ +│ 4 │ 14 │ +└───────┴───────┘ +``` + +### `limit(n, offset = 0)` + +Select the first *n* rows, optionally offset by *offset*. + +```python +import duckdb + +rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") +rel.limit(3).show() +``` + +```text +┌───────┐ +│ id │ +│ int64 │ +├───────┤ +│ 0 │ +│ 1 │ +│ 2 │ +└───────┘ +``` + +### `order(expr)` + +Sort the relation by the given set of expressions. + +```python +import duckdb + +rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") +rel.order("id DESC").limit(3).show() +``` + +```text +┌────────┐ +│ id │ +│ int64 │ +├────────┤ +│ 999999 │ +│ 999998 │ +│ 999997 │ +└────────┘ +``` + +### `project(expr)` + +Apply the given expression to each row in the relation. + +```python +import duckdb + +rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)") +rel.project("id + 10 AS id_plus_ten").limit(3).show() +``` + +```text +┌─────────────┐ +│ id_plus_ten │ +│ int64 │ +├─────────────┤ +│ 10 │ +│ 11 │ +│ 12 │ +└─────────────┘ +``` + +### `union(rel)` + +Combine two relations, returning all rows in `r1` followed by all rows in `r2`. The relations must have the same number of columns. + +```python +import duckdb + +r1 = duckdb.sql("SELECT * FROM range(5) tbl(id)") +r2 = duckdb.sql("SELECT * FROM range(10, 15) tbl(id)") +r1.union(r2).show() +``` + +```text +┌───────┐ +│ id │ +│ int64 │ +├───────┤ +│ 0 │ +│ 1 │ +│ 2 │ +│ 3 │ +│ 4 │ +│ 10 │ +│ 11 │ +│ 12 │ +│ 13 │ +│ 14 │ +└───────┘ +``` + +## Result Output + +The result of relations can be converted to various types of Python structures, see the [result conversion page]({% link docs/archive/1.0/api/python/conversion.md %}) for more information. + +The result of relations can also be directly written to files using the below methods. + +* [`write_csv`]({% link docs/archive/1.0/api/python/reference/index.md %}#duckdb.DuckDBPyRelation.write_csv) +* [`write_parquet`]({% link docs/archive/1.0/api/python/reference/index.md %}#duckdb.DuckDBPyRelation.write_parquet) \ No newline at end of file diff --git a/docs/archive/1.0/api/python/spark_api.md b/docs/archive/1.0/api/python/spark_api.md new file mode 100644 index 00000000000..d31e8c283af --- /dev/null +++ b/docs/archive/1.0/api/python/spark_api.md @@ -0,0 +1,52 @@ +--- +layout: docu +title: Spark API +--- + +The DuckDB Spark API implements the [PySpark API](https://spark.apache.org/docs/3.5.0/api/python/reference/index.html), allowing you to use the familiar Spark API to interact with DuckDB. +All statements are translated to DuckDB's internal plans using our [relational API]({% link docs/archive/1.0/api/python/relational_api.md %}) and executed using DuckDB's query engine. + +> Warning The DuckDB Spark API is currently experimental and features are still missing. We are very interested in feedback. Please report any functionality that you are missing, either through [Discord](https://discord.duckdb.org) or on [GitHub](https://github.com/duckdb/duckdb/issues). + +## Example + +```python +from duckdb.experimental.spark.sql import SparkSession as session +from duckdb.experimental.spark.sql.functions import lit, col +import pandas as pd + +spark = session.builder.getOrCreate() + +pandas_df = pd.DataFrame({ + 'age': [34, 45, 23, 56], + 'name': ['Joan', 'Peter', 'John', 'Bob'] +}) + +df = spark.createDataFrame(pandas_df) +df = df.withColumn( + 'location', lit('Seattle') +) +res = df.select( + col('age'), + col('location') +).collect() + +print(res) +``` + +```text +[ + Row(age=34, location='Seattle'), + Row(age=45, location='Seattle'), + Row(age=23, location='Seattle'), + Row(age=56, location='Seattle') +] +``` + +## Contribution Guidelines + +Contributions to the experimental Spark API are welcome. +When making a contribution, please follow these guidelines: + +* Instead of using temporary files, use our `pytest` testing framework. +* When adding new functions, ensure that method signatures comply with those in the [PySpark API](https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/index.html). \ No newline at end of file diff --git a/docs/archive/1.0/api/python/types.md b/docs/archive/1.0/api/python/types.md new file mode 100644 index 00000000000..9bac824157b --- /dev/null +++ b/docs/archive/1.0/api/python/types.md @@ -0,0 +1,184 @@ +--- +layout: docu +title: Types API +--- + +The `DuckDBPyType` class represents a type instance of our [data types]({% link docs/archive/1.0/sql/data_types/overview.md %}). + +## Converting from Other Types + +To make the API as easy to use as possible, we have added implicit conversions from existing type objects to a DuckDBPyType instance. +This means that wherever a DuckDBPyType object is expected, it is also possible to provide any of the options listed below. + +### Python Built-ins + +The table below shows the mapping of Python Built-in types to DuckDB type. + +
+ +| Built-in types | DuckDB type | +|:---------------|:------------| +| bool | BOOLEAN | +| bytearray | BLOB | +| bytes | BLOB | +| float | DOUBLE | +| int | BIGINT | +| str | VARCHAR | + +### Numpy DTypes + +The table below shows the mapping of Numpy DType to DuckDB type. + +
+ +| Type | DuckDB type | +|:------------|:------------| +| bool | BOOLEAN | +| float32 | FLOAT | +| float64 | DOUBLE | +| int16 | SMALLINT | +| int32 | INTEGER | +| int64 | BIGINT | +| int8 | TINYINT | +| uint16 | USMALLINT | +| uint32 | UINTEGER | +| uint64 | UBIGINT | +| uint8 | UTINYINT | + +### Nested Types + +#### `list[child_type]` + +`list` type objects map to a `LIST` type of the child type. +Which can also be arbitrarily nested. + +```python +import duckdb +from typing import Union + +duckdb.typing.DuckDBPyType(list[dict[Union[str, int], str]]) +``` + +```text +MAP(UNION(u1 VARCHAR, u2 BIGINT), VARCHAR)[] +``` + +#### `dict[key_type, value_type]` + +`dict` type objects map to a `MAP` type of the key type and the value type. + +```python +import duckdb + +print(duckdb.typing.DuckDBPyType(dict[str, int])) +``` + +```text +MAP(VARCHAR, BIGINT) +``` + +#### `{'a': field_one, 'b': field_two, .., 'n': field_n}` + +`dict` objects map to a `STRUCT` composed of the keys and values of the dict. + +```python +import duckdb + +print(duckdb.typing.DuckDBPyType({'a': str, 'b': int})) +``` + +```text +STRUCT(a VARCHAR, b BIGINT) +``` + +#### `Union[⟨type_1⟩, ... ⟨type_n⟩]` + +`typing.Union` objects map to a `UNION` type of the provided types. + +```python +import duckdb +from typing import Union + +print(duckdb.typing.DuckDBPyType(Union[int, str, bool, bytearray])) +``` + +```text +UNION(u1 BIGINT, u2 VARCHAR, u3 BOOLEAN, u4 BLOB) +``` + +### Creation Functions + +For the built-in types, you can use the constants defined in `duckdb.typing`: + +
+ +| DuckDB type | +|:---------------| +| BIGINT | +| BIT | +| BLOB | +| BOOLEAN | +| DATE | +| DOUBLE | +| FLOAT | +| HUGEINT | +| INTEGER | +| INTERVAL | +| SMALLINT | +| SQLNULL | +| TIME_TZ | +| TIME | +| TIMESTAMP_MS | +| TIMESTAMP_NS | +| TIMESTAMP_S | +| TIMESTAMP_TZ | +| TIMESTAMP | +| TINYINT | +| UBIGINT | +| UHUGEINT | +| UINTEGER | +| USMALLINT | +| UTINYINT | +| UUID | +| VARCHAR | + +For the complex types there are methods available on the `DuckDBPyConnection` object or the `duckdb` module. +Anywhere a `DuckDBPyType` is accepted, we will also accept one of the type objects that can implicitly convert to a `DuckDBPyType`. + +#### `list_type` | `array_type` + +Parameters: + +* `child_type: DuckDBPyType` + +#### `struct_type` | `row_type` + +Parameters: + +* `fields: Union[list[DuckDBPyType], dict[str, DuckDBPyType]]` + +#### `map_type` + +Parameters: + +* `key_type: DuckDBPyType` +* `value_type: DuckDBPyType` + +#### `decimal_type` + +Parameters: + +* `width: int` +* `scale: int` + +#### `union_type` + +Parameters: + +* `members: Union[list[DuckDBPyType], dict[str, DuckDBPyType]]` + +#### `string_type` + +Parameters: + +* `collation: Optional[str]` \ No newline at end of file diff --git a/docs/archive/1.0/api/r.md b/docs/archive/1.0/api/r.md new file mode 100644 index 00000000000..589d1760e95 --- /dev/null +++ b/docs/archive/1.0/api/r.md @@ -0,0 +1,163 @@ +--- +github_repository: https://github.com/duckdb/duckdb-r +layout: docu +title: R API +--- + +## Installation + +### `duckdb`: R API + +The DuckDB R API can be installed using the following command: + +```r +install.packages("duckdb") +``` + +Please see the [installation page]({% link docs/archive/1.0/installation/index.html %}?environment=r) for details. + +### `duckplyr`: dplyr API + +DuckDB offers a [dplyr](https://dplyr.tidyverse.org/)-compatible API via the `duckplyr` package. It can be installed using `install.packages("duckplyr")`. For details, see the [`duckplyr` documentation](https://tidyverse.github.io/duckplyr/). + +## Reference Manual + +The reference manual for the DuckDB R API is available at [R.duckdb.org](https://r.duckdb.org). + +## Basic API Usage + +The standard DuckDB R API implements the [DBI interface](https://CRAN.R-project.org/package=DBI) for R. If you are not familiar with DBI yet, see [here for an introduction](https://solutions.rstudio.com/db/r-packages/DBI/). + +### Startup & Shutdown + +To use DuckDB, you must first create a connection object that represents the database. The connection object takes as parameter the database file to read and write from. If the database file does not exist, it will be created (the file extension may be `.db`, `.duckdb`, or anything else). The special value `:memory:` (the default) can be used to create an **in-memory database**. Note that for an in-memory database no data is persisted to disk (i.e., all data is lost when you exit the R process). If you would like to connect to an existing database in read-only mode, set the `read_only` flag to `TRUE`. Read-only mode is required if multiple R processes want to access the same database file at the same time. + +```r +library("duckdb") +# to start an in-memory database +con <- dbConnect(duckdb()) +# or +con <- dbConnect(duckdb(), dbdir = ":memory:") +# to use a database file (not shared between processes) +con <- dbConnect(duckdb(), dbdir = "my-db.duckdb", read_only = FALSE) +# to use a database file (shared between processes) +con <- dbConnect(duckdb(), dbdir = "my-db.duckdb", read_only = TRUE) +``` + +Connections are closed implicitly when they go out of scope or if they are explicitly closed using `dbDisconnect()`. To shut down the database instance associated with the connection, use `dbDisconnect(con, shutdown = TRUE)` + +### Querying + +DuckDB supports the standard DBI methods to send queries and retrieve result sets. `dbExecute()` is meant for queries where no results are expected like `CREATE TABLE` or `UPDATE` etc. and `dbGetQuery()` is meant to be used for queries that produce results (e.g., `SELECT`). Below an example. + +```r +# create a table +dbExecute(con, "CREATE TABLE items (item VARCHAR, value DECIMAL(10, 2), count INTEGER)") +# insert two items into the table +dbExecute(con, "INSERT INTO items VALUES ('jeans', 20.0, 1), ('hammer', 42.2, 2)") + +# retrieve the items again +res <- dbGetQuery(con, "SELECT * FROM items") +print(res) +# item value count +# 1 jeans 20.0 1 +# 2 hammer 42.2 2 +``` + +DuckDB also supports prepared statements in the R API with the `dbExecute` and `dbGetQuery` methods. Here is an example: + +```r +# prepared statement parameters are given as a list +dbExecute(con, "INSERT INTO items VALUES (?, ?, ?)", list('laptop', 2000, 1)) + +# if you want to reuse a prepared statement multiple times, use dbSendStatement() and dbBind() +stmt <- dbSendStatement(con, "INSERT INTO items VALUES (?, ?, ?)") +dbBind(stmt, list('iphone', 300, 2)) +dbBind(stmt, list('android', 3.5, 1)) +dbClearResult(stmt) + +# query the database using a prepared statement +res <- dbGetQuery(con, "SELECT item FROM items WHERE value > ?", list(400)) +print(res) +# item +# 1 laptop +``` + +> Warning Do **not** use prepared statements to insert large amounts of data into DuckDB. See below for better options. + +## Efficient Transfer + +To write a R data frame into DuckDB, use the standard DBI function `dbWriteTable()`. This creates a table in DuckDB and populates it with the data frame contents. For example: + +```r +dbWriteTable(con, "iris_table", iris) +res <- dbGetQuery(con, "SELECT * FROM iris_table LIMIT 1") +print(res) +# Sepal.Length Sepal.Width Petal.Length Petal.Width Species +# 1 5.1 3.5 1.4 0.2 setosa +``` + +It is also possible to “register” a R data frame as a virtual table, comparable to a SQL `VIEW`. This *does not actually transfer data* into DuckDB yet. Below is an example: + +```r +duckdb_register(con, "iris_view", iris) +res <- dbGetQuery(con, "SELECT * FROM iris_view LIMIT 1") +print(res) +# Sepal.Length Sepal.Width Petal.Length Petal.Width Species +# 1 5.1 3.5 1.4 0.2 setosa +``` + +> DuckDB keeps a reference to the R data frame after registration. This prevents the data frame from being garbage-collected. The reference is cleared when the connection is closed, but can also be cleared manually using the `duckdb_unregister()` method. + +Also refer to [the data import documentation]({% link docs/archive/1.0/data/overview.md %}) for more options of efficiently importing data. + +## dbplyr + +DuckDB also plays well with the [dbplyr](https://CRAN.R-project.org/package=dbplyr) / [dplyr](https://dplyr.tidyverse.org) packages for programmatic query construction from R. Here is an example: + +```r +library("duckdb") +library("dplyr") +con <- dbConnect(duckdb()) +duckdb_register(con, "flights", nycflights13::flights) + +tbl(con, "flights") |> + group_by(dest) |> + summarise(delay = mean(dep_time, na.rm = TRUE)) |> + collect() +``` + +When using dbplyr, CSV and Parquet files can be read using the `dplyr::tbl` function. + +```r +# Establish a CSV for the sake of this example +write.csv(mtcars, "mtcars.csv") + +# Summarize the dataset in DuckDB to avoid reading the entire CSV into R's memory +tbl(con, "mtcars.csv") |> + group_by(cyl) |> + summarise(across(disp:wt, .fns = mean)) |> + collect() +``` + +```r +# Establish a set of Parquet files +dbExecute(con, "COPY flights TO 'dataset' (FORMAT PARQUET, PARTITION_BY (year, month))") + +# Summarize the dataset in DuckDB to avoid reading 12 Parquet files into R's memory +tbl(con, "read_parquet('dataset/**/*.parquet', hive_partitioning = true)") |> + filter(month == "3") |> + summarise(delay = mean(dep_time, na.rm = TRUE)) |> + collect() +``` + +## Memory Limit + +You can use the [`memory_limit` configuration option]({% link docs/archive/1.0/configuration/pragmas.md %}) to limit the memory use of DuckDB, e.g.: + +```sql +SET memory_limit = '2GB'; +``` + +Note that this limit is only applied to the memory DuckDB uses and it does not affect the memory use of other R libraries. +Therefore, the total memory used by the R process may be higher than the configured `memory_limit`. \ No newline at end of file diff --git a/docs/archive/1.0/api/rust.md b/docs/archive/1.0/api/rust.md new file mode 100644 index 00000000000..c2816ce6d2c --- /dev/null +++ b/docs/archive/1.0/api/rust.md @@ -0,0 +1,66 @@ +--- +layout: docu +title: Rust API +--- + +## Installation + +The DuckDB Rust API can be installed from [crates.io](https://crates.io/crates/duckdb). Please see the [docs.rs](http://docs.rs/duckdb) for details. + +## Basic API Usage + +duckdb-rs is an ergonomic wrapper based on the [DuckDB C API](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb.h), please refer to the [README](https://github.com/duckdb/duckdb-rs) for details. + +### Startup & Shutdown + +To use duckdb, you must first initialize a `Connection` handle using `Connection::open()`. `Connection::open()` takes as parameter the database file to read and write from. If the database file does not exist, it will be created (the file extension may be `.db`, `.duckdb`, or anything else). You can also use `Connection::open_in_memory()` to create an **in-memory database**. Note that for an in-memory database no data is persisted to disk (i.e., all data is lost when you exit the process). + +```rust +use duckdb::{params, Connection, Result}; +let conn = Connection::open_in_memory()?; +``` + +You can `conn.close()` the `Connection` manually, or just leave it out of scope, we had implement the `Drop` trait which will automatically close the underlining db connection for you. + +### Querying + +SQL queries can be sent to DuckDB using the `execute()` method of connections, or we can also prepare the statement and then query on that. + +```rust +#[derive(Debug)] +struct Person { + id: i32, + name: String, + data: Option>, +} + +conn.execute( + "INSERT INTO person (name, data) VALUES (?, ?)", + params![me.name, me.data], +)?; + +let mut stmt = conn.prepare("SELECT id, name, data FROM person")?; +let person_iter = stmt.query_map([], |row| { + Ok(Person { + id: row.get(0)?, + name: row.get(1)?, + data: row.get(2)?, + }) +})?; + +for person in person_iter { + println!("Found person {:?}", person.unwrap()); +} +``` + +## Appender + +The Rust client supports the [DuckDB Appender API]({% link docs/archive/1.0/data/appender.md %}) for bulk inserts. For example: + +```rust +fn insert_rows(conn: &Connection) -> Result<()> { + let mut app = conn.appender("foo")?; + app.append_rows([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]])?; + Ok(()) +} +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/swift.md b/docs/archive/1.0/api/swift.md new file mode 100644 index 00000000000..4e0a87b841d --- /dev/null +++ b/docs/archive/1.0/api/swift.md @@ -0,0 +1,151 @@ +--- +github_repository: https://github.com/duckdb/duckdb-swift +layout: docu +title: Swift API +--- + +DuckDB offers a Swift API. See the [announcement post]({% post_url 2023-04-21-swift %}) for details. + +## Instantiating DuckDB + +DuckDB supports both in-memory and persistent databases. +To work with an in-memory datatabase, run: + +```swift +let database = try Database(store: .inMemory) +``` + +To work with a persistent database, run: + +```swift +let database = try Database(store: .file(at: "test.db")) +``` + +Queries can be issued through a database connection. + +```swift +let connection = try database.connect() +``` + +DuckDB supports multiple connections per database. + +## Application Example + +The rest of the page is based on the example of our [announcement post]({% post_url 2023-04-21-swift %}), which uses raw data from [NASA's Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu) loaded directly into DuckDB. + +### Creating an Application-Specific Type + +We first create an application-specific type that we'll use to house our database and connection and through which we'll eventually define our app-specific queries. + +```swift +import DuckDB + +final class ExoplanetStore { + + let database: Database + let connection: Connection + + init(database: Database, connection: Connection) { + self.database = database + self.connection = connection + } +} +``` + +### Loading a CSV File + +We load the data from [NASA's Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu): + +```text +wget https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+pl_name+,+disc_year+from+pscomppars&format=csv -O downloaded_exoplanets.csv +``` + +Once we have our CSV downloaded locally, we can use the following SQL command to load it as a new table to DuckDB: + +```sql +CREATE TABLE exoplanets AS + SELECT * FROM read_csv('downloaded_exoplanets.csv'); +``` + +Let's package this up as a new asynchronous factory method on our `ExoplanetStore` type: + +```swift +import DuckDB +import Foundation + +final class ExoplanetStore { + + // Factory method to create and prepare a new ExoplanetStore + static func create() async throws -> ExoplanetStore { + + // Create our database and connection as described above + let database = try Database(store: .inMemory) + let connection = try database.connect() + + // Download the CSV from the exoplanet archive + let (csvFileURL, _) = try await URLSession.shared.download( + from: URL(string: "https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+pl_name+,+disc_year+from+pscomppars&format=csv")!) + + // Issue our first query to DuckDB + try connection.execute(""" + CREATE TABLE exoplanets AS + SELECT * FROM read_csv('\(csvFileURL.path)'); + """) + + // Create our pre-populated ExoplanetStore instance + return ExoplanetStore( + database: database, + connection: connection + ) + } + + // Let's make the initializer we defined previously + // private. This prevents anyone accidentally instantiating + // the store without having pre-loaded our Exoplanet CSV + // into the database + private init(database: Database, connection: Connection) { + ... + } +} +``` + +### Querying the Database + +The following example queires DuckDB from within Swift via an async function. This means the callee won't be blocked while the query is executing. We'll then cast the result columns to Swift native types using DuckDB's `ResultSet` `cast(to:)` family of methods, before finally wrapping them up in a `DataFrame` from the TabularData framework. + +```swift +... + +import TabularData + +extension ExoplanetStore { + + // Retrieves the number of exoplanets discovered by year + func groupedByDiscoveryYear() async throws -> DataFrame { + + // Issue the query we described above + let result = try connection.query(""" + SELECT disc_year, count(disc_year) AS Count + FROM exoplanets + GROUP BY disc_year + ORDER BY disc_year + """) + + // Cast our DuckDB columns to their native Swift + // equivalent types + let discoveryYearColumn = result[0].cast(to: Int.self) + let countColumn = result[1].cast(to: Int.self) + + // Use our DuckDB columns to instantiate TabularData + // columns and populate a TabularData DataFrame + return DataFrame(columns: [ + TabularData.Column(discoveryYearColumn).eraseToAnyColumn(), + TabularData.Column(countColumn).eraseToAnyColumn(), + ]) + } +} +``` + +### Complete Project + +For the complete example project, clone the [DuckDB Swift repository](https://github.com/duckdb/duckdb-swift) and open up the runnable app project located in [`Examples/SwiftUI/ExoplanetExplorer.xcodeproj`](https://github.com/duckdb/duckdb-swift/tree/main/Examples/SwiftUI/ExoplanetExplorer.xcodeproj). \ No newline at end of file diff --git a/docs/archive/1.0/api/wasm/data_ingestion.md b/docs/archive/1.0/api/wasm/data_ingestion.md new file mode 100644 index 00000000000..fc3169d06f3 --- /dev/null +++ b/docs/archive/1.0/api/wasm/data_ingestion.md @@ -0,0 +1,181 @@ +--- +layout: docu +title: Data Ingestion +--- + +DuckDB-Wasm has multiple ways to import data, depending on the format of the data. + +There are two steps to import data into DuckDB. + +First, the data file is imported into a local file system using register functions ([registerEmptyFileBuffer](https://shell.duckdb.org/docs/classes/index.AsyncDuckDB.html#registerEmptyFileBuffer), [registerFileBuffer](https://shell.duckdb.org/docs/classes/index.AsyncDuckDB.html#registerFileBuffer), [registerFileHandle](https://shell.duckdb.org/docs/classes/index.AsyncDuckDB.html#registerFileHandle), [registerFileText](https://shell.duckdb.org/docs/classes/index.AsyncDuckDB.html#registerFileText), [registerFileURL](https://shell.duckdb.org/docs/classes/index.AsyncDuckDB.html#registerFileURL)). + +Then, the data file is imported into DuckDB using insert functions ([insertArrowFromIPCStream](https://shell.duckdb.org/docs/classes/index.AsyncDuckDBConnection.html#insertArrowFromIPCStream), [insertArrowTable](https://shell.duckdb.org/docs/classes/index.AsyncDuckDBConnection.html#insertArrowTable), [insertCSVFromPath](https://shell.duckdb.org/docs/classes/index.AsyncDuckDBConnection.html#insertCSVFromPath), [insertJSONFromPath](https://shell.duckdb.org/docs/classes/index.AsyncDuckDBConnection.html#insertJSONFromPath)) or directly using FROM SQL query (using extensions like Parquet or [Wasm-flavored httpfs](#httpfs-wasm-flavored)). + +[Insert statements]({% link docs/archive/1.0/data/insert.md %}) can also be used to import data. + +## Data Import + +### Open & Close Connection + +```ts +// Create a new connection +const c = await db.connect(); + +// ... import data + +// Close the connection to release memory +await c.close(); +``` + +### Apache Arrow + +```ts +// Data can be inserted from an existing arrow.Table +// More Example https://arrow.apache.org/docs/js/ +import { tableFromArrays } from 'apache-arrow'; + +// EOS signal according to Arrorw IPC streaming format +// See https://arrow.apache.org/docs/format/Columnar.html#ipc-streaming-format +const EOS = new Uint8Array([255, 255, 255, 255, 0, 0, 0, 0]); + +const arrowTable = tableFromArrays({ + id: [1, 2, 3], + name: ['John', 'Jane', 'Jack'], + age: [20, 21, 22], +}); + +await c.insertArrowTable(arrowTable, { name: 'arrow_table' }); +// Write EOS +await c.insertArrowTable(EOS, { name: 'arrow_table' }); + +// ..., from a raw Arrow IPC stream +const streamResponse = await fetch(`someapi`); +const streamReader = streamResponse.body.getReader(); +const streamInserts = []; +while (true) { + const { value, done } = await streamReader.read(); + if (done) break; + streamInserts.push(c.insertArrowFromIPCStream(value, { name: 'streamed' })); +} + +// Write EOS +streamInserts.push(c.insertArrowFromIPCStream(EOS, { name: 'streamed' })); + +await Promise.all(streamInserts); +``` + +### CSV + +```ts +// ..., from CSV files +// (interchangeable: registerFile{Text,Buffer,URL,Handle}) +const csvContent = '1|foo\n2|bar\n'; +await db.registerFileText(`data.csv`, csvContent); +// ... with typed insert options +await c.insertCSVFromPath('data.csv', { + schema: 'main', + name: 'foo', + detect: false, + header: false, + delimiter: '|', + columns: { + col1: new arrow.Int32(), + col2: new arrow.Utf8(), + }, +}); +``` + +### JSON + +```ts +// ..., from JSON documents in row-major format +const jsonRowContent = [ + { "col1": 1, "col2": "foo" }, + { "col1": 2, "col2": "bar" }, +]; +await db.registerFileText( + 'rows.json', + JSON.stringify(jsonRowContent), +); +await c.insertJSONFromPath('rows.json', { name: 'rows' }); + +// ... or column-major format +const jsonColContent = { + "col1": [1, 2], + "col2": ["foo", "bar"] +}; +await db.registerFileText( + 'columns.json', + JSON.stringify(jsonColContent), +); +await c.insertJSONFromPath('columns.json', { name: 'columns' }); + +// From API +const streamResponse = await fetch(`someapi/content.json`); +await db.registerFileBuffer('file.json', new Uint8Array(await streamResponse.arrayBuffer())) +await c.insertJSONFromPath('file.json', { name: 'JSONContent' }); +``` + +### Parquet + +```ts +// from Parquet files +// ...Local +const pickedFile: File = letUserPickFile(); +await db.registerFileHandle('local.parquet', pickedFile, DuckDBDataProtocol.BROWSER_FILEREADER, true); +// ...Remote +await db.registerFileURL('remote.parquet', 'https://origin/remote.parquet', DuckDBDataProtocol.HTTP, false); +// ... Using Fetch +const res = await fetch('https://origin/remote.parquet'); +await db.registerFileBuffer('buffer.parquet', new Uint8Array(await res.arrayBuffer())); + +// ..., by specifying URLs in the SQL text +await c.query(` + CREATE TABLE direct AS + SELECT * FROM 'https://origin/remote.parquet' +`); +// ..., or by executing raw insert statements +await c.query(` + INSERT INTO existing_table + VALUES (1, 'foo'), (2, 'bar')`); +``` + +### httpfs (Wasm-flavored) + +```ts +// ..., by specifying URLs in the SQL text +await c.query(` + CREATE TABLE direct AS + SELECT * FROM 'https://origin/remote.parquet' +`); +``` + +> Tip If you encounter a Network Error (`Failed to execute 'send' on 'XMLHttpRequest'`) when you try to query files from S3, configure the S3 permission CORS header. For example: + +```json +[ + { + "AllowedHeaders": [ + "*" + ], + "AllowedMethods": [ + "GET", + "HEAD" + ], + "AllowedOrigins": [ + "*" + ], + "ExposeHeaders": [], + "MaxAgeSeconds": 3000 + } +] +``` + +### Insert Statement + +```ts +// ..., or by executing raw insert statements +await c.query(` + INSERT INTO existing_table + VALUES (1, 'foo'), (2, 'bar')`); +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/wasm/extensions.md b/docs/archive/1.0/api/wasm/extensions.md new file mode 100644 index 00000000000..a60f73ada3b --- /dev/null +++ b/docs/archive/1.0/api/wasm/extensions.md @@ -0,0 +1,82 @@ +--- +layout: docu +title: Extensions +--- + +DuckDB-Wasm's (dynamic) extension loading is modeled after the regular DuckDB's extension loading, with a few relevant differences due to the difference in platform. + +## Format + +Extensions in DuckDB are binaries to be dynamically loaded via `dlopen`. A cryptographical signature is appended to the binary. +An extension in DuckDB-Wasm is a regular Wasm file to be dynamically loaded via Emscripten's `dlopen`. A cryptographical signature is appended to the Wasm file as a WebAssembly custom section called `duckdb_signature`. +This ensures the file remains a valid WebAssembly file. + +> Currently, we require this custom section to be the last one, but this can be potentially relaxed in the future. + +## `INSTALL` and `LOAD` + +The `INSTALL` semantic in native embeddings of DuckDB is to fetch, decompress from `gzip` and store data in local disk. +The `LOAD` semantic in native embeddings of DuckDB is to (optionally) perform signature checks *and* dynamic load the binary with the main DuckDB binary. + +In DuckDB-Wasm, `INSTALL` is a no-op given there is no durable cross-session storage. The `LOAD` operation will fetch (and decompress on the fly), perform signature checks *and* dynamically load via the Emscripten implementation of `dlopen`. + +## Autoloading + +[Autoloading]({% link docs/archive/1.0/extensions/overview.md %}), i.e., the possibility for DuckDB to add extension functionality on-the-fly, is enabled by default in DuckDB-Wasm. + +## List of Officially Available Extensions + +| Extension name | Description | Aliases | +|---|-----|--| +| [autocomplete]({% link docs/archive/1.0/extensions/autocomplete.md %}) | Adds support for autocomplete in the shell | | +| [excel]({% link docs/archive/1.0/extensions/excel.md %}) | Adds support for Excel-like format strings | | +| [fts]({% link docs/archive/1.0/extensions/full_text_search.md %}) | Adds support for Full-Text Search Indexes | | +| icu | Adds support for time zones and collations using the ICU library | | +| inet | Adds support for IP-related data types and functions | | +| [json]({% link docs/archive/1.0/extensions/json.md %}) | Adds support for JSON operations | | +| [parquet]({% link docs/archive/1.0/data/parquet/overview.md %}) | Adds support for reading and writing Parquet files | | +| [sqlite]({% link docs/archive/1.0/extensions/sqlite.md %}) [GitHub](https://github.com/duckdb/sqlite_scanner) | Adds support for reading SQLite database files | sqlite, sqlite3 | +| sqlsmith | | | +| [substrait]({% link docs/archive/1.0/extensions/substrait.md %}) [GitHub](https://github.com/duckdb/substrait) | Adds support for the Substrait integration | | +| [tpcds]({% link docs/archive/1.0/extensions/tpcds.md %}) | Adds TPC-DS data generation and query support | | +| [tpch]({% link docs/archive/1.0/extensions/tpch.md %}) | Adds TPC-H data generation and query support | | + +WebAssembly is basically an additional platform, and there might be platform-specific limitations that make some extensions not able to match their native capabilities or to perform them in a different way. We will document here relevant differences for DuckDB-hosted extensions. + +### HTTPFS + +The HTTPFS extension is, at the moment, not available in DuckDB-Wasm. Https protocol capabilities needs to go through an additional layer, the browser, which adds both differences and some restrictions to what is doable from native. + +Instead, DuckDB-Wasm has a separate implementation that for most purposes is interchangeable, but does not support all use cases (as it must follow security rules imposed by the browser, such as CORS). +Due to this CORS restriction, any requests for data made using the HTTPFS extension must be to websites that allow (using CORS headers) the website hosting the DuckDB-Wasm instance to access that data. +The [MDN website](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS) is a great resource for more information regarding CORS. + +## Extension Signing + +As with regular DuckDB extensions, DuckDB-Wasm extension are by default checked on `LOAD` to verify the signature confirm the extension has not been tampered with. +Extension signature verification can be disabled via a configuration option. +Signing is a property of the binary itself, so copying a DuckDB extension (say to serve it from a different location) will still keep a valid signature (e.g., for local development). + +## Fetching DuckDB-Wasm Extensions + +Official DuckDB extensions are served at `extensions.duckdb.org`, and this is also the default value for the `default_extension_repository` option. +When installing extensions, a relevant URL will be built that will look like `extensions.duckdb.org/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.gz`. + +DuckDB-Wasm extension are fetched only on load, and the URL will look like: `extensions.duckdb.org/duckdb-wasm/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.wasm`. + +Note that an additional `duckdb-wasm` is added to the folder structure, and the file is served as a `.wasm` file. + +DuckDB-Wasm extensions are served pre-compressed using Brotli compression. While fetched from a browser, extensions will be transparently uncompressed. If you want to fetch the `duckdb-wasm` extension manually, you can use `curl --compress extensions.duckdb.org/<...>/icu.duckdb_extension.wasm`. + +## Serving Extensions from a Third-Party Repository + +As with regular DuckDB, if you use `SET custom_extension_repository = some.url.com`, subsequent loads will be attempted at `some.url.com/duckdb-wasm/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.wasm`. + +Note that GET requests on the extensions needs to be [CORS enabled](https://www.w3.org/wiki/CORS_Enabled) for a browser to allow the connection. + +## Tooling + +Both DuckDB-Wasm and its extensions have been compiled using latest packaged Emscripten toolchain. + + +{% include iframe.html src="https://shell.duckdb.org" %} \ No newline at end of file diff --git a/docs/archive/1.0/api/wasm/instantiation.md b/docs/archive/1.0/api/wasm/instantiation.md new file mode 100644 index 00000000000..283e1d7a4ec --- /dev/null +++ b/docs/archive/1.0/api/wasm/instantiation.md @@ -0,0 +1,107 @@ +--- +layout: docu +title: Instantiation +--- + +DuckDB-Wasm has multiple ways to be instantiated depending on the use case. + +## `cdn(jsdelivr)` + +```ts +import * as duckdb from '@duckdb/duckdb-wasm'; + +const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles(); + +// Select a bundle based on browser checks +const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES); + +const worker_url = URL.createObjectURL( + new Blob([`importScripts("${bundle.mainWorker!}");`], {type: 'text/javascript'}) +); + +// Instantiate the asynchronus version of DuckDB-Wasm +const worker = new Worker(worker_url); +const logger = new duckdb.ConsoleLogger(); +const db = new duckdb.AsyncDuckDB(logger, worker); +await db.instantiate(bundle.mainModule, bundle.pthreadWorker); +URL.revokeObjectURL(worker_url); +``` + +## `webpack` + +```ts +import * as duckdb from '@duckdb/duckdb-wasm'; +import duckdb_wasm from '@duckdb/duckdb-wasm/dist/duckdb-mvp.wasm'; +import duckdb_wasm_next from '@duckdb/duckdb-wasm/dist/duckdb-eh.wasm'; +const MANUAL_BUNDLES: duckdb.DuckDBBundles = { + mvp: { + mainModule: duckdb_wasm, + mainWorker: new URL('@duckdb/duckdb-wasm/dist/duckdb-browser-mvp.worker.js', import.meta.url).toString(), + }, + eh: { + mainModule: duckdb_wasm_next, + mainWorker: new URL('@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js', import.meta.url).toString(), + }, +}; +// Select a bundle based on browser checks +const bundle = await duckdb.selectBundle(MANUAL_BUNDLES); +// Instantiate the asynchronus version of DuckDB-Wasm +const worker = new Worker(bundle.mainWorker!); +const logger = new duckdb.ConsoleLogger(); +const db = new duckdb.AsyncDuckDB(logger, worker); +await db.instantiate(bundle.mainModule, bundle.pthreadWorker); +``` + +## `vite` + +```ts +import * as duckdb from '@duckdb/duckdb-wasm'; +import duckdb_wasm from '@duckdb/duckdb-wasm/dist/duckdb-mvp.wasm?url'; +import mvp_worker from '@duckdb/duckdb-wasm/dist/duckdb-browser-mvp.worker.js?url'; +import duckdb_wasm_eh from '@duckdb/duckdb-wasm/dist/duckdb-eh.wasm?url'; +import eh_worker from '@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js?url'; + +const MANUAL_BUNDLES: duckdb.DuckDBBundles = { + mvp: { + mainModule: duckdb_wasm, + mainWorker: mvp_worker, + }, + eh: { + mainModule: duckdb_wasm_eh, + mainWorker: eh_worker, + }, +}; +// Select a bundle based on browser checks +const bundle = await duckdb.selectBundle(MANUAL_BUNDLES); +// Instantiate the asynchronus version of DuckDB-wasm +const worker = new Worker(bundle.mainWorker!); +const logger = new duckdb.ConsoleLogger(); +const db = new duckdb.AsyncDuckDB(logger, worker); +await db.instantiate(bundle.mainModule, bundle.pthreadWorker); +``` + +## Statically Served + +It is possible to manually download the files from . + +```ts +import * as duckdb from '@duckdb/duckdb-wasm'; + +const MANUAL_BUNDLES: duckdb.DuckDBBundles = { + mvp: { + mainModule: 'change/me/../duckdb-mvp.wasm', + mainWorker: 'change/me/../duckdb-browser-mvp.worker.js', + }, + eh: { + mainModule: 'change/m/../duckdb-eh.wasm', + mainWorker: 'change/m/../duckdb-browser-eh.worker.js', + }, +}; +// Select a bundle based on browser checks +const bundle = await duckdb.selectBundle(MANUAL_BUNDLES); +// Instantiate the asynchronous version of DuckDB-Wasm +const worker = new Worker(bundle.mainWorker!); +const logger = new duckdb.ConsoleLogger(); +const db = new duckdb.AsyncDuckDB(logger, worker); +await db.instantiate(bundle.mainModule, bundle.pthreadWorker); +``` \ No newline at end of file diff --git a/docs/archive/1.0/api/wasm/overview.md b/docs/archive/1.0/api/wasm/overview.md new file mode 100644 index 00000000000..5912c369351 --- /dev/null +++ b/docs/archive/1.0/api/wasm/overview.md @@ -0,0 +1,28 @@ +--- +github_repository: https://github.com/duckdb/duckdb-wasm +layout: docu +redirect_from: +- /docs/archive/1.0/api/wasm +- /docs/archive/1.0/api/wasm/ +title: DuckDB Wasm +--- + +DuckDB has been compiled to WebAssembly, so it can run inside any browser on any device. + + +{% include iframe.html src="https://shell.duckdb.org" %} + +DuckDB-Wasm offers a layered API, it can be embedded as a [JavaScript + WebAssembly library](https://www.npmjs.com/package/@duckdb/duckdb-wasm), as a [Web shell](https://www.npmjs.com/package/@duckdb/duckdb-wasm-shell), or [built from source](https://github.com/duckdb/duckdb-wasm) according to your needs. + +## Getting Started with DuckDB-Wasm + +A great starting point is to read the [DuckDB-Wasm launch blog post]({% post_url 2021-10-29-duckdb-wasm %})! + +Another great resource is the [GitHub repository](https://github.com/duckdb/duckdb-wasm). + +For details, see the full [DuckDB-Wasm API Documentation](https://shell.duckdb.org/docs/modules/index.html). + +## Limitations + +* By default, the WebAssembly client only uses a single thread. +* The WebAssembly client has a limited amount of memory available. [WebAssembly limits the amount of available memory to 4 GB](https://v8.dev/blog/4gb-wasm-memory) and browsers may impose even stricter limits. \ No newline at end of file diff --git a/docs/archive/1.0/api/wasm/query.md b/docs/archive/1.0/api/wasm/query.md new file mode 100644 index 00000000000..91dc60241c9 --- /dev/null +++ b/docs/archive/1.0/api/wasm/query.md @@ -0,0 +1,83 @@ +--- +layout: docu +title: Query +--- + +DuckDB-Wasm provides functions for querying data. Queries are run sequentially. + +First, a connection need to be created by calling [connect](https://shell.duckdb.org/docs/classes/index.AsyncDuckDB.html#connect). Then, queries can be run by calling [query](https://shell.duckdb.org/docs/classes/index.AsyncDuckDBConnection.html#query) or [send](https://shell.duckdb.org/docs/classes/index.AsyncDuckDBConnection.html#send). + +## Query Execution + +```ts +// Create a new connection +const conn = await db.connect(); + +// Either materialize the query result +await conn.query<{ v: arrow.Int }>(` + SELECT * FROM generate_series(1, 100) t(v) +`); +// ..., or fetch the result chunks lazily +for await (const batch of await conn.send<{ v: arrow.Int }>(` + SELECT * FROM generate_series(1, 100) t(v) +`)) { + // ... +} + +// Close the connection to release memory +await conn.close(); +``` + +## Prepared Statements + +```ts +// Create a new connection +const conn = await db.connect(); +// Prepare query +const stmt = await conn.prepare(`SELECT v + ? FROM generate_series(0, 10000) AS t(v);`); +// ... and run the query with materialized results +await stmt.query(234); +// ... or result chunks +for await (const batch of await stmt.send(234)) { + // ... +} +// Close the statement to release memory +await stmt.close(); +// Closing the connection will release statements as well +await conn.close(); +``` + +## Arrow Table to JSON + +```ts +// Create a new connection +const conn = await db.connect(); + +// Query +const arrowResult = await conn.query<{ v: arrow.Int }>(` + SELECT * FROM generate_series(1, 100) t(v) +`); + +// Convert arrow table to json +const result = arrowResult.toArray().map((row) => row.toJSON()); + +// Close the connection to release memory +await conn.close(); +``` + +## Export Parquet + +```ts +// Create a new connection +const conn = await db.connect(); + +// Export Parquet +conn.send(`COPY (SELECT * FROM tbl) TO 'result-snappy.parquet' (FORMAT 'parquet');`); +const parquet_buffer = await this._db.copyFileToBuffer('result-snappy.parquet'); + +// Generate a download link +const link = URL.createObjectURL(new Blob([parquet_buffer])); + +// Close the connection to release memory +await conn.close(); +``` \ No newline at end of file diff --git a/docs/archive/1.0/configuration/overview.md b/docs/archive/1.0/configuration/overview.md new file mode 100644 index 00000000000..ae5061ec818 --- /dev/null +++ b/docs/archive/1.0/configuration/overview.md @@ -0,0 +1,186 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/configuration +- /docs/archive/1.0/sql/configuration +title: Configuration +--- + +DuckDB has a number of configuration options that can be used to change the behavior of the system. + +The configuration options can be set using either the [`SET` statement]({% link docs/archive/1.0/sql/statements/set.md %}) or the [`PRAGMA` statement]({% link docs/archive/1.0/configuration/pragmas.md %}). +They can be reset to their original values using the [`RESET` statement]({% link docs/archive/1.0/sql/statements/set.md %}#reset). + +The values of configuration options can be queried via the [`current_setting()` scalar function]({% link docs/archive/1.0/sql/functions/utility.md %}) or using the [`duckdb_settings()` table function]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_settings). For example: + +```sql +SELECT current_setting('memory_limit') AS memlimit; +SELECT value AS memlimit FROM duckdb_settings() WHERE name = 'memory_limit'; +``` + +## Examples + +Set the memory limit of the system to 10 GB. + +```sql +SET memory_limit = '10GB'; +``` + +Configure the system to use 1 thread. + +```sql +SET threads TO 1; +``` + +Enable printing of a progress bar during long-running queries. + +```sql +SET enable_progress_bar = true; +``` + +Set the default null order to NULLS LAST. + +```sql +SET default_null_order = 'nulls_last'; +``` + +Return the current value of a specific setting. + +```sql +SELECT current_setting('threads') AS threads; +``` + +| threads | +|--------:| +| 10 | + +Query a specific setting. + +```sql +SELECT * +FROM duckdb_settings() +WHERE name = 'threads'; +``` + +| name | value | description | input_type | scope | +|---------|-------|-------------------------------------------------|------------|--------| +| threads | 1 | The number of total threads used by the system. | BIGINT | GLOBAL | + +Show a list of all available settings. + +```sql +SELECT * +FROM duckdb_settings(); +``` + +Reset the memory limit of the system back to the default. + +```sql +RESET memory_limit; +``` + +## Secrets Manager + +DuckDB has a [Secrets manager]({% link docs/archive/1.0/sql/statements/create_secret.md %}), which provides a unified user interface for secrets across all backends (e.g., AWS S3) that use them. + +## Configuration Reference + + + +Configuration options come with different default [scopes]({% link docs/archive/1.0/sql/statements/set.md %}#scopes): `GLOBAL` and `LOCAL`. Below is a list of all available configuration options by scope. + +### Global Configuration Options + +| Name | Description | Type | Default value | +|----|--------|--|---| +| `Calendar` | The current calendar | `VARCHAR` | System (locale) calendar | +| `TimeZone` | The current time zone | `VARCHAR` | System (locale) timezone | +| `access_mode` | Access mode of the database (**AUTOMATIC**, **READ_ONLY** or **READ_WRITE**) | `VARCHAR` | `automatic` | +| `allocator_flush_threshold` | Peak allocation threshold at which to flush the allocator after completing a task. | `VARCHAR` | `128.0 MiB` | +| `allow_community_extensions` | Allow to load community built extensions | `BOOLEAN` | `true` | +| `allow_extensions_metadata_mismatch` | Allow to load extensions with not compatible metadata | `BOOLEAN` | `false` | +| `allow_persistent_secrets` | Allow the creation of persistent secrets, that are stored and loaded on restarts | `BOOLEAN` | `true` | +| `allow_unredacted_secrets` | Allow printing unredacted secrets | `BOOLEAN` | `false` | +| `allow_unsigned_extensions` | Allow to load extensions with invalid or missing signatures | `BOOLEAN` | `false` | +| `arrow_large_buffer_size` | If arrow buffers for strings, blobs, uuids and bits should be exported using large buffers | `BOOLEAN` | `false` | +| `autoinstall_extension_repository` | Overrides the custom endpoint for extension installation on autoloading | `VARCHAR` | | +| `autoinstall_known_extensions` | Whether known extensions are allowed to be automatically installed when a query depends on them | `BOOLEAN` | `true` | +| `autoload_known_extensions` | Whether known extensions are allowed to be automatically loaded when a query depends on them | `BOOLEAN` | `true` | +| `binary_as_string` | In Parquet files, interpret binary data as a string. | `BOOLEAN` | | +| `ca_cert_file` | Path to a custom certificate file for self-signed certificates. | `VARCHAR` | | +| `checkpoint_threshold`, `wal_autocheckpoint` | The WAL size threshold at which to automatically trigger a checkpoint (e.g., 1GB) | `VARCHAR` | `16.0 MiB` | +| `custom_extension_repository` | Overrides the custom endpoint for remote extension installation | `VARCHAR` | | +| `custom_user_agent` | Metadata from DuckDB callers | `VARCHAR` | | +| `default_collation` | The collation setting used when none is specified | `VARCHAR` | | +| `default_null_order`, `null_order` | Null ordering used when none is specified (**NULLS_FIRST** or **NULLS_LAST**) | `VARCHAR` | `NULLS_LAST` | +| `default_order` | The order type used when none is specified (**ASC** or **DESC**) | `VARCHAR` | `ASC` | +| `default_secret_storage` | Allows switching the default storage for secrets | `VARCHAR` | `local_file` | +| `disabled_filesystems` | Disable specific file systems preventing access (e.g., LocalFileSystem) | `VARCHAR` | | +| `duckdb_api` | DuckDB API surface | `VARCHAR` | `cli` | +| `enable_external_access` | Allow the database to access external state (through e.g., loading/installing modules, COPY TO/FROM, CSV readers, pandas replacement scans, etc) | `BOOLEAN` | `true` | +| `enable_fsst_vectors` | Allow scans on FSST compressed segments to emit compressed vectors to utilize late decompression | `BOOLEAN` | `false` | +| `enable_http_metadata_cache` | Whether or not the global http metadata is used to cache HTTP metadata | `BOOLEAN` | `false` | +| `enable_macro_dependencies` | Enable created MACROs to create dependencies on the referenced objects (such as tables) | `BOOLEAN` | `false` | +| `enable_object_cache` | Whether or not object cache is used to cache e.g., Parquet metadata | `BOOLEAN` | `false` | +| `enable_server_cert_verification` | Enable server side certificate verification. | `BOOLEAN` | `false` | +| `enable_view_dependencies` | Enable created VIEWs to create dependencies on the referenced objects (such as tables) | `BOOLEAN` | `false` | +| `extension_directory` | Set the directory to store extensions in | `VARCHAR` | | +| `external_threads` | The number of external threads that work on DuckDB tasks. | `BIGINT` | `1` | +| `force_download` | Forces upfront download of file | `BOOLEAN` | `false` | +| `http_keep_alive` | Keep alive connections. Setting this to false can help when running into connection failures | `BOOLEAN` | `true` | +| `http_retries` | HTTP retries on I/O error | `UBIGINT` | `3` | +| `http_retry_backoff` | Backoff factor for exponentially increasing retry wait time | `FLOAT` | `4` | +| `http_retry_wait_ms` | Time between retries | `UBIGINT` | `100` | +| `http_timeout` | HTTP timeout read/write/connection/retry | `UBIGINT` | `30000` | +| `immediate_transaction_mode` | Whether transactions should be started lazily when needed, or immediately when BEGIN TRANSACTION is called | `BOOLEAN` | `false` | +| `lock_configuration` | Whether or not the configuration can be altered | `BOOLEAN` | `false` | +| `max_memory`, `memory_limit` | The maximum memory of the system (e.g., 1GB) | `VARCHAR` | 80% of RAM | +| `max_temp_directory_size` | The maximum amount of data stored inside the `temp_directory` (e.g., 1GB). No limit is applied when value is zero. | `VARCHAR` | `0 bytes` | +| `old_implicit_casting` | Allow implicit casting to/from VARCHAR | `BOOLEAN` | `false` | +| `password` | The password to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | +| `preserve_insertion_order` | Whether or not to preserve insertion order. If set to false the system is allowed to re-order any results that do not contain ORDER BY clauses. | `BOOLEAN` | `true` | +| `s3_access_key_id` | S3 Access Key ID | `VARCHAR` | | +| `s3_endpoint` | S3 Endpoint | `VARCHAR` | | +| `s3_region` | S3 Region | `VARCHAR` | `us-east-1` | +| `s3_secret_access_key` | S3 Access Key | `VARCHAR` | | +| `s3_session_token` | S3 Session Token | `VARCHAR` | | +| `s3_uploader_max_filesize` | S3 Uploader max filesize (between 50GB and 5TB) | `VARCHAR` | `800GB` | +| `s3_uploader_max_parts_per_file` | S3 Uploader max parts per file (between 1 and 10000) | `UBIGINT` | `10000` | +| `s3_uploader_thread_limit` | S3 Uploader global thread limit | `UBIGINT` | `50` | +| `s3_url_compatibility_mode` | Disable Globs and Query Parameters on S3 URLs | `BOOLEAN` | `false` | +| `s3_url_style` | S3 URL style | `VARCHAR` | `vhost` | +| `s3_use_ssl` | S3 use SSL | `BOOLEAN` | `true` | +| `secret_directory` | Set the directory to which persistent secrets are stored | `VARCHAR` | `~/.duckdb/stored_secrets` | +| `storage_compatibility_version` | Serialize on checkpoint with compatibility for a given duckdb version | `VARCHAR` | `v0.10.2` | +| `temp_directory` | Set the directory to which to write temp files | `VARCHAR` | `⟨database_name⟩.tmp` or `.tmp` (in in-memory mode) | +| `threads`, `worker_threads` | The number of total threads used by the system. | `BIGINT` | # CPU cores | +| `username`, `user` | The username to use. Ignored for legacy compatibility. | `VARCHAR` | `NULL` | + +### Local Configuration Options + +| Name | Description | Type | Default value | +|----|--------|--|---| +| `enable_http_logging` | Enables HTTP logging | `BOOLEAN` | `false` | +| `enable_profiling` | Enables profiling, and sets the output format (**JSON**, **QUERY_TREE**, **QUERY_TREE_OPTIMIZER**) | `VARCHAR` | `NULL` | +| `enable_progress_bar_print` | Controls the printing of the progress bar, when 'enable_progress_bar' is true | `BOOLEAN` | `true` | +| `enable_progress_bar` | Enables the progress bar, printing progress to the terminal for long queries | `BOOLEAN` | `true` | +| `errors_as_json` | Output error messages as structured **JSON** instead of as a raw string | `BOOLEAN` | `false` | +| `explain_output` | Output of EXPLAIN statements (**ALL**, **OPTIMIZED_ONLY**, **PHYSICAL_ONLY**) | `VARCHAR` | `physical_only` | +| `file_search_path` | A comma separated list of directories to search for input files | `VARCHAR` | | +| `home_directory` | Sets the home directory used by the system | `VARCHAR` | | +| `http_logging_output` | The file to which HTTP logging output should be saved, or empty to print to the terminal | `VARCHAR` | | +| `integer_division` | Whether or not the / operator defaults to integer division, or to floating point division | `BOOLEAN` | `false` | +| `log_query_path` | Specifies the path to which queries should be logged (default: NULL, queries are not logged) | `VARCHAR` | `NULL` | +| `max_expression_depth` | The maximum expression depth limit in the parser. WARNING: increasing this setting and using very deep expressions might lead to stack overflow errors. | `UBIGINT` | `1000` | +| `ordered_aggregate_threshold` | The number of rows to accumulate before sorting, used for tuning | `UBIGINT` | `262144` | +| `partitioned_write_flush_threshold` | The threshold in number of rows after which we flush a thread state when writing using **PARTITION_BY** | `BIGINT` | `524288` | +| `perfect_ht_threshold` | Threshold in bytes for when to use a perfect hash table | `BIGINT` | `12` | +| `pivot_filter_threshold` | The threshold to switch from using filtered aggregates to LIST with a dedicated pivot operator | `BIGINT` | `10` | +| `pivot_limit` | The maximum number of pivot columns in a pivot statement | `BIGINT` | `100000` | +| `prefer_range_joins` | Force use of range joins with mixed predicates | `BOOLEAN` | `false` | +| `preserve_identifier_case` | Whether or not to preserve the identifier case, instead of always lowercasing all non-quoted identifiers | `BOOLEAN` | `true` | +| `profile_output`, `profiling_output` | The file to which profile output should be saved, or empty to print to the terminal | `VARCHAR` | | +| `profiling_mode` | The profiling mode (**STANDARD** or **DETAILED**) | `VARCHAR` | `NULL` | +| `progress_bar_time` | Sets the time (in milliseconds) how long a query needs to take before we start printing a progress bar | `BIGINT` | `2000` | +| `schema` | Sets the default search schema. Equivalent to setting search_path to a single value. | `VARCHAR` | `main` | +| `search_path` | Sets the default catalog search path as a comma-separated list of values | `VARCHAR` | | \ No newline at end of file diff --git a/docs/archive/1.0/configuration/pragmas.md b/docs/archive/1.0/configuration/pragmas.md new file mode 100644 index 00000000000..6b2517fe779 --- /dev/null +++ b/docs/archive/1.0/configuration/pragmas.md @@ -0,0 +1,534 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/sql/pragmas +- /docs/archive/1.0/sql/pragmas/ +title: Pragmas +--- + + + +The `PRAGMA` statement is a SQL extension adopted by DuckDB from SQLite. `PRAGMA` statements can be issued in a similar manner to regular SQL statements. `PRAGMA` commands may alter the internal state of the database engine, and can influence the subsequent execution or behavior of the engine. + +`PRAGMA` statements that assign a value to an option can also be issued using the [`SET` statement]({% link docs/archive/1.0/sql/statements/set.md %}) and the value of an option can be retrieved using `SELECT current_setting(option_name)`. + +For DuckDB's built in configuration options, see the [Configuration Reference]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference). +DuckDB [extensions]({% link docs/archive/1.0/extensions/overview.md %}) may register additional configuration options. +These are documented in the respective extensions' documentation pages. + +This page contains the supported `PRAGMA` settings. + +## Metadata + +#### Schema Information + +List all databases: + +```sql +PRAGMA database_list; +``` + +List all tables: + +```sql +PRAGMA show_tables; +``` + +List all tables, with extra information, similarly to [`DESCRIBE`]({% link docs/archive/1.0/guides/meta/describe.md %}): + +```sql +PRAGMA show_tables_expanded; +``` + +To list all functions: + +```sql +PRAGMA functions; +``` + +#### Table Information + +Get info for a specific table: + +```sql +PRAGMA table_info('table_name'); +CALL pragma_table_info('table_name'); +``` + +`table_info` returns information about the columns of the table with name `table_name`. The exact format of the table returned is given below: + +```sql +cid INTEGER, -- cid of the column +name VARCHAR, -- name of the column +type VARCHAR, -- type of the column +notnull BOOLEAN, -- if the column is marked as NOT NULL +dflt_value VARCHAR, -- default value of the column, or NULL if not specified +pk BOOLEAN -- part of the primary key or not +``` + +To also show table structure, but in a slightly different format (included for compatibility): + +```sql +PRAGMA show('table_name'); +``` + +#### Database Size + +Get the file and memory size of each database: + +```sql +SET database_size; +CALL pragma_database_size(); +``` + +`database_size` returns information about the file and memory size of each database. The column types of the returned results are given below: + +```sql +database_name VARCHAR, -- database name +database_size VARCHAR, -- total block count times the block size +block_size BIGINT, -- database block size +total_blocks BIGINT, -- total blocks in the database +used_blocks BIGINT, -- used blocks in the database +free_blocks BIGINT, -- free blocks in the database +wal_size VARCHAR, -- write ahead log size +memory_usage VARCHAR, -- memory used by the database buffer manager +memory_limit VARCHAR -- maximum memory allowed for the database +``` + +#### Storage Information + +To get storage information: + +```sql +PRAGMA storage_info('table_name'); +CALL pragma_storage_info('table_name'); +``` + +This call returns the following information for the given table: + +
+ +| Name | Type | Description | +|----------------|-----------|-------------------------------------------------------| +| `row_group_id` | `BIGINT` || +| `column_name` | `VARCHAR` || +| `column_id` | `BIGINT` || +| `column_path` | `VARCHAR` || +| `segment_id` | `BIGINT` || +| `segment_type` | `VARCHAR` || +| `start` | `BIGINT` | The start row id of this chunk | +| `count` | `BIGINT` | The amount of entries in this storage chunk | +| `compression` | `VARCHAR` | Compression type used for this column – see the [“Lightweight Compression in DuckDB” blog post]({% post_url 2022-10-28-lightweight-compression %}) | +| `stats` | `VARCHAR` || +| `has_updates` | `BOOLEAN` || +| `persistent` | `BOOLEAN` | `false` if temporary table | +| `block_id` | `BIGINT` | empty unless persistent | +| `block_offset` | `BIGINT` | empty unless persistent | + +See [Storage]({% link docs/archive/1.0/internals/storage.md %}) for more information. + +#### Show Databases + +The following statement is equivalent to the [`SHOW DATABASES` statement]({% link docs/archive/1.0/sql/statements/attach.md %}): + +```sql +PRAGMA show_databases; +``` + +## Resource Management + +#### Memory Limit + +Set the memory limit for the buffer manager: + +```sql +SET memory_limit = '1GB'; +SET max_memory = '1GB'; +``` + +> Warning The specified memory limit is only applied to the buffer manager. +> For most queries, the buffer manager handles the majority of the data processed. +> However, certain in-memory data structures such as [vectors]({% link docs/archive/1.0/internals/vector.md %}) and query results are allocated outside of the buffer manager. +> Additionally, [aggregate functions]({% link docs/archive/1.0/sql/functions/aggregates.md %}) with complex state (e.g., `list`, `mode`, `quantile`, `string_agg`, and `approx` functions) use memory outside of the buffer manager. +> Therefore, the actual memory consumption can be higher than the specified memory limit. + +#### Threads + +Set the amount of threads for parallel query execution: + +```sql +SET threads = 4; +``` + +## Collations + +List all available collations: + +```sql +PRAGMA collations; +``` + +Set the default collation to one of the available ones: + +```sql +SET default_collation = 'nocase'; +``` + +## Default Ordering for NULLs + +Set the default ordering for NULLs to be either `NULLS FIRST` or `NULLS LAST`: + +```sql +SET default_null_order = 'NULLS FIRST'; +SET default_null_order = 'NULLS LAST'; +``` + +Set the default result set ordering direction to `ASCENDING` or `DESCENDING`: + +```sql +SET default_order = 'ASCENDING'; +SET default_order = 'DESCENDING'; +``` + +## Implicit Casting to `VARCHAR` + +Prior to version 0.10.0, DuckDB would automatically allow any type to be implicitly cast to `VARCHAR` during function binding. As a result it was possible to e.g., compute the substring of an integer without using an explicit cast. For version v0.10.0 and later an explicit cast is needed instead. To revert to the old behavior that performs implicit casting, set the `old_implicit_casting` variable to `true`: + +```sql +SET old_implicit_casting = true; +``` + +## Information on DuckDB + +#### Version + +Show DuckDB version: + +```sql +PRAGMA version; +CALL pragma_version(); +``` + +#### Platform + +`platform` returns an identifier for the platform the current DuckDB executable has been compiled for, e.g., `osx_arm64`. +The format of this identifier matches the platform name as described [on the extension loading explainer]({% link docs/archive/1.0/extensions/working_with_extensions.md %}#platforms): + +```sql +PRAGMA platform; +CALL pragma_platform(); +``` + +#### User Agent + +The following statement returns the user agent information, e.g., `duckdb/v0.10.0(osx_arm64)`: + +```sql +PRAGMA user_agent; +``` + +#### Metadata Information + +The following statement returns information on the metadata store (`block_id`, `total_blocks`, `free_blocks`, and `free_list`): + +```sql +PRAGMA metadata_info; +``` + +## Progress Bar + +Show progress bar when running queries: + +```sql +PRAGMA enable_progress_bar; +``` + +Or: + +```sql +PRAGMA enable_print_progress_bar; +``` + +Don't show a progress bar for running queries: + +```sql +PRAGMA disable_progress_bar; +``` + +Or: + +```sql +PRAGMA disable_print_progress_bar; +``` + +## Profiling Queries + +#### Explain Plan Output + +The output of [`EXPLAIN`]({% link docs/archive/1.0/sql/statements/profiling.md %}) output can be configured to show only the physical plan. This is the default configuration: + +```sql +SET explain_output = 'physical_only'; +``` + +To only show the optimized query plan: + +```sql +SET explain_output = 'optimized_only'; +``` + +To show all query plans: + +```sql +SET explain_output = 'all'; +``` + +#### Profiling + +##### Enable Profiling + +To enable profiling: + +```sql +PRAGMA enable_profiling; +``` + +Or: + +```sql +PRAGMA enable_profile; +``` + +##### Profiling Format + +The format of the resulting profiling information can be specified as either `json`, `query_tree`, or `query_tree_optimizer`. The default format is `query_tree`, which prints the physical operator tree together with the timings and cardinalities of each operator in the tree to the screen. + +To return the logical query plan as JSON: + +```sql +SET enable_profiling = 'json'; +``` + +To return the logical query plan: + +```sql +SET enable_profiling = 'query_tree'; +``` + +To return the physical query plan: + +```sql +SET enable_profiling = 'query_tree_optimizer'; +``` + +##### Disable Profiling + +To disable profiling: + +```sql +PRAGMA disable_profiling; +``` + +Or: + +```sql +PRAGMA disable_profile; +``` + +##### Profiling Output + +By default, profiling information is printed to the console. However, if you prefer to write the profiling information to a file the `PRAGMA` `profiling_output` can be used to write to a specified file. + +> Warning The file contents will be overwritten for every new query that is issued, hence the file will only contain the profiling information of the last query that is run: + +```sql +SET profiling_output = '/path/to/file.json'; +SET profile_output = '/path/to/file.json'; +``` + +##### Profiling Mode + +By default, a limited amount of profiling information is provided (`standard`). +For more details, use the detailed profiling mode by setting `profiling_mode` to `detailed`. +The output of this mode shows how long it takes to apply certain optimizers on the query tree and how long physical planning takes: + +```sql +SET profiling_mode = 'detailed'; +``` + +## Query Optimization + +#### Optimizer + +To disable the query optimizer: + +```sql +PRAGMA disable_optimizer; +``` + +To enable the query optimizer: + +```sql +PRAGMA enable_optimizer; +``` + +#### Selectively Disabling Optimizers + +The `disabled_optimizers` option allows selectively disabling optimization steps. +For example, to disable `filter_pushdown` and `statistics_propagation`, run: + +```sql +SET disabled_optimizers = 'filter_pushdown,statistics_propagation'; +``` + +The available optimizations can be queried using the [`duckdb_optimizers()` table function]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_optimizers). + +> Warning The `disabled_optimizers` option should only be used for debugging performance issues and should be avoided in production. + +## Logging + +Set a path for query logging: + +```sql +SET log_query_path = '/tmp/duckdb_log/'; +``` + +Disable query logging: + +```sql +SET log_query_path = ''; +``` + +## Full-Text Search Indexes + +The `create_fts_index` and `drop_fts_index` options are only available when the [`fts` extension]({% link docs/archive/1.0/extensions/full_text_search.md %}) is loaded. Their usage is documented on the [Full-Text Search extension page]({% link docs/archive/1.0/extensions/full_text_search.md %}). + +## Verification + +#### Verification of External Operators + +Enable verification of external operators: + +```sql +PRAGMA verify_external; +``` + +Disable verification of external operators: + +```sql +PRAGMA disable_verify_external; +``` + +#### Verification of Round-Trip Capabilities + +Enable verification of round-trip capabilities for supported logical plans: + +```sql +PRAGMA verify_serializer; +``` + +Disable verification of round-trip capabilities: + +```sql +PRAGMA disable_verify_serializer; +``` + +## Object Cache + +Enable caching of objects for e.g., Parquet metadata: + +```sql +PRAGMA enable_object_cache; +``` + +Disable caching of objects: + +```sql +PRAGMA disable_object_cache; +``` + +## Checkpointing + +#### Force Checkpoint + +When [`CHECKPOINT`]({% link docs/archive/1.0/sql/statements/checkpoint.md %}) is called when no changes are made, force a checkpoint regardless: + +```sql +PRAGMA force_checkpoint; +``` + +#### Checkpoint on Shutdown + +Run a `CHECKPOINT` on successful shutdown and delete the WAL, to leave only a single database file behind: + +```sql +PRAGMA enable_checkpoint_on_shutdown; +``` + +Don't run a `CHECKPOINT` on shutdown: + +```sql +PRAGMA disable_checkpoint_on_shutdown; +``` + +## Temp Directory for Spilling Data to Disk + +By default, DuckDB uses a temporary directory named `⟨database_file_name⟩.tmp` to spill to disk, located in the same directory as the database file. To change this, use: + +```sql +SET temp_directory = '/path/to/temp_dir.tmp/'; +``` + +## Returning Errors as JSON + +The `errors_as_json` option can be set to obtain error information in raw JSON format. For certain errors, extra information or decomposed information is provided for easier machine processing. For example: + +```sql +SET errors_as_json = true; +``` + +Then, running a query that results in an error produces a JSON output: + +```sql +SELECT * FROM nonexistent_tbl; +``` + +```json +{ + "exception_type":"Catalog", + "exception_message":"Table with name nonexistent_tbl does not exist!\nDid you mean \"temp.information_schema.tables\"?", + "name":"nonexistent_tbl", + "candidates":"temp.information_schema.tables", + "position":"14", + "type":"Table", + "error_subtype":"MISSING_ENTRY" +} +``` + +## Query Verification (for Development) + +The following `PRAGMA`s are mostly used for development and internal testing. + +Enable query verification: + +```sql +PRAGMA enable_verification; +``` + +Disable query verification: + +```sql +PRAGMA disable_verification; +``` + +Enable force parallel query processing: + +```sql +PRAGMA verify_parallelism; +``` + +Disable force parallel query processing: + +```sql +PRAGMA disable_verify_parallelism; +``` \ No newline at end of file diff --git a/docs/archive/1.0/configuration/secrets_manager.md b/docs/archive/1.0/configuration/secrets_manager.md new file mode 100644 index 00000000000..586324d1a66 --- /dev/null +++ b/docs/archive/1.0/configuration/secrets_manager.md @@ -0,0 +1,112 @@ +--- +layout: docu +title: Secrets Manager +--- + +The **Secrets manager** provides a unified user interface for secrets across all backends that use them. Secrets can be scoped, so different storage prefixes can have different secrets, allowing for example to join data across organizations in a single query. Secrets can also be persisted, so that they do not need to be specified every time DuckDB is launched. + +> Warning Persistent secrets are stored in unencrypted binary format on the disk. + +## Secrets + +### Types of Secrets + +Secrets are typed, their type identifies which service they are for. Currently, the following cloud services are available: + +* AWS S3 (`S3`), through the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/s3api.md %}) +* Azure Blob Storage (`AZURE`), through the [`azure` extension]({% link docs/archive/1.0/extensions/azure.md %}) +* Cloudflare R2 (`R2`), through the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/s3api.md %}) +* Google Cloud Storage (`GCS`), through the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/s3api.md %}) +* Hugging Face (`HUGGINGFACE`), through the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/hugging_face.md %}) + +For each type, there are one or more “secret providers” that specify how the secret is created. Secrets can also have an optional scope, which is a file path prefix that the secret applies to. When fetching a secret for a path, the secret scopes are compared to the path, returning the matching secret for the path. In the case of multiple matching secrets, the longest prefix is chosen. + +### Creating a Secret + +Secrets can be created using the [`CREATE SECRET` SQL statement]({% link docs/archive/1.0/sql/statements/create_secret.md %}). +Secrets can be **temporary** or **persistent**. Temporary secrets are used by default – and are stored in-memory for the life span of the DuckDB instance similar to how settings worked previously. Persistent secrets are stored in **unencrypted binary format** in the `~/.duckdb/stored_secrets` directory. On startup of DuckDB, persistent secrets are read from this directory and automatically loaded. + +#### Secret Providers + +To create a secret, a **Secret Provider** needs to be used. A Secret Provider is a mechanism through which a secret is generated. To illustrate this, for the `S3`, `GCS`, `R2`, and `AZURE` secret types, DuckDB currently supports two providers: `CONFIG` and `CREDENTIAL_CHAIN`. The `CONFIG` provider requires the user to pass all configuration information into the `CREATE SECRET`, whereas the `CREDENTIAL_CHAIN` provider will automatically try to fetch credentials. When no Secret Provider is specified, the `CONFIG` provider is used. For more details on how to create secrets using different providers check out the respective pages on [httpfs]({% link docs/archive/1.0/extensions/httpfs/overview.md %}#configuration-and-authentication-using-secrets) and [azure]({% link docs/archive/1.0/extensions/azure.md %}#authentication-with-secret). + +#### Temporary Secrets + +To create a temporary unscoped secret to access S3, we can now use the following: + +```sql +CREATE SECRET my_secret ( + TYPE S3, + KEY_ID 'my_secret_key', + SECRET 'my_secret_value', + REGION 'my_region' +); +``` + +Note that we implicitly use the default `CONFIG` secret provider here. + +#### Persistent Secrets + +In order to persist secrets between DuckDB database instances, we can now use the `CREATE PERSISTENT SECRET` command, e.g.: + +```sql +CREATE PERSISTENT SECRET my_persistent_secret ( + TYPE S3, + KEY_ID 'my_secret_key', + SECRET 'my_secret_value' +); +``` + +By default, this will write the secret (unencrypted) to the `~/.duckdb/stored_secrets` directory. To change the secrets directory, issue: + +```sql +SET secret_directory = 'path/to/my_secrets_dir'; +``` + +Note that setting the value of the `home_directory` configuration option has no effect on the location of the secrets. + +### Deleting Secrets + +Secrets can be deleted using the [`DROP SECRET` statement]({% link docs/archive/1.0/sql/statements/create_secret.md %}#syntax-for-drop-secret), e.g.: + +```sql +DROP PERSISTENT SECRET my_persistent_secret; +``` + +### Creating Multiple Secrets for the Same Service Type + +If two secrets exist for a service type, the scope can be used to decide which one should be used. For example: + +```sql +CREATE SECRET secret1 ( + TYPE S3, + KEY_ID 'my_secret_key1', + SECRET 'my_secret_value1', + SCOPE 's3://my-bucket' +); +``` + +```sql +CREATE SECRET secret2 ( + TYPE S3, + KEY_ID 'my_secret_key2', + SECRET 'my_secret_value2', + SCOPE 's3://my-other-bucket' +); +``` + +Now, if the user queries something from `s3://my-other-bucket/something`, secret `secret2` will be chosen automatically for that request. To see which secret is being used, the `which_secret` scalar function can be used, which takes a path and a secret type as parameters: + +```sql +FROM which_secret('s3://my-other-bucket/file.parquet', 's3'); +``` + +### Listing Secrets + +Secrets can be listed using the built-in table-producing function, e.g., by using the [`duckdb_secrets()` table function]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_secrets): + +```sql +FROM duckdb_secrets(); +``` + +Sensitive information will be redacted. \ No newline at end of file diff --git a/docs/archive/1.0/connect/concurrency.md b/docs/archive/1.0/connect/concurrency.md new file mode 100644 index 00000000000..d1ce088e2b6 --- /dev/null +++ b/docs/archive/1.0/connect/concurrency.md @@ -0,0 +1,37 @@ +--- +layout: docu +title: Concurrency +--- + +## Handling Concurrency + +DuckDB has two configurable options for concurrency: + +1. One process can both read and write to the database. +2. Multiple processes can read from the database, but no processes can write ([`access_mode = 'READ_ONLY'`]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference)). + +When using option 1, DuckDB supports multiple writer threads using a combination of [MVCC (Multi-Version Concurrency Control)](https://en.wikipedia.org/wiki/Multiversion_concurrency_control) and optimistic concurrency control (see [Concurrency within a Single Process](#concurrency-within-a-single-process)), but all within that single writer process. The reason for this concurrency model is to allow for the caching of data in RAM for faster analytical queries, rather than going back and forth to disk during each query. It also allows the caching of functions pointers, the database catalog, and other items so that subsequent queries on the same connection are faster. + +> DuckDB is optimized for bulk operations, so executing many small transactions is not a primary design goal. + +## Concurrency within a Single Process + +DuckDB supports concurrency within a single process according to the following rules. As long as there are no write conflicts, multiple concurrent writes will succeed. Appends will never conflict, even on the same table. Multiple threads can also simultaneously update separate tables or separate subsets of the same table. Optimistic concurrency control comes into play when two threads attempt to edit (update or delete) the same row at the same time. In that situation, the second thread to attempt the edit will fail with a conflict error. + +## Writing to DuckDB from Multiple Processes + +Writing to DuckDB from multiple processes is not supported automatically and is not a primary design goal (see [Handling Concurrency](#handling-concurrency)). + +If multiple processes must write to the same file, several design patterns are possible, but would need to be implemented in application logic. For example, each process could acquire a cross-process mutex lock, then open the database in read/write mode and close it when the query is complete. Instead of using a mutex lock, each process could instead retry the connection if another process is already connected to the database (being sure to close the connection upon query completion). Another alternative would be to do multi-process transactions on a MySQL, PostgreSQL, or SQLite database, and use DuckDB's [MySQL]({% link docs/archive/1.0/extensions/mysql.md %}), [PostgreSQL]({% link docs/archive/1.0/extensions/postgres.md %}), or [SQLite]({% link docs/archive/1.0/extensions/sqlite.md %}) extensions to execute analytical queries on that data periodically. + +Additional options include writing data to Parquet files and using DuckDB's ability to [read multiple Parquet files]({% link docs/archive/1.0/data/parquet/overview.md %}), taking a similar approach with [CSV files]({% link docs/archive/1.0/data/csv/overview.md %}), or creating a web server to receive requests and manage reads and writes to DuckDB. + +## Optimistic Concurrency Control + +DuckDB uses [optimistic concurrency control](https://en.wikipedia.org/wiki/Optimistic_concurrency_control), an approach generally considered to be the best fit for read-intensive analytical database systems as it speeds up read query processing. As a result any transactions that modify the same rows at the same time will cause a transaction conflict error: + +```console +Transaction conflict: cannot update a table that has been altered! +``` + +> Tip A common workaround when a transaction conflict is encountered is to rerun the transaction. \ No newline at end of file diff --git a/docs/archive/1.0/connect/overview.md b/docs/archive/1.0/connect/overview.md new file mode 100644 index 00000000000..7c6c7e257c9 --- /dev/null +++ b/docs/archive/1.0/connect/overview.md @@ -0,0 +1,31 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/connect +- /docs/archive/1.0/connect.html +title: Connect +--- + +## Connect or Create a Database + +To use DuckDB, you must first create a connection to a database. The exact syntax varies between the [client APIs]({% link docs/archive/1.0/api/overview.md %}) but it typically involves passing an argument to configure persistence. + +## Persistence + +DuckDB can operate in both persistent mode, where the data is saved to disk, and in in-memory mode, where the entire data set is stored in the main memory. + +> Tip Both persistent and in-memory databases use spilling to disk to facilitate larger-than-memory workloads (i.e., out-of-core-processing). + +### Persistent Database + +To create or open a persistent database, set the path of the database file, e.g., `my_database.duckdb`, when creating the connection. +This path can point to an existing database or to a file that does not yet exist and DuckDB will open or create a database at that location as needed. +The file may have an arbitrary extension, but `.db` or `.duckdb` are two common choices with `.ddb` also used sometimes. + +Starting with v0.10, DuckDB's storage format is [backwards-compatible]({% link docs/archive/1.0/internals/storage.md %}#backward-compatibility), i.e., DuckDB is able to read database files produced by an older versions of DuckDB. +For example, DuckDB v0.10 can read and operate on files created by the previous DuckDB version, v0.9. +For more details on DuckDB's storage format, see the [storage page]({% link docs/archive/1.0/internals/storage.md %}). + +### In-Memory Database + +DuckDB can operate in in-memory mode. In most clients, this can be activated by passing the special value `:memory:` as the database file or omitting the database file argument. In in-memory mode, no data is persisted to disk, therefore, all data is lost when the process finishes. \ No newline at end of file diff --git a/docs/archive/1.0/data/appender.md b/docs/archive/1.0/data/appender.md new file mode 100644 index 00000000000..b47f811c103 --- /dev/null +++ b/docs/archive/1.0/data/appender.md @@ -0,0 +1,85 @@ +--- +layout: docu +title: Appender +--- + +The Appender can be used to load bulk data into a DuckDB database. It is currently available in the [C, C++, Go, Java, and Rust APIs](#appender-support-in-other-clients). The Appender is tied to a connection, and will use the transaction context of that connection when appending. An Appender always appends to a single table in the database file. + +In the [C++ API]({% link docs/archive/1.0/api/cpp.md %}), the Appender works as follows: + +```cpp +DuckDB db; +Connection con(db); +// create the table +con.Query("CREATE TABLE people (id INTEGER, name VARCHAR)"); +// initialize the appender +Appender appender(con, "people"); +``` + +The `AppendRow` function is the easiest way of appending data. It uses recursive templates to allow you to put all the values of a single row within one function call, as follows: + +```cpp +appender.AppendRow(1, "Mark"); +``` + +Rows can also be individually constructed using the `BeginRow`, `EndRow` and `Append` methods. This is done internally by `AppendRow`, and hence has the same performance characteristics. + +```cpp +appender.BeginRow(); +appender.Append(2); +appender.Append("Hannes"); +appender.EndRow(); +``` + +Any values added to the Appender are cached prior to being inserted into the database system +for performance reasons. That means that, while appending, the rows might not be immediately visible in the system. The cache is automatically flushed when the Appender goes out of scope or when `appender.Close()` is called. The cache can also be manually flushed using the `appender.Flush()` method. After either `Flush` or `Close` is called, all the data has been written to the database system. + +## Date, Time and Timestamps + +While numbers and strings are rather self-explanatory, dates, times and timestamps require some explanation. They can be directly appended using the methods provided by `duckdb::Date`, `duckdb::Time` or `duckdb::Timestamp`. They can also be appended using the internal `duckdb::Value` type, however, this adds some additional overheads and should be avoided if possible. + +Below is a short example: + +```cpp +con.Query("CREATE TABLE dates (d DATE, t TIME, ts TIMESTAMP)"); +Appender appender(con, "dates"); + +// construct the values using the Date/Time/Timestamp types +// (this is the most efficient approach) +appender.AppendRow( + Date::FromDate(1992, 1, 1), + Time::FromTime(1, 1, 1, 0), + Timestamp::FromDatetime(Date::FromDate(1992, 1, 1), Time::FromTime(1, 1, 1, 0)) +); +// construct duckdb::Value objects +appender.AppendRow( + Value::DATE(1992, 1, 1), + Value::TIME(1, 1, 1, 0), + Value::TIMESTAMP(1992, 1, 1, 1, 1, 1, 0) +); +``` + +## Commit Frequency + +By default, the appender performs a commits every 204,800 rows. +You can change this by explicitly using [transactions]({% link docs/archive/1.0/sql/statements/transactions.md %}) and surrounding your batches of `AppendRow` calls by `BEGIN TRANSACTION` and `COMMIT` statements. + +## Handling Constraint Violations + +If the Appender encounters a `PRIMARY KEY` conflict or a `UNIQUE` constraint violation, it fails and returns the following error: + +```console +Constraint Error: PRIMARY KEY or UNIQUE constraint violated: duplicate key "..." +``` + +In this case, the entire append operation fails and no rows are inserted. + +## Appender Support in Other Clients + +The Appender is also available in the following client APIs: + +* [C]({% link docs/archive/1.0/api/c/appender.md %}) +* [Go]({% link docs/archive/1.0/api/go.md %}#appender) +* [Julia]({% link docs/archive/1.0/api/julia.md %}#appender-api) +* [JDBC (Java)]({% link docs/archive/1.0/api/java.md %}#appender) +* [Rust]({% link docs/archive/1.0/api/rust.md %}#appender) \ No newline at end of file diff --git a/docs/archive/1.0/data/csv/auto_detection.md b/docs/archive/1.0/data/csv/auto_detection.md new file mode 100644 index 00000000000..56bcb0fa60d --- /dev/null +++ b/docs/archive/1.0/data/csv/auto_detection.md @@ -0,0 +1,185 @@ +--- +layout: docu +title: CSV Auto Detection +--- + +When using `read_csv`, the system tries to automatically infer how to read the CSV file using the [CSV sniffer]({% post_url 2023-10-27-csv-sniffer %}). +This step is necessary because CSV files are not self-describing and come in many different dialects. The auto-detection works roughly as follows: + +* Detect the dialect of the CSV file (delimiter, quoting rule, escape) +* Detect the types of each of the columns +* Detect whether or not the file has a header row + +By default the system will try to auto-detect all options. However, options can be individually overridden by the user. This can be useful in case the system makes a mistake. For example, if the delimiter is chosen incorrectly, we can override it by calling the `read_csv` with an explicit delimiter (e.g., `read_csv('file.csv', delim = '|')`). + +The detection works by operating on a sample of the file. The size of the sample can be modified by setting the `sample_size` parameter. The default sample size is `20480` rows. Setting the `sample_size` parameter to `-1` means the entire file is read for sampling. The way sampling is performed depends on the type of file. If we are reading from a regular file on disk, we will jump into the file and try to sample from different locations in the file. If we are reading from a file in which we cannot jump – such as a `.gz` compressed CSV file or `stdin` – samples are taken only from the beginning of the file. + +## `sniff_csv` Function + +It is possible to run the CSV sniffer as a separate step using the `sniff_csv(filename)` function, which returns the detected CSV properties as a table with a single row. +The `sniff_csv` function accepts an optional `sample_size` parameter to configure the number of rows sampled. + +```sql +FROM sniff_csv('my_file.csv'); +FROM sniff_csv('my_file.csv', sample_size = 1000); +``` + +| Column name | Description | Example | +|----|-----|-------| +| `Delimiter` | delimiter | `,` | +| `Quote` | quote character | `"` | +| `Escape` | escape | `\` | +| `NewLineDelimiter` | new-line delimiter | `\r\n` | +| `SkipRow` | number of rows skipped | 1 | +| `HasHeader` | whether the CSV has a header | `true` | +| `Columns` | column types encoded as a `LIST` of `STRUCT`s | `({'name': 'VARCHAR', 'age': 'BIGINT'})` | +| `DateFormat` | date Format | `%d/%m/%Y` | +| `TimestampFormat` | timestamp Format | `%Y-%m-%dT%H:%M:%S.%f` | +| `UserArguments` | arguments used to invoke `sniff_csv` | `sample_size = 1000` | +| `Prompt` | prompt ready to be used to read the CSV | `FROM read_csv('my_file.csv', auto_detect=false, delim=',', ...)` | + +### Prompt + +The `Prompt` column contains a SQL command with the configurations detected by the sniffer. + +```sql +-- use line mode in CLI to get the full command +.mode line +SELECT Prompt FROM sniff_csv('my_file.csv'); +``` + +```text +Prompt = FROM read_csv('my_file.csv', auto_detect=false, delim=',', quote='"', escape='"', new_line='\n', skip=0, header=true, columns={...}); +``` + +## Detection Steps + +### Dialect Detection + +Dialect detection works by attempting to parse the samples using the set of considered values. The detected dialect is the dialect that has (1) a consistent number of columns for each row, and (2) the highest number of columns for each row. + +The following dialects are considered for automatic dialect detection. + +
+ + + +| Parameters | Considered values | +|------------|-----------------------| +| `delim` | `,` `|` `;` `\t` | +| `quote` | `"` `'` (empty) | +| `escape` | `"` `'` `\` (empty) | + + + +Consider the example file [`flights.csv`](/data/flights.csv): + +```csv +FlightDate|UniqueCarrier|OriginCityName|DestCityName +1988-01-01|AA|New York, NY|Los Angeles, CA +1988-01-02|AA|New York, NY|Los Angeles, CA +1988-01-03|AA|New York, NY|Los Angeles, CA +``` + +In this file, the dialect detection works as follows: + +* If we split by a `|` every row is split into `4` columns +* If we split by a `,` rows 2-4 are split into `3` columns, while the first row is split into `1` column +* If we split by `;`, every row is split into `1` column +* If we split by `\t`, every row is split into `1` column + +In this example – the system selects the `|` as the delimiter. All rows are split into the same amount of columns, and there is more than one column per row meaning the delimiter was actually found in the CSV file. + +### Type Detection + +After detecting the dialect, the system will attempt to figure out the types of each of the columns. Note that this step is only performed if we are calling `read_csv`. In case of the `COPY` statement the types of the table that we are copying into will be used instead. + +The type detection works by attempting to convert the values in each column to the candidate types. If the conversion is unsuccessful, the candidate type is removed from the set of candidate types for that column. After all samples have been handled – the remaining candidate type with the highest priority is chosen. The default set of candidate types is given below, in order of priority: + +
+ +| Types | +|-----------| +| BOOLEAN | +| BIGINT | +| DOUBLE | +| TIME | +| DATE | +| TIMESTAMP | +| VARCHAR | + +Note everything can be cast to `VARCHAR`. This type has the lowest priority – i.e., columns are converted to `VARCHAR` if they cannot be cast to anything else. In [`flights.csv`](/data/flights.csv) the `FlightDate` column will be cast to a `DATE`, while the other columns will be cast to `VARCHAR`. + +The set of candidate types that should be considered by the CSV reader can be explicitly specified using the [`auto_type_candidates`]({% link docs/archive/1.0/data/csv/overview.md %}#auto_type_candidates-details) option. + +In addition to the default set of candidate types, other types that may be specified using the `auto_type_candidates` options are: + +
+ +| Types | +|-----------| +| DECIMAL | +| FLOAT | +| INTEGER | +| SMALLINT | +| TINYINT | + +Even though the set of data types that can be automatically detected may appear quite limited, the CSV reader can configured to read arbitrarily complex types by using the `types`-option described in the next section. + +Type detection can be entirely disabled by using the `all_varchar` option. If this is set all columns will remain as `VARCHAR` (as they originally occur in the CSV file). + +#### Overriding Type Detection + +The detected types can be individually overridden using the `types` option. This option takes either of two options: + +* A list of type definitions (e.g., `types = ['INTEGER', 'VARCHAR', 'DATE']`). This overrides the types of the columns in-order of occurrence in the CSV file. +* Alternatively, `types` takes a `name` → `type` map which overrides options of individual columns (e.g., `types = {'quarter': 'INTEGER'}`). + +The set of column types that may be specified using the `types` option is not as limited as the types available for the `auto_type_candidates` option: any valid type definition is acceptable to the `types`-option. (To get a valid type definition, use the [`typeof()`]({% link docs/archive/1.0/sql/functions/utility.md %}#typeofexpression) function, or use the `column_type` column of the [`DESCRIBE`]({% link docs/archive/1.0/guides/meta/describe.md %}) result.) + +The `sniff_csv()` function's `Column` field returns a struct with column names and types that can be used as a basis for overriding types. + +## Header Detection + +Header detection works by checking if the candidate header row deviates from the other rows in the file in terms of types. For example, in [`flights.csv`](/data/flights.csv), we can see that the header row consists of only `VARCHAR` columns – whereas the values contain a `DATE` value for the `FlightDate` column. As such – the system defines the first row as the header row and extracts the column names from the header row. + +In files that do not have a header row, the column names are generated as `column0`, `column1`, etc. + +Note that headers cannot be detected correctly if all columns are of type `VARCHAR` – as in this case the system cannot distinguish the header row from the other rows in the file. In this case, the system assumes the file has a header. This can be overridden by setting the `header` option to `false`. + +### Dates and Timestamps + +DuckDB supports the [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601) format by default for timestamps, dates and times. Unfortunately, not all dates and times are formatted using this standard. For that reason, the CSV reader also supports the `dateformat` and `timestampformat` options. Using this format the user can specify a [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}) that specifies how the date or timestamp should be read. + +As part of the auto-detection, the system tries to figure out if dates and times are stored in a different representation. This is not always possible – as there are ambiguities in the representation. For example, the date `01-02-2000` can be parsed as either January 2nd or February 1st. Often these ambiguities can be resolved. For example, if we later encounter the date `21-02-2000` then we know that the format must have been `DD-MM-YYYY`. `MM-DD-YYYY` is no longer possible as there is no 21nd month. + +If the ambiguities cannot be resolved by looking at the data the system has a list of preferences for which date format to use. If the system choses incorrectly, the user can specify the `dateformat` and `timestampformat` options manually. + +The system considers the following formats for dates (`dateformat`). Higher entries are chosen over lower entries in case of ambiguities (i.e., ISO 8601 is preferred over `MM-DD-YYYY`). + +
+ +| dateformat | +|------------| +| ISO 8601 | +| %y-%m-%d | +| %Y-%m-%d | +| %d-%m-%y | +| %d-%m-%Y | +| %m-%d-%y | +| %m-%d-%Y | + +The system considers the following formats for timestamps (`timestampformat`). Higher entries are chosen over lower entries in case of ambiguities. + +
+ +| timestampformat | +|----------------------| +| ISO 8601 | +| %y-%m-%d %H:%M:%S | +| %Y-%m-%d %H:%M:%S | +| %d-%m-%y %H:%M:%S | +| %d-%m-%Y %H:%M:%S | +| %m-%d-%y %I:%M:%S %p | +| %m-%d-%Y %I:%M:%S %p | +| %Y-%m-%d %H:%M:%S.%f | \ No newline at end of file diff --git a/docs/archive/1.0/data/csv/overview.md b/docs/archive/1.0/data/csv/overview.md new file mode 100644 index 00000000000..d4a2c01e9ec --- /dev/null +++ b/docs/archive/1.0/data/csv/overview.md @@ -0,0 +1,224 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/data/csv +title: CSV Import +--- + +## Examples + +The following examples use the [`flights.csv`](/data/flights.csv) file. + +Read a CSV file from disk, auto-infer options: + +```sql +SELECT * FROM 'flights.csv'; +``` + +Use the `read_csv` function with custom options: + +```sql +SELECT * +FROM read_csv('flights.csv', + delim = '|', + header = true, + columns = { + 'FlightDate': 'DATE', + 'UniqueCarrier': 'VARCHAR', + 'OriginCityName': 'VARCHAR', + 'DestCityName': 'VARCHAR' + }); +``` + +Read a CSV from stdin, auto-infer options: + +```bash +cat flights.csv | duckdb -c "SELECT * FROM read_csv('/dev/stdin')" +``` + +Read a CSV file into a table: + +```sql +CREATE TABLE ontime ( + FlightDate DATE, + UniqueCarrier VARCHAR, + OriginCityName VARCHAR, + DestCityName VARCHAR +); +COPY ontime FROM 'flights.csv'; +``` + +Alternatively, create a table without specifying the schema manually using a [`CREATE TABLE .. AS SELECT` statement]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas): + +```sql +CREATE TABLE ontime AS + SELECT * FROM 'flights.csv'; +``` + +We can use the [`FROM`-first syntax]({% link docs/archive/1.0/sql/query_syntax/from.md %}#from-first-syntax) to omit `SELECT *`. + +```sql +CREATE TABLE ontime AS + FROM 'flights.csv'; +``` + +Write the result of a query to a CSV file. + +```sql +COPY (SELECT * FROM ontime) TO 'flights.csv' WITH (HEADER, DELIMITER '|'); +``` + +If we serialize the entire table, we can simply refer to it with its name. + +```sql +COPY ontime TO 'flights.csv' WITH (HEADER, DELIMITER '|'); +``` + +## CSV Loading + +CSV loading, i.e., importing CSV files to the database, is a very common, and yet surprisingly tricky, task. While CSVs seem simple on the surface, there are a lot of inconsistencies found within CSV files that can make loading them a challenge. CSV files come in many different varieties, are often corrupt, and do not have a schema. The CSV reader needs to cope with all of these different situations. + +The DuckDB CSV reader can automatically infer which configuration flags to use by analyzing the CSV file using the [CSV sniffer]({% post_url 2023-10-27-csv-sniffer %}). This will work correctly in most situations, and should be the first option attempted. In rare situations where the CSV reader cannot figure out the correct configuration it is possible to manually configure the CSV reader to correctly parse the CSV file. See the [auto detection page]({% link docs/archive/1.0/data/csv/auto_detection.md %}) for more information. + +## Parameters + +Below are parameters that can be passed to the CSV reader. These parameters are accepted by both the [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}#copy-to) and the [`read_csv` function](#csv-functions). + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `all_varchar` | Option to skip type detection for CSV parsing and assume all columns to be of type `VARCHAR`. | `BOOL` | `false` | +| `allow_quoted_nulls` | Option to allow the conversion of quoted values to `NULL` values | `BOOL` | `true` | +| `auto_detect` | Enables [auto detection of CSV parameters]({% link docs/archive/1.0/data/csv/auto_detection.md %}). | `BOOL` | `true` | +| `auto_type_candidates` | This option allows you to specify the types that the sniffer will use when detecting CSV column types. The `VARCHAR` type is always included in the detected types (as a fallback option). See [example](#auto_type_candidates-details). | `TYPE[]` | [default types](#auto_type_candidates-details) | +| `columns` | A struct that specifies the column names and column types contained within the CSV file (e.g., `{'col1': 'INTEGER', 'col2': 'VARCHAR'}`). Using this option implies that auto detection is not used. | `STRUCT` | (empty) | +| `compression` | The compression type for the file. By default this will be detected automatically from the file extension (e.g., `t.csv.gz` will use gzip, `t.csv` will use `none`). Options are `none`, `gzip`, `zstd`. | `VARCHAR` | `auto` | +| `dateformat` | Specifies the date format to use when parsing dates. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | `VARCHAR` | (empty) | +| `decimal_separator` | The decimal separator of numbers. | `VARCHAR` | `.` | +| `delim` | Specifies the delimiter character that separates columns within each row (line) of the file. Alias for `sep`. | `VARCHAR` | `,` | +| `escape` | Specifies the string that should appear before a data character sequence that matches the `quote` value. | `VARCHAR` | `"` | +| `filename` | Whether or not an extra `filename` column should be included in the result. | `BOOL` | `false` | +| `force_not_null` | Do not match the specified columns' values against the NULL string. In the default case where the `NULL` string is empty, this means that empty values will be read as zero-length strings rather than `NULL`s. | `VARCHAR[]` | `[]` | +| `header` | Specifies that the file contains a header line with the names of each column in the file. | `BOOL` | `false` | +| `hive_partitioning` | Whether or not to interpret the path as a [Hive partitioned path]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}). | `BOOL` | `false` | +| `ignore_errors` | Option to ignore any parsing errors encountered – and instead ignore rows with errors. | `BOOL` | `false` | +| `max_line_size` | The maximum line size in bytes. | `BIGINT` | 2097152 | +| `names` | The column names as a list, see [example]({% link docs/archive/1.0/data/csv/tips.md %}#provide-names-if-the-file-does-not-contain-a-header). | `VARCHAR[]` | (empty) | +| `new_line` | Set the new line character(s) in the file. Options are `'\r'`,`'\n'`, or `'\r\n'`. | `VARCHAR` | (empty) | +| `normalize_names` | Boolean value that specifies whether or not column names should be normalized, removing any non-alphanumeric characters from them. | `BOOL` | `false` | +| `null_padding` | If this option is enabled, when a row lacks columns, it will pad the remaining columns on the right with null values. | `BOOL` | `false` | +| `nullstr` | Specifies the string that represents a `NULL` value or (since v0.10.2) a list of strings that represent a `NULL` value. | `VARCHAR` or `VARCHAR[]` | (empty) | +| `parallel` | Whether or not the parallel CSV reader is used. | `BOOL` | `true` | +| `quote` | Specifies the quoting string to be used when a data value is quoted. | `VARCHAR` | `"` | +| `sample_size` | The number of sample rows for [auto detection of parameters]({% link docs/archive/1.0/data/csv/auto_detection.md %}). | `BIGINT` | 20480 | +| `sep` | Specifies the delimiter character that separates columns within each row (line) of the file. Alias for `delim`. | `VARCHAR` | `,` | +| `skip` | The number of lines at the top of the file to skip. | `BIGINT` | 0 | +| `timestampformat` | Specifies the date format to use when parsing timestamps. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | `VARCHAR` | (empty) | +| `types` or `dtypes` | The column types as either a list (by position) or a struct (by name). [Example here]({% link docs/archive/1.0/data/csv/tips.md %}#override-the-types-of-specific-columns). | `VARCHAR[]` or `STRUCT` | (empty) | +| `union_by_name` | Whether the columns of multiple schemas should be [unified by name]({% link docs/archive/1.0/data/multiple_files/combining_schemas.md %}#union-by-name), rather than by position. Note that using this option increases memory consumption. | `BOOL` | `false` | + +### `auto_type_candidates` Details + +The `auto_type_candidates` option lets you specify the data types that should be considered by the CSV reader for [column data type detection]({% link docs/archive/1.0/data/csv/auto_detection.md %}#type-detection). +Usage example: + +```sql +SELECT * FROM read_csv('csv_file.csv', auto_type_candidates = ['BIGINT', 'DATE']); +``` + +The default value for the `auto_type_candidates` option is `['SQLNULL', 'BOOLEAN', 'BIGINT', 'DOUBLE', 'TIME', 'DATE', 'TIMESTAMP', 'VARCHAR']`. + +## CSV Functions + +The `read_csv` automatically attempts to figure out the correct configuration of the CSV reader using the [CSV sniffer]({% post_url 2023-10-27-csv-sniffer %}). It also automatically deduces types of columns. If the CSV file has a header, it will use the names found in that header to name the columns. Otherwise, the columns will be named `column0, column1, column2, ...`. An example with the [`flights.csv`](/data/flights.csv) file: + +```sql +SELECT * FROM read_csv('flights.csv'); +``` + +
+ +| FlightDate | UniqueCarrier | OriginCityName | DestCityName | +|------------|---------------|----------------|-----------------| +| 1988-01-01 | AA | New York, NY | Los Angeles, CA | +| 1988-01-02 | AA | New York, NY | Los Angeles, CA | +| 1988-01-03 | AA | New York, NY | Los Angeles, CA | + +The path can either be a relative path (relative to the current working directory) or an absolute path. + +We can use `read_csv` to create a persistent table as well: + +```sql +CREATE TABLE ontime AS + SELECT * FROM read_csv('flights.csv'); +DESCRIBE ontime; +``` + +
+ +| column_name | column_type | null | key | default | extra | +|----------------|-------------|------|------|---------|-------| +| FlightDate | DATE | YES | NULL | NULL | NULL | +| UniqueCarrier | VARCHAR | YES | NULL | NULL | NULL | +| OriginCityName | VARCHAR | YES | NULL | NULL | NULL | +| DestCityName | VARCHAR | YES | NULL | NULL | NULL | + +```sql +SELECT * FROM read_csv('flights.csv', sample_size = 20_000); +``` + +If we set `delim`/`sep`, `quote`, `escape`, or `header` explicitly, we can bypass the automatic detection of this particular parameter: + +```sql +SELECT * FROM read_csv('flights.csv', header = true); +``` + +Multiple files can be read at once by providing a glob or a list of files. Refer to the [multiple files section]({% link docs/archive/1.0/data/multiple_files/overview.md %}) for more information. + +## Writing Using the `COPY` Statement + +The [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}#copy-to) can be used to load data from a CSV file into a table. This statement has the same syntax as the one used in PostgreSQL. To load the data using the `COPY` statement, we must first create a table with the correct schema (which matches the order of the columns in the CSV file and uses types that fit the values in the CSV file). `COPY` detects the CSV's configuration options automatically. + +```sql +CREATE TABLE ontime ( + flightdate DATE, + uniquecarrier VARCHAR, + origincityname VARCHAR, + destcityname VARCHAR +); +COPY ontime FROM 'flights.csv'; +SELECT * FROM ontime; +``` + +
+ +| flightdate | uniquecarrier | origincityname | destcityname | +|------------|---------------|----------------|-----------------| +| 1988-01-01 | AA | New York, NY | Los Angeles, CA | +| 1988-01-02 | AA | New York, NY | Los Angeles, CA | +| 1988-01-03 | AA | New York, NY | Los Angeles, CA | + +If we want to manually specify the CSV format, we can do so using the configuration options of `COPY`. + +```sql +CREATE TABLE ontime (flightdate DATE, uniquecarrier VARCHAR, origincityname VARCHAR, destcityname VARCHAR); +COPY ontime FROM 'flights.csv' (DELIMITER '|', HEADER); +SELECT * FROM ontime; +``` + +## Reading Faulty CSV Files + +DuckDB supports reading erroneous CSV files. For details, see the [Reading Faulty CSV Files page]({% link docs/archive/1.0/data/csv/reading_faulty_csv_files.md %}). + +## Limitations + +The CSV reader only supports input files using UTF-8 character encoding. For CSV files using different encodings, use e.g., the [`iconv` command-line tool](https://linux.die.net/man/1/iconv) to convert them to UTF-8. For example: + +```bash +iconv -f ISO-8859-2 -t UTF-8 input.csv > input-utf-8.csv +``` + +## Order Preservation + +The CSV reader respects the `preserve_insertion_order` [configuration option]({% link docs/archive/1.0/configuration/overview.md %}). +When `true` (the default), the order of the rows in the resultset returned by the CSV reader is the same as the order of the corresponding lines read from the file(s). +When `false`, there is no guarantee that the order is preserved. \ No newline at end of file diff --git a/docs/archive/1.0/data/csv/reading_faulty_csv_files.md b/docs/archive/1.0/data/csv/reading_faulty_csv_files.md new file mode 100644 index 00000000000..fd25034df6d --- /dev/null +++ b/docs/archive/1.0/data/csv/reading_faulty_csv_files.md @@ -0,0 +1,235 @@ +--- +layout: docu +redirect_from: null +title: Reading Faulty CSV Files +--- + +CSV files can come in all shapes and forms, with some presenting many errors that make the process of cleanly reading them inherently difficult. To help users read these files, DuckDB supports detailed error messages, the ability to skip faulty lines, and the possibility of storing faulty lines in a temporary table to assist users with a data cleaning step. + +## Structural Errors + +DuckDB supports the detection and skipping of several different structural errors. In this section, we will go over each error with an example. +For the examples, consider the following table: + +```sql +CREATE TABLE people (name VARCHAR, birth_date DATE); +``` + +DuckDB detects the following error types: + +* `CAST`: Casting errors occur when a column in the CSV file cannot be cast to the expected schema value. For example, the line `Pedro,The 90s` would cause an error since the string `The 90s` cannot be cast to a date. +* `MISSING COLUMNS`: This error occurs if a line in the CSV file has fewer columns than expected. In our example, we expect two columns; therefore, a row with just one value, e.g., `Pedro`, would cause this error. +* `TOO MANY COLUMNS`: This error occurs if a line in the CSV has more columns than expected. In our example, any line with more than two columns would cause this error, e.g., `Pedro,01-01-1992,pdet`. +* `UNQUOTED VALUE`: Quoted values in CSV lines must always be unquoted at the end; if a quoted value remains quoted throughout, it will cause an error. For example, assuming our scanner uses `quote='"'`, the line `"pedro"holanda, 01-01-1992` would present an unquoted value error. +* `LINE SIZE OVER MAXIMUM`: DuckDB has a parameter that sets the maximum line size a CSV file can have, which by default is set to `2,097,152` bytes. Assuming our scanner is set to `max_line_size = 25`, the line `Pedro Holanda, 01-01-1992` would produce an error, as it exceeds 25 bytes. +* `INVALID UNICODE`: DuckDB only supports UTF-8 strings; thus, lines containing non-UTF-8 characters will produce an error. For example, the line `pedro\xff\xff, 01-01-1992` would be problematic. + +### Anatomy of a CSV Error + +By default, when performing a CSV read, if any structural errors are encountered, the scanner will immediately stop the scanning process and throw the error to the user. +These errors are designed to provide as much information as possible to allow users to evaluate them directly in their CSV file. + +This is an example for a full error message: + +```console +Conversion Error: CSV Error on Line: 5648 +Original Line: Pedro,The 90s +Error when converting column "birth_date". date field value out of range: "The 90s", expected format is (DD-MM-YYYY) + +Column date is being converted as type DATE +This type was auto-detected from the CSV file. +Possible solutions: +* Override the type for this column manually by setting the type explicitly, e.g. types={'birth_date': 'VARCHAR'} +* Set the sample size to a larger value to enable the auto-detection to scan more values, e.g. sample_size=-1 +* Use a COPY statement to automatically derive types from an existing table. + + file= people.csv + delimiter = , (Auto-Detected) + quote = " (Auto-Detected) + escape = " (Auto-Detected) + new_line = \r\n (Auto-Detected) + header = true (Auto-Detected) + skip_rows = 0 (Auto-Detected) + date_format = (DD-MM-YYYY) (Auto-Detected) + timestamp_format = (Auto-Detected) + null_padding=0 + sample_size=20480 + ignore_errors=false + all_varchar=0 +``` + +The first block provides us with information regarding where the error occurred, including the line number, the original CSV line, and which field was problematic: + +```console +Conversion Error: CSV Error on Line: 5648 +Original Line: Pedro,The 90s +Error when converting column "birth_date". date field value out of range: "The 90s", expected format is (DD-MM-YYYY) +``` + +The second block provides us with potential solutions: + +```console +Column date is being converted as type DATE +This type was auto-detected from the CSV file. +Possible solutions: +* Override the type for this column manually by setting the type explicitly, e.g. types={'birth_date': 'VARCHAR'} +* Set the sample size to a larger value to enable the auto-detection to scan more values, e.g. sample_size=-1 +* Use a COPY statement to automatically derive types from an existing table. +``` + +Since the type of this field was auto-detected, it suggests defining the field as a `VARCHAR` or fully utilizing the dataset for type detection. + +Finally, the last block presents some of the options used in the scanner that can cause errors, indicating whether they were auto-detected or manually set by the user. + +## Using the `ignore_errors` Option + +There are cases where CSV files may have multiple structural errors, and users simply wish to skip these and read the correct data. Reading erroneous CSV files is possible by utilizing the `ignore_errors` option. With this option set, rows containing data that would otherwise cause the CSV parser to generate an error will be ignored. In our example, we will demonstrate a CAST error, but note that any of the errors described in our Structural Error section would cause the faulty line to be skipped. + +For example, consider the following CSV file, [`faulty.csv`](/data/faulty.csv): + +```csv +Pedro,31 +Oogie Boogie, three +``` + +If you read the CSV file, specifying that the first column is a `VARCHAR` and the second column is an `INTEGER`, loading the file would fail, as the string `three` cannot be converted to an `INTEGER`. + +For example, the following query will throw a casting error. + +```sql +FROM read_csv('faulty.csv', columns = {'name': 'VARCHAR', 'age': 'INTEGER'}); +``` + +However, with `ignore_errors` set, the second row of the file is skipped, outputting only the complete first row. For example: + +```sql +FROM read_csv( + 'faulty.csv', + columns = {'name': 'VARCHAR', 'age': 'INTEGER'}, + ignore_errors = true +); +``` + +Outputs: + +
+ +| name | age | +|-------|-----| +| Pedro | 31 | + +One should note that the CSV Parser is affected by the projection pushdown optimization. Hence, if we were to select only the name column, both rows would be considered valid, as the casting error on the age would never occur. For example: + +```sql +SELECT name +FROM read_csv('faulty.csv', columns = {'name': 'VARCHAR', 'age': 'INTEGER'}); +``` + +Outputs: + +
+ +| name | +|--------------| +| Pedro | +| Oogie Boogie | + +## Retrieving Faulty CSV Lines + +Being able to read faulty CSV files is important, but for many data cleaning operations, it is also necessary to know exactly which lines are corrupted and what errors the parser discovered on them. For scenarios like these, it is possible to use DuckDB's CSV Rejects Table feature. +By default, this feature creates two temporary tables. + +1. `reject_scans`: Stores information regarding the parameters of the CSV Scanner +2. `reject_errors`: Stores information regarding each CSV faulty line and in which CSV Scanner they happened. + +Note that any of the errors described in our Structural Error section will be stored in the rejects tables. Also, if a line has multiple errors, multiple entries will be stored for the same line, one for each error. + +### Reject Scans + +The CSV Reject Scans Table returns the following information: + +
+ +| Column name | Description | Type | +|:--|:-----|:-| +| `scan_id` | The internal ID used in DuckDB to represent that scanner | `UBIGINT` | +| `file_id` | A scanner might happen over multiple files, so the file_id represents a unique file in a scanner | `UBIGINT` | +| `file_path` | The file path | `VARCHAR` | +| `delimiter` | The delimiter used e.g., ; | `VARCHAR` | +| `quote` | The quote used e.g., " | `VARCHAR` | +| `escape` | The quote used e.g., " | `VARCHAR` | +| `newline_delimiter` | The newline delimiter used e.g., \r\n | `VARCHAR` | +| `skip_rows` | If any rows were skipped from the top of the file | `UINTEGER` | +| `has_header` | If the file has a header | `BOOLEAN` | +| `columns` | The schema of the file (i.e., all column names and types) | `VARCHAR` | +| `date_format` | The format used for date types | `VARCHAR` | +| `timestamp_format` | The format used for timestamp types| `VARCHAR` | +| `user_arguments` | Any extra scanner parameters manually set by the user | `VARCHAR` | + +### Reject Errors + +The CSV Reject Errors Table returns the following information: + +
+ +| Column name | Description | Type | +|:--|:-----|:-| +| `scan_id` | The internal ID used in DuckDB to represent that scanner, used to join with reject scans tables | `UBIGINT` | +| `file_id` | The file_id represents a unique file in a scanner, used to join with reject scans tables | `UBIGINT` | +| `line` | Line number, from the CSV File, where the error occurred. | `UBIGINT` | +| `line_byte_position` | Byte Position of the start of the line, where the error occurred. | `UBIGINT` | +| `byte_position` | Byte Position where the error occurred. | `UBIGINT` | +| `column_idx` | If the error happens in a specific column, the index of the column. | `UBIGINT` | +| `column_name` | If the error happens in a specific column, the name of the column. | `VARCHAR` | +| `error_type` | The type of the error that happened. | `ENUM` | +| `csv_line` | The original CSV line. | `VARCHAR` | +| `error_message` | The error message produced by DuckDB. | `VARCHAR` | + +## Parameters + +
+ +The parameters listed below are used in the `read_csv` function to configure the CSV Rejects Table. + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `store_rejects` | If set to true, any errors in the file will be skipped and stored in the default rejects temporary tables.| `BOOLEAN` | False | +| `rejects_scan` | Name of a temporary table where the information of the scan information of faulty CSV file are stored. | `VARCHAR` | reject_scans | +| `rejects_table` | Name of a temporary table where the information of the faulty lines of a CSV file are stored. | `VARCHAR` | reject_errors | +| `rejects_limit` | Upper limit on the number of faulty records from a CSV file that will be recorded in the rejects table. 0 is used when no limit should be applied. | `BIGINT` | 0 | + +To store the information of the faulty CSV lines in a rejects table, the user must simply set the `store_rejects` option to true. For example: + +```sql +FROM read_csv( + 'faulty.csv', + columns = {'name': 'VARCHAR', 'age': 'INTEGER'}, + store_rejects = true +); +``` + +You can then query both the `reject_scans` and `reject_errors` tables, to retrieve information about the rejected tuples. For example: + +```sql +FROM reject_scans; +``` + +Outputs: + +
+ +| scan_id | file_id | file_path | delimiter | quote | escape | newline_delimiter | skip_rows | has_header | columns | date_format | timestamp_format | user_arguments | +|---------|---------|-----------------------------------|-----------|-------|--------|-------------------|-----------|-----------:|--------------------------------------|-------------|------------------|--------------------| +| 5 | 0 | faulty.csv | , | " | " | \n | 0 | false | {'name': 'VARCHAR','age': 'INTEGER'} | | | store_rejects=true | + +```sql +FROM reject_errors; +``` + +Outputs: + +
+ +| scan_id | file_id | line | line_byte_position | byte_position | column_idx | column_name | error_type | csv_line | error_message | +|---------|---------|------|--------------------|---------------|------------|-------------|------------|---------------------|------------------------------------------------------------------------------------| +| 5 | 0 | 2 | 10 | 23 | 2 | age | CAST | Oogie Boogie, three | Error when converting column "age". Could not convert string " three" to 'INTEGER' | \ No newline at end of file diff --git a/docs/archive/1.0/data/csv/tips.md b/docs/archive/1.0/data/csv/tips.md new file mode 100644 index 00000000000..9c4d8e5bc3c --- /dev/null +++ b/docs/archive/1.0/data/csv/tips.md @@ -0,0 +1,46 @@ +--- +layout: docu +title: CSV Import Tips +--- + +Below is a collection of tips to help when attempting to import complex CSV files. In the examples, we use the [`flights.csv`](/data/flights.csv) file. + +## Override the Header Flag if the Header Is Not Correctly Detected + +If a file contains only string columns the `header` auto-detection might fail. Provide the `header` option to override this behavior. + +```sql +SELECT * FROM read_csv('flights.csv', header = true); +``` + +## Provide Names if the File Does Not Contain a Header + +If the file does not contain a header, names will be auto-generated by default. You can provide your own names with the `names` option. + +```sql +SELECT * FROM read_csv('flights.csv', names = ['DateOfFlight', 'CarrierName']); +``` + +## Override the Types of Specific Columns + +The `types` flag can be used to override types of only certain columns by providing a struct of `name` → `type` mappings. + +```sql +SELECT * FROM read_csv('flights.csv', types = {'FlightDate': 'DATE'}); +``` + +## Use `COPY` When Loading Data into a Table + +The [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}) copies data directly into a table. The CSV reader uses the schema of the table instead of auto-detecting types from the file. This speeds up the auto-detection, and prevents mistakes from being made during auto-detection. + +```sql +COPY tbl FROM 'test.csv'; +``` + +## Use `union_by_name` When Loading Files with Different Schemas + +The `union_by_name` option can be used to unify the schema of files that have different or missing columns. For files that do not have certain columns, `NULL` values are filled in. + +```sql +SELECT * FROM read_csv('flights*.csv', union_by_name = true); +``` \ No newline at end of file diff --git a/docs/archive/1.0/data/insert.md b/docs/archive/1.0/data/insert.md new file mode 100644 index 00000000000..6ddd4904b5a --- /dev/null +++ b/docs/archive/1.0/data/insert.md @@ -0,0 +1,23 @@ +--- +layout: docu +title: INSERT Statements +--- + +`INSERT` statements are the standard way of loading data into a relational database. When using `INSERT` statements, the values are supplied row-by-row. While simple, there is significant overhead involved in parsing and processing individual `INSERT` statements. This makes lots of individual row-by-row insertions very inefficient for bulk insertion. + +> Bestpractice As a rule-of-thumb, avoid using lots of individual row-by-row `INSERT` statements when inserting more than a few rows (i.e., avoid using `INSERT` statements as part of a loop). When bulk inserting data, try to maximize the amount of data that is inserted per statement. + +If you must use `INSERT` statements to load data in a loop, avoid executing the statements in auto-commit mode. After every commit, the database is required to sync the changes made to disk to ensure no data is lost. In auto-commit mode every single statement will be wrapped in a separate transaction, meaning `fsync` will be called for every statement. This is typically unnecessary when bulk loading and will significantly slow down your program. + +> Tip If you absolutely must use `INSERT` statements in a loop to load data, wrap them in calls to `BEGIN TRANSACTION` and `COMMIT`. + +## Syntax + +An example of using `INSERT INTO` to load data in a table is as follows: + +```sql +CREATE TABLE people (id INTEGER, name VARCHAR); +INSERT INTO people VALUES (1, 'Mark'), (2, 'Hannes'); +``` + +For a more detailed description together with syntax diagram can be found, see the [page on the `INSERT statement`]({% link docs/archive/1.0/sql/statements/insert.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/data/json/overview.md b/docs/archive/1.0/data/json/overview.md new file mode 100644 index 00000000000..ba2b74e1a30 --- /dev/null +++ b/docs/archive/1.0/data/json/overview.md @@ -0,0 +1,320 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/data/json +title: JSON Loading +--- + +## Examples + +Read a JSON file from disk, auto-infer options: + +```sql +SELECT * FROM 'todos.json'; +``` + +Use the `read_json` function with custom options: + +```sql +SELECT * +FROM read_json('todos.json', + format = 'array', + columns = {userId: 'UBIGINT', + id: 'UBIGINT', + title: 'VARCHAR', + completed: 'BOOLEAN'}); +``` + +Read a JSON file from stdin, auto-infer options: + +```bash +cat data/json/todos.json | duckdb -c "SELECT * FROM read_json('/dev/stdin')" +``` + +Read a JSON file into a table: + +```sql +CREATE TABLE todos (userId UBIGINT, id UBIGINT, title VARCHAR, completed BOOLEAN); +COPY todos FROM 'todos.json'; +``` + +Alternatively, create a table without specifying the schema manually with a [`CREATE TABLE ... AS SELECT` clause]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas): + +```sql +CREATE TABLE todos AS + SELECT * FROM 'todos.json'; +``` + +Write the result of a query to a JSON file: + +```sql +COPY (SELECT * FROM todos) TO 'todos.json'; +``` + +## JSON Loading + +JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values). +While it is not a very efficient format for tabular data, it is very commonly used, especially as a data interchange format. + +The DuckDB JSON reader can automatically infer which configuration flags to use by analyzing the JSON file. This will work correctly in most situations, and should be the first option attempted. In rare situations where the JSON reader cannot figure out the correct configuration, it is possible to manually configure the JSON reader to correctly parse the JSON file. + +Below are parameters that can be passed in to the JSON reader. + +## Parameters + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `auto_detect` | Whether to auto-detect detect the names of the keys and data types of the values automatically | `BOOL` | `false` | +| `columns` | A struct that specifies the key names and value types contained within the JSON file (e.g., `{key1: 'INTEGER', key2: 'VARCHAR'}`). If `auto_detect` is enabled these will be inferred | `STRUCT` | `(empty)` | +| `compression` | The compression type for the file. By default this will be detected automatically from the file extension (e.g., `t.json.gz` will use gzip, `t.json` will use none). Options are `'uncompressed'`, `'gzip'`, `'zstd'`, and `'auto_detect'`. | `VARCHAR` | `'auto_detect'` | +| `convert_strings_to_integers` | Whether strings representing integer values should be converted to a numerical type. | `BOOL` | `false` | +| `dateformat` | Specifies the date format to use when parsing dates. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | `'iso'` | +| `filename` | Whether or not an extra `filename` column should be included in the result. | `BOOL` | `false` | +| `format` | Can be one of `['auto', 'unstructured', 'newline_delimited', 'array']` | `VARCHAR` | `'array'` | +| `hive_partitioning` | Whether or not to interpret the path as a [Hive partitioned path]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}). | `BOOL` | `false` | +| `ignore_errors` | Whether to ignore parse errors (only possible when `format` is `'newline_delimited'`) | `BOOL` | `false` | +| `maximum_depth` | Maximum nesting depth to which the automatic schema detection detects types. Set to -1 to fully detect nested JSON types | `BIGINT` | `-1` | +| `maximum_object_size` | The maximum size of a JSON object (in bytes) | `UINTEGER` | `16777216` | +| `records` | Can be one of `['auto', 'true', 'false']` | `VARCHAR` | `'records'` | +| `sample_size` | Option to define number of sample objects for automatic JSON type detection. Set to -1 to scan the entire input file | `UBIGINT` | `20480` | +| `timestampformat` | Specifies the date format to use when parsing timestamps. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | `'iso'`| +| `union_by_name` | Whether the schema's of multiple JSON files should be [unified]({% link docs/archive/1.0/data/multiple_files/combining_schemas.md %}). | `BOOL` | `false` | + +## Examples of Format Settings + +The JSON extension can attempt to determine the format of a JSON file when setting `format` to `auto`. +Here are some example JSON files and the corresponding `format` settings that should be used. + +In each of the below cases, the `format` setting was not needed, as DuckDB was able to infer it correctly, but it is included for illustrative purposes. +A query of this shape would work in each case: + +```sql +SELECT * +FROM filename.json; +``` + +### Format: `newline_delimited` + +With `format = 'newline_delimited'` newline-delimited JSON can be parsed. +Each line is a JSON. + +We use the example file [`records.json`](/data/records.json) with the following content: + +```json +{"key1":"value1", "key2": "value1"} +{"key1":"value2", "key2": "value2"} +{"key1":"value3", "key2": "value3"} +``` + +```sql +SELECT * +FROM read_json('records.json', format = 'newline_delimited'); +``` + +
+ +| key1 | key2 | +|--------|--------| +| value1 | value1 | +| value2 | value2 | +| value3 | value3 | + +### Format: `array` + +If the JSON file contains a JSON array of objects (pretty-printed or not), `array_of_objects` may be used. +To demonstrate its use, we use the example file [`records-in-array.json`](/data/records-in-array.json): + +```json +[ + {"key1":"value1", "key2": "value1"}, + {"key1":"value2", "key2": "value2"}, + {"key1":"value3", "key2": "value3"} +] +``` + +```sql +SELECT * +FROM read_json('records-in-array.json', format = 'array'); +``` + +
+ +| key1 | key2 | +|--------|--------| +| value1 | value1 | +| value2 | value2 | +| value3 | value3 | + +### Format: `unstructured` + +If the JSON file contains JSON that is not newline-delimited or an array, `unstructured` may be used. +To demonstrate its use, we use the example file [`unstructured.json`](/data/unstructured.json): + +```json +{ + "key1":"value1", + "key2":"value1" +} +{ + "key1":"value2", + "key2":"value2" +} +{ + "key1":"value3", + "key2":"value3" +} +``` + +```sql +SELECT * +FROM read_json('unstructured.json', format = 'unstructured'); +``` + +
+ +| key1 | key2 | +|--------|--------| +| value1 | value1 | +| value2 | value2 | +| value3 | value3 | + +## Examples of Records Settings + +The JSON extension can attempt to determine whether a JSON file contains records when setting `records = auto`. +When `records = true`, the JSON extension expects JSON objects, and will unpack the fields of JSON objects into individual columns. + +Continuing with the same example file, [`records.json`](/data/records.json): + +```json +{"key1":"value1", "key2": "value1"} +{"key1":"value2", "key2": "value2"} +{"key1":"value3", "key2": "value3"} +``` + +```sql +SELECT * +FROM read_json('records.json', records = true); +``` + +
+ +| key1 | key2 | +|--------|--------| +| value1 | value1 | +| value2 | value2 | +| value3 | value3 | + +When `records = false`, the JSON extension will not unpack the top-level objects, and create `STRUCT`s instead: + +```sql +SELECT * +FROM read_json('records.json', records = false); +``` + +
+ +| json | +|----------------------------------| +| {'key1': value1, 'key2': value1} | +| {'key1': value2, 'key2': value2} | +| {'key1': value3, 'key2': value3} | + +This is especially useful if we have non-object JSON, for example, [`arrays.json`](/data/arrays.json): + +```json +[1, 2, 3] +[4, 5, 6] +[7, 8, 9] +``` + +```sql +SELECT * +FROM read_json('arrays.json', records = false); +``` + +
+ +| json | +|-----------| +| [1, 2, 3] | +| [4, 5, 6] | +| [7, 8, 9] | + +## Writing + +The contents of tables or the result of queries can be written directly to a JSON file using the `COPY` statement. See the [COPY documentation]({% link docs/archive/1.0/sql/statements/copy.md %}#copy-to) for more information. + +## `read_json` Function + +The `read_json` is the simplest method of loading JSON files: it automatically attempts to figure out the correct configuration of the JSON reader. It also automatically deduces types of columns. + +```sql +SELECT * +FROM read_json('todos.json') +LIMIT 5; +``` + +
+ +| userId | id | title | completed | +|-------:|---:|-----------------------------------------------------------------|-----------| +| 1 | 1 | delectus aut autem | false | +| 1 | 2 | quis ut nam facilis et officia qui | false | +| 1 | 3 | fugiat veniam minus | false | +| 1 | 4 | et porro tempora | true | +| 1 | 5 | laboriosam mollitia et enim quasi adipisci quia provident illum | false | + +The path can either be a relative path (relative to the current working directory) or an absolute path. + +We can use `read_json` to create a persistent table as well: + +```sql +CREATE TABLE todos AS + SELECT * + FROM read_json('todos.json'); +DESCRIBE todos; +``` + +
+ +| column_name | column_type | null | key | default | extra | +|-------------|-------------|------|-----|---------|-------| +| userId | UBIGINT | YES | | | | +| id | UBIGINT | YES | | | | +| title | VARCHAR | YES | | | | +| completed | BOOLEAN | YES | | | | + +If we specify the columns, we can bypass the automatic detection. Note that not all columns need to be specified: + +```sql +SELECT * +FROM read_json('todos.json', + columns = {userId: 'UBIGINT', + completed: 'BOOLEAN'}); +``` + +Multiple files can be read at once by providing a glob or a list of files. Refer to the [multiple files section]({% link docs/archive/1.0/data/multiple_files/overview.md %}) for more information. + +## `COPY` Statement + +The `COPY` statement can be used to load data from a JSON file into a table. For the `COPY` statement, we must first create a table with the correct schema to load the data into. We then specify the JSON file to load from plus any configuration options separately. + +```sql +CREATE TABLE todos (userId UBIGINT, id UBIGINT, title VARCHAR, completed BOOLEAN); +COPY todos FROM 'todos.json'; +SELECT * FROM todos LIMIT 5; +``` + +
+ +| userId | id | title | completed | +|--------|----|-----------------------------------------------------------------|-----------| +| 1 | 1 | delectus aut autem | false | +| 1 | 2 | quis ut nam facilis et officia qui | false | +| 1 | 3 | fugiat veniam minus | false | +| 1 | 4 | et porro tempora | true | +| 1 | 5 | laboriosam mollitia et enim quasi adipisci quia provident illum | false | + +For more details, see the [page on the `COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}). + + \ No newline at end of file diff --git a/docs/archive/1.0/data/multiple_files/combining_schemas.md b/docs/archive/1.0/data/multiple_files/combining_schemas.md new file mode 100644 index 00000000000..13a683f9b77 --- /dev/null +++ b/docs/archive/1.0/data/multiple_files/combining_schemas.md @@ -0,0 +1,96 @@ +--- +layout: docu +title: Combining Schemas +--- + + + +## Examples + +Read a set of CSV files combining columns by position: + +```sql +SELECT * FROM read_csv('flights*.csv'); +``` + +Read a set of CSV files combining columns by name: + +```sql +SELECT * FROM read_csv('flights*.csv', union_by_name = true); +``` + +## Combining Schemas + +When reading from multiple files, we have to **combine schemas** from those files. That is because each file has its own schema that can differ from the other files. DuckDB offers two ways of unifying schemas of multiple files: **by column position** and **by column name**. + +By default, DuckDB reads the schema of the first file provided, and then unifies columns in subsequent files by column position. This works correctly as long as all files have the same schema. If the schema of the files differs, you might want to use the `union_by_name` option to allow DuckDB to construct the schema by reading all of the names instead. + +Below is an example of how both methods work. + +## Union by Position + +By default, DuckDB unifies the columns of these different files **by position**. This means that the first column in each file is combined together, as well as the second column in each file, etc. For example, consider the following two files. + +[`flights1.csv`](/data/flights1.csv): + +```csv +FlightDate|UniqueCarrier|OriginCityName|DestCityName +1988-01-01|AA|New York, NY|Los Angeles, CA +1988-01-02|AA|New York, NY|Los Angeles, CA +``` + +[`flights2.csv`](/data/flights2.csv): + +```csv +FlightDate|UniqueCarrier|OriginCityName|DestCityName +1988-01-03|AA|New York, NY|Los Angeles, CA +``` + +Reading the two files at the same time will produce the following result set: + +
+ +| FlightDate | UniqueCarrier | OriginCityName | DestCityName | +|------------|---------------|----------------|-----------------| +| 1988-01-01 | AA | New York, NY | Los Angeles, CA | +| 1988-01-02 | AA | New York, NY | Los Angeles, CA | +| 1988-01-03 | AA | New York, NY | Los Angeles, CA | + +This is equivalent to the SQL construct [`UNION ALL`]({% link docs/archive/1.0/sql/query_syntax/setops.md %}#union-all). + +## Union by Name + +If you are processing multiple files that have different schemas, perhaps because columns have been added or renamed, it might be desirable to unify the columns of different files **by name** instead. This can be done by providing the `union_by_name` option. For example, consider the following two files, where `flights4.csv` has an extra column (`UniqueCarrier`). + +[`flights3.csv`](/data/flights3.csv): + +```csv +FlightDate|OriginCityName|DestCityName +1988-01-01|New York, NY|Los Angeles, CA +1988-01-02|New York, NY|Los Angeles, CA +``` + +[`flights4.csv`](/data/flights4.csv): + +```csv +FlightDate|UniqueCarrier|OriginCityName|DestCityName +1988-01-03|AA|New York, NY|Los Angeles, CA +``` + +Reading these when unifying column names **by position** results in an error – as the two files have a different number of columns. When specifying the `union_by_name` option, the columns are correctly unified, and any missing values are set to `NULL`. + +```sql +SELECT * FROM read_csv(['flights3.csv', 'flights4.csv'], union_by_name = true); +``` + +
+ +| FlightDate | OriginCityName | DestCityName | UniqueCarrier | +|------------|----------------|-----------------|---------------| +| 1988-01-01 | New York, NY | Los Angeles, CA | NULL | +| 1988-01-02 | New York, NY | Los Angeles, CA | NULL | +| 1988-01-03 | New York, NY | Los Angeles, CA | AA | + +This is equivalent to the SQL construct [`UNION ALL BY NAME`]({% link docs/archive/1.0/sql/query_syntax/setops.md %}#union-all-by-name). + +> Using the `union_by_name` option increases memory consumption. \ No newline at end of file diff --git a/docs/archive/1.0/data/multiple_files/overview.md b/docs/archive/1.0/data/multiple_files/overview.md new file mode 100644 index 00000000000..5eeae9b36b9 --- /dev/null +++ b/docs/archive/1.0/data/multiple_files/overview.md @@ -0,0 +1,169 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/data/csv/multiple_files +title: Reading Multiple Files +--- + +DuckDB can read multiple files of different types (CSV, Parquet, JSON files) at the same time using either the glob syntax, or by providing a list of files to read. +See the [combining schemas]({% link docs/archive/1.0/data/multiple_files/combining_schemas.md %}) page for tips on reading files with different schemas. + +## CSV + +Read all files with a name ending in `.csv` in the folder `dir`: + +```sql +SELECT * +FROM 'dir/*.csv'; +``` + +Read all files with a name ending in `.csv`, two directories deep: + +```sql +SELECT * +FROM '*/*/*.csv'; +``` + +Read all files with a name ending in `.csv`, at any depth in the folder `dir`: + +```sql +SELECT * +FROM 'dir/**/*.csv'; +``` + +Read the CSV files `flights1.csv` and `flights2.csv`: + +```sql +SELECT * +FROM read_csv(['flights1.csv', 'flights2.csv']); +``` + +Read the CSV files `flights1.csv` and `flights2.csv`, unifying schemas by name and outputting a `filename` column: + +```sql +SELECT * +FROM read_csv(['flights1.csv', 'flights2.csv'], union_by_name = true, filename = true); +``` + +## Parquet + +Read all files that match the glob pattern: + +```sql +SELECT * +FROM 'test/*.parquet'; +``` + +Read three Parquet files and treat them as a single table: + +```sql +SELECT * +FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']); +``` + +Read all Parquet files from two specific folders: + +```sql +SELECT * +FROM read_parquet(['folder1/*.parquet', 'folder2/*.parquet']); +``` + +Read all Parquet files that match the glob pattern at any depth: + +```sql +SELECT * +FROM read_parquet('dir/**/*.parquet'); +``` + +## Multi-File Reads and Globs + +DuckDB can also read a series of Parquet files and treat them as if they were a single table. Note that this only works if the Parquet files have the same schema. You can specify which Parquet files you want to read using a list parameter, glob pattern matching syntax, or a combination of both. + +### List Parameter + +The `read_parquet` function can accept a list of filenames as the input parameter. + +Read three Parquet files and treat them as a single table: + +```sql +SELECT * +FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']); +``` + +### Glob Syntax + +Any file name input to the `read_parquet` function can either be an exact filename, or use a glob syntax to read multiple files that match a pattern. + +
+ +| Wildcard | Description | +|------------|-----------------------------------------------------------| +| `*` | matches any number of any characters (including none) | +| `**` | matches any number of subdirectories (including none) | +| `?` | matches any single character | +| `[abc]` | matches one character given in the bracket | +| `[a-z]` | matches one character from the range given in the bracket | + +Note that the `?` wildcard in globs is not supported for reads over S3 due to HTTP encoding issues. + +Here is an example that reads all the files that end with `.parquet` located in the `test` folder: + +Read all files that match the glob pattern: + +```sql +SELECT * +FROM read_parquet('test/*.parquet'); +``` + +### List of Globs + +The glob syntax and the list input parameter can be combined to scan files that meet one of multiple patterns. + +Read all Parquet files from 2 specific folders. + +```sql +SELECT * +FROM read_parquet(['folder1/*.parquet', 'folder2/*.parquet']); +``` + +DuckDB can read multiple CSV files at the same time using either the glob syntax, or by providing a list of files to read. + +## Filename + +The `filename` argument can be used to add an extra `filename` column to the result that indicates which row came from which file. For example: + +```sql +SELECT * +FROM read_csv(['flights1.csv', 'flights2.csv'], union_by_name = true, filename = true); +``` + +
+ +| FlightDate | OriginCityName | DestCityName | UniqueCarrier | filename | +|------------|----------------|-----------------|---------------|--------------| +| 1988-01-01 | New York, NY | Los Angeles, CA | NULL | flights1.csv | +| 1988-01-02 | New York, NY | Los Angeles, CA | NULL | flights1.csv | +| 1988-01-03 | New York, NY | Los Angeles, CA | AA | flights2.csv | + +## Glob Function to Find Filenames + +The glob pattern matching syntax can also be used to search for filenames using the `glob` table function. +It accepts one parameter: the path to search (which may include glob patterns). + +Search the current directory for all files. + +```sql +SELECT * +FROM glob('*'); +``` + +
+ +| file | +|---------------| +| test.csv | +| test.json | +| test.parquet | +| test2.csv | +| test2.parquet | +| todos.json | \ No newline at end of file diff --git a/docs/archive/1.0/data/overview.md b/docs/archive/1.0/data/overview.md new file mode 100644 index 00000000000..12795004a6e --- /dev/null +++ b/docs/archive/1.0/data/overview.md @@ -0,0 +1,100 @@ +--- +layout: docu +title: Importing Data +--- + +The first step to using a database system is to insert data into that system. DuckDB provides several data ingestion methods that allow you to easily and efficiently fill up the database. In this section, we provide an overview of these methods so you can select which one is best suited for your use case. + +## Insert Statements + +Insert statements are the standard way of loading data into a database system. They are suitable for quick prototyping, but should be avoided for bulk loading as they have significant per-row overhead. + +```sql +INSERT INTO people VALUES (1, 'Mark'); +``` + +For a more detailed description, see the [page on the `INSERT statement`]({% link docs/archive/1.0/data/insert.md %}). + +## CSV Loading + +Data can be efficiently loaded from CSV files using several methods. The simplest is to use the CSV file's name: + +```sql +SELECT * FROM 'test.csv'; +``` + +Alternatively, use the [`read_csv` function]({% link docs/archive/1.0/data/csv/overview.md %}) to pass along options: + +```sql +SELECT * FROM read_csv('test.csv', header = false); +``` + +Or use the [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}#copy--from): + +```sql +COPY tbl FROM 'test.csv' (HEADER false); +``` + +It is also possible to read data directly from **compressed CSV files** (e.g., compressed with [gzip](https://www.gzip.org/)): + +```sql +SELECT * FROM 'test.csv.gz'; +``` + +DuckDB can create a table from the loaded data using the [`CREATE TABLE ... AS SELECT` statement]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas): + +```sql +CREATE TABLE test AS + SELECT * FROM 'test.csv'; +``` + +For more details, see the [page on CSV loading]({% link docs/archive/1.0/data/csv/overview.md %}). + +## Parquet Loading + +Parquet files can be efficiently loaded and queried using their filename: + +```sql +SELECT * FROM 'test.parquet'; +``` + +Alternatively, use the [`read_parquet` function]({% link docs/archive/1.0/data/parquet/overview.md %}): + +```sql +SELECT * FROM read_parquet('test.parquet'); +``` + +Or use the [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}#copy--from): + +```sql +COPY tbl FROM 'test.parquet'; +``` + +For more details, see the [page on Parquet loading]({% link docs/archive/1.0/data/parquet/overview.md %}). + +## JSON Loading + +JSON files can be efficiently loaded and queried using their filename: + +```sql +SELECT * FROM 'test.json'; +``` + +Alternatively, use the [`read_json_auto` function]({% link docs/archive/1.0/data/json/overview.md %}): + +```sql +SELECT * FROM read_json_auto('test.json'); +``` + +Or use the [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}#copy--from): + +```sql +COPY tbl FROM 'test.json'; +``` + +For more details, see the [page on JSON loading]({% link docs/archive/1.0/data/json/overview.md %}). + +## Appender + +In several APIs (C, C++, Go, Java, and Rust), the [Appender]({% link docs/archive/1.0/data/appender.md %}) can be used as an alternative for bulk data loading. +This class can be used to efficiently add rows to the database system without using SQL statements. \ No newline at end of file diff --git a/docs/archive/1.0/data/parquet/encryption.md b/docs/archive/1.0/data/parquet/encryption.md new file mode 100644 index 00000000000..e770a4fa7b9 --- /dev/null +++ b/docs/archive/1.0/data/parquet/encryption.md @@ -0,0 +1,68 @@ +--- +layout: docu +title: Parquet Encryption +--- + +Starting with version 0.10.0, DuckDB supports reading and writing encrypted Parquet files. +DuckDB broadly follows the [Parquet Modular Encryption specification](https://github.com/apache/parquet-format/blob/master/Encryption.md) with some [limitations](#limitations). + +## Reading and Writing Encrypted Files + +Using the `PRAGMA add_parquet_key` function, named encryption keys of 128, 192, or 256 bits can be added to a session. These keys are stored in-memory: + +```sql +PRAGMA add_parquet_key('key128', '0123456789112345'); +PRAGMA add_parquet_key('key192', '012345678911234501234567'); +PRAGMA add_parquet_key('key256', '01234567891123450123456789112345'); +``` + +### Writing Encrypted Parquet Files + +After specifying the key (e.g., `key256`), files can be encrypted as follows: + +```sql +COPY tbl TO 'tbl.parquet' (ENCRYPTION_CONFIG {footer_key: 'key256'}); +``` + +### Reading Encrpyted Parquet Files + +An encrypted Parquet file using a specific key (e.g., `key256`), can then be read as follows: + +```sql +COPY tbl FROM 'tbl.parquet' (ENCRYPTION_CONFIG {footer_key: 'key256'}); +``` + +Or: + +```sql +SELECT * +FROM read_parquet('tbl.parquet', encryption_config = {footer_key: 'key256'}); +``` + +## Limitations + +DuckDB's Parquet encryption currently has the following limitations. + +1. It is not compatible with the encryption of, e.g., PyArrow, until the missing details are implemented. + +2. DuckDB encrypts the footer and all columns using the `footer_key`. The Parquet specification allows encryption of individual columns with different keys, e.g.: + + ```sql + COPY tbl TO 'tbl.parquet' + (ENCRYPTION_CONFIG { + footer_key: 'key256', + column_keys: {key256: ['col0', 'col1']} + }); + ``` + + However, this is unsupported at the moment and will cause an error to be thrown (for now): + + ```console + Not implemented Error: Parquet encryption_config column_keys not yet implemented + ``` + +## Performance Implications + +Note that encryption has some performance implications. +Without encryption, reading/writing the `lineitem` table from [`TPC-H`]({% link docs/archive/1.0/extensions/tpch.md %}) at SF1, which is 6M rows and 15 columns, from/to a Parquet file takes 0.26 and 0.99 seconds, respectively. +With encryption, this takes 0.64 and 2.21 seconds, both approximately 2.5× slower than the unencrypted version. \ No newline at end of file diff --git a/docs/archive/1.0/data/parquet/metadata.md b/docs/archive/1.0/data/parquet/metadata.md new file mode 100644 index 00000000000..9ee5806169b --- /dev/null +++ b/docs/archive/1.0/data/parquet/metadata.md @@ -0,0 +1,121 @@ +--- +layout: docu +title: Querying Parquet Metadata +--- + +## Parquet Metadata + +The `parquet_metadata` function can be used to query the metadata contained within a Parquet file, which reveals various internal details of the Parquet file such as the statistics of the different columns. This can be useful for figuring out what kind of skipping is possible in Parquet files, or even to obtain a quick overview of what the different columns contain: + +```sql +SELECT * +FROM parquet_metadata('test.parquet'); +``` + +Below is a table of the columns returned by `parquet_metadata`. + +
+ +| Field | Type | +| ----------------------- | --------------- | +| file_name | VARCHAR | +| row_group_id | BIGINT | +| row_group_num_rows | BIGINT | +| row_group_num_columns | BIGINT | +| row_group_bytes | BIGINT | +| column_id | BIGINT | +| file_offset | BIGINT | +| num_values | BIGINT | +| path_in_schema | VARCHAR | +| type | VARCHAR | +| stats_min | VARCHAR | +| stats_max | VARCHAR | +| stats_null_count | BIGINT | +| stats_distinct_count | BIGINT | +| stats_min_value | VARCHAR | +| stats_max_value | VARCHAR | +| compression | VARCHAR | +| encodings | VARCHAR | +| index_page_offset | BIGINT | +| dictionary_page_offset | BIGINT | +| data_page_offset | BIGINT | +| total_compressed_size | BIGINT | +| total_uncompressed_size | BIGINT | +| key_value_metadata | MAP(BLOB, BLOB) | + +## Parquet Schema + +The `parquet_schema` function can be used to query the internal schema contained within a Parquet file. Note that this is the schema as it is contained within the metadata of the Parquet file. If you want to figure out the column names and types contained within a Parquet file it is easier to use `DESCRIBE`. + +Fetch the column names and column types: + +```sql +DESCRIBE SELECT * FROM 'test.parquet'; +``` + +Fetch the internal schema of a Parquet file: + +```sql +SELECT * +FROM parquet_schema('test.parquet'); +``` + +Below is a table of the columns returned by `parquet_schema`. + +
+ +| Field | Type | +| --------------- | ------- | +| file_name | VARCHAR | +| name | VARCHAR | +| type | VARCHAR | +| type_length | VARCHAR | +| repetition_type | VARCHAR | +| num_children | BIGINT | +| converted_type | VARCHAR | +| scale | BIGINT | +| precision | BIGINT | +| field_id | BIGINT | +| logical_type | VARCHAR | + +## Parquet File Metadata + +The `parquet_file_metadata` function can be used to query file-level metadata such as the format version and the encryption algorithm used: + +```sql +SELECT * +FROM parquet_file_metadata('test.parquet'); +``` + +Below is a table of the columns returned by `parquet_file_metadata`. + +
+ +| Field | Type | +| ----------------------------| ------- | +| file_name | VARCHAR | +| created_by | VARCHAR | +| num_rows | BIGINT | +| num_row_groups | BIGINT | +| format_version | BIGINT | +| encryption_algorithm | VARCHAR | +| footer_signing_key_metadata | VARCHAR | + +## Parquet Key-Value Metadata + +The `parquet_kv_metadata` function can be used to query custom metadata defined as key-value pairs: + +```sql +SELECT * +FROM parquet_kv_metadata('test.parquet'); +``` + +Below is a table of the columns returned by `parquet_kv_metadata`. + +
+ +| Field | Type | +| --------- | ------- | +| file_name | VARCHAR | +| key | BLOB | +| value | BLOB | \ No newline at end of file diff --git a/docs/archive/1.0/data/parquet/overview.md b/docs/archive/1.0/data/parquet/overview.md new file mode 100644 index 00000000000..ed554edb896 --- /dev/null +++ b/docs/archive/1.0/data/parquet/overview.md @@ -0,0 +1,302 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/data/parquet +- /docs/archive/1.0/extensions/parquet +title: Reading and Writing Parquet Files +--- + +## Examples + +Read a single Parquet file: + +```sql +SELECT * FROM 'test.parquet'; +``` + +Figure out which columns/types are in a Parquet file: + +```sql +DESCRIBE SELECT * FROM 'test.parquet'; +``` + +Create a table from a Parquet file: + +```sql +CREATE TABLE test AS + SELECT * FROM 'test.parquet'; +``` + +If the file does not end in `.parquet`, use the `read_parquet` function: + +```sql +SELECT * +FROM read_parquet('test.parq'); +``` + +Use list parameter to read three Parquet files and treat them as a single table: + +```sql +SELECT * +FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']); +``` + +Read all files that match the glob pattern: + +```sql +SELECT * +FROM 'test/*.parquet'; +``` + +Read all files that match the glob pattern, and include a `filename` column: + +That specifies which file each row came from: + +```sql +SELECT * +FROM read_parquet('test/*.parquet', filename = true); +``` + +Use a list of globs to read all Parquet files from two specific folders: + +```sql +SELECT * +FROM read_parquet(['folder1/*.parquet', 'folder2/*.parquet']); +``` + +Read over HTTPS: + +```sql +SELECT * +FROM read_parquet('https://some.url/some_file.parquet'); +``` + +Query the [metadata of a Parquet file]({% link docs/archive/1.0/data/parquet/metadata.md %}#parquet-metadata): + +```sql +SELECT * +FROM parquet_metadata('test.parquet'); +``` + +Query the [file metadata of a Parquet file]({% link docs/archive/1.0/data/parquet/metadata.md %}#parquet-file-metadata): + +```sql +SELECT * +FROM parquet_file_metadata('test.parquet'); +``` + +Query the [key-value metadata of a Parquet file]({% link docs/archive/1.0/data/parquet/metadata.md %}#parquet-key-value-metadata): + +```sql +SELECT * +FROM parquet_kv_metadata('test.parquet'); +``` + +Query the [schema of a Parquet file]({% link docs/archive/1.0/data/parquet/metadata.md %}#parquet-schema): + +```sql +SELECT * +FROM parquet_schema('test.parquet'); +``` + +Write the results of a query to a Parquet file using the default compression (Snappy): + +```sql +COPY + (SELECT * FROM tbl) + TO 'result-snappy.parquet' + (FORMAT 'parquet'); +``` + +Write the results from a query to a Parquet file with specific compression and row group size: + +```sql +COPY + (FROM generate_series(100_000)) + TO 'test.parquet' + (FORMAT 'parquet', COMPRESSION 'zstd', ROW_GROUP_SIZE 100_000); +``` + +Export the table contents of the entire database as parquet: + +```sql +EXPORT DATABASE 'target_directory' (FORMAT PARQUET); +``` + +## Parquet Files + +Parquet files are compressed columnar files that are efficient to load and process. DuckDB provides support for both reading and writing Parquet files in an efficient manner, as well as support for pushing filters and projections into the Parquet file scans. + +> Parquet data sets differ based on the number of files, the size of individual files, the compression algorithm used row group size, etc. These have a significant effect on performance. Please consult the [Performance Guide]({% link docs/archive/1.0/guides/performance/file_formats.md %}) for details. + +## `read_parquet` Function + +| Function | Description | Example | +|:--|:--|:-----| +| `read_parquet(path_or_list_of_paths)` | Read Parquet file(s) | `SELECT * FROM read_parquet('test.parquet');` | +| `parquet_scan(path_or_list_of_paths)` | Alias for `read_parquet` | `SELECT * FROM parquet_scan('test.parquet');` | + +If your file ends in `.parquet`, the function syntax is optional. The system will automatically infer that you are reading a Parquet file: + +```sql +SELECT * FROM 'test.parquet'; +``` + +Multiple files can be read at once by providing a glob or a list of files. Refer to the [multiple files section]({% link docs/archive/1.0/data/multiple_files/overview.md %}) for more information. + +### Parameters + +There are a number of options exposed that can be passed to the `read_parquet` function or the [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}). + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `binary_as_string` | Parquet files generated by legacy writers do not correctly set the `UTF8` flag for strings, causing string columns to be loaded as `BLOB` instead. Set this to true to load binary columns as strings. | `BOOL` | `false` | +| `encryption_config` | Configuration for [Parquet encryption]({% link docs/archive/1.0/data/parquet/encryption.md %}). | `STRUCT` | - | +| `filename` | Whether or not an extra `filename` column should be included in the result. | `BOOL` | `false` | +| `file_row_number` | Whether or not to include the `file_row_number` column. | `BOOL` | `false` | +| `hive_partitioning` | Whether or not to interpret the path as a [Hive partitioned path]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}). | `BOOL` | `true` | +| `union_by_name` | Whether the columns of multiple schemas should be [unified by name]({% link docs/archive/1.0/data/multiple_files/combining_schemas.md %}), rather than by position. | `BOOL` | `false` | + +## Partial Reading + +DuckDB supports projection pushdown into the Parquet file itself. That is to say, when querying a Parquet file, only the columns required for the query are read. This allows you to read only the part of the Parquet file that you are interested in. This will be done automatically by DuckDB. + +DuckDB also supports filter pushdown into the Parquet reader. When you apply a filter to a column that is scanned from a Parquet file, the filter will be pushed down into the scan, and can even be used to skip parts of the file using the built-in zonemaps. Note that this will depend on whether or not your Parquet file contains zonemaps. + +Filter and projection pushdown provide significant performance benefits. See [our blog post “Querying Parquet with Precision Using DuckDB”]({% post_url 2021-06-25-querying-parquet %}) for more information. + +## Inserts and Views + +You can also insert the data into a table or create a table from the Parquet file directly. This will load the data from the Parquet file and insert it into the database: + +Insert the data from the Parquet file in the table: + +```sql +INSERT INTO people + SELECT * FROM read_parquet('test.parquet'); +``` + +Create a table directly from a Parquet file: + +```sql +CREATE TABLE people AS + SELECT * FROM read_parquet('test.parquet'); +``` + +If you wish to keep the data stored inside the Parquet file, but want to query the Parquet file directly, you can create a view over the `read_parquet` function. You can then query the Parquet file as if it were a built-in table: + +Create a view over the Parquet file: + +```sql +CREATE VIEW people AS + SELECT * FROM read_parquet('test.parquet'); +``` + +Query the Parquet file: + +```sql +SELECT * FROM people; +``` + +## Writing to Parquet Files + +DuckDB also has support for writing to Parquet files using the `COPY` statement syntax. See the [`COPY` Statement page]({% link docs/archive/1.0/sql/statements/copy.md %}) for details, including all possible parameters for the `COPY` statement. + +Write a query to a snappy compressed Parquet file: + +```sql +COPY + (SELECT * FROM tbl) + TO 'result-snappy.parquet' + (FORMAT 'parquet'); +``` + +Write `tbl` to a zstd-compressed Parquet file: + +```sql +COPY tbl + TO 'result-zstd.parquet' + (FORMAT 'parquet', CODEC 'zstd'); +``` + +Write `tbl` to a zstd-compressed Parquet file with the lowest compression level yielding the fastest compression: + +```sql +COPY tbl + TO 'result-zstd.parquet' + (FORMAT 'parquet', CODEC 'zstd', COMPRESSION_LEVEL 1); +``` + +Write to Parquet file with [key-value metadata]({% link docs/archive/1.0/data/parquet/metadata.md %}): + +```sql +COPY ( + SELECT + 42 AS number, + true AS is_even +) TO 'kv_metadata.parquet' ( + FORMAT PARQUET, + KV_METADATA { + number: 'Answer to life, universe, and everything', + is_even: 'not ''odd''' -- single quotes in values must be escaped + } +); +``` + +Write a CSV file to an uncompressed Parquet file: + +```sql +COPY + 'test.csv' + TO 'result-uncompressed.parquet' + (FORMAT 'parquet', CODEC 'uncompressed'); +``` + +Write a query to a Parquet file with zstd-compression (same as `CODEC`) and row group size: + +```sql +COPY + (FROM generate_series(100_000)) + TO 'row-groups-zstd.parquet' + (FORMAT PARQUET, COMPRESSION ZSTD, ROW_GROUP_SIZE 100_000); +``` + +> LZ4 compression is currently only available in the nightly and source builds: + +Write a CSV file to an `LZ4_RAW`-compressed Parquet file: + +```sql +COPY + (FROM generate_series(100_000)) + TO 'result-lz4.parquet' + (FORMAT PARQUET, COMPRESSION LZ4); +``` + +Or: + +```sql +COPY + (FROM generate_series(100_000)) + TO 'result-lz4.parquet' + (FORMAT PARQUET, COMPRESSION LZ4_RAW); +``` + +DuckDB's `EXPORT` command can be used to export an entire database to a series of Parquet files. See the [Export statement documentation]({% link docs/archive/1.0/sql/statements/export.md %}) for more details: + +Export the table contents of the entire database as Parquet: + +```sql +EXPORT DATABASE 'target_directory' (FORMAT PARQUET); +``` + +## Encryption + +DuckDB supports reading and writing [encrypted Parquet files]({% link docs/archive/1.0/data/parquet/encryption.md %}). + +## Installing and Loading the Parquet Extension + +The support for Parquet files is enabled via extension. The `parquet` extension is bundled with almost all clients. However, if your client does not bundle the `parquet` extension, the extension must be installed separately: + +```sql +INSTALL parquet; +``` \ No newline at end of file diff --git a/docs/archive/1.0/data/parquet/tips.md b/docs/archive/1.0/data/parquet/tips.md new file mode 100644 index 00000000000..476cb2e4267 --- /dev/null +++ b/docs/archive/1.0/data/parquet/tips.md @@ -0,0 +1,54 @@ +--- +layout: docu +title: Parquet Tips +--- + +Below is a collection of tips to help when dealing with Parquet files. + +## Tips for Reading Parquet Files + +### Use `union_by_name` When Loading Files with Different Schemas + +The `union_by_name` option can be used to unify the schema of files that have different or missing columns. For files that do not have certain columns, `NULL` values are filled in: + +```sql +SELECT * +FROM read_parquet('flights*.parquet', union_by_name = true); +``` + +## Tips for Writing Parquet Files + +Using a [glob pattern]({% link docs/archive/1.0/data/multiple_files/overview.md %}#glob-syntax) upon read or a [Hive partitioning]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}) structure are good ways to transparently handle multiple files. + +### Enabling `PER_THREAD_OUTPUT` + +If the final number of Parquet files is not important, writing one file per thread can significantly improve performance: + +```sql +COPY + (FROM generate_series(10_000_000)) + TO 'test.parquet' + (FORMAT PARQUET, PER_THREAD_OUTPUT); +``` + +### Selecting a `ROW_GROUP_SIZE` + +The `ROW_GROUP_SIZE` parameter specifies the minimum number of rows in a Parquet row group, with a minimum value equal to DuckDB's vector size, 2,048, and a default of 122,880. +A Parquet row group is a partition of rows, consisting of a column chunk for each column in the dataset. + +Compression algorithms are only applied per row group, so the larger the row group size, the more opportunities to compress the data. +DuckDB can read Parquet row groups in parallel even within the same file and uses predicate pushdown to only scan the row groups whose metadata ranges match the `WHERE` clause of the query. +However there is some overhead associated with reading the metadata in each group. +A good approach would be to ensure that within each file, the total number of row groups is at least as large as the number of CPU threads used to query that file. +More row groups beyond the thread count would improve the speed of highly selective queries, but slow down queries that must scan the whole file like aggregations. + +To write a query to a Parquet file with a different row group size, run: + +```sql +COPY + (FROM generate_series(100_000)) + TO 'row-groups.parquet' + (FORMAT PARQUET, ROW_GROUP_SIZE 100_000); +``` + +See the [Performance Guide on file formats]({% link docs/archive/1.0/guides/performance/file_formats.md %}#parquet-file-sizes) for more tips. \ No newline at end of file diff --git a/docs/archive/1.0/data/partitioning/hive_partitioning.md b/docs/archive/1.0/data/partitioning/hive_partitioning.md new file mode 100644 index 00000000000..b1c0071157d --- /dev/null +++ b/docs/archive/1.0/data/partitioning/hive_partitioning.md @@ -0,0 +1,106 @@ +--- +layout: docu +title: Hive Partitioning +--- + +## Examples + +Read data from a Hive partitioned data set: + +```sql +SELECT * +FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true); +``` + +Write a table to a Hive partitioned data set: + +```sql +COPY orders +TO 'orders' (FORMAT PARQUET, PARTITION_BY (year, month)); +``` + +Note that the `PARTITION_BY` options cannot use expressions. You can produce columns on the fly using the following syntax: + +```sql +COPY (SELECT *, year(timestamp) AS year, month(timestamp) AS month FROM services) +TO 'test' (PARTITION_BY (year, month)); +``` + +## Hive Partitioning + +Hive partitioning is a [partitioning strategy](https://en.wikipedia.org/wiki/Partition_(database)) that is used to split a table into multiple files based on **partition keys**. The files are organized into folders. Within each folder, the **partition key** has a value that is determined by the name of the folder. + +Below is an example of a Hive partitioned file hierarchy. The files are partitioned on two keys (`year` and `month`). + +```text +orders +├── year=2021 +│ ├── month=1 +│ │ ├── file1.parquet +│ │ └── file2.parquet +│ └── month=2 +│ └── file3.parquet +└── year=2022 + ├── month=11 + │ ├── file4.parquet + │ └── file5.parquet + └── month=12 + └── file6.parquet +``` + +Files stored in this hierarchy can be read using the `hive_partitioning` flag. + +```sql +SELECT * +FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true); +``` + +When we specify the `hive_partitioning` flag, the values of the columns will be read from the directories. + +### Filter Pushdown + +Filters on the partition keys are automatically pushed down into the files. This way the system skips reading files that are not necessary to answer a query. For example, consider the following query on the above dataset: + +```sql +SELECT * +FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true) +WHERE year = 2022 + AND month = 11; +``` + +When executing this query, only the following files will be read: + +```text +orders +└── year=2022 + └── month=11 + ├── file4.parquet + └── file5.parquet +``` + +### Autodetection + +By default the system tries to infer if the provided files are in a hive partitioned hierarchy. And if so, the `hive_partitioning` flag is enabled automatically. The autodetection will look at the names of the folders and search for a `'key' = 'value'` pattern. This behavior can be overridden by using the `hive_partitioning` configuration option: + +```sql +SET hive_partitioning = false; +``` + +### Hive Types + +`hive_types` is a way to specify the logical types of the hive partitions in a struct: + +```sql +SELECT * +FROM read_parquet( + 'dir/**/*.parquet', + hive_partitioning = true, + hive_types = {'release': DATE, 'orders': BIGINT} +); +``` + +`hive_types` will be autodetected for the following types: `DATE`, `TIMESTAMP` and `BIGINT`. To switch off the autodetection, the flag `hive_types_autocast = 0` can be set. + +### Writing Partitioned Files + +See the [Partitioned Writes]({% link docs/archive/1.0/data/partitioning/partitioned_writes.md %}) section. \ No newline at end of file diff --git a/docs/archive/1.0/data/partitioning/partitioned_writes.md b/docs/archive/1.0/data/partitioning/partitioned_writes.md new file mode 100644 index 00000000000..848ceff37b5 --- /dev/null +++ b/docs/archive/1.0/data/partitioning/partitioned_writes.md @@ -0,0 +1,73 @@ +--- +layout: docu +title: Partitioned Writes +--- + +## Examples + +Write a table to a Hive partitioned data set of Parquet files: + +```sql +COPY orders TO 'orders' (FORMAT PARQUET, PARTITION_BY (year, month)); +``` + +Write a table to a Hive partitioned data set of CSV files, allowing overwrites: + +```sql +COPY orders TO 'orders' (FORMAT CSV, PARTITION_BY (year, month), OVERWRITE_OR_IGNORE); +``` + +Write a table to a Hive partitioned data set of GZIP-compressed CSV files, setting explicit data files' extension: + +```sql +COPY orders TO 'orders' (FORMAT CSV, PARTITION_BY (year, month), COMPRESSION GZIP, FILE_EXTENSION 'csv.gz'); +``` + +## Partitioned Writes + +When the `partition_by` clause is specified for the [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}), the files are written in a [Hive partitioned]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}) folder hierarchy. The target is the name of the root directory (in the example above: `orders`). The files are written in-order in the file hierarchy. Currently, one file is written per thread to each directory. + +```text +orders +├── year=2021 +│ ├── month=1 +│ │ ├── data_1.parquet +│ │ └── data_2.parquet +│ └── month=2 +│ └── data_1.parquet +└── year=2022 + ├── month=11 + │ ├── data_1.parquet + │ └── data_2.parquet + └── month=12 + └── data_1.parquet +``` + +The values of the partitions are automatically extracted from the data. Note that it can be very expensive to write many partitions as many files will be created. The ideal partition count depends on how large your data set is. + +> Bestpractice Writing data into many small partitions is expensive. It is generally recommended to have at least `100 MB` of data per partition. + +### Overwriting + +By default the partitioned write will not allow overwriting existing directories. Use the `OVERWRITE_OR_IGNORE` option to allow overwriting an existing directory. + +### Filename Pattern + +By default, files will be named `data_0.parquet` or `data_0.csv`. With the flag `FILENAME_PATTERN` a pattern with `{i}` or `{uuid}` can be defined to create specific filenames: + +* `{i}` will be replaced by an index +* `{uuid}` will be replaced by a 128 bits long UUID + +Write a table to a Hive partitioned data set of .parquet files, with an index in the filename: + +```sql +COPY orders TO 'orders' + (FORMAT PARQUET, PARTITION_BY (year, month), OVERWRITE_OR_IGNORE, FILENAME_PATTERN 'orders_{i}'); +``` + +Write a table to a Hive partitioned data set of .parquet files, with unique filenames: + +```sql +COPY orders TO 'orders' + (FORMAT PARQUET, PARTITION_BY (year, month), OVERWRITE_OR_IGNORE, FILENAME_PATTERN 'file_{uuid}'); +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/benchmark.md b/docs/archive/1.0/dev/benchmark.md new file mode 100644 index 00000000000..dd4c56333c4 --- /dev/null +++ b/docs/archive/1.0/dev/benchmark.md @@ -0,0 +1,101 @@ +--- +layout: docu +redirect_from: +- /dev/benchmark +title: Benchmark Suite +--- + +DuckDB has an extensive benchmark suite. +When making changes that have potential performance implications, it is important to run these benchmarks to detect potential performance regressions. + +## Getting Started + +To build the benchmark suite, run the following command in the [DuckDB repository](https://github.com/duckdb/duckdb): + +```bash +BUILD_BENCHMARK=1 BUILD_TPCH=1 make +``` + +## Listing Benchmarks + +To list all available benchmarks, run: + +```bash +build/release/benchmark/benchmark_runner --list +``` + +## Running Benchmarks + +### Running a Single Benchmark + +To run a single benchmark, issue the following command: + +```bash +build/release/benchmark/benchmark_runner benchmark/micro/nulls/no_nulls_addition.benchmark +``` + +The output will be printed to `stdout` in CSV format, in the following format: + +```text +name run timing +benchmark/micro/nulls/no_nulls_addition.benchmark 1 0.121234 +benchmark/micro/nulls/no_nulls_addition.benchmark 2 0.121702 +benchmark/micro/nulls/no_nulls_addition.benchmark 3 0.122948 +benchmark/micro/nulls/no_nulls_addition.benchmark 4 0.122534 +benchmark/micro/nulls/no_nulls_addition.benchmark 5 0.124102 +``` + +You can also specify an output file using the `--out` flag. This will write only the timings (delimited by newlines) to that file. + +```bash +build/release/benchmark/benchmark_runner benchmark/micro/nulls/no_nulls_addition.benchmark --out=timings.out +``` + +The output will contain the following: + +```text +0.182472 +0.185027 +0.184163 +0.185281 +0.182948 +``` + +### Running Multiple Benchmark Using a Regular Expression + +You can also use a regular expression to specify which benchmarks to run. +Be careful of shell expansion of certain regex characters (e.g., `*` will likely be expanded by your shell, hence this requires proper quoting or escaping). + +```bash +build/release/benchmark/benchmark_runner "benchmark/micro/nulls/.*" +``` + +#### Running All Benchmarks + +Not specifying any argument will run all benchmarks. + +```bash +build/release/benchmark/benchmark_runner +``` + +#### Other Options + +The `--info` flag gives you some other information about the benchmark. + +```bash +build/release/benchmark/benchmark_runner benchmark/micro/nulls/no_nulls_addition.benchmark --info +``` + +```text +display_name:NULL Addition (no nulls) +group:micro +subgroup:nulls +``` + +The `--query` flag will print the query that is run by the benchmark. + +```sql +SELECT min(i + 1) FROM integers; +``` + +The `--profile` flag will output a query tree. \ No newline at end of file diff --git a/docs/archive/1.0/dev/building/build_configuration.md b/docs/archive/1.0/dev/building/build_configuration.md new file mode 100644 index 00000000000..c6c315cb0e3 --- /dev/null +++ b/docs/archive/1.0/dev/building/build_configuration.md @@ -0,0 +1,104 @@ +--- +layout: docu +title: Building Configuration +--- + +## Build Types + +DuckDB can be built in many different settings, most of these correspond directly to CMake but not all of them. + +### `release` + +This build has been stripped of all the assertions and debug symbols and code, optimized for performance. + +### `debug` + +This build runs with all the debug information, including symbols, assertions and `#ifdef DEBUG` blocks. +Due to these, binaries of this build are expected to be slow. +Note: the special debug defines are not automatically set for this build. + +### `relassert` + +This build does not trigger the `#ifdef DEBUG` code blocks but it still has debug symbols that make it possible to step through the execution with line number information and `D_ASSERT` lines are still checked in this build. +Binaries of this build mode are significantly faster than those of the `debug` mode. + +### `reldebug` + +This build is similar to `relassert` in many ways, only assertions are also stripped in this build. + +### `benchmark` + +This build is a shorthand for `release` with `BUILD_BENCHMARK=1` set. + +### `tidy-check` + +This creates a build and then runs [Clang-Tidy](https://clang.llvm.org/extra/clang-tidy/) to check for issues or style violations through static analysis. +The CI will also run this check, causing it to fail if this check fails. + +### `format-fix` | `format-changes` | `format-main` + +This doesn't actually create a build, but uses the following format checkers to check for style issues: + +* [clang-format](https://clang.llvm.org/docs/ClangFormat.html) to fix format issues in the code. +* [cmake-format](https://cmake-format.readthedocs.io/en/latest/) to fix format issues in the `CMakeLists.txt` files. + +The CI will also run this check, causing it to fail if this check fails. + +## Package Flags + +For every package that is maintained by core DuckDB, there exists a flag in the Makefile to enable building the package. +These can be enabled by either setting them in the current `env`, through set up files like `bashrc` or `zshrc`, or by setting them before the call to `make`, for example: + +```bash +BUILD_PYTHON=1 make debug +``` + +### `BUILD_PYTHON` + +When this flag is set, the [Python]({% link docs/archive/1.0/api/python/overview.md %}) package is built. + +### `BUILD_SHELL` + +When this flag is set, the [CLI]({% link docs/archive/1.0/api/cli/overview.md %}) is built, this is usually enabled by default. + +### `BUILD_BENCHMARK` + +When this flag is set, DuckDB's in-house benchmark suite is built. +More information about this can be found [here](https://github.com/duckdb/duckdb/blob/main/benchmark/README.md). + +### `BUILD_JDBC` + +When this flag is set, the [Java]({% link docs/archive/1.0/api/java.md %}) package is built. + +### `BUILD_ODBC` + +When this flag is set, the [ODBC]({% link docs/archive/1.0/api/odbc/overview.md %}) package is built. + +## Miscellaneous Flags + +### `DISABLE_UNITY` + +To improve compilation time, we use [Unity Build](https://cmake.org/cmake/help/latest/prop_tgt/UNITY_BUILD.html) to combine translation units. +This can however hide include bugs, this flag disables using the unity build so these errors can be detected. + +### `DISABLE_SANITIZER` + +In some situations, running an executable that has been built with sanitizers enabled is not support / can cause problems. Julia is an example of this. +With this flag enabled, the sanitizers are disabled for the build. + +## Overriding Git Hash and Version + +It is possible to override the Git hash and version when building from source using the `OVERRIDE_GIT_DESCRIBE` environment variable. +This is useful when building from sources that are not part of a complete Git repository (e.g., an archive file with no information on commit hashes and tags). +For example: + +```bash +OVERRIDE_GIT_DESCRIBE=v0.10.0-843-g09ea97d0a9 GEN=ninja make +``` + +Will result in the following output when running `./build/release/duckdb`: + +```text +v0.10.1-dev843 09ea97d0a9 +... +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/building/build_instructions.md b/docs/archive/1.0/dev/building/build_instructions.md new file mode 100644 index 00000000000..cadcd7d0613 --- /dev/null +++ b/docs/archive/1.0/dev/building/build_instructions.md @@ -0,0 +1,87 @@ +--- +layout: docu +title: Building Instructions +--- + +## Prerequisites + +DuckDB needs CMake and a C++11-compliant compiler (e.g., GCC, Apple-Clang, MSVC). +Additionally, we recommend using the [Ninja build system](https://ninja-build.org/), which automatically parallelizes the build process. + +Clone the DuckDB repository. + +```bash +git clone https://github.com/duckdb/duckdb +``` + +We recommend creating a full clone of the repository. Note that the directory uses approximately 1.2 GB of disk space. + +### Linux Packages + +Install the required packages with the package manager of your distribution. + +Ubuntu and Debian: + +```bash +sudo apt-get update && sudo apt-get install -y git g++ cmake ninja-build libssl-dev +``` + +Fedora, CentOS, and Red Hat: + +```bash +sudo yum install -y git g++ cmake ninja-build openssl-devel +``` + +Alpine Linux: + +```bash +apk add g++ git make cmake ninja +``` + +### macOS + +Install Xcode and [Homebrew](https://brew.sh/). Then, install the required packages with: + +```bash +brew install cmake ninja +``` + +### Windows + +Consult the [Windows CI workflow](https://github.com/duckdb/duckdb/blob/v0.10.2/.github/workflows/Windows.yml#L234) for a list of packages used to build DuckDB on Windows. + +On Windows, the DuckDB Python package requires the [Microsoft Visual C++ Redistributable package](https://learn.microsoft.com/en-US/cpp/windows/latest-supported-vc-redist) to be built and [to run]({% link docs/archive/1.0/api/python/known_issues.md %}#error-when-importing-the-duckdb-python-package-on-windows). + +## Building DuckDB + +To build DuckDB we use a Makefile which in turn calls into CMake. We also advise using [Ninja](https://ninja-build.org/manual.html) as the generator for CMake. + +```bash +GEN=ninja make +``` + +> Bestpractice It is not advised to directly call CMake, as the Makefile sets certain variables that are crucial to properly building the package. + +For testing, it can be useful to build DuckDB with statically linked core extensions. To do so, run: + +```bash +CORE_EXTENSIONS='autocomplete;icu;parquet;json' GEN=ninja make +``` + +This option also accepts out-of-tree extensions: + +```bash +CORE_EXTENSIONS='autocomplete;icu;parquet;json;delta' GEN=ninja make +``` + +For more details, see the [“Building Extensions” page]({% link docs/archive/1.0/dev/building/building_extensions.md %}). + +## Troubleshooting + +### The Build Runs Out of Memory + +Ninja parallelizes the build, which can cause out-of-memory issues on systems with limited resources. They also occur on Alpine Linux. In these cases, avoid using Ninja: + +```bash +make +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/building/building_extensions.md b/docs/archive/1.0/dev/building/building_extensions.md new file mode 100644 index 00000000000..d0291556dce --- /dev/null +++ b/docs/archive/1.0/dev/building/building_extensions.md @@ -0,0 +1,139 @@ +--- +layout: docu +title: Building Extensions +--- + +[Extensions]({% link docs/archive/1.0/extensions/overview.md %}) can be built from source and installed from the resulting local binary. + +## Building Extensions using Build Flags + +To build using extension flags, set the corresponding [`BUILD_[EXTENSION_NAME]` extension flag](#extension-flags) when running the build, then use the `INSTALL` command. + +For example, to install the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}), run the following script: + +```bash +GEN=ninja BUILD_HTTPFS=1 make +``` + +For release builds: + +```bash +build/release/duckdb -c "INSTALL 'build/release/extension/httpfs/httpfs.duckdb_extension';" +``` + +For debug builds: + +```bash +build/debug/duckdb -c "INSTALL 'build/debug/extension/httpfs/httpfs.duckdb_extension';" +``` + +### Extension Flags + +For every in-tree extension that is maintained by core DuckDB there exists a flag to enable building and statically linking the extension into the build. + +#### `BUILD_AUTOCOMPLETE` + +When this flag is set, the [`autocomplete` extension]({% link docs/archive/1.0/extensions/autocomplete.md %}) is built. + +#### `BUILD_ICU` + +When this flag is set, the [`icu` extension]({% link docs/archive/1.0/extensions/icu.md %}) is built. + +#### `BUILD_TPCH` + +When this flag is set, the [`tpch` extension]({% link docs/archive/1.0/extensions/tpch.md %}) is built, this enables TPCH-H data generation and query support using `dbgen`. + +#### `BUILD_TPCDS` + +When this flag is set, the [`tpcds` extension]({% link docs/archive/1.0/extensions/tpcds.md %}) is built, this enables TPC-DS data generation and query support using `dsdgen`. + +#### `BUILD_TPCE` + +When this flag is set, the [TPCE](https://www.tpc.org/tpce/) extension is built. Unlike TPC-H and TPC-DS this does not enable data generation and query support. Instead, it enables tests for TPC-E through our test suite. + +#### `BUILD_FTS` + +When this flag is set, the [`fts` (full text search) extension]({% link docs/archive/1.0/extensions/full_text_search.md %}) is built. + +#### `BUILD_HTTPFS` + +When this flag is set, the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) is built. + +#### `BUILD_JEMALLOC` + +When this flag is set, the [`jemalloc` extension]({% link docs/archive/1.0/extensions/jemalloc.md %}) is built. + +#### `BUILD_JSON` + +When this flag is set, the [`json` extension]({% link docs/archive/1.0/extensions/json.md %}) is built. + +#### `BUILD_INET` + +When this flag is set, the [`inet` extension]({% link docs/archive/1.0/extensions/inet.md %}) is built. + +#### `BUILD_SQLSMITH` + +When this flag is set, the [SQLSmith extension](https://github.com/duckdb/duckdb/pull/3410) is built. + +### Debug Flags + +#### `CRASH_ON_ASSERT` + +`D_ASSERT(condition)` is used all throughout the code, these will throw an InternalException in debug builds. +With this flag enabled, when the assertion triggers it will instead directly cause a crash. + +#### `DISABLE_STRING_INLINE` + +In our execution format `string_t` has the feature to “inline” strings that are under a certain length (12 bytes), this means they don't require a separate allocation. +When this flag is set, we disable this and don't inline small strings. + +#### `DISABLE_MEMORY_SAFETY` + +Our data structures that are used extensively throughout the non-performance-critical code have extra checks to ensure memory safety, these checks include: + +* Making sure `nullptr` is never dereferenced. +* Making sure index out of bounds accesses don't trigger a crash. + +With this flag enabled we remove these checks, this is mostly done to check that the performance hit of these checks is negligible. + +#### `DESTROY_UNPINNED_BLOCKS` + +When previously pinned blocks in the BufferManager are unpinned, with this flag enabled we destroy them instantly to make sure that there aren't situations where this memory is still being used, despite not being pinned. + +#### `DEBUG_STACKTRACE` + +When a crash or assertion hit occurs in a test, print a stack trace. +This is useful when debugging a crash that is hard to pinpoint with a debugger attached. + +## Using a CMake Configuration File + +To build using a CMake configuration file, create an extension configuration file named `extension_config.cmake` with e.g., the following content: + +```cmake +duckdb_extension_load(autocomplete) +duckdb_extension_load(fts) +duckdb_extension_load(httpfs/overview) +duckdb_extension_load(inet) +duckdb_extension_load(icu) +duckdb_extension_load(json) +duckdb_extension_load(parquet) +``` + +Build DuckDB as follows: + +```bash +GEN=ninja EXTENSION_CONFIGS="extension_config.cmake" make +``` + +Then, to install the extensions in one go, run: + +```bash +# for release builds +cd build/release/extension/ +# for debug builds +cd build/debug/extension/ +# install extensions +for EXTENSION in *; do + ../duckdb -c "INSTALL '${EXTENSION}/${EXTENSION}.duckdb_extension';" +done +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/building/overview.md b/docs/archive/1.0/dev/building/overview.md new file mode 100644 index 00000000000..bc8df443a1b --- /dev/null +++ b/docs/archive/1.0/dev/building/overview.md @@ -0,0 +1,10 @@ +--- +layout: docu +redirect_from: +- /dev/building +- /docs/archive/1.0/dev/building +title: Building DuckDB from Source +--- + +> DuckDB binaries are available for stable and nightly builds on the [installation page](/docs/installation/index). +> You should only build DuckDB under specific circumstances, such as when running on a specific architecture or building an unmerged pull request. \ No newline at end of file diff --git a/docs/archive/1.0/dev/building/supported_platforms.md b/docs/archive/1.0/dev/building/supported_platforms.md new file mode 100644 index 00000000000..29a6d15c620 --- /dev/null +++ b/docs/archive/1.0/dev/building/supported_platforms.md @@ -0,0 +1,18 @@ +--- +layout: docu +title: Supported Platforms +--- + +DuckDB officially supports the following platforms: + +| Platform name | Description | +|--------------------|--------------------------------------------| +| `linux_amd64` | Linux AMD64 | +| `linux_arm64` | Linux ARM64 | +| `osx_amd64` | macOS 12+ (Intel CPUs) | +| `osx_arm64` | macOS 12+ (Apple Silicon: M1, M2, M3 CPUs) | +| `windows_amd64` | Windows 10+ on Intel and AMD CPUs (x86_64) | + +DuckDB can be [built from source]({% link docs/archive/1.0/dev/building/build_instructions.md %}) for several other platforms such as Windows with ARM64 CPUs (Qualcomm, Snapdragon, etc.) and macOS 11. + +For details on free and commercial support, see the [support policy blog post](https://duckdblabs.com/news/2023/10/02/support-policy#platforms). \ No newline at end of file diff --git a/docs/archive/1.0/dev/building/troubleshooting.md b/docs/archive/1.0/dev/building/troubleshooting.md new file mode 100644 index 00000000000..b82f22a9ad2 --- /dev/null +++ b/docs/archive/1.0/dev/building/troubleshooting.md @@ -0,0 +1,129 @@ +--- +layout: docu +title: Troubleshooting +--- + +## R Package: The Build Only Uses a Single Thread + +**Problem:** +By default, R compiles packages using a single thread, which causes the build to be slow. + +**Solution:** +To parallelize the compilation, create or edit the `~/.R/Makevars` file, and add a line like the following: + +```ini +MAKEFLAGS = -j8 +``` + +The above will parallelize the compilation using 8 threads. On Linux/macOS, you can add the following to use all of the machine's threads: + +```ini +MAKEFLAGS = -j$(nproc) +``` + +However, note that, the more threads that are used, the higher the RAM consumption. If the system runs out of RAM while compiling, then the R session will crash. + +## R Package on Linux AArch64: `too many GOT entries` Build Error + +**Problem:** +Building the R package on Linux running on an ARM64 architecture (AArch64) may result in the following error message: + +```console +/usr/bin/ld: /usr/include/c++/10/bits/basic_string.tcc:206: +warning: too many GOT entries for -fpic, please recompile with -fPIC +``` + +**Solution:** +Create or edit the `~/.R/Makevars` file. This example also contains the [flag to parallelize the build](#r-package-the-build-only-uses-a-single-thread): + +```ini +ALL_CXXFLAGS = $(PKG_CXXFLAGS) -fPIC $(SHLIB_CXXFLAGS) $(CXXFLAGS) +MAKEFLAGS = -j$(nproc) +``` + +When making this change, also consider [making the build parallel](#r-package-the-build-only-uses-a-single-thread). + +## Python Package: `No module named 'duckdb.duckdb'` Build Error + +**Problem:** +Building the Python package succeeds but the package cannot be imported: + +```batch +cd tools/pythonpkg/ +python3 -m pip install . +python3 -c "import duckdb" +``` + +This returns the following error message: + +```console +Traceback (most recent call last): + File "", line 1, in + File "/duckdb/tools/pythonpkg/duckdb/__init__.py", line 4, in + import duckdb.functional as functional + File "/duckdb/tools/pythonpkg/duckdb/functional/__init__.py", line 1, in + from duckdb.duckdb.functional import ( +ModuleNotFoundError: No module named 'duckdb.duckdb' +``` + +**Solution:** +The problem is caused by Python trying to import from the current working directory. +To work around this, navigate to a different directory (e.g., `cd ..`) and try running Python import again. + +## Python Package on macOS: Building the httpfs Extension Fails + +**Problem:** +The build fails on macOS when both the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) and the Python package are included: + +```bash +GEN=ninja BUILD_PYTHON=1 BUILD_HTTPFS=1 make +``` + +```console +ld: library not found for -lcrypto +clang: error: linker command failed with exit code 1 (use -v to see invocation) +error: command '/usr/bin/clang++' failed with exit code 1 +ninja: build stopped: subcommand failed. +make: *** [release] Error 1 +``` + +**Solution:** +In general, we recommended using the nightly builds, available under GitHub main on the [installation page]({% link docs/archive/1.0/installation/index.html %}). +If you would like to build DuckDB from source, avoid using the `BUILD_PYTHON=1` flag unless you are actively developing the Python library. +Instead, first build the `httpfs` extension (if required), then build and install the Python package separately using pip: + +```bash +GEN=ninja BUILD_HTTPFS=1 make +``` + +If the next line complains about pybind11 being missing, or `--use-pep517` not being supported, make sure you're using a modern version of pip and setuptools. +`python3-pip` on your OS may not be modern, so you may need to run `python3 -m pip install pip -U` first. + +```bash +python3 -m pip install tools/pythonpkg --use-pep517 --user +``` + +## Linux: Building the httpfs Extension Fails + +**Problem:** +When building the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) on Linux, the build may fail with the following error. + +```console +CMake Error at /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message): + Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the + system variable OPENSSL_ROOT_DIR (missing: OPENSSL_CRYPTO_LIBRARY + OPENSSL_INCLUDE_DIR) +``` + +**Solution:** +Install the `libssl-dev` library. + +```bash +sudo apt-get install -y libssl-dev +``` + +Then, build with: + +```bash +GEN=ninja BUILD_HTTPFS=1 make +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/internal_errors.md b/docs/archive/1.0/dev/internal_errors.md new file mode 100644 index 00000000000..4ae6b274ec8 --- /dev/null +++ b/docs/archive/1.0/dev/internal_errors.md @@ -0,0 +1,15 @@ +--- +layout: docu +title: Internal Errors +--- + +Internal errors signal an assertion failure within DuckDB. They usually occur due to unexpected conditions or errors in the program's logic. + +After encountering an internal error, DuckDB enters safe mode where any further operations will result in the following error message: + +```console +FATAL Error: Failed: database has been invalidated because of a previous fatal error. +The database must be restarted prior to being used again. +``` + +If you encounter an internal error, please consider creating a minimal reproducible example and submitting an issue to the [DuckDB issue tracker](https://github.com/duckdb/duckdb/issues/new/choose). \ No newline at end of file diff --git a/docs/archive/1.0/dev/profiling.md b/docs/archive/1.0/dev/profiling.md new file mode 100644 index 00000000000..bccfc22089c --- /dev/null +++ b/docs/archive/1.0/dev/profiling.md @@ -0,0 +1,196 @@ +--- +layout: docu +redirect_from: +- /dev/profiling +title: Profiling +--- + +Profiling is important to help understand why certain queries exhibit specific performance characteristics. DuckDB contains several built-in features to enable query profiling that will be explained on this page. + +For the examples on this page we will use the following example data set: + +```sql +CREATE TABLE students (sid INTEGER PRIMARY KEY, name VARCHAR); +CREATE TABLE exams (cid INTEGER, sid INTEGER, grade INTEGER, PRIMARY KEY (cid, sid)); + +INSERT INTO students VALUES (1, 'Mark'), (2, 'Hannes'), (3, 'Pedro'); +INSERT INTO exams VALUES (1, 1, 8), (1, 2, 8), (1, 3, 7), (2, 1, 9), (2, 2, 10); +``` + +## `EXPLAIN` Statement + +The first step to profiling a database engine is figuring out what execution plan the engine is using. The `EXPLAIN` statement allows you to peek into the query plan and see what is going on under the hood. + +The `EXPLAIN` statement displays the physical plan, i.e., the query plan that will get executed. + +To demonstrate, see the below example: + +```sql +CREATE TABLE students (name VARCHAR, sid INTEGER); +CREATE TABLE exams (eid INTEGER, subject VARCHAR, sid INTEGER); +INSERT INTO students VALUES ('Mark', 1), ('Joe', 2), ('Matthew', 3); +INSERT INTO exams VALUES (10, 'Physics', 1), (20, 'Chemistry', 2), (30, 'Literature', 3); +EXPLAIN SELECT name FROM students JOIN exams USING (sid) WHERE name LIKE 'Ma%'; +``` + +```text +┌─────────────────────────────┐ +│┌───────────────────────────┐│ +││ Physical Plan ││ +│└───────────────────────────┘│ +└─────────────────────────────┘ +┌───────────────────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ name │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HASH_JOIN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ INNER │ +│ sid = sid ├──────────────┐ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ +│ EC: 1 │ │ +└─────────────┬─────────────┘ │ +┌─────────────┴─────────────┐┌─────────────┴─────────────┐ +│ SEQ_SCAN ││ FILTER │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ exams ││ prefix(name, 'Ma') │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ sid ││ EC: 1 │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │ +│ EC: 3 ││ │ +└───────────────────────────┘└─────────────┬─────────────┘ + ┌─────────────┴─────────────┐ + │ SEQ_SCAN │ + │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ + │ students │ + │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ + │ sid │ + │ name │ + │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ + │ Filters: name>=Ma AND name│ + │ =Ma AND name│ + │ This file is overwritten with each query that is issued. If you want to store the profile output for later it should be copied to a different file. + +Now let us run the query that we inspected before: + +```sql +SELECT name +FROM students +JOIN exams USING (sid) +WHERE name LIKE 'Ma%'; +``` + +After the query is completed, the JSON file containing the profiling output has been written to the specified file. We can then render the query graph using the Python script, provided we have the `duckdb` python module installed. This script will generate a HTML file and open it in your web browser. + +```bash +python -m duckdb.query_graph /path/to/file.json +``` + +## Notation in Query Plans + +In query plans, the [hash join](https://en.wikipedia.org/wiki/Hash_join) operators adhere to the following convention: +the _probe side_ of the join is the left operand, +while the _build side_ is the right operand. + +Join operators in the query plan show the join type used: + +* Inner joins are denoted as `INNER`. +* Left outer joins and right outer joins are denoted as `LEFT` and `RIGHT`, respectively. +* Full outer joins are denoted as `FULL`. \ No newline at end of file diff --git a/docs/archive/1.0/dev/release_calendar.md b/docs/archive/1.0/dev/release_calendar.md new file mode 100644 index 00000000000..dcd8078e9a4 --- /dev/null +++ b/docs/archive/1.0/dev/release_calendar.md @@ -0,0 +1,54 @@ +--- +layout: docu +redirect_from: +- /dev/release-dates +- /dev/release-dates/ +- /dev/release-calendar +- /dev/release-calendar/ +- /dev/docs/archive/1.0/dev/release_calendar +- /dev/docs/archive/1.0/dev/release_calendar/ +title: Release Calendar +--- + +DuckDB follows [semantic versioning](https://semver.org/spec/v2.0.0.html). +Patch versions only ship bugfixes, while minor versions also introduce new features. + +## Upcoming Releases + +The planned dates of upcoming DuckDB releases are shown below. +**Please note that these dates are tentative** and DuckDB maintainers may decide to push back release dates to ensure the stability and quality of releases. + +
+ + + +| Date | Version | +|:-----|--------:| +{%- for release in site.data.upcoming_releases reversed %} +| {{ release.start_date }} | {{ release.title | replace: "Release ", "" }} | +{%- endfor %} + + + +## Past Releases + +
+ +In the following, we list DuckDB's past releases along with their codename where applicable. +Between versions 0.2.2 and 0.3.3, all releases (including patch versions) received a codename. +Since version 0.4.0, only major and minor versions get a codename. + + + +| Date | Version | Codename | Named after | | +|:-----|--------:|----------|-------------|------| +{% for row in site.data.past_releases %} + {%- capture logo_filename %}images/release-icons/{{ row.version_number }}.svg{% endcapture -%} + {%- capture logo_exists %}{% file_exists {{ logo_filename }} %}{% endcapture -%} + | {{ row.release_date }} | [{{ row.version_number }}](https://github.com/duckdb/duckdb/releases/tag/v{{ row.version_number }}) | {% if row.blog_post %}[{{ row.codename }}]({{ row.blog_post }}){% else %}{{ row.codename | default: "–" }}{% endif %} | {% if row.duck_wikipage %}{% endif %}{{ row.duck_species_primary | default: "–" }}{% if row.duck_wikipage %}{% endif %} {% if row.duck_species_secondary != nil %}_({{ row.duck_species_secondary }})_{% endif %} | {% if logo_exists == "true" %}![Logo of version {{ row.version_number }}](/{{ logo_filename }}){% endif %} | +{% endfor %} + + + +You can get a [CSV file containing past DuckDB releases](/data/duckdb-releases.csv) and analyze it using DuckDB's [CSV reader]({% link docs/archive/1.0/data/csv/overview.md %}). + \ No newline at end of file diff --git a/docs/archive/1.0/dev/repositories.md b/docs/archive/1.0/dev/repositories.md new file mode 100644 index 00000000000..284b0ec0a3f --- /dev/null +++ b/docs/archive/1.0/dev/repositories.md @@ -0,0 +1,39 @@ +--- +layout: docu +redirect_from: +- /internals/repositories +title: DuckDB Repositories +--- + +Several components of DuckDB are maintained in separate repositories. + +## Main repositories + +* [`duckdb`](https://github.com/duckdb/duckdb): core DuckDB project +* [`duckdb-wasm`](https://github.com/duckdb/duckdb-wasm): WebAssembly version of DuckDB +* [`duckdb-web`](https://github.com/duckdb/duckdb-web): documentation and blog + +## Clients + +* [`duckdb-java`](https://github.com/duckdb/duckdb-java): Java (JDBC) client +* [`duckdb-node`](https://github.com/duckdb/duckdb-node): Node.js client +* [`duckdb-node-neo`](https://github.com/duckdb/duckdb-node): Node.js client, second iteration (currently experimental) +* [`duckdb-odbc`](https://github.com/duckdb/duckdb-odbc): ODBC client +* [`duckdb-r`](https://github.com/duckdb/duckdb-r): R client +* [`duckdb-rs`](https://github.com/duckdb/duckdb-rs): Rust client +* [`duckdb-swift`](https://github.com/duckdb/duckdb-swift): Swift client +* [`duckplyr`](https://github.com/tidyverse/duckplyr): a drop-in replacement for dplyr in R +* [`go-duckdb`](https://github.com/marcboeker/go-duckdb): Go client + +## Connectors + +* [`dbt-duckdb`](https://github.com/duckdb/dbt-duckdb): dbt +* [`duckdb_mysql`](https://github.com/duckdb/duckdb_mysql): MySQL connector +* [`pg_duckdb`](https://github.com/duckdb/pg_duckdb): official PostgreSQL extension for DuckDB (run DuckDB in PostgreSQL) +* [`postgres_scanner`](https://github.com/duckdb/postgres_scanner): PostgreSQL connector (connect to PostgreSQL from DuckdB) +* [`sqlite_scanner`](https://github.com/duckdb/sqlite_scanner): SQLite connector + +## Extensions + +* Core extension repositories are linked in the [Official Extensions page]({% link docs/archive/1.0/extensions/core_extensions.md %}) +* Community extensions are built in the [Community Extensions repository](https://github.com/duckdb/community-extensions) \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/catch.md b/docs/archive/1.0/dev/sqllogictest/catch.md new file mode 100644 index 00000000000..00983bfdcd1 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/catch.md @@ -0,0 +1,46 @@ +--- +layout: docu +redirect_from: +- /dev/sqllogictest/catch +title: Catch C/C++ Tests +--- + +While we prefer the sqllogic tests for testing most functionality, for certain tests only SQL is not sufficient. This typically happens when you want to test the C++ API. When using pure SQL is really not an option it might be necessary to make a C++ test using Catch. + +Catch tests reside in the test directory as well. Here is an example of a catch test that tests the storage of the system: + +```cpp +#include "catch.hpp" +#include "test_helpers.hpp" + +TEST_CASE("Test simple storage", "[storage]") { + auto config = GetTestConfig(); + unique_ptr result; + auto storage_database = TestCreatePath("storage_test"); + + // make sure the database does not exist + DeleteDatabase(storage_database); + { + // create a database and insert values + DuckDB db(storage_database, config.get()); + Connection con(db); + REQUIRE_NO_FAIL(con.Query("CREATE TABLE test (a INTEGER, b INTEGER);")); + REQUIRE_NO_FAIL(con.Query("INSERT INTO test VALUES (11, 22), (13, 22), (12, 21), (NULL, NULL)")); + REQUIRE_NO_FAIL(con.Query("CREATE TABLE test2 (a INTEGER);")); + REQUIRE_NO_FAIL(con.Query("INSERT INTO test2 VALUES (13), (12), (11)")); + } + // reload the database from disk a few times + for (idx_t i = 0; i < 2; i++) { + DuckDB db(storage_database, config.get()); + Connection con(db); + result = con.Query("SELECT * FROM test ORDER BY a"); + REQUIRE(CHECK_COLUMN(result, 0, {Value(), 11, 12, 13})); + REQUIRE(CHECK_COLUMN(result, 1, {Value(), 22, 21, 22})); + result = con.Query("SELECT * FROM test2 ORDER BY a"); + REQUIRE(CHECK_COLUMN(result, 0, {11, 12, 13})); + } + DeleteDatabase(storage_database); +} +``` + +The test uses the `TEST_CASE` wrapper to create each test. The database is created and queried using the C++ API. Results are checked using either `REQUIRE_FAIL`/`REQUIRE_NO_FAIL` (corresponding to statement ok and statement error) or `REQUIRE(CHECK_COLUMN(...))` (corresponding to query with a result check). Every test that is created in this way needs to be added to the corresponding `CMakeLists.txt`. \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/debugging.md b/docs/archive/1.0/dev/sqllogictest/debugging.md new file mode 100644 index 00000000000..03e882cf296 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/debugging.md @@ -0,0 +1,73 @@ +--- +layout: docu +redirect_from: +- /dev/sqllogictest/debugging +title: Debugging +--- + +The purpose of the tests is to figure out when things break. Inevitably changes made to the system will cause one of the tests to fail, and when that happens the test needs to be debugged. + +First, it is always recommended to run in debug mode. This can be done by compiling the system using the command `make debug`. Second, it is recommended to only run the test that breaks. This can be done by passing the filename of the breaking test to the test suite as a command line parameter (e.g., `build/debug/test/unittest test/sql/projection/test_simple_projection.test`). For more options on running a subset of the tests see the [Triggering which tests to run](#triggering-which-tests-to-run) section. + +After that, a debugger can be attached to the program and the test can be debugged. In the sqllogictests it is normally difficult to break on a specific query, however, we have expanded the test suite so that a function called `query_break` is called with the line number `line` as parameter for every query that is run. This allows you to put a conditional breakpoint on a specific query. For example, if we want to break on line number 43 of the test file we can create the following break point: + +```text +gdb: break query_break if line==43 +lldb: break s -n query_break -c line==43 +``` + +You can also skip certain queries from executing by placing `mode skip` in the file, followed by an optional `mode unskip`. Any queries between the two statements will not be executed. + +## Triggering Which Tests to Run + +When running the unittest program, by default all the fast tests are run. A specific test can be run by adding the name of the test as an argument. For the sqllogictests, this is the relative path to the test file. +To run only a single test: + +```bash +build/debug/test/unittest test/sql/projection/test_simple_projection.test +``` + +All tests in a given directory can be executed by providing the directory as a parameter with square brackets. +To run all tests in the “projection” directory: + +```bash +build/debug/test/unittest "[projection]" +``` + +All tests, including the slow tests, can be run by running the tests with an asterisk. +To run all tests, including the slow tests: + +```bash +build/debug/test/unittest "*" +``` + +We can run a subset of the tests using the `--start-offset` and `--end-offset` parameters. +To run the tests 200..250: + +```bash +build/debug/test/unittest --start-offset=200 --end-offset=250 +``` + +These are also available in percentages. To run tests 10% - 20%: + +```bash +build/debug/test/unittest --start-offset-percentage=10 --end-offset-percentage=20 +``` + +The set of tests to run can also be loaded from a file containing one test name per line, and loaded using the `-f` command. + +```bash +cat test.list +``` + +```text +test/sql/join/full_outer/test_full_outer_join_issue_4252.test +test/sql/join/full_outer/full_outer_join_cache.test +test/sql/join/full_outer/test_full_outer_join.test +``` + +To run only the tests labeled in the file: + +```bash +build/debug/test/unittest -f test.list +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/intro.md b/docs/archive/1.0/dev/sqllogictest/intro.md new file mode 100644 index 00000000000..9cbe2218e5a --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/intro.md @@ -0,0 +1,75 @@ +--- +layout: docu +redirect_from: +- /dev/sqllogictest/intro +title: sqllogictest Introduction +--- + +For testing plain SQL, we use an extended version of the SQL logic test suite, adopted from [SQLite](https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki). Every test is a single self-contained file located in the `test/sql` directory. +To run tests located outside of the default `test` directory, specify `--test-dir ` and make sure provided test file paths are relative to that root directory. + +The test describes a series of SQL statements, together with either the expected result, a `statement ok` indicator, or a `statement error` indicator. An example of a test file is shown below: + +```sql +# name: test/sql/projection/test_simple_projection.test +# group [projection] + +# enable query verification +statement ok +PRAGMA enable_verification + +# create table +statement ok +CREATE TABLE a (i INTEGER, j INTEGER); + +# insertion: 1 affected row +statement ok +INSERT INTO a VALUES (42, 84); + +query II +SELECT * FROM a; +---- +42 84 +``` + +In this example, three statements are executed. The first statements are expected to succeed (prefixed by `statement ok`). The third statement is expected to return a single row with two columns (indicated by `query II`). The values of the row are expected to be `42` and `84` (separated by a tab character). For more information on query result verification, see the [result verification section]({% link docs/archive/1.0/dev/sqllogictest/result_verification.md %}). + +The top of every file should contain a comment describing the name and group of the test. The name of the test is always the relative file path of the file. The group is the folder that the file is in. The name and group of the test are relevant because they can be used to execute *only* that test in the unittest group. For example, if we wanted to execute *only* the above test, we would run the command `unittest test/sql/projection/test_simple_projection.test`. If we wanted to run all tests in a specific directory, we would run the command `unittest "[projection]"`. + +Any tests that are placed in the `test` directory are automatically added to the test suite. Note that the extension of the test is significant. The sqllogictests should either use the `.test` extension, or the `.test_slow` extension. The `.test_slow` extension indicates that the test takes a while to run, and will only be run when all tests are explicitly run using `unittest *`. Tests with the extension `.test` will be included in the fast set of tests. + +## Query Verification + +Many simple tests start by enabling query verification. This can be done through the following `PRAGMA` statement: + +```sql +statement ok +PRAGMA enable_verification +``` + +Query verification performs extra validation to ensure that the underlying code runs correctly. The most important part of that is that it verifies that optimizers do not cause bugs in the query. It does this by running both an unoptimized and optimized version of the query, and verifying that the results of these queries are identical. + +Query verification is very useful because it not only discovers bugs in optimizers, but also finds bugs in e.g., join implementations. This is because the unoptimized version will typically run using cross products instead. Because of this, query verification can be very slow to do when working with larger data sets. It is therefore recommended to turn on query verification for all unit tests, except those involving larger data sets (more than ~10-100 rows). + +## Editors & Syntax Highlighting + +The sqllogictests are not exactly an industry standard, but several other systems have adopted them as well. Parsing sqllogictests is intentionally simple. All statements have to be separated by empty lines. For that reason, writing a syntax highlighter is not extremely difficult. + +A syntax highlighter exists for [Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=benesch.sqllogictest). We have also [made a fork that supports the DuckDB dialect of the sqllogictests](https://github.com/Mytherin/vscode-sqllogictest). You can use the fork by installing the original, then copying the `syntaxes/sqllogictest.tmLanguage.json` into the installed extension (on macOS this is located in `~/.vscode/extensions/benesch.sqllogictest-0.1.1`). + +A syntax highlighter is also available for [CLion](https://plugins.jetbrains.com/plugin/15295-sqltest). It can be installed directly on the IDE by searching SQLTest on the marketplace. A [GitHub repository](https://github.com/pdet/SQLTest) is also available, with extensions and bug reports being welcome. + +### Temporary Files + +For some tests (e.g., CSV/Parquet file format tests) it is necessary to create temporary files. Any temporary files should be created in the temporary testing directory. This directory can be used by placing the string `__TEST_DIR__` in a query. This string will be replaced by the path of the temporary testing directory. + +```sql +statement ok +COPY csv_data TO '__TEST_DIR__/output_file.csv.gz' (COMPRESSION GZIP); +``` + +### Require & Extensions + +To avoid bloating the core system, certain functionality of DuckDB is available only as an extension. Tests can be build for those extensions by adding a `require` field in the test. If the extension is not loaded, any statements that occurs after the require field will be skipped. Examples of this are `require parquet` or `require icu`. + +Another usage is to limit a test to a specific vector size. For example, adding `require vector_size 512` to a test will prevent the test from being run unless the vector size greater than or equal to 512. This is useful because certain functionality is not supported for low vector sizes, but we run tests using a vector size of 2 in our CI. \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/loops.md b/docs/archive/1.0/dev/sqllogictest/loops.md new file mode 100644 index 00000000000..4ca65b36ae3 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/loops.md @@ -0,0 +1,77 @@ +--- +layout: docu +redirect_from: +- /dev/sqllogictest/loops +title: Loops +--- + +Loops can be used in sqllogictests when it is required to execute the same query many times but with slight modifications in constant values. For example, suppose we want to fire off 100 queries that check for the presence of the values `0..100` in a table: + +```sql +# create the table 'integers' with values 0..100 +statement ok +CREATE TABLE integers AS SELECT * FROM range(0, 100, 1) t1(i); + +# verify individually that all 100 values are there +loop i 0 100 + +# execute the query, replacing the value +query I +SELECT count(*) FROM integers WHERE i = ${i}; +---- +1 + +# end the loop (note that multiple statements can be part of a loop) +endloop +``` + +Similarly, `foreach` can be used to iterate over a set of values. + +```sql +foreach partcode millennium century decade year quarter month day hour minute second millisecond microsecond epoch + +query III +SELECT i, date_part('${partcode}', i) AS p, date_part(['${partcode}'], i) AS st +FROM intervals +WHERE p <> st['${partcode}']; +---- + +endloop +``` + +`foreach` also has a number of preset combinations that should be used when required. In this manner, when new combinations are added to the preset, old tests will automatically pick up these new combinations. + +
+ +| Preset | Expansion | +|----------------|--------------------------------------------------------------| +| | none uncompressed rle bitpacking dictionary fsst chimp patas | +| | tinyint smallint integer bigint hugeint | +| | utinyint usmallint uinteger ubigint uhugeint | +| | | +| | float double | +| | bool interval varchar json | + +> Use large loops sparingly. Executing hundreds of thousands of SQL statements will slow down tests unnecessarily. Do not use loops for inserting data. + +## Data Generation without Loops + +Loops should be used sparingly. While it might be tempting to use loops for inserting data using insert statements, this will considerably slow down the test cases. Instead, it is better to generate data using the built-in `range` and `repeat` functions. + +To create the table `integers` with the values `[0, 1, .., 98, 99]`, run: + +```sql +CREATE TABLE integers AS SELECT * FROM range(0, 100, 1) t1(i); +``` + +To create the table `strings` with 100 times the value `hello`, run: + +```sql +CREATE TABLE strings AS SELECT 'hello' AS s FROM range(0, 100, 1); +``` + +Using these two functions, together with clever use of cross products and other expressions, many different types of datasets can be efficiently generated. The `random()` function can also be used to generate random data. + +An alternative option is to read data from an existing CSV or Parquet file. There are several large CSV files that can be loaded from the directory `test/sql/copy/csv/data/real` using a `COPY INTO` statement or the `read_csv_auto` function. + +The TPC-H and TPC-DS extensions can also be used to generate synthetic data, using e.g. `CALL dbgen(sf = 1)` or `CALL dsdgen(sf = 1)`. \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/multiple_connections.md b/docs/archive/1.0/dev/sqllogictest/multiple_connections.md new file mode 100644 index 00000000000..3b1dc07a010 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/multiple_connections.md @@ -0,0 +1,53 @@ +--- +layout: docu +redirect_from: +- /dev/sqllogictest/multiple_connections +title: Multiple Connections +--- + +For tests whose purpose is to verify that the transactional management or versioning of data works correctly, it is generally necessary to use multiple connections. For example, if we want to verify that the creation of tables is correctly transactional, we might want to start a transaction and create a table in `con1`, then fire a query in `con2` that checks that the table is not accessible yet until committed. + +We can use multiple connections in the sqllogictests using `connection labels`. The connection label can be optionally appended to any `statement` or `query`. All queries with the same connection label will be executed in the same connection. A test that would verify the above property would look as follows: + +```sql +statement ok con1 +BEGIN TRANSACTION + +statement ok con1 +CREATE TABLE integers (i INTEGER); + +statement error con2 +SELECT * FROM integers; +``` + +## Concurrent Connections + +Using connection modifiers on the statement and queries will result in testing of multiple connections, but all the queries will still be run *sequentially* on a single thread. If we want to run code from multiple connections *concurrently* over multiple threads, we can use the `concurrentloop` construct. The queries in `concurrentloop` will be run concurrently on separate threads at the same time. + +```sql +concurrentloop i 0 10 + +statement ok +CREATE TEMP TABLE t2 AS (SELECT 1); + +statement ok +INSERT INTO t2 VALUES (42); + +statement ok +DELETE FROM t2 + +endloop +``` + +One caveat with `concurrentloop` is that results are often unpredictable - as multiple clients can hammer the database at the same time we might end up with (expected) transaction conflicts. `statement maybe` can be used to deal with these situations. `statement maybe` essentially accepts both a success, and a failure with a specific error message. + +```sql +concurrentloop i 1 10 + +statement maybe +CREATE OR REPLACE TABLE t2 AS (SELECT -54124033386577348004002656426531535114 FROM t2 LIMIT 70%); +---- +write-write conflict + +endloop +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/overview.md b/docs/archive/1.0/dev/sqllogictest/overview.md new file mode 100644 index 00000000000..db47e25f9e5 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/overview.md @@ -0,0 +1,12 @@ +--- +layout: docu +redirect_from: +- /dev/testing +title: Overview +--- + +## How is DuckDB Tested? + +Testing is vital to make sure that DuckDB works properly and keeps working properly. For that reason, we put a large emphasis on thorough and frequent testing: +* We run a batch of small tests on every commit using [GitHub Actions](https://github.com/duckdb/duckdb/actions), and run a more exhaustive batch of tests on pull requests and commits in the master branch. +* We use a [fuzzer](https://github.com/duckdb/duckdb-fuzzer), which automatically reports of issues found through fuzzing DuckDB. \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/persistent_testing.md b/docs/archive/1.0/dev/sqllogictest/persistent_testing.md new file mode 100644 index 00000000000..6333a0a5470 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/persistent_testing.md @@ -0,0 +1,43 @@ +--- +layout: docu +redirect_from: +- /dev/sqllogictest/persistent_testing +title: Persistent Testing +--- + +By default, all tests are run in in-memory mode (unless `--force-storage` is enabled). In certain cases, we want to force the usage of a persistent database. We can initiate a persistent database using the `load` command, and trigger a reload of the database using the `restart` command. + +```sql +# load the DB from disk +load __TEST_DIR__/storage_scan.db + +statement ok +CREATE TABLE test (a INTEGER); + +statement ok +INSERT INTO test VALUES (11), (12), (13), (14), (15), (NULL) + +# ... + +restart + +query I +SELECT * FROM test ORDER BY a +---- +NULL +11 +12 +13 +14 +15 +``` + +Note that by default the tests run with `SET wal_autocheckpoint = '0KB'` – meaning a checkpoint is triggered after every statement. WAL tests typically run with the following settings to disable this behavior: + +```sql +statement ok +PRAGMA disable_checkpoint_on_shutdown + +statement ok +PRAGMA wal_autocheckpoint = '1TB' +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/result_verification.md b/docs/archive/1.0/dev/sqllogictest/result_verification.md new file mode 100644 index 00000000000..0bc662d2a38 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/result_verification.md @@ -0,0 +1,159 @@ +--- +layout: docu +redirect_from: +- /dev/sqllogictest/result_verification +title: Result Verification +--- + +The standard way of verifying results of queries is using the `query` statement, followed by the letter `I` times the number of columns that are expected in the result. After the query, four dashes (`----`) are expected followed by the result values separated by tabs. For example, + +```sql +query II +SELECT 42, 84 UNION ALL SELECT 10, 20; +---- +42 84 +10 20 +``` + +For legacy reasons the letters `R` and `T` are also accepted to denote columns. + +> Deprecated DuckDB deprecated the usage of types in the sqllogictest. The DuckDB test runner does not use or need them internally – therefore, only `I` should be used to denote columns. + +## NULL Values and Empty Strings + +Empty lines have special significance for the SQLLogic test runner: they signify an end of the current statement or query. For that reason, empty strings and NULL values have special syntax that must be used in result verification. NULL values should use the string `NULL`, and empty strings should use the string `(empty)`, e.g.: + +```sql +query II +SELECT NULL, '' +---- +NULL +(empty) +``` + +## Error Verification + +In order to signify that an error is expected, the `statement error` indicator can be used. The `statement error` also takes an optional expected result – which is interpreted as the *expected error message*. Similar to `query`, the expected error should be placed after the four dashes (`----`) following the query. The test passes if the error message *contains* the text under `statement error` – the entire error message does not need to be provided. It is recommended that you only use a subset of the error message, so that the test does not break unnecessarily if the formatting of error messages is changed. + +```sql +statement error +SELECT * FROM non_existent_table; +---- +Table with name non_existent_table does not exist! +``` + +## Regex + +In certain cases result values might be very large or complex, and we might only be interested in whether or not the result *contains* a snippet of text. In that case, we can use the `:` modifier followed by a certain regex. If the result value matches the regex the test is passed. This is primarily used for query plan analysis. + +```sql +query II +EXPLAIN SELECT tbl.a FROM "data/parquet-testing/arrow/alltypes_plain.parquet" tbl(a) WHERE a = 1 OR a = 2 +---- +physical_plan :.*PARQUET_SCAN.*Filters: a=1 OR a=2.* +``` + +If we instead want the result *not* to contain a snippet of text, we can use the `:` modifier. + +## File + +As results can grow quite large, and we might want to re-use results over multiple files, it is also possible to read expected results from files using the `` command. The expected result is read from the given file. As convention the file path should be provided as relative to the root of the GitHub repository. + +```sql +query I +PRAGMA tpch(1) +---- +:extension/tpch/dbgen/answers/sf1/q01.csv +``` + +## Row-Wise vs. Value-Wise Result Ordering + +The result values of a query can be either supplied in row-wise order, with the individual values separated by tabs, or in value-wise order. In value wise order the individual *values* of the query must appear in row, column order each on an individual line. Consider the following example in both row-wise and value-wise order: + +```sql +# row-wise +query II +SELECT 42, 84 UNION ALL SELECT 10, 20; +---- +42 84 +10 20 + +# value-wise +query II +SELECT 42, 84 UNION ALL SELECT 10, 20; +---- +42 +84 +10 +20 +``` + +## Hashes and Outputting Values + +Besides direct result verification, the sqllogic test suite also has the option of using MD5 hashes for value comparisons. A test using hashes for result verification looks like this: + +```sql +query I +SELECT g, string_agg(x,',') FROM strings GROUP BY g +---- +200 values hashing to b8126ea73f21372cdb3f2dc483106a12 +``` + +This approach is useful for reducing the size of tests when results have many output rows. However, it should be used sparingly, as hash values make the tests more difficult to debug if they do break. + +After it is ensured that the system outputs the correct result, hashes of the queries in a test file can be computed by adding `mode output_hash` to the test file. For example: + +```sql +mode output_hash + +query II +SELECT 42, 84 UNION ALL SELECT 10, 20; +---- +42 84 +10 20 +``` + +The expected output hashes for every query in the test file will then be printed to the terminal, as follows: + +```text +================================================================================ +SQL Query +SELECT 42, 84 UNION ALL SELECT 10, 20; +================================================================================ +4 values hashing to 498c69da8f30c24da3bd5b322a2fd455 +================================================================================ +``` + +In a similar manner, `mode output_result` can be used in order to force the program to print the result to the terminal for every query run in the test file. + +## Result Sorting + +Queries can have an optional field that indicates that the result should be sorted in a specific manner. This field goes in the same location as the connection label. Because of that, connection labels and result sorting cannot be mixed. + +The possible values of this field are `nosort`, `rowsort` and `valuesort`. An example of how this might be used is given below: + +```sql +query I rowsort +SELECT 'world' UNION ALL SELECT 'hello' +---- +hello +world +``` + +In general, we prefer not to use this field and rely on `ORDER BY` in the query to generate deterministic query answers. However, existing sqllogictests use this field extensively, hence it is important to know of its existence. + +## Query Labels + +Another feature that can be used for result verification are `query labels`. These can be used to verify that different queries provide the same result. This is useful for comparing queries that are logically equivalent, but formulated differently. Query labels are provided after the connection label or sorting specifier. + +Queries that have a query label do not need to have a result provided. Instead, the results of each of the queries with the same label are compared to each other. For example, the following script verifies that the queries `SELECT 42+1` and `SELECT 44-1` provide the same result: + +```sql +query I nosort r43 +SELECT 42+1; +---- + +query I nosort r43 +SELECT 44-1; +---- +``` \ No newline at end of file diff --git a/docs/archive/1.0/dev/sqllogictest/writing_tests.md b/docs/archive/1.0/dev/sqllogictest/writing_tests.md new file mode 100644 index 00000000000..5894795a9b9 --- /dev/null +++ b/docs/archive/1.0/dev/sqllogictest/writing_tests.md @@ -0,0 +1,98 @@ +--- +layout: docu +redirect_from: +- /dev/writing_tests +title: Writing Tests +--- + +## Development and Testing + +It is crucial that any new features that get added have correct tests that not only test the “happy path”, but also test edge cases and incorrect usage of the feature. In this section, we describe how DuckDB tests are structured and how to make new tests for DuckDB. + +The tests can be run by running the `unittest` program located in the `test` folder. For the default compilations this is located in either `build/release/test/unittest` (release) or `build/debug/test/unittest` (debug). + +## Philosophy + +When testing DuckDB, we aim to route all the tests through SQL. We try to avoid testing components individually because that makes those components more difficult to change later on. As such, almost all of our tests can (and should) be expressed in pure SQL. There are certain exceptions to this, which we will discuss in [Catch Tests]({% link docs/archive/1.0/dev/sqllogictest/catch.md %}). However, in most cases you should write your tests in plain SQL. + +## Frameworks + +SQL tests should be written using the [sqllogictest framework]({% link docs/archive/1.0/dev/sqllogictest/intro.md %}). + +C++ tests can be written using the [Catch framework]({% link docs/archive/1.0/dev/sqllogictest/catch.md %}). + +## Client Connector Tests + +DuckDB also has tests for various client connectors. These are generally written in the relevant client language, and can be found in `tools/*/tests`. +They also double as documentation of what should be doable from a given client. + +## Functions for Generating Test Data + +DuckDB has built-in functions for generating test data. + +### `test_all_types` Function + +The `test_all_types` table function generates a table whose columns correspond to types (`BOOL`, `TINYINT`, etc.). +The table has three rows encoding the minimum value, the maximum value, and the null value for each type. + +```sql +FROM test_all_types(); +``` + +```text +┌─────────┬─────────┬──────────┬─────────────┬──────────────────────┬──────────────────────┬───┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┬──────────────────────┐ +│ bool │ tinyint │ smallint │ int │ bigint │ hugeint │ … │ struct │ struct_of_arrays │ array_of_structs │ map │ union │ +│ boolean │ int8 │ int16 │ int32 │ int64 │ int128 │ │ struct(a integer, … │ struct(a integer[]… │ struct(a integer, … │ map(varchar, varch… │ union("name" varch… │ +├─────────┼─────────┼──────────┼─────────────┼──────────────────────┼──────────────────────┼───┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┼──────────────────────┤ +│ false │ -128 │ -32768 │ -2147483648 │ -9223372036854775808 │ -17014118346046923… │ … │ {'a': NULL, 'b': N… │ {'a': NULL, 'b': N… │ [] │ {} │ Frank │ +│ true │ 127 │ 32767 │ 2147483647 │ 9223372036854775807 │ 170141183460469231… │ … │ {'a': 42, 'b': 🦆… │ {'a': [42, 999, NU… │ [{'a': NULL, 'b': … │ {key1=🦆🦆🦆🦆🦆🦆… │ 5 │ +│ NULL │ NULL │ NULL │ NULL │ NULL │ NULL │ … │ NULL │ NULL │ NULL │ NULL │ NULL │ +├─────────┴─────────┴──────────┴─────────────┴──────────────────────┴──────────────────────┴───┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┴──────────────────────┤ +│ 3 rows 44 columns (11 shown) │ +└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ +``` + +### `test_vector_types` Function + +The `test_vector_types` table function takes _n_ arguments `col1`, ..., `coln` and an optional `BOOLEAN` argument `all_flat`. +The function generates a table with _n_ columns `test_vector`, `test_vector2`, ..., `test_vectorn`. +In each row, each field contains values conforming to the type of their respective column. + +```sql +FROM test_vector_types(NULL::BIGINT); +``` + +```text +┌──────────────────────┐ +│ test_vector │ +│ int64 │ +├──────────────────────┤ +│ -9223372036854775808 │ +│ 9223372036854775807 │ +│ NULL │ +│ ... │ +└──────────────────────┘ +``` + +```sql +FROM test_vector_types(NULL::ROW(i INTEGER, j VARCHAR, k DOUBLE), NULL::TIMESTAMP); +``` + +```text +┌──────────────────────────────────────────────────────────────────────┬──────────────────────────────┐ +│ test_vector │ test_vector2 │ +│ struct(i integer, j varchar, k double) │ timestamp │ +├──────────────────────────────────────────────────────────────────────┼──────────────────────────────┤ +│ {'i': -2147483648, 'j': 🦆🦆🦆🦆🦆🦆, 'k': -1.7976931348623157e+308} │ 290309-12-22 (BC) 00:00:00 │ +│ {'i': 2147483647, 'j': goo\0se, 'k': 1.7976931348623157e+308} │ 294247-01-10 04:00:54.775806 │ +│ {'i': NULL, 'j': NULL, 'k': NULL} │ NULL │ +│ ... │ +└─────────────────────────────────────────────────────────────────────────────────────────────────────┘ +``` + +`test_vector_types` has an optional argument called `all_flat` of type `BOOL`. This only affects the internal representation of the vector. + +```sql +FROM test_vector_types(NULL::ROW(i INTEGER, j VARCHAR, k DOUBLE), NULL::TIMESTAMP, all_flat = true); +-- the output is the same as above but with a different internal representation +``` \ No newline at end of file diff --git a/docs/archive/1.0/extensions/arrow.md b/docs/archive/1.0/extensions/arrow.md new file mode 100644 index 00000000000..4d5e5312b76 --- /dev/null +++ b/docs/archive/1.0/extensions/arrow.md @@ -0,0 +1,26 @@ +--- +github_repository: https://github.com/duckdb/arrow +layout: docu +title: Arrow Extension +--- + +The `arrow` extension implements features for using [Apache Arrow](https://arrow.apache.org/), a cross-language development platform for in-memory analytics. + +## Installing and Loading + +The `arrow` extension will be transparently autoloaded on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL arrow; +LOAD arrow; +``` + +## Functions + +
+ +| Function | Type | Description | +|--|----|-------| +| `to_arrow_ipc` | Table in-out function | Serializes a table into a stream of blobs containing Arrow IPC buffers | +| `scan_arrow_ipc` | Table function | Scan a list of pointers pointing to Arrow IPC buffers | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/autocomplete.md b/docs/archive/1.0/extensions/autocomplete.md new file mode 100644 index 00000000000..8a2c0fd3a0d --- /dev/null +++ b/docs/archive/1.0/extensions/autocomplete.md @@ -0,0 +1,54 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/autocomplete +layout: docu +title: AutoComplete Extension +--- + +The `autocomplete` extension adds supports for autocomplete in the [CLI client]({% link docs/archive/1.0/api/cli/overview.md %}). +The extension is shipped by default with the CLI client. + +## Behavior + +For the behavior of the `autocomplete` extension, see the [documentation of the CLI client]({% link docs/archive/1.0/api/cli/autocomplete.md %}). + +## Functions + +
+ +| Function | Description | +|:----------------------------------|:-----------------------------------------------------| +| `sql_auto_complete(query_string)` | Attempts autocompletion on the given `query_string`. | + +## Example + +```sql +SELECT * +FROM sql_auto_complete('SEL'); +``` + +Returns: + +
+ +| suggestion | suggestion_start | +|-------------|------------------| +| SELECT | 0 | +| DELETE | 0 | +| INSERT | 0 | +| CALL | 0 | +| LOAD | 0 | +| CALL | 0 | +| ALTER | 0 | +| BEGIN | 0 | +| EXPORT | 0 | +| CREATE | 0 | +| PREPARE | 0 | +| EXECUTE | 0 | +| EXPLAIN | 0 | +| ROLLBACK | 0 | +| DESCRIBE | 0 | +| SUMMARIZE | 0 | +| CHECKPOINT | 0 | +| DEALLOCATE | 0 | +| UPDATE | 0 | +| DROP | 0 | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/aws.md b/docs/archive/1.0/extensions/aws.md new file mode 100644 index 00000000000..85ac7d4925b --- /dev/null +++ b/docs/archive/1.0/extensions/aws.md @@ -0,0 +1,75 @@ +--- +github_repository: https://github.com/duckdb/duckdb_aws +layout: docu +title: AWS Extension +--- + +The `aws` extension adds functionality (e.g., authentication) on top of the `httpfs` extension's [S3 capabilities]({% link docs/archive/1.0/extensions/httpfs/overview.md %}#s3-api), using the AWS SDK. + +## Installing and Loading + +The `aws` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL aws; +LOAD aws; +``` + +## Related Extensions + +`aws` depends on `httpfs` extension capabilities, and both will be autoloaded on the first call to `load_aws_credentials`. +If autoinstall or autoload are disabled, you can always explicitly install and load them as follows: + +```sql +INSTALL aws; +INSTALL httpfs; +LOAD aws; +LOAD httpfs; +``` + +## Usage + +In most cases, you will not need to explicitly interact with the `aws` extension. It will automatically be invoked +whenever you use DuckDB's [S3 Secret functionality]({% link docs/archive/1.0/sql/statements/create_secret.md %}). See the [httpfs extension's S3 capabilities]({% link docs/archive/1.0/extensions/httpfs/overview.md %}#s3) for instructions. + +## Legacy Features + +Prior to version 0.10.0, DuckDB did not have a [Secrets manager]({% link docs/archive/1.0/sql/statements/create_secret.md %}), to load the credentials automatically, the AWS extension provided +a special function to load the AWS credentials in the [legacy authentication method]({% link docs/archive/1.0/extensions/httpfs/s3api_legacy_authentication.md %}). + +| Function | Type | Description | +|---|---|-------| +| `load_aws_credentials` | `PRAGMA` function | Loads the AWS credentials through the [AWS Default Credentials Provider Chain](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials-chain.html). | + +### Load AWS Credentials (Legacy) + +To load the AWS credentials, run: + +```sql +CALL load_aws_credentials(); +``` + +| loaded_access_key_id | loaded_secret_access_key | loaded_session_token | loaded_region | +|----------------------|--------------------------|----------------------|---------------| +| AKIAIOSFODNN7EXAMPLE | | NULL | us-east-2 | + +The function takes a string parameter to specify a specific profile: + +```sql +CALL load_aws_credentials('minio-testing-2'); +``` + +| loaded_access_key_id | loaded_secret_access_key | loaded_session_token | loaded_region | +|----------------------|--------------------------|----------------------|---------------| +| minio_duckdb_user_2 | | NULL | NULL | + +There are several parameters to tweak the behavior of the call: + +```sql +CALL load_aws_credentials('minio-testing-2', set_region = false, redact_secret = false); +``` + +| loaded_access_key_id | loaded_secret_access_key | loaded_session_token | loaded_region | +|----------------------|------------------------------|----------------------|---------------| +| minio_duckdb_user_2 | minio_duckdb_user_password_2 | NULL | NULL | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/azure.md b/docs/archive/1.0/extensions/azure.md new file mode 100644 index 00000000000..046128b5893 --- /dev/null +++ b/docs/archive/1.0/extensions/azure.md @@ -0,0 +1,355 @@ +--- +github_repository: https://github.com/duckdb/duckdb_azure +layout: docu +title: Azure Extension +--- + +The `azure` extension is a loadable extension that adds a filesystem abstraction for the [Azure Blob storage](https://azure.microsoft.com/en-us/products/storage/blobs) to DuckDB. + +## Installing and Loading + +The `azure` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL azure; +LOAD azure; +``` + +## Usage + +Once the [authentication](#authentication) is set up, you can query Azure storage as follows: + +### For Azure Blob Storage + +Allowed URI schemes: `az` or `azure` + +```sql +SELECT count(*) +FROM 'az://⟨my_container⟩/⟨path⟩/⟨my_file⟩.⟨parquet_or_csv⟩'; +``` + +Globs are also supported: + +```sql +SELECT * +FROM 'az://⟨my_container⟩/⟨path⟩/*.csv'; +``` + +```sql +SELECT * +FROM 'az://⟨my_container⟩/⟨path⟩/**'; +``` + +Or with a fully qualified path syntax: + +```sql +SELECT count(*) +FROM 'az://⟨my_storage_account⟩.blob.core.windows.net/⟨my_container⟩/⟨path⟩/⟨my_file⟩.⟨parquet_or_csv⟩'; +``` + +```sql +SELECT * +FROM 'az://⟨my_storage_account⟩.blob.core.windows.net/⟨my_container⟩/⟨path⟩/*.csv'; +``` + +### For Azure Data Lake Storage (ADLS) + +Allowed URI schemes: `abfss` + +```sql +SELECT count(*) +FROM 'abfss://⟨my_filesystem⟩/⟨path⟩/⟨my_file⟩.⟨parquet_or_csv⟩'; +``` + +Globs are also supported: + +```sql +SELECT * +FROM 'abfss://⟨my_filesystem⟩/⟨path⟩/*.csv'; +``` + +```sql +SELECT * +FROM 'abfss://⟨my_filesystem⟩/⟨path⟩/**'; +``` + +Or with a fully qualified path syntax: + +```sql +SELECT count(*) +FROM 'abfss://⟨my_storage_account⟩.dfs.core.windows.net/⟨my_filesystem⟩/⟨path⟩/⟨my_file⟩.⟨parquet_or_csv⟩'; +``` + +```sql +SELECT * +FROM 'abfss://⟨my_storage_account⟩.dfs.core.windows.net/⟨my_filesystem⟩/⟨path⟩/*.csv'; +``` + +## Configuration + +Use the following [configuration options]({% link docs/archive/1.0/configuration/overview.md %}) how the extension reads remote files: + +| Name | Description | Type | Default | +|:---|:---|:---|:---| +| `azure_http_stats` | Include http info from Azure Storage in the [`EXPLAIN ANALYZE` statement]({% link docs/archive/1.0/dev/profiling.md %}). | `BOOLEAN` | `false` | +| `azure_read_transfer_concurrency` | Maximum number of threads the Azure client can use for a single parallel read. If `azure_read_transfer_chunk_size` is less than `azure_read_buffer_size` then setting this > 1 will allow the Azure client to do concurrent requests to fill the buffer. | `BIGINT` | `5` | +| `azure_read_transfer_chunk_size` | Maximum size in bytes that the Azure client will read in a single request. It is recommended that this is a factor of `azure_read_buffer_size`. | `BIGINT` | `1024*1024` | +| `azure_read_buffer_size` | Size of the read buffer. It is recommended that this is evenly divisible by `azure_read_transfer_chunk_size`. | `UBIGINT` | `1024*1024` | +| `azure_transport_option_type` | Underlying [adapter](https://github.com/Azure/azure-sdk-for-cpp/blob/main/doc/HttpTransportAdapter.md) to use in the Azure SDK. Valid values are: `default` or `curl`. | `VARCHAR` | `default` | +| `azure_context_caching` | Enable/disable the caching of the underlying Azure SDK HTTP connection in the DuckDB connection context when performing queries. If you suspect that this is causing some side effect, you can try to disable it by setting it to false (not recommended). | `BOOLEAN` | `true` | + +> Setting `azure_transport_option_type` explicitly to `curl` with have the following effect: +> * On Linux, this may solve certificates issue (`Error: Invalid Error: Fail to get a new connection for: https://⟨storage account name⟩.blob.core.windows.net/. Problem with the SSL CA cert (path? access rights?)`) because when specifying the extension will try to find the bundle certificate in various paths (that is not done by *curl* by default and might be wrong due to static linking). +> * On Windows, this replaces the default adapter (*WinHTTP*) allowing you to use all *curl* capabilities (for example using a socks proxies). +> * On all operating systems, it will honor the following environment variables: +> * `CURL_CA_INFO`: Path to a PEM encoded file containing the certificate authorities sent to libcurl. Note that this option is known to only work on Linux and might throw if set on other platforms. +> * `CURL_CA_PATH`: Path to a directory which holds PEM encoded file, containing the certificate authorities sent to libcurl. + +Example: + +```sql +SET azure_http_stats = false; +SET azure_read_transfer_concurrency = 5; +SET azure_read_transfer_chunk_size = 1_048_576; +SET azure_read_buffer_size = 1_048_576; +``` + +## Authentication + +The Azure extension has two ways to configure the authentication. The preferred way is to use Secrets. + +### Authentication with Secret + +Multiple [Secret Providers]({% link docs/archive/1.0/configuration/secrets_manager.md %}#secret-providers) are available for the Azure extension: + +> * If you need to define different secrets for different storage accounts you can use [the `SCOPE` configuration]({% link docs/archive/1.0/configuration/secrets_manager.md %}#creating-multiple-secrets-for-the-same-service-type). +> * If you use fully qualified path then the `ACCOUNT_NAME` attribute is optional. + +#### `CONFIG` Provider + +The default provider, `CONFIG` (i.e., user-configured), allows access to the storage account using a connection string or anonymously. For example: + +```sql +CREATE SECRET secret1 ( + TYPE AZURE, + CONNECTION_STRING '⟨value⟩' +); +``` + +If you do not use authentication, you still need to specify the storage account name. For example: + +```sql +CREATE SECRET secret2 ( + TYPE AZURE, + PROVIDER CONFIG, + ACCOUNT_NAME '⟨storage account name⟩' +); +``` + +The default `PROVIDER` is `CONFIG`. + +#### `CREDENTIAL_CHAIN` Provider + +The `CREDENTIAL_CHAIN` provider allows connecting using credentials automatically fetched by the Azure SDK via the Azure credential chain. +By default, the `DefaultAzureCredential` chain used, which tries credentials according to the order specified by the [Azure documentation](https://learn.microsoft.com/en-us/javascript/api/@azure/identity/defaultazurecredential?view=azure-node-latest#@azure-identity-defaultazurecredential-constructor). +For example: + +```sql +CREATE SECRET secret3 ( + TYPE AZURE, + PROVIDER CREDENTIAL_CHAIN, + ACCOUNT_NAME '⟨storage account name⟩' +); +``` + +DuckDB also allows specifying a specific chain using the `CHAIN` keyword. This takes a semicolon-separated list (`a;b;c`) of providers that will be tried in order. For example: + +```sql +CREATE SECRET secret4 ( + TYPE AZURE, + PROVIDER CREDENTIAL_CHAIN, + CHAIN 'cli;env', + ACCOUNT_NAME '⟨storage account name⟩' +); +``` + +The possible values are the following: +[`cli`](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli); +[`managed_identity`](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview); +[`env`](https://github.com/Azure/azure-sdk-for-cpp/blob/azure-identity_1.6.0/sdk/identity/azure-identity/README.md#environment-variables); +[`default`](https://github.com/Azure/azure-sdk-for-cpp/blob/azure-identity_1.6.0/sdk/identity/azure-identity/README.md#defaultazurecredential); + +If no explicit `CHAIN` is provided, the default one will be [`default`](https://github.com/Azure/azure-sdk-for-cpp/blob/azure-identity_1.6.0/sdk/identity/azure-identity/README.md#defaultazurecredential) + +#### `SERVICE_PRINCIPAL` Provider + +The `SERVICE_PRINCIPAL` provider allows connecting using a [Azure Service Principal (SPN)](https://learn.microsoft.com/en-us/entra/architecture/service-accounts-principal). + +Either with a secret: + +```sql +CREATE SECRET azure_spn ( + TYPE AZURE, + PROVIDER SERVICE_PRINCIPAL, + TENANT_ID '⟨tenant id⟩', + CLIENT_ID '⟨client id⟩', + CLIENT_SECRET '⟨client secret⟩', + ACCOUNT_NAME '⟨storage account name⟩' +); +``` + +Or with a certificate: + +```sql +CREATE SECRET azure_spn_cert ( + TYPE AZURE, + PROVIDER SERVICE_PRINCIPAL, + TENANT_ID '⟨tenant id⟩', + CLIENT_ID '⟨client id⟩', + CLIENT_CERTIFICATE_PATH '⟨client cert path⟩', + ACCOUNT_NAME '⟨storage account name⟩' +); +``` + +#### Configuring a Proxy + +To configure proxy information when using secrets, you can add `HTTP_PROXY`, `PROXY_USER_NAME`, and `PROXY_PASSWORD` in the secret definition. For example: + +```sql +CREATE SECRET secret5 ( + TYPE AZURE, + CONNECTION_STRING '⟨value⟩', + HTTP_PROXY 'http://localhost:3128', + PROXY_USER_NAME 'john', + PROXY_PASSWORD 'doe' +); +``` + +> * When using secrets, the `HTTP_PROXY` environment variable will still be honored except if you provide an explicit value for it. +> * When using secrets, the `SET` variable of the *Authentication with variables* session will be ignored. +> * The Azure `CREDENTIAL_CHAIN` provider, the actual token is fetched at query time, not at the time of creating the secret. + +### Authentication with Variables (Deprecated) + +```sql +SET variable_name = variable_value; +``` + +Where `variable_name` can be one of the following: + +| Name | Description | Type | Default | +|:---|:---|:---|:---| +| `azure_storage_connection_string` | Azure connection string, used for authenticating and configuring Azure requests. | `STRING` | - | +| `azure_account_name` | Azure account name, when set, the extension will attempt to automatically detect credentials (not used if you pass the connection string). | `STRING` | - | +| `azure_endpoint` | Override the Azure endpoint for when the Azure credential providers are used. | `STRING` | `blob.core.windows.net` | +| `azure_credential_chain`| Ordered list of Azure credential providers, in string format separated by `;`. For example: `'cli;managed_identity;env'`. See the list of possible values in the [`CREDENTIAL_CHAIN` provider section](#credential_chain-provider). Not used if you pass the connection string. | `STRING` | - | +| `azure_http_proxy`| Proxy to use when login & performing request to Azure. | `STRING` | `HTTP_PROXY` environment variable (if set). | +| `azure_proxy_user_name`| Http proxy username if needed. | `STRING` | - | +| `azure_proxy_password`| Http proxy password if needed. | `STRING` | - | + +## Additional Information + +### Logging + +The Azure extension relies on the Azure SDK to connect to Azure Blob storage and supports printing the SDK logs to the console. +To control the log level, set the [`AZURE_LOG_LEVEL`](https://github.com/Azure/azure-sdk-for-cpp/blob/main/sdk/core/azure-core/README.md#sdk-log-messages) environment variable. + +For instance, verbose logs can be enabled as follows in Python: + +```python +import os +import duckdb + +os.environ["AZURE_LOG_LEVEL"] = "verbose" + +duckdb.sql("CREATE SECRET myaccount (TYPE AZURE, PROVIDER CREDENTIAL_CHAIN, SCOPE 'az://myaccount.blob.core.windows.net/')") +duckdb.sql("SELECT count(*) FROM 'az://myaccount.blob.core.windows.net/path/to/blob.parquet'") +``` + +### Difference between ADLS and Blob Storage + +Even though ADLS implements similar functionality as the Blob storage, there are some important performance benefits to using the ADLS endpoints for globbing, especially when using (complex) glob patterns. + +To demonstrate, lets look at an example of how the a glob is performed internally using respectively the Glob and ADLS endpoints. + +Using the following filesystem: + +```text +root +├── l_receipmonth=1997-10 +│ ├── l_shipmode=AIR +│ │ └── data_0.csv +│ ├── l_shipmode=SHIP +│ │ └── data_0.csv +│ └── l_shipmode=TRUCK +│ └── data_0.csv +├── l_receipmonth=1997-11 +│ ├── l_shipmode=AIR +│ │ └── data_0.csv +│ ├── l_shipmode=SHIP +│ │ └── data_0.csv +│ └── l_shipmode=TRUCK +│ └── data_0.csv +└── l_receipmonth=1997-12 + ├── l_shipmode=AIR + │ └── data_0.csv + ├── l_shipmode=SHIP + │ └── data_0.csv + └── l_shipmode=TRUCK + └── data_0.csv +``` + +The following query performed through the blob endpoint + +```sql +SELECT count(*) +FROM 'az://root/l_receipmonth=1997-*/l_shipmode=SHIP/*.csv'; +``` + +will perform the following steps: + +* List all the files with the prefix `root/l_receipmonth=1997-` + * `root/l_receipmonth=1997-10/l_shipmode=SHIP/data_0.csv` + * `root/l_receipmonth=1997-10/l_shipmode=AIR/data_0.csv` + * `root/l_receipmonth=1997-10/l_shipmode=TRUCK/data_0.csv` + * `root/l_receipmonth=1997-11/l_shipmode=SHIP/data_0.csv` + * `root/l_receipmonth=1997-11/l_shipmode=AIR/data_0.csv` + * `root/l_receipmonth=1997-11/l_shipmode=TRUCK/data_0.csv` + * `root/l_receipmonth=1997-12/l_shipmode=SHIP/data_0.csv` + * `root/l_receipmonth=1997-12/l_shipmode=AIR/data_0.csv` + * `root/l_receipmonth=1997-12/l_shipmode=TRUCK/data_0.csv` +* Filter the result with the requested pattern `root/l_receipmonth=1997-*/l_shipmode=SHIP/*.csv` + * `root/l_receipmonth=1997-10/l_shipmode=SHIP/data_0.csv` + * `root/l_receipmonth=1997-11/l_shipmode=SHIP/data_0.csv` + * `root/l_receipmonth=1997-12/l_shipmode=SHIP/data_0.csv` + +Meanwhile, the same query performed through the datalake endpoint, + +```sql +SELECT count(*) +FROM 'abfss://root/l_receipmonth=1997-*/l_shipmode=SHIP/*.csv'; +``` + +will perform the following steps: + +* List all directories in `root/` + * `root/l_receipmonth=1997-10` + * `root/l_receipmonth=1997-11` + * `root/l_receipmonth=1997-12` +* Filter and list subdirectories: `root/l_receipmonth=1997-10`, `root/l_receipmonth=1997-11`, `root/l_receipmonth=1997-12` + * `root/l_receipmonth=1997-10/l_shipmode=SHIP` + * `root/l_receipmonth=1997-10/l_shipmode=AIR` + * `root/l_receipmonth=1997-10/l_shipmode=TRUCK` + * `root/l_receipmonth=1997-11/l_shipmode=SHIP` + * `root/l_receipmonth=1997-11/l_shipmode=AIR` + * `root/l_receipmonth=1997-11/l_shipmode=TRUCK` + * `root/l_receipmonth=1997-12/l_shipmode=SHIP` + * `root/l_receipmonth=1997-12/l_shipmode=AIR` + * `root/l_receipmonth=1997-12/l_shipmode=TRUCK` +* Filter and list subdirectories: `root/l_receipmonth=1997-10/l_shipmode=SHIP`, `root/l_receipmonth=1997-11/l_shipmode=SHIP`, `root/l_receipmonth=1997-12/l_shipmode=SHIP` + * `root/l_receipmonth=1997-10/l_shipmode=SHIP/data_0.csv` + * `root/l_receipmonth=1997-11/l_shipmode=SHIP/data_0.csv` + * `root/l_receipmonth=1997-12/l_shipmode=SHIP/data_0.csv` + +As you can see because the Blob endpoint does not support the notion of directories, the filter can only be performed after the listing, whereas the ADLS endpoint will list files recursively. Especially with higher partition/directory counts, the performance difference can be very significant. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/community_extensions.md b/docs/archive/1.0/extensions/community_extensions.md new file mode 100644 index 00000000000..5fb54fb4433 --- /dev/null +++ b/docs/archive/1.0/extensions/community_extensions.md @@ -0,0 +1,71 @@ +--- +layout: docu +title: Community Extensions +--- + +DuckDB recently launched a [Community Extensions repository](https://github.com/duckdb/community-extensions). +For details, see the [announcement blog post]({% post_url 2024-07-05-community-extensions %}). + +## User Experience + +We are going to use the [`h3` extension](https://github.com/isaacbrodsky/h3-duckdb) as our example. +This extension implements [hierarchical hexagonal indexing](https://github.com/uber/h3) for geospatial data. + +Using the DuckDB Community Extensions repository, you can install and load the `h3` extension as follows: + +```sql +INSTALL h3 FROM community; +LOAD h3; +``` + +Then, you can instantly start using it. Note that the sample data is 500 MB: + +```sql +SELECT + h3_latlng_to_cell(pickup_latitude, pickup_longitude, 9) AS cell_id, + h3_cell_to_boundary_wkt(cell_id) AS boundary, + count() AS cnt +FROM read_parquet('https://blobs.duckdb.org/data/yellow_tripdata_2010-01.parquet') +GROUP BY cell_id +HAVING cnt > 10; +``` + +On load, the extension’s signature is checked, both to ensure platform and versions are compatible, and to verify that the source of the binary is the community extensions repository. Extensions are built, signed and distributed for Linux, macOS, Windows, and WebAssembly. This allows extensions to be available to any DuckDB client using version 1.0.0 and upcoming versions. + +The `h3` extension’s documentation is available at . + +## Developer Experience + +From the developer’s perspective, the Community Extensions repository performs the steps required for publishing extensions, including building the extensions for all relevant [platforms]({% link docs/archive/1.0/dev/building/supported_platforms.md %}), signing the extension binaries and serving them from the repository. + +For the [maintainer of `h3`](https://github.com/isaacbrodsky/), the publication process required performing the following steps: + +1. Sending a PR with a metadata file `description.yml` contains the description of the extension: + + ```yaml + extension: + name: h3 + description: Hierarchical hexagonal indexing for geospatial data + version: 1.0.0 + language: C++ + build: cmake + license: Apache-2.0 + maintainers: + - isaacbrodsky + + repo: + github: isaacbrodsky/h3-duckdb + ref: 3c8a5358e42ab8d11e0253c70f7cc7d37781b2ef + ``` + +2. The CI will build and test the extension. The checks performed by the CI are aligned with the [`extension-template` repository](https://github.com/duckdb/extension-template), so iterations can be done independently. + +3. Wait for approval from the DuckDB Community Extension repository’s maintainers and for the build process to complete. + +## Security Considerations + +See the [Securing Extensions page]({% link docs/archive/1.0/operations_manual/securing_duckdb/securing_extensions.md %}) for details. + +## List of Community Extensions + +See the [DuckDB Community Extensions repository site](https://community-extensions.duckdb.org/). \ No newline at end of file diff --git a/docs/archive/1.0/extensions/core_extensions.md b/docs/archive/1.0/extensions/core_extensions.md new file mode 100644 index 00000000000..ebe723478ac --- /dev/null +++ b/docs/archive/1.0/extensions/core_extensions.md @@ -0,0 +1,56 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/extensions/official_extensions +title: Core Extensions +--- + +## List of Core Extensions + +| Name | GitHub | Description | Autoloadable | Aliases | +|:-----------------------------|----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:--------------|:------------------------| +| [arrow]({% link docs/archive/1.0/extensions/arrow.md %}) | [GitHub](https://github.com/duckdb/arrow) | A zero-copy data integration between Apache Arrow and DuckDB | no | | +| [autocomplete]({% link docs/archive/1.0/extensions/autocomplete.md %}) | | Adds support for autocomplete in the shell | yes | | +| [aws]({% link docs/archive/1.0/extensions/aws.md %}) | [GitHub](https://github.com/duckdb/duckdb_aws) | Provides features that depend on the AWS SDK | yes | | +| [azure]({% link docs/archive/1.0/extensions/azure.md %}) | [GitHub](https://github.com/duckdb/duckdb_azure) | Adds a filesystem abstraction for Azure blob storage to DuckDB | yes | | +| [delta]({% link docs/archive/1.0/extensions/delta.md %}) | [GitHub](https://github.com/duckdb/duckdb_delta) | Adds support for Delta Lake | yes | | +| [excel]({% link docs/archive/1.0/extensions/excel.md %}) | [GitHub](https://github.com/duckdb/duckdb_excel) | Adds support for Excel-like format strings | yes | | +| [fts]({% link docs/archive/1.0/extensions/full_text_search.md %}) | | Adds support for Full-Text Search Indexes | yes | | +| [httpfs]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) | | Adds support for reading and writing files over an HTTP(S) or S3 connection | yes | http, https, s3 | +| [iceberg]({% link docs/archive/1.0/extensions/iceberg.md %}) | [GitHub](https://github.com/duckdb/duckdb_iceberg) | Adds support for Apache Iceberg | no | | +| [icu]({% link docs/archive/1.0/extensions/icu.md %}) | | Adds support for time zones and collations using the ICU library | yes | | +| [inet]({% link docs/archive/1.0/extensions/inet.md %}) | | Adds support for IP-related data types and functions | yes | | +| [jemalloc]({% link docs/archive/1.0/extensions/jemalloc.md %}) | | Overwrites system allocator with jemalloc | no | | +| [json]({% link docs/archive/1.0/extensions/json.md %}) | | Adds support for JSON operations | yes | | +| [mysql]({% link docs/archive/1.0/extensions/mysql.md %}) | [GitHub](https://github.com/duckdb/duckdb_mysql) | Adds support for reading from and writing to a MySQL database | no | | +| [parquet]({% link docs/archive/1.0/data/parquet/overview.md %}) | | Adds support for reading and writing Parquet files | (built-in) | | +| [postgres]({% link docs/archive/1.0/extensions/postgres.md %}) | [GitHub](https://github.com/duckdb/postgres_scanner) | Adds support for reading from and writing to a PostgreSQL database | yes | postgres_scanner | +| [spatial]({% link docs/archive/1.0/extensions/spatial.md %}) | [GitHub](https://github.com/duckdb/duckdb_spatial) | Geospatial extension that adds support for working with spatial data and functions | no | | +| [sqlite]({% link docs/archive/1.0/extensions/sqlite.md %}) | [GitHub](https://github.com/duckdb/sqlite_scanner) | Adds support for reading from and writing to SQLite database files | yes | sqlite_scanner, sqlite3 | +| [substrait]({% link docs/archive/1.0/extensions/substrait.md %}) | [GitHub](https://github.com/duckdb/substrait) | Adds support for the Substrait integration | no | | +| [tpcds]({% link docs/archive/1.0/extensions/tpcds.md %}) | | Adds TPC-DS data generation and query support | yes | | +| [tpch]({% link docs/archive/1.0/extensions/tpch.md %}) | | Adds TPC-H data generation and query support | yes | | +| [vss]({% link docs/archive/1.0/extensions/vss.md %}) | [GitHub](https://github.com/duckdb/duckdb_vss) | Adds support for vector similarity search queries | no | | + +## Default Extensions + +Different DuckDB clients ship a different set of extensions. +We summarize the main distributions in the table below. + +
+ +| Name | CLI (duckdb.org) | CLI (Homebrew) | Python | R | Java | Node.js | +|------|------|------|---|---|---|---|---| +| [autocomplete]({% link docs/archive/1.0/extensions/autocomplete.md %}) | yes | yes | | | | | +| [excel]({% link docs/archive/1.0/extensions/excel.md %}) | yes | | | | | | +| [fts]({% link docs/archive/1.0/extensions/full_text_search.md %}) | yes | | yes | | | | +| [httpfs]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) | | | yes | | | | +| [icu]({% link docs/archive/1.0/extensions/icu.md %}) | yes | yes | yes | | yes | yes | +| [json]({% link docs/archive/1.0/extensions/json.md %}) | yes | yes | yes | | yes | yes | +| [parquet]({% link docs/archive/1.0/data/parquet/overview.md %}) | yes | yes | yes | yes | yes | yes | +| [tpcds]({% link docs/archive/1.0/extensions/tpcds.md %}) | | | yes | | | | +| [tpch]({% link docs/archive/1.0/extensions/tpch.md %}) | yes | | yes | | | | + +The [jemalloc]({% link docs/archive/1.0/extensions/jemalloc.md %}) extension's availability is based on the operating system. +Starting with version 0.10.1, `jemalloc` is a built-in extension on Linux x86_64 (AMD64) distributions, while it will be optionally available on Linux ARM64 distributions and on macOS (via compiling from source). +On Windows, it is not available. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/delta.md b/docs/archive/1.0/extensions/delta.md new file mode 100644 index 00000000000..bfbbff62226 --- /dev/null +++ b/docs/archive/1.0/extensions/delta.md @@ -0,0 +1,87 @@ +--- +github_repository: https://github.com/duckdb/duckdb_delta +layout: docu +title: Delta Extension +--- + +The `delta` extension adds support for the [Delta Lake open-source storage format](https://delta.io/). It is built using the [Delta Kernel](https://github.com/delta-incubator/delta-kernel-rs). The extension offers **read support** for delta tables, both local and remote. + +For implementation details, see the [announcement blog post]({% post_url 2024-06-10-delta %}). + +> Warning The `delta` extension is currently experimental and is [only supported on given platforms](#supported-duckdb-versions-and-platforms). + +## Installing and Loading + +The `delta` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL delta; +LOAD delta; +``` + +## Usage + +To scan a local delta table, run: + +```sql +SELECT * +FROM delta_scan('file:///some/path/on/local/machine'); +``` + +To scan a delta table in an S3 bucket, run: + +```sql +SELECT * +FROM delta_scan('s3://some/delta/table'); +``` + +For authenticating to S3 buckets, DuckDB [Secrets]({% link docs/archive/1.0/configuration/secrets_manager.md %}) are supported: + +```sql +CREATE SECRET ( + TYPE S3, + PROVIDER CREDENTIAL_CHAIN +); +SELECT * +FROM delta_scan('s3://some/delta/table/with/auth'); +``` + +To scan public buckets on S3, you may need to pass the correct region by creating a secret containing the region of your public S3 bucket: + +```sql +CREATE SECRET ( + TYPE S3, + REGION 'my-region' +); +SELECT * +FROM delta_scan('s3://some/public/table/in/my-region'); +``` + +## Features + +While the `delta` extension is still experimental, many (scanning) features and optimizations are already supported: + +* multithreaded scans and Parquet metadata reading +* data skipping/filter pushdown + * skipping row-groups in file (based on Parquet metadata) + * skipping complete files (based on Delta partition information) +* projection pushdown +* scanning tables with deletion vectors +* all primitive types +* structs +* S3 support with secrets + +More optimizations are going to be released in the future. + +## Supported DuckDB Versions and Platforms + +The `delta` extension requires DuckDB version 0.10.3 or newer. + +The `delta` extension currently only supports the following platforms: + +* Linux AMD64 (x86_64 and ARM64): `linux_amd64`, `linux_amd64_gcc4`, and `linux_arm64` +* macOS Intel and Apple Silicon: `osx_amd64` and `osx_arm64` +* Windows AMD64: `windows_amd64` + +Support for the [other DuckDB platforms]({% link docs/archive/1.0/extensions/working_with_extensions.md %}#platforms) is work-in-progress. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/excel.md b/docs/archive/1.0/extensions/excel.md new file mode 100644 index 00000000000..bf29ea5fe8a --- /dev/null +++ b/docs/archive/1.0/extensions/excel.md @@ -0,0 +1,45 @@ +--- +github_repository: https://github.com/duckdb/duckdb_excel +layout: docu +title: Excel Extension +--- + +The `excel` extension, unlike what its name may suggest, does not provide support for reading Excel files. +Instead, provides a function that wraps the number formatting functionality of the [i18npool library](https://www.openoffice.org/l10n/i18n_framework/index.html), which formats numbers per Excel's formatting rules. + +Excel files can be currently handled through the [`spatial` extension]({% link docs/archive/1.0/extensions/spatial.md %}): see the [Excel Import]({% link docs/archive/1.0/guides/file_formats/excel_import.md %}) and [Excel Export]({% link docs/archive/1.0/guides/file_formats/excel_export.md %}) pages for instructions. + +## Installing and Loading + +The `excel` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL excel; +LOAD excel; +``` + +## Functions + +| Function | Description | +|:--|:---| +| `excel_text(number, format_string)`| Format the given `number` per the rules given in the `format_string`. | +| `text(number, format_string)` | Alias for `excel_text`. | + +## Examples + +```sql +SELECT excel_text(1_234_567.897, 'h:mm AM/PM') AS timestamp; +``` + +| timestamp | +|-----------| +| 9:31 PM | + +```sql +SELECT excel_text(1_234_567.897, 'h AM/PM') AS timestamp; +``` + +| timestamp | +|-----------| +| 9 PM | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/full_text_search.md b/docs/archive/1.0/extensions/full_text_search.md new file mode 100644 index 00000000000..1b75193316a --- /dev/null +++ b/docs/archive/1.0/extensions/full_text_search.md @@ -0,0 +1,167 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/fts +layout: docu +title: Full-Text Search Extension +--- + +Full-Text Search is an extension to DuckDB that allows for search through strings, similar to [SQLite's FTS5 extension](https://www.sqlite.org/fts5.html). + +## Installing and Loading + +The `fts` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL fts; +LOAD fts; +``` + +## Usage + +The extension adds two `PRAGMA` statements to DuckDB: one to create, and one to drop an index. Additionally, a scalar macro `stem` is added, which is used internally by the extension. + +### `PRAGMA create_fts_index` + +```python +create_fts_index(input_table, input_id, *input_values, stemmer = 'porter', + stopwords = 'english', ignore = '(\\.|[^a-z])+', + strip_accents = 1, lower = 1, overwrite = 0) +``` + +`PRAGMA` that creates a FTS index for the specified table. + + + +| Name | Type | Description | +|:--|:--|:----------| +| `input_table` | `VARCHAR` | Qualified name of specified table, e.g., `'table_name'` or `'main.table_name'` | +| `input_id` | `VARCHAR` | Column name of document identifier, e.g., `'document_identifier'` | +| `input_values…` | `VARCHAR` | Column names of the text fields to be indexed (vararg), e.g., `'text_field_1'`, `'text_field_2'`, ..., `'text_field_N'`, or `'\*'` for all columns in input_table of type `VARCHAR` | +| `stemmer` | `VARCHAR` | The type of stemmer to be used. One of `'arabic'`, `'basque'`, `'catalan'`, `'danish'`, `'dutch'`, `'english'`, `'finnish'`, `'french'`, `'german'`, `'greek'`, `'hindi'`, `'hungarian'`, `'indonesian'`, `'irish'`, `'italian'`, `'lithuanian'`, `'nepali'`, `'norwegian'`, `'porter'`, `'portuguese'`, `'romanian'`, `'russian'`, `'serbian'`, `'spanish'`, `'swedish'`, `'tamil'`, `'turkish'`, or `'none'` if no stemming is to be used. Defaults to `'porter'` | +| `stopwords` | `VARCHAR` | Qualified name of table containing a single `VARCHAR` column containing the desired stopwords, or `'none'` if no stopwords are to be used. Defaults to `'english'` for a pre-defined list of 571 English stopwords | +| `ignore` | `VARCHAR` | Regular expression of patterns to be ignored. Defaults to `'(\\.|[^a-z])+'`, ignoring all escaped and non-alphabetic lowercase characters | +| `strip_accents` | `BOOLEAN` | Whether to remove accents (e.g., convert `á` to `a`). Defaults to `1` | +| `lower` | `BOOLEAN` | Whether to convert all text to lowercase. Defaults to `1` | +| `overwrite` | `BOOLEAN` | Whether to overwrite an existing index on a table. Defaults to `0` | + + + +This `PRAGMA` builds the index under a newly created schema. The schema will be named after the input table: if an index is created on table `'main.table_name'`, then the schema will be named `'fts_main_table_name'`. + +### `PRAGMA drop_fts_index` + +```python +drop_fts_index(input_table) +``` + +Drops a FTS index for the specified table. + +
+ +| Name | Type | Description | +|:--|:--|:-----------| +| `input_table` | `VARCHAR` | Qualified name of input table, e.g., `'table_name'` or `'main.table_name'` | + +### `match_bm25` Function + +```python +match_bm25(input_id, query_string, fields := NULL, k := 1.2, b := 0.75, conjunctive := 0) +``` + +When an index is built, this retrieval macro is created that can be used to search the index. + +| Name | Type | Description | +|:--|:--|:----------| +| `input_id` | `VARCHAR` | Column name of document identifier, e.g., `'document_identifier'` | +| `query_string` | `VARCHAR` | The string to search the index for | +| `fields` | `VARCHAR` | Comma-separarated list of fields to search in, e.g., `'text_field_2, text_field_N'`. Defaults to `NULL` to search all indexed fields | +| `k` | `DOUBLE` | Parameter _k1_ in the Okapi BM25 retrieval model. Defaults to `1.2` | +| `b` | `DOUBLE` | Parameter _b_ in the Okapi BM25 retrieval model. Defaults to `0.75` | +| `conjunctive` | `BOOLEAN` | Whether to make the query conjunctive i.e., all terms in the query string must be present in order for a document to be retrieved | + +### `stem` Function + +```python +stem(input_string, stemmer) +``` + +Reduces words to their base. Used internally by the extension. + +| Name | Type | Description | +|:--|:--|:----------| +| `input_string` | `VARCHAR` | The column or constant to be stemmed. | +| `stemmer` | `VARCHAR` | The type of stemmer to be used. One of `'arabic'`, `'basque'`, `'catalan'`, `'danish'`, `'dutch'`, `'english'`, `'finnish'`, `'french'`, `'german'`, `'greek'`, `'hindi'`, `'hungarian'`, `'indonesian'`, `'irish'`, `'italian'`, `'lithuanian'`, `'nepali'`, `'norwegian'`, `'porter'`, `'portuguese'`, `'romanian'`, `'russian'`, `'serbian'`, `'spanish'`, `'swedish'`, `'tamil'`, `'turkish'`, or `'none'` if no stemming is to be used. | + +## Example Usage + +Create a table and fill it with text data: + +```sql +CREATE TABLE documents ( + document_identifier VARCHAR, + text_content VARCHAR, + author VARCHAR, + doc_version INTEGER +); +INSERT INTO documents + VALUES ('doc1', + 'The mallard is a dabbling duck that breeds throughout the temperate.', + 'Hannes Mühleisen', + 3), + ('doc2', + 'The cat is a domestic species of small carnivorous mammal.', + 'Laurens Kuiper', + 2 + ); +``` + +Build the index, and make both the `text_content` and `author` columns searchable. + +```sql +PRAGMA create_fts_index( + 'documents', 'document_identifier', 'text_content', 'author' +); +``` + +Search the `author` field index for documents that are authored by `Muhleisen`. This retrieves `doc1`: + +```sql +SELECT document_identifier, text_content, score +FROM ( + SELECT *, fts_main_documents.match_bm25( + document_identifier, + 'Muhleisen', + fields := 'author' + ) AS score + FROM documents +) sq +WHERE score IS NOT NULL + AND doc_version > 2 +ORDER BY score DESC; +``` + +| document_identifier | text_content | score | +|---------------------|----------------------------------------------------------------------|------:| +| doc1 | The mallard is a dabbling duck that breeds throughout the temperate. | 0.0 | + +Search for documents about `small cats`. This retrieves `doc2`: + +```sql +SELECT document_identifier, text_content, score +FROM ( + SELECT *, fts_main_documents.match_bm25( + document_identifier, + 'small cats' + ) AS score + FROM documents +) sq +WHERE score IS NOT NULL +ORDER BY score DESC; +``` + +| document_identifier | text_content | score | +|---------------------|------------------------------------------------------------|------:| +| doc2 | The cat is a domestic species of small carnivorous mammal. | 0.0 | + +> Warning The FTS index will not update automatically when input table changes. +> A workaround of this limitation can be recreating the index to refresh. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/httpfs/https.md b/docs/archive/1.0/extensions/httpfs/https.md new file mode 100644 index 00000000000..c90a78e2792 --- /dev/null +++ b/docs/archive/1.0/extensions/httpfs/https.md @@ -0,0 +1,52 @@ +--- +layout: docu +title: HTTP(S) Support +--- + +With the `httpfs` extension, it is possible to directly query files over the HTTP(S) protocol. This works for all files supported by DuckDB or its various extensions, and provides read-only access. + +```sql +SELECT * +FROM 'https://domain.tld/file.extension'; +``` + +## Partial Reading + +For CSV files, files will be downloaded entirely in most cases, due to the row-based nature of the format. +For Parquet files, DuckDB supports [partial reading]({% link docs/archive/1.0/data/parquet/overview.md %}#partial-reading), i.e., it can use a combination of the Parquet metadata and [HTTP range requests](https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests) to only download the parts of the file that are actually required by the query. For example, the following query will only read the Parquet metadata and the data for the `column_a` column: + +```sql +SELECT column_a +FROM 'https://domain.tld/file.parquet'; +``` + +In some cases, no actual data needs to be read at all as they only require reading the metadata: + +```sql +SELECT count(*) +FROM 'https://domain.tld/file.parquet'; +``` + +## Scanning Multiple Files + +Scanning multiple files over HTTP(S) is also supported: + +```sql +SELECT * +FROM read_parquet([ + 'https://domain.tld/file1.parquet', + 'https://domain.tld/file2.parquet' +]); +``` + +## Using a Custom Certificate File + +> This feature is currently only available in the nightly build. It will be [released]({% link docs/archive/1.0/dev/release_calendar.md %}) in version 0.10.1. + +To use the `httpfs` extension with a custom certificate file, set the following [configuration options]({% link docs/archive/1.0/configuration/pragmas.md %}) prior to loading the extension: + +```sql +LOAD httpfs; +SET ca_cert_file = '⟨certificate_file⟩'; +SET enable_server_cert_verification = true; +``` \ No newline at end of file diff --git a/docs/archive/1.0/extensions/httpfs/hugging_face.md b/docs/archive/1.0/extensions/httpfs/hugging_face.md new file mode 100644 index 00000000000..d438e2760d1 --- /dev/null +++ b/docs/archive/1.0/extensions/httpfs/hugging_face.md @@ -0,0 +1,150 @@ +--- +layout: docu +title: Hugging Face Support +--- + +The `httpfs` extension introduces support for the `hf://` protocol to access data sets hosted in [Hugging Face](https://huggingface.co/) repositories. +See the [announcement blog post]({% post_url 2024-05-29-access-150k-plus-datasets-from-hugging-face-with-duckdb %}) for details. + +## Usage + +Hugging Face repositories can be queried using the following URL pattern: + +```text +hf://datasets/⟨my_username⟩/⟨my_dataset⟩/⟨path_to_file⟩ +``` + +For example, to read a CSV file, you can use the following query: + +```sql +SELECT * +FROM 'hf://datasets/datasets-examples/doc-formats-csv-1/data.csv'; +``` + +Where: + +* `datasets-examples` is the name of the user/organization +* `doc-formats-csv-1` is the name of the dataset repository +* `data.csv` is the file path in the repository + +The result of the query is: + +| kind | sound | +|---------|-------| +| dog | woof | +| cat | meow | +| pokemon | pika | +| human | hello | + +To read a JSONL file, you can run: + +```sql +SELECT * +FROM 'hf://datasets/datasets-examples/doc-formats-jsonl-1/data.jsonl'; +``` + +Finally, for reading a Parquet file, use the following query: + +```sql +SELECT * +FROM 'hf://datasets/datasets-examples/doc-formats-parquet-1/data/train-00000-of-00001.parquet'; +``` + +Each of these commands reads the data from the specified file format and displays it in a structured tabular format. Choose the appropriate command based on the file format you are working with. + +## Creating a local table + +To avoid accessing the remote endpoint for every query, you can save the data in a DuckDB table by running a [`CREATE TABLE ... AS` command]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas). For example: + +```sql +CREATE TABLE data AS + SELECT * + FROM 'hf://datasets/datasets-examples/doc-formats-csv-1/data.csv'; +``` + +Then, simply query the `data` table as follows: + +```sql +SELECT * +FROM data; +``` + +## Multiple files + +To query all files under a specific directory, you can use a [glob pattern]({% link docs/archive/1.0/data/multiple_files/overview.md %}#multi-file-reads-and-globs). For example: + +```sql +SELECT count(*) AS count +FROM 'hf://datasets/cais/mmlu/astronomy/*.parquet'; +``` + +| count | +|------:| +| 173 | + +By using glob patterns, you can efficiently handle large datasets and perform comprehensive queries across multiple files, simplifying your data inspections and processing tasks. +Here, you can see how you can look for questions that contain the word “planet” in astronomy: + +```sql +SELECT count(*) AS count +FROM 'hf://datasets/cais/mmlu/astronomy/*.parquet' +WHERE question LIKE '%planet%'; +``` + +| count | +|------:| +| 21 | + +## Versioning and revisions + +In Hugging Face repositories, dataset versions or revisions are different dataset updates. Each version is a snapshot at a specific time, allowing you to track changes and improvements. In git terms, it can be understood as a branch or specific commit. + +You can query different dataset versions/revisions by using the following URL: + +```text +hf://datasets/⟨my-username⟩/⟨my-dataset⟩@⟨my_branch⟩/⟨path_to_file⟩ +``` + +For example: + +```sql +SELECT * +FROM 'hf://datasets/datasets-examples/doc-formats-csv-1@~parquet/**/*.parquet'; +``` + +| kind | sound | +|---------|-------| +| dog | woof | +| cat | meow | +| pokemon | pika | +| human | hello | + +The previous query will read all parquet files under the `~parquet` revision. This is a special branch where Hugging Face automatically generates the Parquet files of every dataset to enable efficient scanning. + +## Authentication + +Configure your Hugging Face Token in the DuckDB Secrets Manager to access private or gated datasets. +First, visit [Hugging Face Settings – Tokens](https://huggingface.co/settings/tokens) to obtain your access token. +Second, set it in your DuckDB session using DuckDB’s [Secrets Manager]({% link docs/archive/1.0/configuration/secrets_manager.md %}). DuckDB supports two providers for managing secrets: + +### `CONFIG` + +The user must pass all configuration information into the `CREATE SECRET` statement. To create a secret using the `CONFIG` provider, use the following command: + +```sql +CREATE SECRET hf_token ( + TYPE HUGGINGFACE, + TOKEN 'your_hf_token' +); +``` + +### `CREDENTIAL_CHAIN` + +Automatically tries to fetch credentials. For the Hugging Face token, it will try to get it from `~/.cache/huggingface/token`. To create a secret using the `CREDENTIAL_CHAIN` provider, use the following command: + +```sql +CREATE SECRET hf_token ( + TYPE HUGGINGFACE, + PROVIDER CREDENTIAL_CHAIN +); +``` \ No newline at end of file diff --git a/docs/archive/1.0/extensions/httpfs/overview.md b/docs/archive/1.0/extensions/httpfs/overview.md new file mode 100644 index 00000000000..8ac53cf624f --- /dev/null +++ b/docs/archive/1.0/extensions/httpfs/overview.md @@ -0,0 +1,30 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/httpfs +layout: docu +redirect_from: +- /docs/archive/1.0/extensions/httpfs +- /docs/archive/1.0/extensions/httpfs/ +title: httpfs Extension for HTTP and S3 Support +--- + +The `httpfs` extension is an autoloadable extension implementing a file system that allows reading remote/writing remote files. +For plain HTTP(S), only file reading is supported. For object storage using the S3 API, the `httpfs` extension supports reading/writing/[globbing]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#globbing) files. + +## Installation and Loading + +The `httpfs` extension will be, by default, autoloaded on first use of any functionality exposed by this extension. + +To manually install and load the `httpfs` extension, run: + +```sql +INSTALL httpfs; +LOAD httpfs; +``` + +## HTTP(S) + +The `httpfs` extension supports connecting to [HTTP(S) endpoints]({% link docs/archive/1.0/extensions/httpfs/https.md %}). + +## S3 API + +The `httpfs` extension supports connecting to [S3 API endpoints]({% link docs/archive/1.0/extensions/httpfs/s3api.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/extensions/httpfs/s3api.md b/docs/archive/1.0/extensions/httpfs/s3api.md new file mode 100644 index 00000000000..4d1c4a89ded --- /dev/null +++ b/docs/archive/1.0/extensions/httpfs/s3api.md @@ -0,0 +1,256 @@ +--- +layout: docu +title: S3 API Support +--- + +The `httpfs` extension supports reading/writing/[globbing](#globbing) files on object storage servers using the S3 API. S3 offers a standard API to read and write to remote files (while regular http servers, predating S3, do not offer a common write API). DuckDB conforms to the S3 API, that is now common among industry storage providers. + +## Platforms + +The `httpfs` filesystem is tested with [AWS S3](https://aws.amazon.com/s3/), [Minio](https://min.io/), [Google Cloud](https://cloud.google.com/storage/docs/interoperability), and [lakeFS](https://docs.lakefs.io/integrations/duckdb.html). Other services that implement the S3 API (such as [Cloudflare R2](https://www.cloudflare.com/en-gb/developer-platform/r2/)) should also work, but not all features may be supported. + +The following table shows which parts of the S3 API are required for each `httpfs` feature. + +
+ +| Feature | Required S3 API features | +|:---|:---| +| Public file reads | HTTP Range requests | +| Private file reads | Secret key or session token authentication | +| File glob | [ListObjectV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html) | +| File writes | [Multipart upload](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html) | + +## Configuration and Authentication + +The preferred way to configure and authenticate to S3 endpoints is to use [secrets]({% link docs/archive/1.0/sql/statements/create_secret.md %}). Multiple secret providers are available. + +> Deprecated Prior to version 0.10.0, DuckDB did not have a [Secrets manager]({% link docs/archive/1.0/sql/statements/create_secret.md %}). Hence, the configuration of and authentication to S3 endpoints was handled via variables. See the [legacy authentication scheme for the S3 API]({% link docs/archive/1.0/extensions/httpfs/s3api_legacy_authentication.md %}). + +### `CONFIG` Provider + +The default provider, `CONFIG` (i.e., user-configured), allows access to the S3 bucket by manually providing a key. For example: + +```sql +CREATE SECRET secret1 ( + TYPE S3, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', + REGION 'us-east-1' +); +``` + +> Tip If you get an IO Error (`Connection error for HTTP HEAD`), configure the endpoint explicitly via `ENDPOINT 's3.⟨your-region⟩.amazonaws.com'`. + +Now, to query using the above secret, simply query any `s3://` prefixed file: + +```sql +SELECT * +FROM 's3://my-bucket/file.parquet'; +``` + +### `CREDENTIAL_CHAIN` Provider + +The `CREDENTIAL_CHAIN` provider allows automatically fetching credentials using mechanisms provided by the AWS SDK. For example, to use the AWS SDK default provider: + +```sql +CREATE SECRET secret2 ( + TYPE S3, + PROVIDER CREDENTIAL_CHAIN +); +``` + +Again, to query a file using the above secret, simply query any `s3://` prefixed file. + +DuckDB also allows specifying a specific chain using the `CHAIN` keyword. This takes a semicolon-separated list (`a;b;c`) of providers that will be tried in order. For example: + +```sql +CREATE SECRET secret3 ( + TYPE S3, + PROVIDER CREDENTIAL_CHAIN, + CHAIN 'env;config' +); +``` + +The possible values for `CHAIN` are the following: + +* [`config`](https://sdk.amazonaws.com/cpp/api/LATEST/aws-cpp-sdk-core/html/class_aws_1_1_auth_1_1_profile_config_file_a_w_s_credentials_provider.html) +* [`sts`](https://sdk.amazonaws.com/cpp/api/LATEST/aws-cpp-sdk-core/html/class_aws_1_1_auth_1_1_s_t_s_assume_role_web_identity_credentials_provider.html) +* [`sso`](https://sdk.amazonaws.com/cpp/api/LATEST/aws-cpp-sdk-core/html/class_aws_1_1_auth_1_1_s_s_o_credentials_provider.html) +* [`env`](https://sdk.amazonaws.com/cpp/api/LATEST/aws-cpp-sdk-core/html/class_aws_1_1_auth_1_1_environment_a_w_s_credentials_provider.html) +* [`instance`](https://sdk.amazonaws.com/cpp/api/LATEST/aws-cpp-sdk-core/html/class_aws_1_1_auth_1_1_instance_profile_credentials_provider.html) +* [`process`](https://sdk.amazonaws.com/cpp/api/LATEST/aws-cpp-sdk-core/html/class_aws_1_1_auth_1_1_process_credentials_provider.html) + +The `CREDENTIAL_CHAIN` provider also allows overriding the automatically fetched config. For example, to automatically load credentials, and then override the region, run: + +```sql +CREATE SECRET secret4 ( + TYPE S3, + PROVIDER CREDENTIAL_CHAIN, + CHAIN 'config', + REGION 'eu-west-1' +); +``` + +### Overview of S3 Secret Parameters + +Below is a complete list of the supported parameters that can be used for both the `CONFIG` and `CREDENTIAL_CHAIN` providers: + +| Name | Description | Secret | Type | Default | +|:------------------------------|:--------------------------------------------------------------------------------------|:------------------|:----------|:--------------------------------------------| +| `KEY_ID` | The ID of the key to use | `S3`, `GCS`, `R2` | `STRING` | - | +| `SECRET` | The secret of the key to use | `S3`, `GCS`, `R2` | `STRING` | - | +| `REGION` | The region for which to authenticate (should match the region of the bucket to query) | `S3`, `GCS`, `R2` | `STRING` | `us-east-1` | +| `SESSION_TOKEN` | Optionally, a session token can be passed to use temporary credentials | `S3`, `GCS`, `R2` | `STRING` | - | +| `ENDPOINT` | Specify a custom S3 endpoint | `S3`, `GCS`, `R2` | `STRING` | `s3.amazonaws.com` for `S3`, | +| `URL_STYLE` | Either `vhost` or `path` | `S3`, `GCS`, `R2` | `STRING` | `vhost` for `S3`, `path` for `R2` and `GCS` | +| `USE_SSL` | Whether to use HTTPS or HTTP | `S3`, `GCS`, `R2` | `BOOLEAN` | `true` | +| `URL_COMPATIBILITY_MODE` | Can help when urls contain problematic characters. | `S3`, `GCS`, `R2` | `BOOLEAN` | `true` | +| `ACCOUNT_ID` | The R2 account ID to use for generating the endpoint url | `R2` | `STRING` | - | + +### Platform-Specific Secret Types + +#### R2 Secrets + +While [Cloudflare R2](https://www.cloudflare.com/developer-platform/r2) uses the regular S3 API, DuckDB has a special Secret type, `R2`, to make configuring it a bit simpler: + +```sql +CREATE SECRET secret5 ( + TYPE R2, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', + ACCOUNT_ID 'my_account_id' +); +``` + +Note the addition of the `ACCOUNT_ID` which is used to generate to correct endpoint url for you. Also note that for `R2` Secrets can also use both the `CONFIG` and `CREDENTIAL_CHAIN` providers. Finally, `R2` secrets are only available when using urls starting with `r2://`, for example: + +```sql +SELECT * +FROM read_parquet('r2://some/file/that/uses/r2/secret/file.parquet'); +``` + +#### GCS Secrets + +While [Google Cloud Storage](https://cloud.google.com/storage) is accessed by DuckDB using the S3 API, DuckDB has a special Secret type, `GCS`, to make configuring it a bit simpler: + +```sql +CREATE SECRET secret6 ( + TYPE GCS, + KEY_ID 'my_key', + SECRET 'my_secret' +); +``` + +Note that the above secret, will automatically have the correct Google Cloud Storage endpoint configured. Also note that for `GCS` Secrets can also use both the `CONFIG` and `CREDENTIAL_CHAIN` providers. Finally, `GCS` secrets are only available when using urls starting with `gcs://` or `gs://`, for example: + +```sql +SELECT * +FROM read_parquet('gcs://some/file/that/uses/gcs/secret/file.parquet'); +``` + +## Reading + +Reading files from S3 is now as simple as: + +```sql +SELECT * +FROM 's3://bucket/file.extension'; +``` + +### Partial Reading + +The `httpfs` extension supports [partial reading]({% link docs/archive/1.0/extensions/httpfs/https.md %}#partial-reading) from S3 buckets. + +### Reading Multiple Files + +Multiple files are also possible, for example: + +```sql +SELECT * +FROM read_parquet([ + 's3://bucket/file1.parquet', + 's3://bucket/file2.parquet' +]); +``` + +### Globbing + +File [globbing]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#globbing) is implemented using the ListObjectV2 API call and allows to use filesystem-like glob patterns to match multiple files, for example: + +```sql +SELECT * +FROM read_parquet('s3://bucket/*.parquet'); +``` + +This query matches all files in the root of the bucket with the [Parquet extension]({% link docs/archive/1.0/data/parquet/overview.md %}). + +Several features for matching are supported, such as `*` to match any number of any character, `?` for any single character or `[0-9]` for a single character in a range of characters: + +```sql +SELECT count(*) FROM read_parquet('s3://bucket/folder*/100?/t[0-9].parquet'); +``` + +A useful feature when using globs is the `filename` option, which adds a column named `filename` that encodes the file that a particular row originated from: + +```sql +SELECT * +FROM read_parquet('s3://bucket/*.parquet', filename = true); +``` + +could for example result in: + +
+ +| column_a | column_b | filename | +|:---|:---|:---| +| 1 | examplevalue1 | s3://bucket/file1.parquet | +| 2 | examplevalue1 | s3://bucket/file2.parquet | + +### Hive Partitioning + +DuckDB also offers support for the [Hive partitioning scheme]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}), which is available when using HTTP(S) and S3 endpoints. + +## Writing + +Writing to S3 uses the multipart upload API. This allows DuckDB to robustly upload files at high speed. Writing to S3 works for both CSV and Parquet: + +```sql +COPY table_name TO 's3://bucket/file.extension'; +``` + +Partitioned copy to S3 also works: + +```sql +COPY table TO 's3://my-bucket/partitioned' ( + FORMAT PARQUET, + PARTITION_BY (part_col_a, part_col_b) +); +``` + +An automatic check is performed for existing files/directories, which is currently quite conservative (and on S3 will add a bit of latency). To disable this check and force writing, an `OVERWRITE_OR_IGNORE` flag is added: + +```sql +COPY table TO 's3://my-bucket/partitioned' ( + FORMAT PARQUET, + PARTITION_BY (part_col_a, part_col_b), + OVERWRITE_OR_IGNORE true +); +``` + +The naming scheme of the written files looks like this: + +```text +s3://my-bucket/partitioned/part_col_a=⟨val⟩/part_col_b=⟨val⟩/data_⟨thread_number⟩.parquet +``` + +### Configuration + +Some additional configuration options exist for the S3 upload, though the default values should suffice for most use cases. + +
+ +| Name | Description | +|:---|:---| +| `s3_uploader_max_parts_per_file` | used for part size calculation, see [AWS docs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html) | +| `s3_uploader_max_filesize` | used for part size calculation, see [AWS docs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html) | +| `s3_uploader_thread_limit` | maximum number of uploader threads | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/httpfs/s3api_legacy_authentication.md b/docs/archive/1.0/extensions/httpfs/s3api_legacy_authentication.md new file mode 100644 index 00000000000..cf0ce691fd5 --- /dev/null +++ b/docs/archive/1.0/extensions/httpfs/s3api_legacy_authentication.md @@ -0,0 +1,86 @@ +--- +layout: docu +title: Legacy Authentication Scheme for S3 API +--- + +Prior to version 0.10.0, DuckDB did not have a [Secrets manager]({% link docs/archive/1.0/sql/statements/create_secret.md %}). Hence, the configuration of and authentication to S3 endpoints was handled via variables. This page documents the legacy authentication scheme for the S3 API. + +> The recommended way to configuration and authentication of S3 endpoints is to use [secrets]({% link docs/archive/1.0/extensions/httpfs/s3api.md %}#configuration-and-authentication). + +## Legacy Authentication Scheme + +To be able to read or write from S3, the correct region should be set: + +```sql +SET s3_region = 'us-east-1'; +``` + +Optionally, the endpoint can be configured in case a non-AWS object storage server is used: + +```sql +SET s3_endpoint = '⟨domain⟩.⟨tld⟩:⟨port⟩'; +``` + +If the endpoint is not SSL-enabled then run: + +```sql +SET s3_use_ssl = false; +``` + +Switching between [path-style](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#path-style-access) and [vhost-style](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#virtual-hosted-style-access) URLs is possible using: + +```sql +SET s3_url_style = 'path'; +``` + +However, note that this may also require updating the endpoint. For example for AWS S3 it is required to change the endpoint to `s3.⟨region⟩.amazonaws.com`. + +After configuring the correct endpoint and region, public files can be read. To also read private files, authentication credentials can be added: + +```sql +SET s3_access_key_id = '⟨AWS access key id⟩'; +SET s3_secret_access_key = '⟨AWS secret access key⟩'; +``` + +Alternatively, session tokens are also supported and can be used instead: + +```sql +SET s3_session_token = '⟨AWS session token⟩'; +``` + +The [`aws` extension]({% link docs/archive/1.0/extensions/aws.md %}) allows for loading AWS credentials. + +## Per-Request Configuration + +Aside from the global S3 configuration described above, specific configuration values can be used on a per-request basis. This allows for use of multiple sets of credentials, regions, etc. These are used by including them on the S3 URI as query parameters. All the individual configuration values listed above can be set as query parameters. For instance: + +```sql +SELECT * +FROM 's3://bucket/file.parquet?s3_access_key_id=accessKey&s3_secret_access_key=secretKey'; +``` + +Multiple configurations per query are also allowed: + +```sql +SELECT * +FROM 's3://bucket/file.parquet?s3_region=region&s3_session_token=session_token' t1 +INNER JOIN 's3://bucket/file.csv?s3_access_key_id=accessKey&s3_secret_access_key=secretKey' t2; +``` + +## Configuration + +Some additional configuration options exist for the S3 upload, though the default values should suffice for most use cases. + +Additionally, most of the configuration options can be set via environment variables: + +
+ +| DuckDB setting | Environment variable | Note | +|:-----------------------|:------------------------|:-----------------------------------------| +| `s3_region` | `AWS_REGION` | Takes priority over `AWS_DEFAULT_REGION` | +| `s3_region` | `AWS_DEFAULT_REGION` | | +| `s3_access_key_id` | `AWS_ACCESS_KEY_ID` | | +| `s3_secret_access_key` | `AWS_SECRET_ACCESS_KEY` | | +| `s3_session_token` | `AWS_SESSION_TOKEN` | | +| `s3_endpoint` | `DUCKDB_S3_ENDPOINT` | | +| `s3_use_ssl` | `DUCKDB_S3_USE_SSL` | | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/iceberg.md b/docs/archive/1.0/extensions/iceberg.md new file mode 100644 index 00000000000..d53829b7584 --- /dev/null +++ b/docs/archive/1.0/extensions/iceberg.md @@ -0,0 +1,75 @@ +--- +github_repository: https://github.com/duckdb/duckdb_iceberg +layout: docu +title: Iceberg Extension +--- + +The `iceberg` extension is a loadable extension that implements support for the [Apache Iceberg format](https://iceberg.apache.org/). + +## Installing and Loading + +To install and load the `iceberg` extension, run: + +```sql +INSTALL iceberg; +LOAD iceberg; +``` + +## Usage + +To test the examples, download the [`iceberg_data.zip`](/data/iceberg_data.zip) file and unzip it. + +### Querying Individual Tables + +```sql +SELECT count(*) +FROM iceberg_scan('data/iceberg/lineitem_iceberg', allow_moved_paths = true); +``` + +| count_star() | +|--------------| +| 51793 | + +> The `allow_moved_paths` option ensures that some path resolution is performed, which allows scanning Iceberg tables that are moved. + +You can also address specify the current manifest directly in the query, this may be resolved from the catalog prior to the query, in this example the manifest version is a UUID. + +```sql +SELECT count(*) +FROM iceberg_scan('data/iceberg/lineitem_iceberg/metadata/02701-1e474dc7-4723-4f8d-a8b3-b5f0454eb7ce.metadata.json'); +``` + +This extension can be paired with the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) to access Iceberg tables in object stores such as S3. + +```sql +SELECT count(*) +FROM iceberg_scan('s3://bucketname/lineitem_iceberg/metadata/02701-1e474dc7-4723-4f8d-a8b3-b5f0454eb7ce.metadata.json', allow_moved_paths = true); +``` + +### Access Iceberg Metadata + +```sql +SELECT * +FROM iceberg_metadata('data/iceberg/lineitem_iceberg', allow_moved_paths = true); +``` + +| manifest_path | manifest_sequence_number | manifest_content | status | content | file_path | file_format | record_count | +|------------------------------------------------------------------------|--------------------------|------------------|---------|----------|------------------------------------------------------------------------------------|-------------|--------------| +| lineitem_iceberg/metadata/10eaca8a-1e1c-421e-ad6d-b232e5ee23d3-m1.avro | 2 | DATA | ADDED | EXISTING | lineitem_iceberg/data/00041-414-f3c73457-bbd6-4b92-9c15-17b241171b16-00001.parquet | PARQUET | 51793 | +| lineitem_iceberg/metadata/10eaca8a-1e1c-421e-ad6d-b232e5ee23d3-m0.avro | 2 | DATA | DELETED | EXISTING | lineitem_iceberg/data/00000-411-0792dcfe-4e25-4ca3-8ada-175286069a47-00001.parquet | PARQUET | 60175 | + +### Visualizing Snapshots + +```sql +SELECT * +FROM iceberg_snapshots('data/iceberg/lineitem_iceberg'); +``` + +| sequence_number | snapshot_id | timestamp_ms | manifest_list | +|-----------------|---------------------|-------------------------|------------------------------------------------------------------------------------------------| +| 1 | 3776207205136740581 | 2023-02-15 15:07:54.504 | lineitem_iceberg/metadata/snap-3776207205136740581-1-cf3d0be5-cf70-453d-ad8f-48fdc412e608.avro | +| 2 | 7635660646343998149 | 2023-02-15 15:08:14.73 | lineitem_iceberg/metadata/snap-7635660646343998149-1-10eaca8a-1e1c-421e-ad6d-b232e5ee23d3.avro | + +## Limitations + +Writing (i.e., exporting to) Iceberg files is currently not supported. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/icu.md b/docs/archive/1.0/extensions/icu.md new file mode 100644 index 00000000000..fd18b8010a3 --- /dev/null +++ b/docs/archive/1.0/extensions/icu.md @@ -0,0 +1,24 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/icu +layout: docu +title: ICU Extension +--- + +The `icu` extension contains an easy-to-use version of the collation/timezone part of the [ICU library](https://github.com/unicode-org/icu). + +## Installing and Loading + +The `icu` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL icu; +LOAD icu; +``` + +## Features + +The `icu` extension introduces the following features: + +* [region-dependent collations]({% link docs/archive/1.0/sql/expressions/collations.md %}) +* [time zones]({% link docs/archive/1.0/sql/data_types/timezones.md %}), used for [timestamp data types]({% link docs/archive/1.0/sql/data_types/timestamp.md %}) and [timestamp functions]({% link docs/archive/1.0/sql/functions/timestamptz.md %}) \ No newline at end of file diff --git a/docs/archive/1.0/extensions/inet.md b/docs/archive/1.0/extensions/inet.md new file mode 100644 index 00000000000..149439c711b --- /dev/null +++ b/docs/archive/1.0/extensions/inet.md @@ -0,0 +1,86 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/inet +layout: docu +title: inet Extension +--- + +The `inet` extension defines the `INET` data type for storing [IPv4](https://en.wikipedia.org/wiki/Internet_Protocol_version_4) and [IPv6](https://en.wikipedia.org/wiki/IPv6) Internet addresses. It supports the [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation) for subnet masks (e.g., `198.51.100.0/22`, `2001:db8:3c4d::/48`). + +## Installing and Loading + +The `inet` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL inet; +LOAD inet; +``` + +## Examples + +```sql +SELECT '127.0.0.1'::INET AS ipv4, '2001:db8:3c4d::/48'::INET AS ipv6; +``` + +| ipv4 | ipv6 | +|-----------|--------------------| +| 127.0.0.1 | 2001:db8:3c4d::/48 | + +```sql +CREATE TABLE tbl (id INTEGER, ip INET); +INSERT INTO tbl VALUES + (1, '192.168.0.0/16'), + (2, '127.0.0.1'), + (3, '8.8.8.8'), + (4, 'fe80::/10'), + (5, '2001:db8:3c4d:15::1a2f:1a2b'); +SELECT * FROM tbl; +``` + +| id | ip | +|---:|-----------------------------| +| 1 | 192.168.0.0/16 | +| 2 | 127.0.0.1 | +| 3 | 8.8.8.8 | +| 4 | fe80::/10 | +| 5 | 2001:db8:3c4d:15::1a2f:1a2b | + +## Operations on `INET` Values + +`INET` values can be compared naturally, and IPv4 will sort before IPv6. Additionally, IP addresses can be modified by adding or subtracting integers. + +```sql +CREATE TABLE tbl (cidr INET); +INSERT INTO tbl VALUES + ('127.0.0.1'::INET + 10), + ('fe80::10'::INET - 9), + ('127.0.0.1'), + ('2001:db8:3c4d:15::1a2f:1a2b'); +SELECT cidr FROM tbl ORDER BY cidr ASC; +``` + +| cidr | +|-----------------------------| +| 127.0.0.1 | +| 127.0.0.11 | +| 2001:db8:3c4d:15::1a2f:1a2b | +| fe80::7 | + +## `host` Function + +The host component of an `INET` value can be extracted using the `HOST()` function. + +```sql +CREATE TABLE tbl (cidr INET); +INSERT INTO tbl VALUES + ('192.168.0.0/16'), + ('127.0.0.1'), + ('2001:db8:3c4d:15::1a2f:1a2b/96'); +SELECT cidr, host(cidr) FROM tbl; +``` + +| cidr | host(cidr) | +|--------------------------------|-----------------------------| +| 192.168.0.0/16 | 192.168.0.0 | +| 127.0.0.1 | 127.0.0.1 | +| 2001:db8:3c4d:15::1a2f:1a2b/96 | 2001:db8:3c4d:15::1a2f:1a2b | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/jemalloc.md b/docs/archive/1.0/extensions/jemalloc.md new file mode 100644 index 00000000000..cfe6d4842f5 --- /dev/null +++ b/docs/archive/1.0/extensions/jemalloc.md @@ -0,0 +1,36 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/jemalloc +layout: docu +title: jemalloc Extension +--- + +The `jemalloc` extension replaces the system's memory allocator with [jemalloc](https://jemalloc.net/). +Unlike other DuckDB extensions, the `jemalloc` extension is statically linked and cannot be installed or loaded during runtime. + +## Operating System Support + +The availability of the `jemalloc` extension depends on the operating system. + +### Linux + +The Linux version of DuckDB ships with the `jemalloc` extension by default. + +> DuckDB v0.10.1 introduced a change: on ARM64 architecture, DuckDB is shipped without `jemalloc`, while on x86_64 (AMD64) architectures, it is shipped with `jemalloc`. + +To disable the `jemalloc` extension, [build DuckDB from source]({% link docs/archive/1.0/dev/building/build_instructions.md %}) and set the `SKIP_EXTENSIONS` flag as follows: + +```bash +GEN=ninja SKIP_EXTENSIONS="jemalloc" make +``` + +### macOS + +The macOS version of DuckDB does not ship with the `jemalloc` extension but can be [built from source]({% link docs/archive/1.0/dev/building/build_instructions.md %}) to include it: + +```bash +GEN=ninja BUILD_JEMALLOC=1 make +``` + +### Windows + +On Windows, this extension is not available. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/json.md b/docs/archive/1.0/extensions/json.md new file mode 100644 index 00000000000..d7432dcd4d1 --- /dev/null +++ b/docs/archive/1.0/extensions/json.md @@ -0,0 +1,1041 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/json +layout: docu +title: JSON Extension +--- + +The `json` extension is a loadable extension that implements SQL functions that are useful for reading values from existing JSON, and creating new JSON data. + +## Installing and Loading + +The `json` extension is shipped by default in DuckDB builds, otherwise, it will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use. +If you would like to install and load it manually, run: + +```sql +INSTALL json; +LOAD json; +``` + +## Example Uses + +Read a JSON file from disk, auto-infer options: + +```sql +SELECT * FROM 'todos.json'; +``` + +`read_json` with custom options: + +```sql +SELECT * +FROM read_json('todos.json', + format = 'array', + columns = {userId: 'UBIGINT', + id: 'UBIGINT', + title: 'VARCHAR', + completed: 'BOOLEAN'}); +``` + +Write the result of a query to a JSON file: + +```sql +COPY (SELECT * FROM todos) TO 'todos.json'; +``` + +See more examples of loading JSON data on the [JSON data page]({% link docs/archive/1.0/data/json/overview.md %}#examples): + +Create a table with a column for storing JSON data: + +```sql +CREATE TABLE example (j JSON); +``` + +Insert JSON data into the table: + +```sql +INSERT INTO example VALUES + ('{ "family": "anatidae", "species": [ "duck", "goose", "swan", null ] }'); +``` + +Retrieve the family key's value: + +```sql +SELECT j.family FROM example; +``` + +```text +"anatidae" +``` + +Extract the family key's value with a JSONPath expression: + +```sql +SELECT j->'$.family' FROM example; +``` + +```text +"anatidae" +``` + +Extract the family key's value with a JSONPath expression as a VARCHAR: + +```sql +SELECT j->>'$.family' FROM example; +``` + +```text +anatidae +``` + +## JSON Type + +The `json` extension makes use of the `JSON` logical type. +The `JSON` logical type is interpreted as JSON, i.e., parsed, in JSON functions rather than interpreted as `VARCHAR`, i.e., a regular string (modulo the equality-comparison caveat at the bottom of this page). +All JSON creation functions return values of this type. + +We also allow any of DuckDB's types to be casted to JSON, and JSON to be casted back to any of DuckDB's types, for example, to cast `JSON` to DuckDB's `STRUCT` type, run: + +```sql +SELECT '{"duck": 42}'::JSON::STRUCT(duck INTEGER); +``` + +```text +{'duck': 42} +``` + +And back: + +```sql +SELECT {duck: 42}::JSON; +``` + +```text +{"duck":42} +``` + +This works for our nested types as shown in the example, but also for non-nested types: + +```sql +SELECT '2023-05-12'::DATE::JSON; +``` + +```text +"2023-05-12" +``` + +The only exception to this behavior is the cast from `VARCHAR` to `JSON`, which does not alter the data, but instead parses and validates the contents of the `VARCHAR` as JSON. + +## JSON Table Functions + +The following table functions are used to read JSON: + +| Function | Description | +|:---|:---| +| `read_json_objects(filename)` | Read a JSON object from `filename`, where `filename` can also be a list of files or a glob pattern. | +| `read_ndjson_objects(filename)` | Alias for `read_json_objects` with parameter `format` set to `'newline_delimited'`. | +| `read_json_objects_auto(filename)` | Alias for `read_json_objects` with parameter `format` set to `'auto'`. | + +These functions have the following parameters: + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `compression` | The compression type for the file. By default this will be detected automatically from the file extension (e.g., `t.json.gz` will use gzip, `t.json` will use none). Options are `'none'`, `'gzip'`, `'zstd'`, and `'auto'`. | `VARCHAR` | `'auto'` | +| `filename` | Whether or not an extra `filename` column should be included in the result. | `BOOL` | `false` | +| `format` | Can be one of `['auto', 'unstructured', 'newline_delimited', 'array']`. | `VARCHAR` | `'array'` | +| `hive_partitioning` | Whether or not to interpret the path as a [Hive partitioned path]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}). | `BOOL` | `false` | +| `ignore_errors` | Whether to ignore parse errors (only possible when `format` is `'newline_delimited'`). | `BOOL` | `false` | +| `maximum_sample_files` | The maximum number of JSON files sampled for auto-detection. | `BIGINT` | `32` | +| `maximum_object_size` | The maximum size of a JSON object (in bytes). | `UINTEGER` | `16777216` | + +The `format` parameter specifies how to read the JSON from a file. +With `'unstructured'`, the top-level JSON is read, e.g.: + +```json +{ + "duck": 42 +} +{ + "goose": [1, 2, 3] +} +``` + +will result in two objects being read. + +With `'newline_delimited'`, [NDJSON](http://ndjson.org) is read, where each JSON is separated by a newline (`\n`), e.g.: + +```json +{"duck": 42} +{"goose": [1, 2, 3]} +``` + +will also result in two objects being read. + +With `'array'`, each array element is read, e.g.: + +```json +[ + { + "duck": 42 + }, + { + "goose": [1, 2, 3] + } +] +``` + +Again, will result in two objects being read. + +Example usage: + +```sql +SELECT * FROM read_json_objects('my_file1.json'); +``` + +```text +{"duck":42,"goose":[1,2,3]} +``` + +```sql +SELECT * FROM read_json_objects(['my_file1.json', 'my_file2.json']); +``` + +```text +{"duck":42,"goose":[1,2,3]} +{"duck":43,"goose":[4,5,6],"swan":3.3} +``` + +```sql +SELECT * FROM read_ndjson_objects('*.json.gz'); +``` + +```text +{"duck":42,"goose":[1,2,3]} +{"duck":43,"goose":[4,5,6],"swan":3.3} +``` + +DuckDB also supports reading JSON as a table, using the following functions: + +| Function | Description | +|:----|:-------| +| `read_json(filename)` | Read JSON from `filename`, where `filename` can also be a list of files, or a glob pattern. | +| `read_json_auto(filename)` | Alias for `read_json` with all auto-detection enabled. | +| `read_ndjson(filename)` | Alias for `read_json` with parameter `format` set to `'newline_delimited'`. | +| `read_ndjson_auto(filename)` | Alias for `read_json_auto` with parameter `format` set to `'newline_delimited'`. | + +Besides the `maximum_object_size`, `format`, `ignore_errors` and `compression`, these functions have additional parameters: + +| Name | Description | Type | Default | +|:--|:------|:-|:-| +| `auto_detect` | Whether to auto-detect the names of the keys and data types of the values automatically | `BOOL` | `false` | +| `columns` | A struct that specifies the key names and value types contained within the JSON file (e.g., `{key1: 'INTEGER', key2: 'VARCHAR'}`). If `auto_detect` is enabled these will be inferred | `STRUCT` | `(empty)` | +| `dateformat` | Specifies the date format to use when parsing dates. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | `'iso'` | +| `maximum_depth` | Maximum nesting depth to which the automatic schema detection detects types. Set to -1 to fully detect nested JSON types | `BIGINT` | `-1` | +| `records` | Can be one of `['auto', 'true', 'false']` | `VARCHAR` | `'records'` | +| `sample_size` | Option to define number of sample objects for automatic JSON type detection. Set to -1 to scan the entire input file | `UBIGINT` | `20480` | +| `timestampformat` | Specifies the date format to use when parsing timestamps. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | `'iso'`| +| `union_by_name` | Whether the schema's of multiple JSON files should be [unified]({% link docs/archive/1.0/data/multiple_files/combining_schemas.md %}) | `BOOL` | `false` | + +Example usage: + +```sql +SELECT * FROM read_json('my_file1.json', columns = {duck: 'INTEGER'}); +``` + +
+ +| duck | +|:---| +| 42 | + +DuckDB can convert JSON arrays directly to its internal `LIST` type, and missing keys become `NULL`: + +```sql +SELECT * +FROM read_json( + ['my_file1.json', 'my_file2.json'], + columns = {duck: 'INTEGER', goose: 'INTEGER[]', swan: 'DOUBLE'} + ); +``` + +
+ +| duck | goose | swan | +|:---|:---|:---| +| 42 | [1, 2, 3] | NULL | +| 43 | [4, 5, 6] | 3.3 | + +DuckDB can automatically detect the types like so: + +```sql +SELECT goose, duck FROM read_json('*.json.gz'); +SELECT goose, duck FROM '*.json.gz'; -- equivalent +``` + +
+ +| goose | duck | +|:---|:---| +| [1, 2, 3] | 42 | +| [4, 5, 6] | 43 | + +DuckDB can read (and auto-detect) a variety of formats, specified with the `format` parameter. +Querying a JSON file that contains an `'array'`, e.g.: + +```json +[ + { + "duck": 42, + "goose": 4.2 + }, + { + "duck": 43, + "goose": 4.3 + } +] +``` + +Can be queried exactly the same as a JSON file that contains `'unstructured'` JSON, e.g.: + +```json +{ + "duck": 42, + "goose": 4.2 +} +{ + "duck": 43, + "goose": 4.3 +} +``` + +Both can be read as the table: + +
+ +| duck | goose | +|:---|:---| +| 42 | 4.2 | +| 43 | 4.3 | + +If your JSON file does not contain 'records', i.e., any other type of JSON than objects, DuckDB can still read it. +This is specified with the `records` parameter. +The `records` parameter specifies whether the JSON contains records that should be unpacked into individual columns, i.e., reading the following file with `records`: + +```json +{"duck": 42, "goose": [1, 2, 3]} +{"duck": 43, "goose": [4, 5, 6]} +``` + +Results in two columns: + +
+ +| duck | goose | +|:---|:---| +| 42 | [1,2,3] | +| 42 | [4,5,6] | + +You can read the same file with `records` set to `'false'`, to get a single column, which is a `STRUCT` containing the data: + +
+ +| json | +|:---| +| {'duck': 42, 'goose': [1,2,3]} | +| {'duck': 43, 'goose': [4,5,6]} | + +For additional examples reading more complex data, please see the [“Shredding Deeply Nested JSON, One Vector at a Time” blog post]({% post_url 2023-03-03-json %}). + +## JSON Import/Export + +When the `json` extension is installed, `FORMAT JSON` is supported for `COPY FROM`, `COPY TO`, `EXPORT DATABASE` and `IMPORT DATABASE`. See [Copy]({% link docs/archive/1.0/sql/statements/copy.md %}) and [Import/Export]({% link docs/archive/1.0/sql/statements/export.md %}). + +By default, `COPY` expects newline-delimited JSON. If you prefer copying data to/from a JSON array, you can specify `ARRAY true`, e.g., + +```sql +COPY (SELECT * FROM range(5)) TO 'my.json' (ARRAY true); +``` + +will create the following file: + +```json +[ + {"range":0}, + {"range":1}, + {"range":2}, + {"range":3}, + {"range":4} +] +``` + +This can be read like so: + +```sql +CREATE TABLE test (range BIGINT); +COPY test FROM 'my.json' (ARRAY true); +``` + +The format can be detected automatically the format like so: + +```sql +COPY test FROM 'my.json' (AUTO_DETECT true); +``` + +## JSON Creation Functions + +The following functions are used to create JSON. + +
+ +| Function | Description | +|:--|:----| +| `to_json(any)` | Create `JSON` from a value of `any` type. Our `LIST` is converted to a JSON array, and our `STRUCT` and `MAP` are converted to a JSON object. | +| `json_quote(any)` | Alias for `to_json`. | +| `array_to_json(list)` | Alias for `to_json` that only accepts `LIST`. | +| `row_to_json(list)` | Alias for `to_json` that only accepts `STRUCT`. | +| `json_array([any, ...])` | Create a JSON array from `any` number of values. | +| `json_object([key, value, ...])` | Create a JSON object from any number of `key`, `value` pairs. | +| `json_merge_patch(json, json)` | Merge two JSON documents together. | + +Examples: + +```sql +SELECT to_json('duck'); +``` + +```text +"duck" +``` + +```sql +SELECT to_json([1, 2, 3]); +``` + +```text +[1,2,3] +``` + +```sql +SELECT to_json({duck : 42}); +``` + +```text +{"duck":42} +``` + +```sql +SELECT to_json(map(['duck'],[42])); +``` + +```text +{"duck":42} +``` + +```sql +SELECT json_array(42, 'duck', NULL); +``` + +```text +[42,"duck",null] +``` + +```sql +SELECT json_object('duck', 42); +``` + +```text +{"duck":42} +``` + +```sql +SELECT json_merge_patch('{"duck": 42}', '{"goose": 123}'); +``` + +```text +{"goose":123,"duck":42} +``` + + + +## JSON Extraction Functions + +There are two extraction functions, which have their respective operators. The operators can only be used if the string is stored as the `JSON` logical type. +These functions supports the same two location notations as the previous functions. + +| Function | Alias | Operator | Description | +|:---|:---|:-| +| `json_extract(json, path)` | `json_extract_path` | `->` | Extract `JSON` from `json` at the given `path`. If `path` is a `LIST`, the result will be a `LIST` of `JSON` | +| `json_extract_string(json, path)` | `json_extract_path_text` | `->>` | Extract `VARCHAR` from `json` at the given `path`. If `path` is a `LIST`, the result will be a `LIST` of `VARCHAR` | + +Note that the equality comparison operator (`=`) has a higher precedence than the `->` JSON extract operator. Therefore, surround the uses of the `->` operator with parentheses when making equality comparisons. For example: + +```sql +SELECT ((JSON '{"field": 42}')->'field') = 42; +``` + +> Warning DuckDB's JSON data type uses [0-based indexing](#indexing). + +Examples: + +```sql +CREATE TABLE example (j JSON); +INSERT INTO example VALUES + ('{ "family": "anatidae", "species": [ "duck", "goose", "swan", null ] }'); +``` + +```sql +SELECT json_extract(j, '$.family') FROM example; +``` + +```text +"anatidae" +``` + +```sql +SELECT j->'$.family' FROM example; +``` + +```text +"anatidae" +``` + +```sql +SELECT j->'$.species[0]' FROM example; +``` + +```text +"duck" +``` + +```sql +SELECT j->'$.species[*]' FROM example; +``` + +```text +["duck", "goose", "swan", null] +``` + +```sql +SELECT j->>'$.species[*]' FROM example; +``` + +```text +[duck, goose, swan, null] +``` + +```sql +SELECT j->'$.species'->0 FROM example; +``` + +```text +"duck" +``` + +```sql +SELECT j->'species'->['0','1'] FROM example; +``` + +```text +["duck", "goose"] +``` + +```sql +SELECT json_extract_string(j, '$.family') FROM example; +``` + +```text +anatidae +``` + +```sql +SELECT j->>'$.family' FROM example; +``` + +```text +anatidae +``` + +```sql +SELECT j->>'$.species[0]' FROM example; +``` + +```text +duck +``` + +```sql +SELECT j->'species'->>0 FROM example; +``` + +```text +duck +``` + +```sql +SELECT j->'species'->>['0','1'] FROM example; +``` + +```text +[duck, goose] +``` + +Note that DuckDB's JSON data type uses [0-based indexing](#indexing). + +If multiple values need to be extracted from the same JSON, it is more efficient to extract a list of paths: + +The following will cause the JSON to be parsed twice,: + +Resulting in a slower query that uses more memory: + +```sql +SELECT + json_extract(j, 'family') AS family, + json_extract(j, 'species') AS species +FROM example; +``` + +
+ +| family | species | +|------------|------------------------------| +| "anatidae" | ["duck","goose","swan",null] | + +The following produces the same result but is faster and more memory-efficient: + +```sql +WITH extracted AS ( + SELECT json_extract(j, ['family', 'species']) AS extracted_list + FROM example +) +SELECT + extracted_list[1] AS family, + extracted_list[2] AS species +FROM extracted; +``` + +## JSON Scalar Functions + +The following scalar JSON functions can be used to gain information about the stored JSON values. +With the exception of `json_valid(json)`, all JSON functions produce an error when invalid JSON is supplied. + +We support two kinds of notations to describe locations within JSON: [JSON Pointer](https://datatracker.ietf.org/doc/html/rfc6901) and JSONPath. + +| Function | Description | +|:---|:----| +| `json_array_length(json[, path])` | Return the number of elements in the JSON array `json`, or `0` if it is not a JSON array. If `path` is specified, return the number of elements in the JSON array at the given `path`. If `path` is a `LIST`, the result will be `LIST` of array lengths. | +| `json_contains(json_haystack, json_needle)` | Returns `true` if `json_needle` is contained in `json_haystack`. Both parameters are of JSON type, but `json_needle` can also be a numeric value or a string, however the string must be wrapped in double quotes. | +| `json_keys(json[, path])` | Returns the keys of `json` as a `LIST` of `VARCHAR`, if `json` is a JSON object. If `path` is specified, return the keys of the JSON object at the given `path`. If `path` is a `LIST`, the result will be `LIST` of `LIST` of `VARCHAR`. | +| `json_structure(json)` | Return the structure of `json`. Defaults to `JSON` if the structure is inconsistent (e.g., incompatible types in an array). | +| `json_type(json[, path])` | Return the type of the supplied `json`, which is one of `ARRAY`, `BIGINT`, `BOOLEAN`, `DOUBLE`, `OBJECT`, `UBIGINT`, `VARCHAR`, and `NULL`. If `path` is specified, return the type of the element at the given `path`. If `path` is a `LIST`, the result will be `LIST` of types. | +| `json_valid(json)` | Return whether `json` is valid JSON. | +| `json(json)` | Parse and minify `json`. | + +The JSONPointer syntax separates each field with a `/`. +For example, to extract the first element of the array with key `duck`, you can do: + +```sql +SELECT json_extract('{"duck": [1, 2, 3]}', '/duck/0'); +``` + +```text +1 +``` + +The JSONPath syntax separates fields with a `.`, and accesses array elements with `[i]`, and always starts with `$`. Using the same example, we can do the following: + +```sql +SELECT json_extract('{"duck": [1, 2, 3]}', '$.duck[0]'); +``` + +```text +1 +``` + +Note that DuckDB's JSON data type uses [0-based indexing](#indexing). + +JSONPath is more expressive, and can also access from the back of lists: + +```sql +SELECT json_extract('{"duck": [1, 2, 3]}', '$.duck[#-1]'); +``` + +```text +3 +``` + +JSONPath also allows escaping syntax tokens, using double quotes: + +```sql +SELECT json_extract('{"duck.goose": [1, 2, 3]}', '$."duck.goose"[1]'); +``` + +```text +2 +``` + +Examples using the [anatidae biological family](https://en.wikipedia.org/wiki/Anatidae): + +```sql +CREATE TABLE example (j JSON); +INSERT INTO example VALUES + ('{ "family": "anatidae", "species": [ "duck", "goose", "swan", null ] }'); +``` + +```sql +SELECT json(j) FROM example; +``` + +```text +{"family":"anatidae","species":["duck","goose","swan",null]} +``` + +```sql +SELECT j.family FROM example; +``` + +```text +"anatidae" +``` + +```sql +SELECT j.species[0] FROM example; +``` + +```text +"duck" +``` + +```sql +SELECT json_valid(j) FROM example; +``` + +```text +true +``` + +```sql +SELECT json_valid('{'); +``` + +```text +false +``` + +```sql +SELECT json_array_length('["duck", "goose", "swan", null]'); +``` + +```text +4 +``` + +```sql +SELECT json_array_length(j, 'species') FROM example; +``` + +```text +4 +``` + +```sql +SELECT json_array_length(j, '/species') FROM example; +``` + +```text +4 +``` + +```sql +SELECT json_array_length(j, '$.species') FROM example; +``` + +```text +4 +``` + +```sql +SELECT json_array_length(j, ['$.species']) FROM example; +``` + +```text +[4] +``` + +```sql +SELECT json_type(j) FROM example; +``` + +```text +OBJECT +``` + +```sql +SELECT json_keys(j) FROM example; +``` + +```text +[family, species] +``` + +```sql +SELECT json_structure(j) FROM example; +``` + +```text +{"family":"VARCHAR","species":["VARCHAR"]} +``` + +```sql +SELECT json_structure('["duck", {"family": "anatidae"}]'); +``` + +```text +["JSON"] +``` + +```sql +SELECT json_contains('{"key": "value"}', '"value"'); +``` + +```text +true +``` + +```sql +SELECT json_contains('{"key": 1}', '1'); +``` + +```text +true +``` + +```sql +SELECT json_contains('{"top_key": {"key": "value"}}', '{"key": "value"}'); +``` + +```text +true +``` + +## JSON Aggregate Functions + +There are three JSON aggregate functions. + +
+ +| Function | Description | +|:---|:----| +| `json_group_array(any)` | Return a JSON array with all values of `any` in the aggregation. | +| `json_group_object(key, value)` | Return a JSON object with all `key`, `value` pairs in the aggregation. | +| `json_group_structure(json)` | Return the combined `json_structure` of all `json` in the aggregation. | + +Examples: + +```sql +CREATE TABLE example1 (k VARCHAR, v INTEGER); +INSERT INTO example1 VALUES ('duck', 42), ('goose', 7); +``` + +```sql +SELECT json_group_array(v) FROM example1; +``` + +```text +[42, 7] +``` + +```sql +SELECT json_group_object(k, v) FROM example1; +``` + +```text +{"duck":42,"goose":7} +``` + +```sql +CREATE TABLE example2 (j JSON); +INSERT INTO example2 VALUES + ('{"family": "anatidae", "species": ["duck", "goose"], "coolness": 42.42}'), + ('{"family": "canidae", "species": ["labrador", "bulldog"], "hair": true}'); +``` + +```sql +SELECT json_group_structure(j) FROM example2; +``` + +```text +{"family":"VARCHAR","species":["VARCHAR"],"coolness":"DOUBLE","hair":"BOOLEAN"} +``` + +## Transforming JSON + +In many cases, it is inefficient to extract values from JSON one-by-one. +Instead, we can “extract” all values at once, transforming JSON to the nested types `LIST` and `STRUCT`. + +
+ +| Function | Description | +|:---|:---| +| `json_transform(json, structure)` | Transform `json` according to the specified `structure`. | +| `from_json(json, structure)` | Alias for `json_transform`. | +| `json_transform_strict(json, structure)` | Same as `json_transform`, but throws an error when type casting fails. | +| `from_json_strict(json, structure)` | Alias for `json_transform_strict`. | + +The `structure` argument is JSON of the same form as returned by `json_structure`. +The `structure` argument can be modified to transform the JSON into the desired structure and types. +It is possible to extract fewer key/value pairs than are present in the JSON, and it is also possible to extract more: missing keys become `NULL`. + +Examples: + +```sql +CREATE TABLE example (j JSON); +INSERT INTO example VALUES + ('{"family": "anatidae", "species": ["duck", "goose"], "coolness": 42.42}'), + ('{"family": "canidae", "species": ["labrador", "bulldog"], "hair": true}'); +``` + +```sql +SELECT json_transform(j, '{"family": "VARCHAR", "coolness": "DOUBLE"}') FROM example; +``` + +```text +{'family': anatidae, 'coolness': 42.420000} +{'family': canidae, 'coolness': NULL} +``` + +```sql +SELECT json_transform(j, '{"family": "TINYINT", "coolness": "DECIMAL(4, 2)"}') FROM example; +``` + +```text +{'family': NULL, 'coolness': 42.42} +{'family': NULL, 'coolness': NULL} +``` + +```sql +SELECT json_transform_strict(j, '{"family": "TINYINT", "coolness": "DOUBLE"}') FROM example; +``` + +```console +Invalid Input Error: Failed to cast value: "anatidae" +``` + +## Serializing and Deserializing SQL to JSON and Vice Versa + +The `json` extension also provides functions to serialize and deserialize `SELECT` statements between SQL and JSON, as well as executing JSON serialized statements. + +| Function | Type | Description | +|:------|:-|:---------| +| `json_deserialize_sql(json)` | Scalar | Deserialize one or many `json` serialized statements back to an equivalent SQL string. | +| `json_execute_serialized_sql(varchar)` | Table | Execute `json` serialized statements and return the resulting rows. Only one statement at a time is supported for now. | +| `json_serialize_sql(varchar, skip_empty := boolean, skip_null := boolean, format := boolean)` | Scalar | Serialize a set of semicolon-separated (`;`) select statements to an equivalent list of `json` serialized statements. | +| `PRAGMA json_execute_serialized_sql(varchar)` | Pragma | Pragma version of the `json_execute_serialized_sql` function. | + +The `json_serialize_sql(varchar)` function takes three optional parameters, `skip_empty`, `skip_null`, and `format` that can be used to control the output of the serialized statements. + +If you run the `json_execute_serialize_sql(varchar)` table function inside of a transaction the serialized statements will not be able to see any transaction local changes. This is because the statements are executed in a separate query context. You can use the `PRAGMA json_execute_serialize_sql(varchar)` pragma version to execute the statements in the same query context as the pragma, although with the limitation that the serialized JSON must be provided as a constant string, i.e., you cannot do `PRAGMA json_execute_serialize_sql(json_serialize_sql(...))`. + +Note that these functions do not preserve syntactic sugar such as `FROM * SELECT ...`, so a statement round-tripped through `json_deserialize_sql(json_serialize_sql(...))` may not be identical to the original statement, but should always be semantically equivalent and produce the same output. + +Examples: + +Simple example: + +```sql +SELECT json_serialize_sql('SELECT 2'); +``` + +```text +'{"error":false,"statements":[{"node":{"type":"SELECT_NODE","modifiers":[],"cte_map":{"map":[]},"select_list":[{"class":"CONSTANT","type":"VALUE_CONSTANT","alias":"","value":{"type":{"id":"INTEGER","type_info":null},"is_null":false,"value":2}}],"from_table":{"type":"EMPTY","alias":"","sample":null},"where_clause":null,"group_expressions":[],"group_sets":[],"aggregate_handling":"STANDARD_HANDLING","having":null,"sample":null,"qualify":null}}]}' +``` + +Example with multiple statements and skip options: + +```sql +SELECT json_serialize_sql('SELECT 1 + 2; SELECT a + b FROM tbl1', skip_empty := true, skip_null := true); +``` + +```text +'{"error":false,"statements":[{"node":{"type":"SELECT_NODE","select_list":[{"class":"FUNCTION","type":"FUNCTION","function_name":"+","children":[{"class":"CONSTANT","type":"VALUE_CONSTANT","value":{"type":{"id":"INTEGER"},"is_null":false,"value":1}},{"class":"CONSTANT","type":"VALUE_CONSTANT","value":{"type":{"id":"INTEGER"},"is_null":false,"value":2}}],"order_bys":{"type":"ORDER_MODIFIER"},"distinct":false,"is_operator":true,"export_state":false}],"from_table":{"type":"EMPTY"},"aggregate_handling":"STANDARD_HANDLING"}},{"node":{"type":"SELECT_NODE","select_list":[{"class":"FUNCTION","type":"FUNCTION","function_name":"+","children":[{"class":"COLUMN_REF","type":"COLUMN_REF","column_names":["a"]},{"class":"COLUMN_REF","type":"COLUMN_REF","column_names":["b"]}],"order_bys":{"type":"ORDER_MODIFIER"},"distinct":false,"is_operator":true,"export_state":false}],"from_table":{"type":"BASE_TABLE","table_name":"tbl1"},"aggregate_handling":"STANDARD_HANDLING"}}]}' +``` + +Example with a syntax error: + +```sql +SELECT json_serialize_sql('TOTALLY NOT VALID SQL'); +``` + +```text +'{"error":true,"error_type":"parser","error_message":"syntax error at or near \"TOTALLY\"\nLINE 1: TOTALLY NOT VALID SQL\n ^"}' +``` + +Example with deserialize: + +```sql +SELECT json_deserialize_sql(json_serialize_sql('SELECT 1 + 2')); +``` + +```text +'SELECT (1 + 2)' +``` + +Example with deserialize and syntax sugar: + +```sql +SELECT json_deserialize_sql(json_serialize_sql('FROM x SELECT 1 + 2')); +``` + +```text +'SELECT (1 + 2) FROM x' +``` + +Example with execute: + +```sql +SELECT * FROM json_execute_serialized_sql(json_serialize_sql('SELECT 1 + 2')); +``` + +```text +3 +``` + +Example with error: + +```sql +SELECT * FROM json_execute_serialized_sql(json_serialize_sql('TOTALLY NOT VALID SQL')); +``` + +```console +Error: Parser Error: Error parsing json: parser: syntax error at or near "TOTALLY" +``` + +## Indexing + +> Warning Following PostgreSQL's conventions, DuckDB uses 1-based indexing for [arrays]({% link docs/archive/1.0/sql/data_types/array.md %}) and [lists]({% link docs/archive/1.0/sql/data_types/list.md %}) but [0-based indexing for the JSON data type](https://www.postgresql.org/docs/16/functions-json.html#FUNCTIONS-JSON-PROCESSING). + +## Equality Comparison + +> Warning Currently, equality comparison of JSON files can differ based on the context. In some cases, it is based on raw text comparison, while in other cases, it uses logical content comparison. + +The following query returns true for all fields: + +```sql +SELECT + a != b, -- Space is part of physical JSON content. Despite equal logical content, values are treated as not equal. + c != d, -- Same. + c[0] = d[0], -- Equality because space was removed from physical content of fields: + a = c[0], -- Indeed, field is equal to empty list without space... + b != c[0], -- ... but different from empty list with space. +FROM ( + SELECT + '[]'::JSON AS a, + '[ ]'::JSON AS b, + '[[]]'::JSON AS c, + '[[ ]]'::JSON AS d + ); +``` + +
+ +| (a != b) | (c != d) | (c[0] = d[0]) | (a = c[0]) | (b != c[0]) | +|----------|----------|---------------|------------|-------------| +| true | true | true | true | true | \ No newline at end of file diff --git a/docs/archive/1.0/extensions/mysql.md b/docs/archive/1.0/extensions/mysql.md new file mode 100644 index 00000000000..fd1314626d7 --- /dev/null +++ b/docs/archive/1.0/extensions/mysql.md @@ -0,0 +1,263 @@ +--- +github_repository: https://github.com/duckdb/duckdb_mysql +layout: docu +title: MySQL Extension +--- + +The [`mysql` extension](https://github.com/duckdb/duckdb_mysql) allows DuckDB to directly read and write data from/to a running MySQL instance. The data can be queried directly from the underlying MySQL database. Data can be loaded from MySQL tables into DuckDB tables, or vice versa. + +## Installing and Loading + +To install the `mysql` extension, run: + +```sql +INSTALL mysql; +``` + +The extension is loaded automatically upon first use. If you prefer to load it manually, run: + +```sql +LOAD mysql; +``` + +## Reading Data from MySQL + +To make a MySQL database accessible to DuckDB use the `ATTACH` command with the `MYSQL` or the `MYSQL_SCANNER` type: + +```sql +ATTACH 'host=localhost user=root port=0 database=mysql' AS mysqldb (TYPE MYSQL); +USE mysqldb; +``` + +### Configuration + +The connection string determines the parameters for how to connect to MySQL as a set of `key=value` pairs. Any options not provided are replaced by their default values, as per the table below. Connection information can also be specified with [environment variables](https://dev.mysql.com/doc/refman/8.3/en/environment-variables.html). If no option is provided explicitly, the MySQL extension tries to read it from an environment variable. + +
+ +| Setting | Default | Environment variable | +|----------|----------------|----------------------| +| database | NULL | MYSQL_DATABASE | +| host | localhost | MYSQL_HOST | +| password | | MYSQL_PWD | +| port | 0 | MYSQL_TCP_PORT | +| socket | NULL | MYSQL_UNIX_PORT | +| user | ⟨current user⟩ | MYSQL_USER | + +### Reading MySQL Tables + +The tables in the MySQL database can be read as if they were normal DuckDB tables, but the underlying data is read directly from MySQL at query time. + +```sql +SHOW TABLES; +``` + +
+ +| name | +|-----------------| +| signed_integers | + +```sql +SELECT * FROM signed_integers; +``` + +
+ +| t | s | m | i | b | +|-----:|-------:|---------:|------------:|---------------------:| +| -128 | -32768 | -8388608 | -2147483648 | -9223372036854775808 | +| 127 | 32767 | 8388607 | 2147483647 | 9223372036854775807 | +| NULL | NULL | NULL | NULL | NULL | + +It might be desirable to create a copy of the MySQL databases in DuckDB to prevent the system from re-reading the tables from MySQL continuously, particularly for large tables. + +Data can be copied over from MySQL to DuckDB using standard SQL, for example: + +```sql +CREATE TABLE duckdb_table AS FROM mysqlscanner.mysql_table; +``` + +## Writing Data to MySQL + +In addition to reading data from MySQL, create tables, ingest data into MySQL and make other modifications to a MySQL database using standard SQL queries. + +This allows you to use DuckDB to, for example, export data that is stored in a MySQL database to Parquet, or read data from a Parquet file into MySQL. + +Below is a brief example of how to create a new table in MySQL and load data into it. + +```sql +ATTACH 'host=localhost user=root port=0 database=mysqlscanner' AS mysql_db (TYPE MYSQL); +CREATE TABLE mysql_db.tbl (id INTEGER, name VARCHAR); +INSERT INTO mysql_db.tbl VALUES (42, 'DuckDB'); +``` + +Many operations on MySQL tables are supported. All these operations directly modify the MySQL database, and the result of subsequent operations can then be read using MySQL. +Note that if modifications are not desired, `ATTACH` can be run with the `READ_ONLY` property which prevents making modifications to the underlying database. For example: + +```sql +ATTACH 'host=localhost user=root port=0 database=mysqlscanner' AS mysql_db (TYPE MYSQL, READ_ONLY); +``` + +## Supported Operations + +Below is a list of supported operations. + +### `CREATE TABLE` + +```sql +CREATE TABLE mysql_db.tbl (id INTEGER, name VARCHAR); +``` + +### `INSERT INTO` + +```sql +INSERT INTO mysql_db.tbl VALUES (42, 'DuckDB'); +``` + +### `SELECT` + +```sql +SELECT * FROM mysql_db.tbl; +``` + +| id | name | +|---:|--------| +| 42 | DuckDB | + +### `COPY` + +```sql +COPY mysql_db.tbl TO 'data.parquet'; +COPY mysql_db.tbl FROM 'data.parquet'; +``` + +You may also create a full copy of the database using the [`COPY FROM DATABASE` statement]({% link docs/archive/1.0/sql/statements/copy.md %}#copy-from-database--to): + +```sql +COPY FROM DATABASE mysql_db TO my_duckdb_db; +``` + +### `UPDATE` + +```sql +UPDATE mysql_db.tbl +SET name = 'Woohoo' +WHERE id = 42; +``` + +### `DELETE` + +```sql +DELETE FROM mysql_db.tbl +WHERE id = 42; +``` + +### `ALTER TABLE` + +```sql +ALTER TABLE mysql_db.tbl +ADD COLUMN k INTEGER; +``` + +### `DROP TABLE` + +```sql +DROP TABLE mysql_db.tbl; +``` + +### `CREATE VIEW` + +```sql +CREATE VIEW mysql_db.v1 AS SELECT 42; +``` + +### `CREATE SCHEMA` and `DROP SCHEMA` + +```sql +CREATE SCHEMA mysql_db.s1; +CREATE TABLE mysql_db.s1.integers (i INTEGER); +INSERT INTO mysql_db.s1.integers VALUES (42); +SELECT * FROM mysql_db.s1.integers; +``` + +| i | +|---:| +| 42 | + +```sql +DROP SCHEMA mysql_db.s1; +``` + +### Transactions + +```sql +CREATE TABLE mysql_db.tmp (i INTEGER); +BEGIN; +INSERT INTO mysql_db.tmp VALUES (42); +SELECT * FROM mysql_db.tmp; +``` + +This returns: + +| i | +|---:| +| 42 | + +```sql +ROLLBACK; +SELECT * FROM mysql_db.tmp; +``` + +This returns an empty table. + +> The DDL statements are not transactional in MySQL. + +## Running SQL Queries in MySQL + +### The `mysql_query` Table Function + +The `mysql_query` table function allows you to run arbitrary read queries within an attached database. `mysql_query` takes the name of the attached MySQL database to execute the query in, as well as the SQL query to execute. The result of the query is returned. Single-quote strings are escaped by repeating the single quote twice. + +```sql +mysql_query(attached_database::VARCHAR, query::VARCHAR) +``` + +For example: + +```sql +ATTACH 'host=localhost database=mysql' AS mysqldb (TYPE MYSQL); +SELECT * FROM mysql_query('mysqldb', 'SELECT * FROM cars LIMIT 3'); +``` + +> Warning This function is only available on DuckDB v0.10.1+, using the latest MySQL extension. +> To upgrade your extension, run `FORCE INSTALL mysql;`. + +### The `mysql_execute` Function + +The `mysql_execute` function allows running arbitrary queries within MySQL, including statements that update the schema and content of the database. + +```sql +ATTACH 'host=localhost database=mysql' AS mysqldb (TYPE MYSQL); +CALL mysql_execute('mysqldb', 'CREATE TABLE my_table (i INTEGER)'); +``` + +> Warning This function is only available on DuckDB v0.10.1+, using the latest MySQL extension. +> To upgrade your extension, run `FORCE INSTALL mysql;`. + +## Settings + +| Name | Description | Default | +|--------------------------------------|----------------------------------------------------------------|-----------| +| `mysql_bit1_as_boolean` | Whether or not to convert `BIT(1)` columns to `BOOLEAN` | `true` | +| `mysql_debug_show_queries` | DEBUG SETTING: print all queries sent to MySQL to stdout | `false` | +| `mysql_experimental_filter_pushdown` | Whether or not to use filter pushdown (currently experimental) | `false` | +| `mysql_tinyint1_as_boolean` | Whether or not to convert `TINYINT(1)` columns to `BOOLEAN` | `true` | + +## Schema Cache + +To avoid having to continuously fetch schema data from MySQL, DuckDB keeps schema information – such as the names of tables, their columns, etc. – cached. If changes are made to the schema through a different connection to the MySQL instance, such as new columns being added to a table, the cached schema information might be outdated. In this case, the function `mysql_clear_cache` can be executed to clear the internal caches. + +```sql +CALL mysql_clear_cache(); +``` \ No newline at end of file diff --git a/docs/archive/1.0/extensions/overview.md b/docs/archive/1.0/extensions/overview.md new file mode 100644 index 00000000000..db652183011 --- /dev/null +++ b/docs/archive/1.0/extensions/overview.md @@ -0,0 +1,179 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/extensions +- /docs/archive/1.0/extensions/ +title: Extensions +--- + +## Overview + +DuckDB has a flexible extension mechanism that allows for dynamically loading extensions. +These may extend DuckDB's functionality by providing support for additional file formats, introducing new types, and domain-specific functionality. + +> Extensions are loadable on all clients (e.g., Python and R). +> Extensions distributed via the Core and Community repositories are built and tested on macOS (AMD64 and ARM64), Windows (AMD64) and Linux (AMD64 and ARM64). + +## Listing Extensions + +To get a list of extensions, use `duckdb_extensions`: + +```sql +SELECT extension_name, installed, description +FROM duckdb_extensions(); +``` + +
+ +| extension_name | installed | description | +|-------------------|-----------|--------------------------------------------------------------| +| arrow | false | A zero-copy data integration between Apache Arrow and DuckDB | +| autocomplete | false | Adds support for autocomplete in the shell | +| ... | ... | ... | + +This list will show which extensions are available, which extensions are installed, at which version, where it is installed, and more. +The list includes most, but not all, available core extensions. For the full list, we maintain a [list of core extensions]({% link docs/archive/1.0/extensions/core_extensions.md %}). + +## Built-In Extensions + +DuckDB's binary distribution comes standard with a few built-in extensions. They are statically linked into the binary and can be used as is. +For example, to use the built-in [`json` extension]({% link docs/archive/1.0/extensions/json.md %}) to read a JSON file: + +```sql +SELECT * +FROM 'test.json'; +``` + +To make the DuckDB distribution lightweight, only a few essential extensions are built-in, varying slightly per distribution. Which extension is built-in on which platform is documented in the [list of core extensions]({% link docs/archive/1.0/extensions/core_extensions.md %}#default-extensions). + +## Installing More Extensions + +To make an extension that is not built-in available in DuckDB, two steps need to happen: + +1. **Extension installation** is the process of downloading the extension binary and verifying its metadata. During installation, DuckDB stores the downloaded extension and some metadata in a local directory. From this directory DuckDB can then load the Extension whenever it needs to. This means that installation needs to happen only once. + +2. **Extension loading** is the process of dynamically loading the binary into a DuckDB instance. DuckDB will search the local extension +directory for the installed extension, then load it to make its features available. This means that every time DuckDB is restarted, all +extensions that are used need to be (re)loaded + +> Extension installation and loading are subject to a few [limitations]({% link docs/archive/1.0/extensions/working_with_extensions.md %}#limitations). + +There are two main methods of making DuckDB perform the **installation** and **loading** steps for an installable extension: **explicitly** and through **autoloading**. + +### Explicit `INSTALL` and `LOAD` + +In DuckDB extensions can also explicitly installed and loaded. Both non-autoloadable and autoloadable extensions can be installed this way. +To explicitly install and load an extension, DuckDB has the dedicated SQL statements `LOAD` and `INSTALL`. For example, +to install and load the [`spatial` extension]({% link docs/archive/1.0/extensions/spatial.md %}), run: + +```sql +INSTALL spatial; +LOAD spatial; +``` + +With these statements, DuckDB will ensure the spatial extension is installed (ignoring the `INSTALL` statement if it is already), then proceed +to `LOAD` the spatial extension (again ignoring the statement if it is already loaded). + +After installing/loading an extension, the [`duckdb_extensions` function](#listing-extensions) can be used to get more information. + +### Autoloading Extensions + +For many of DuckDB's core extensions, explicitly loading and installing extensions is not necessary. DuckDB contains an autoloading mechanism +which can install and load the core extensions as soon as they are used in a query. For example, when running: + +```sql +SELECT * +FROM 'https://raw.githubusercontent.com/duckdb/duckdb-web/main/data/weather.csv'; +``` + +DuckDB will automatically install and load the [`httpfs`]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) extension. No explicit `INSTALL` or `LOAD` statements are required. + +Not all extensions can be autoloaded. This can have various reasons: some extensions make several changes to the running DuckDB instance, making autoloading technically not (yet) possible. For others, it is preferred to have users opt-in to the extension explicitly before use due to the way they modify behavior in DuckDB. + +To see which extensions can be autoloaded, check the [core extensions list]({% link docs/archive/1.0/extensions/core_extensions.md %}). + +### Community Extensions + +DuckDB supports installing third-party [Community Extensions]({% link docs/archive/1.0/extensions/community_extensions.md %}). +These are contributed by community members but they are built, signed, and distributed in a centralized repository. + +### Installing Extensions through Client APIs + +For many clients, using SQL to load and install extensions is the preferred method. However, some clients have a dedicated +API to install and load extensions. For example the [Python API client]({% link docs/archive/1.0/api/python/overview.md %}#loading-and-installing-extensions), +which has dedicated `install_extension(name: str)` and `load_extension(name: str)` methods. For more details on a specific Client API, refer +to the [Client API docs]({% link docs/archive/1.0/api/overview.md %}) + +## Updating Extensions + +> This feature was introduced in DuckDB 0.10.3. + +While built-in extensions are tied to a DuckDB release due to their nature of being built into the DuckDB binary, installable extensions +can and do receive updates. To ensure all currently installed extensions are on the most recent version, call: + +```sql +UPDATE EXTENSIONS; +``` + +For more details on extension version refer to [Extension Versioning]({% link docs/archive/1.0/extensions/versioning_of_extensions.md %}). + +## Installation Location + +By default, extensions are installed under the user's home directory: + +```text +~/.duckdb/extensions/⟨duckdb_version⟩/⟨platform_name⟩/ +``` + +For stable DuckDB releases, the `⟨duckdb_version⟩` will be equal to the version tag of that release. For nightly DuckDB builds, it will be equal +to the short git hash of the build. So for example, the extensions for DuckDB version v0.10.3 on macOS ARM64 (Apple Silicon) are installed to `~/.duckdb/extensions/v0.10.3/osx_arm64/`. +An example installation path for a nightly DuckDB build could be `~/.duckdb/extensions/fc2e4b26a6/linux_amd64_gcc4`. + +To change the default location where DuckDB stores its extensions, use the `extension_directory` configuration option: + +```sql +SET extension_directory = '/path/to/your/extension/directory'; +``` + +Note that setting the value of the `home_directory` configuration option has no effect on the location of the extensions. + +## Binary Compatibility + +To avoid binary compatibility issues, the binary extensions distributed by DuckDB are tied both to a specific DuckDB version and a platform. This means that DuckDB can automatically detect binary compatibility between it and a loadable extension. When trying to load an extension that was compiled for a different version or platform, DuckDB will throw an error and refuse to load the extension. + +See the [Working with Extensions page]({% link docs/archive/1.0/extensions/working_with_extensions.md %}#platforms) for details on available platforms. + +## Developing Extensions + +The same API that the core extensions use is available for developing extensions. This allows users to extend the functionality of DuckDB such that it suits their domain the best. +A template for creating extensions is available in the [`extension-template` repository](https://github.com/duckdb/extension-template/). This template also holds some documentation on how to get started building your own extension. + +## Extension Signing + +Extensions are signed with a cryptographic key, which also simplifies distribution (this is why they are served over HTTP and not HTTPS). +By default, DuckDB uses its built-in public keys to verify the integrity of extension before loading them. +All extensions provided by the DuckDB core team are signed. + +### Unsigned Extensions + +> Warning Only load unsigned extensions from sources you trust. Also, avoid loading them over HTTP. + +If you wish to load your own extensions or extensions from third-parties you will need to enable the `allow_unsigned_extensions` flag. +To load unsigned extensions using the [CLI client]({% link docs/archive/1.0/api/cli/overview.md %}), pass the `-unsigned` flag to it on startup: + +```bash +duckdb -unsigned +``` + +Now any extension can be loaded, signed or not: + +```sql +LOAD './some/local/ext.duckdb_extension'; +``` + +For Client APIs, the `allow_unsigned_extensions` database configuration options needs to be set, see the respective [Client API docs]({% link docs/archive/1.0/api/overview.md %}). +For example, for the Python client, see the [Loading and Installing Extensions section in the Python API documentation]({% link docs/archive/1.0/api/python/overview.md %}#loading-and-installing-extensions). + +## Working with Extensions + +For advanced installation instructions and more details on extensions, see the [Working with Extensions page]({% link docs/archive/1.0/extensions/working_with_extensions.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/extensions/postgres.md b/docs/archive/1.0/extensions/postgres.md new file mode 100644 index 00000000000..362ef1a3aac --- /dev/null +++ b/docs/archive/1.0/extensions/postgres.md @@ -0,0 +1,343 @@ +--- +github_repository: https://github.com/duckdb/postgres_scanner +layout: docu +redirect_from: +- docs/archive/1.0/extensions/postgres_scanner +- docs/archive/1.0/extensions/postgresql +title: PostgreSQL Extension +--- + +The `postgres` extension allows DuckDB to directly read and write data from a running PostgreSQL database instance. The data can be queried directly from the underlying PostgreSQL database. Data can be loaded from PostgreSQL tables into DuckDB tables, or vice versa. See the [official announcement]({% post_url 2022-09-30-postgres-scanner %}) for implementation details and background. + +## Installing and Loading + +The `postgres` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL postgres; +LOAD postgres; +``` + +## Connecting + +To make a PostgreSQL database accessible to DuckDB, use the `ATTACH` command with the `POSTGRES` or `POSTGRES_SCANNER` type. + +To connect to the `public` schema of the PostgreSQL instance running on localhost in read-write mode, run: + +```sql +ATTACH '' AS postgres_db (TYPE POSTGRES); +``` + +To connect to the PostgreSQL instance with the given parameters in read-only mode, run: + +```sql +ATTACH 'dbname=postgres user=postgres host=127.0.0.1' AS db (TYPE POSTGRES, READ_ONLY); +``` + +### Configuration + +The `ATTACH` command takes as input either a [`libpq` connection string](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING) +or a [PostgreSQL URI](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING-URIS). + +Below are some example connection strings and commonly used parameters. A full list of available parameters can be found [in the PostgreSQL documentation](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS). + +```text +dbname=postgresscanner +host=localhost port=5432 dbname=mydb connect_timeout=10 +``` + +| Name | Description | Default | +| ---------- | ------------------------------------ | ------------ | +| `dbname` | Database name | [user] | +| `host` | Name of host to connect to | `localhost` | +| `hostaddr` | Host IP address | `localhost` | +| `passfile` | Name of file passwords are stored in | `~/.pgpass` | +| `password` | PostgreSQL password | (empty) | +| `port` | Port number | `5432` | +| `user` | PostgreSQL user name | current user | + +An example URI is `postgresql://username@hostname/dbname`. + +### Configuring via Environment Variables + +PostgreSQL connection information can also be specified with [environment variables](https://www.postgresql.org/docs/current/libpq-envars.html). +This can be useful in a production environment where the connection information is managed externally +and passed in to the environment. + +```bash +export PGPASSWORD="secret" +export PGHOST=localhost +export PGUSER=owner +export PGDATABASE=mydatabase +``` + +Then, to connect, start the `duckdb` process and run: + +```sql +ATTACH '' AS p (TYPE POSTGRES); +``` + +## Usage + +The tables in the PostgreSQL database can be read as if they were normal DuckDB tables, but the underlying data is read directly from PostgreSQL at query time. + +```sql +SHOW ALL TABLES; +``` + +
+ +| name | +| ----- | +| uuids | + +```sql +SELECT * FROM uuids; +``` + +
+ +| u | +| ------------------------------------ | +| 6d3d2541-710b-4bde-b3af-4711738636bf | +| NULL | +| 00000000-0000-0000-0000-000000000001 | +| ffffffff-ffff-ffff-ffff-ffffffffffff | + +It might be desirable to create a copy of the PostgreSQL databases in DuckDB to prevent the system from re-reading the tables from PostgreSQL continuously, particularly for large tables. + +Data can be copied over from PostgreSQL to DuckDB using standard SQL, for example: + +```sql +CREATE TABLE duckdb_table AS FROM postgres_db.postgres_tbl; +``` + +## Writing Data to PostgreSQL + +In addition to reading data from PostgreSQL, the extension allows you to create tables, ingest data into PostgreSQL and make other modifications to a PostgreSQL database using standard SQL queries. + +This allows you to use DuckDB to, for example, export data that is stored in a PostgreSQL database to Parquet, or read data from a Parquet file into PostgreSQL. + +Below is a brief example of how to create a new table in PostgreSQL and load data into it. + +```sql +ATTACH 'dbname=postgresscanner' AS postgres_db (TYPE POSTGRES); +CREATE TABLE postgres_db.tbl (id INTEGER, name VARCHAR); +INSERT INTO postgres_db.tbl VALUES (42, 'DuckDB'); +``` + +Many operations on PostgreSQL tables are supported. All these operations directly modify the PostgreSQL database, and the result of subsequent operations can then be read using PostgreSQL. +Note that if modifications are not desired, `ATTACH` can be run with the `READ_ONLY` property which prevents making modifications to the underlying database. For example: + +```sql +ATTACH 'dbname=postgresscanner' AS postgres_db (TYPE POSTGRES, READ_ONLY); +``` + +Below is a list of supported operations. + +### `CREATE TABLE` + +```sql +CREATE TABLE postgres_db.tbl (id INTEGER, name VARCHAR); +``` + +### `INSERT INTO` + +```sql +INSERT INTO postgres_db.tbl VALUES (42, 'DuckDB'); +``` + +### `SELECT` + +```sql +SELECT * FROM postgres_db.tbl; +``` + +| id | name | +| ---: | ------ | +| 42 | DuckDB | + +### `COPY` + +You can copy tables back and forth between PostgreSQL and DuckDB: + +```sql +COPY postgres_db.tbl TO 'data.parquet'; +COPY postgres_db.tbl FROM 'data.parquet'; +``` + +These copies use [PostgreSQL binary wire encoding](https://www.postgresql.org/docs/current/sql-copy.html). +DuckDB can also write data using this encoding to a file which you can then load into PostgreSQL using a client of your choosing if you would like to do your own connection management: + +```sql +COPY 'data.parquet' TO 'pg.bin' WITH (FORMAT POSTGRES_BINARY); +``` + +The file produced will be the equivalent of copying the file to PostgreSQL using DuckDB and then dumping it from PostgreSQL using `psql` or another client: + +DuckDB: + +```sql +COPY postgres_db.tbl FROM 'data.parquet'; +``` + +PostgreSQL: + +```sql +\copy tbl TO 'data.bin' WITH (FORMAT BINARY); +``` + +You may also create a full copy of the database using the [`COPY FROM DATABASE` statement]({% link docs/archive/1.0/sql/statements/copy.md %}#copy-from-database--to): + +```sql +COPY FROM DATABASE postgres_db TO my_duckdb_db; +``` + +### `UPDATE` + +```sql +UPDATE postgres_db.tbl +SET name = 'Woohoo' +WHERE id = 42; +``` + +### `DELETE` + +```sql +DELETE FROM postgres_db.tbl +WHERE id = 42; +``` + +### `ALTER TABLE` + +```sql +ALTER TABLE postgres_db.tbl +ADD COLUMN k INTEGER; +``` + +### `DROP TABLE` + +```sql +DROP TABLE postgres_db.tbl; +``` + +### `CREATE VIEW` + +```sql +CREATE VIEW postgres_db.v1 AS SELECT 42; +``` + +### `CREATE SCHEMA` / `DROP SCHEMA` + +```sql +CREATE SCHEMA postgres_db.s1; +CREATE TABLE postgres_db.s1.integers (i INTEGER); +INSERT INTO postgres_db.s1.integers VALUES (42); +SELECT * FROM postgres_db.s1.integers; +``` + +| i | +| ---: | +| 42 | + +```sql +DROP SCHEMA postgres_db.s1; +``` + +## `DETACH` + +```sql +DETACH postgres_db; +``` + +### Transactions + +```sql +CREATE TABLE postgres_db.tmp (i INTEGER); +BEGIN; +INSERT INTO postgres_db.tmp VALUES (42); +SELECT * FROM postgres_db.tmp; +``` + +This returns: + +| i | +| ---: | +| 42 | + +```sql +ROLLBACK; +SELECT * FROM postgres_db.tmp; +``` + +This returns an empty table. + +## Running SQL Queries in PostgreSQL + +### The `postgres_query` Table Function + +The `postgres_query` table function allows you to run arbitrary read queries within an attached database. `postgres_query` takes the name of the attached PostgreSQL database to execute the query in, as well as the SQL query to execute. The result of the query is returned. Single-quote strings are escaped by repeating the single quote twice. + +```sql +postgres_query(attached_database::VARCHAR, query::VARCHAR) +``` + +For example: + +```sql +ATTACH 'dbname=postgresscanner' AS postgres_db (TYPE POSTGRES); +SELECT * FROM postgres_query('postgres_db', 'SELECT * FROM cars LIMIT 3'); +``` + + + +| brand | model | color | +| ------------ | ---------- | ----- | +| Ferrari | Testarossa | red | +| Aston Martin | DB2 | blue | +| Bentley | Mulsanne | gray | + +### The `postgres_execute` Function + +The `postgres_execute` function allows running arbitrary queries within PostgreSQL, including statements that update the schema and content of the database. + +```sql +ATTACH 'dbname=postgresscanner' AS postgres_db (TYPE POSTGRES); +CALL postgres_execute('postgres_db', 'CREATE TABLE my_table (i INTEGER)'); +``` + +> Warning This function is only available on DuckDB v0.10.1+, using the latest PostgreSQL extension. +> To upgrade your extension, run `FORCE INSTALL postgres;`. + +## Settings + +The extension exposes the following configuration parameters. + +| Name | Description | Default | +| --------------------------------- | ---------------------------------------------------------------------------- | ------- | +| `pg_array_as_varchar` | Read PostgreSQL arrays as varchar - enables reading mixed dimensional arrays | `false` | +| `pg_connection_cache` | Whether or not to use the connection cache | `true` | +| `pg_connection_limit` | The maximum amount of concurrent PostgreSQL connections | `64` | +| `pg_debug_show_queries` | DEBUG SETTING: print all queries sent to PostgreSQL to stdout | `false` | +| `pg_experimental_filter_pushdown` | Whether or not to use filter pushdown (currently experimental) | `false` | +| `pg_pages_per_task` | The amount of pages per task | `1000` | +| `pg_use_binary_copy` | Whether or not to use BINARY copy to read data | `true` | +| `pg_use_ctid_scan` | Whether or not to parallelize scanning using table ctids | `true` | + +## Schema Cache + +To avoid having to continuously fetch schema data from PostgreSQL, DuckDB keeps schema information – such as the names of tables, their columns, etc. – cached. If changes are made to the schema through a different connection to the PostgreSQL instance, such as new columns being added to a table, the cached schema information might be outdated. In this case, the function `pg_clear_cache` can be executed to clear the internal caches. + +```sql +CALL pg_clear_cache(); +``` + +> Deprecated The old `postgres_attach` function is deprecated. It is recommended to switch over to the new `ATTACH` syntax. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/spatial.md b/docs/archive/1.0/extensions/spatial.md new file mode 100644 index 00000000000..b69ab92032d --- /dev/null +++ b/docs/archive/1.0/extensions/spatial.md @@ -0,0 +1,308 @@ +--- +github_repository: https://github.com/duckdb/duckdb_spatial +layout: docu +title: Spatial Extension +--- + +The `spatial` extension provides support for geospatial data processing in DuckDB. +For an overview of the extension, see our [blog post]({% post_url 2023-04-28-spatial %}). + +## Installing and Loading + +To install and load the `spatial` extension, run: + +```sql +INSTALL spatial; +LOAD spatial; +``` + +## `GEOMETRY` Type + +The core of the spatial extension is the `GEOMETRY` type. If you're unfamiliar with geospatial data and GIS tooling, this type probably works very different from what you'd expect. + +In short, while the `GEOMETRY` type is a binary representation of “geometry” data made up out of sets of vertices (pairs of X and Y `double` precision floats), it actually stores one of several geometry subtypes. These are `POINT`, `LINESTRING`, `POLYGON`, as well as their “collection” equivalents, `MULTIPOINT`, `MULTILINESTRING` and `MULTIPOLYGON`. Lastly there is `GEOMETRYCOLLECTION`, which can contain any of the other subtypes, as well as other `GEOMETRYCOLLECTION`s recursively. + +This may seem strange at first, since DuckDB already have types like `LIST`, `STRUCT` and `UNION` which could be used in a similar way, but the design and behavior of the `GEOMETRY` type is actually based on the [Simple Features](https://en.wikipedia.org/wiki/Simple_Features) geometry model, which is a standard used by many other databases and GIS software. + +That said, the spatial extension also includes a couple of experimental non-standard explicit geometry types, such as `POINT_2D`, `LINESTRING_2D`, `POLYGON_2D` and `BOX_2D` that are based on DuckDBs native nested types, such as structs and lists. In theory it should be possible to optimize a lot of operations for these types much better than for the `GEOMETRY` type (which is just a binary blob), but only a couple functions are implemented so far. + +All of these are implicitly castable to `GEOMETRY` but with a conversion cost, so the `GEOMETRY` type is still the recommended type to use for now if you are planning to work with a lot of different spatial functions. + +`GEOMETRY` is not currently capable of storing additional geometry types, Z/M coordinates, or SRID information. These features may be added in the future. + +## Spatial Scalar Functions + +The spatial extension implements a large number of scalar functions and overloads. Most of these are implemented using the [GEOS](https://libgeos.org/) library, but we'd like to implement more of them natively in this extension to better utilize DuckDB's vectorized execution and memory management. The following symbols are used to indicate which implementation is used: + +🧭 – GEOS – functions that are implemented using the [GEOS](https://libgeos.org/) library + +🦆 – DuckDB – functions that are implemented natively in this extension that are capable of operating directly on the DuckDB types + +🔄 – `CAST(GEOMETRY)` – functions that are supported by implicitly casting to `GEOMETRY` and then using the `GEOMETRY` implementation + +The currently implemented spatial functions can roughly be categorized into the following groups: + +### Geometry Conversion + +Convert between geometries and other formats. + +| Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | +|-----|---|--|--|--|---| +| `VARCHAR ST_AsText(GEOMETRY)` | 🧭 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `WKB_BLOB ST_AsWKB(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `VARCHAR ST_AsHEXWKB(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `VARCHAR ST_AsGeoJSON(GEOMETRY)` | 🔄 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_GeomFromText(VARCHAR)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_GeomFromWKB(BLOB)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_GeomFromHEXWKB(VARCHAR)` | 🦆 | | | | | +| `GEOMETRY ST_GeomFromGeoJSON(VARCHAR)` | 🦆 | | | | | + +### Geometry Construction + +Construct new geometries from other geometries or other data. + +| Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | +|-----|---|--|--|--|---| +| `GEOMETRY ST_Point(DOUBLE, DOUBLE)` | 🦆 | 🦆 | | | | +| `GEOMETRY ST_ConvexHull(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Boundary(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Buffer(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Centroid(GEOMETRY)` | 🧭 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_Collect(GEOMETRY[]) ` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_Normalize(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_SimplifyPreserveTopology(GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Simplify(GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Union(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_Intersection(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_MakeLine(GEOMETRY[]) ` | 🦆 | | 🦆 | | | +| `GEOMETRY ST_Envelope(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_FlipCoordinates(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_Transform(GEOMETRY, VARCHAR, VARCHAR)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `BOX_2D ST_Extent(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GEOMETRY ST_PointN(GEOMETRY, INTEGER)` | 🦆 | | 🦆 | | | +| `GEOMETRY ST_StartPoint(GEOMETRY)` | 🦆 | | 🦆 | | | +| `GEOMETRY ST_EndPoint(GEOMETRY)` | 🦆 | | 🦆 | | | +| `GEOMETRY ST_ExteriorRing(GEOMETRY)` | 🦆 | | | 🦆 | | +| `GEOMETRY ST_Reverse(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 | +| `GEOMETRY ST_RemoveRepeatedPoints(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON` ) | +| `GEOMETRY ST_RemoveRepeatedPoints(GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON` ) | +| `GEOMETRY ST_ReducePrecision(GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON` ) | +| `GEOMETRY ST_PointOnSurface(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `GEOMETRY ST_CollectionExtract(GEOMETRY)` | 🦆 | | | | | +| `GEOMETRY ST_CollectionExtract(GEOMETRY, INTEGER)` | 🦆 | | | | | + +### Spatial Properties + +Calculate and access spatial properties of geometries. + +| Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | +|-----|---|--|--|--|---| +| `DOUBLE ST_Area(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `BOOLEAN ST_IsClosed(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsEmpty(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsRing(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsSimple(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_IsValid(GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `DOUBLE ST_X(GEOMETRY)` | 🧭 | 🦆 | | | | +| `DOUBLE ST_Y(GEOMETRY)` | 🧭 | 🦆 | | | | +| `DOUBLE ST_XMax(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `DOUBLE ST_YMax(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `DOUBLE ST_XMin(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `DOUBLE ST_YMin(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `GeometryType ST_GeometryType(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `DOUBLE ST_Length(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🔄 (as `POLYGON`) | +| `INTEGER ST_NGeometries(GEOMETRY)` | 🦆 | | | | | +| `INTEGER ST_NPoints(GEOMETRY)` | 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | +| `INTEGER ST_NInteriorRings(GEOMETRY)` | 🦆 | | | 🦆 | | + +### Spatial Relationships + +Compute relationships and spatial predicates between geometries. + +| Scalar functions | GEOMETRY | POINT_2D | LINESTRING_2D | POLYGON_2D | BOX_2D | +|-----|---|--|--|--|---| +| `BOOLEAN ST_Within(GEOMETRY, GEOMETRY)` | 🧭 | 🦆 or 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Touches(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Overlaps(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Contains(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🦆 or 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_CoveredBy(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Covers(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Crosses(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Difference(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Disjoint(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Intersects(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🦆 | +| `BOOLEAN ST_Equals(GEOMETRY, GEOMETRY)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `DOUBLE ST_Distance(GEOMETRY, GEOMETRY)` | 🧭 | 🦆 or 🔄 | 🦆 or 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_DWithin(GEOMETRY, GEOMETRY, DOUBLE)` | 🧭 | 🔄 | 🔄 | 🔄 | 🔄 (as `POLYGON`) | +| `BOOLEAN ST_Intersects_Extent(GEOMETRY, GEOMETRY)`| 🦆 | 🦆 | 🦆 | 🦆 | 🦆 | + +## Spatial Aggregate Functions + +
+ +| Aggregate functions | Implemented with | +|-------------------------------------------|------------------| +| `GEOMETRY ST_Envelope_Agg(GEOMETRY)` | 🦆 | +| `GEOMETRY ST_Union_Agg(GEOMETRY)` | 🧭 | +| `GEOMETRY ST_Intersection_Agg(GEOMETRY)` | 🧭 | + +## Spatial Table Functions + +### `ST_Read()` – Read Spatial Data from Files + +The spatial extension provides a `ST_Read` table function based on the [GDAL](https://github.com/OSGeo/gdal) translator library to read spatial data from a variety of geospatial vector file formats as if they were DuckDB tables. For example to create a new table from a GeoJSON file, you can use the following query: + +```sql +CREATE TABLE ⟨table⟩ AS SELECT * FROM ST_Read('some/file/path/filename.json'); +``` + +`ST_Read` can take a number of optional arguments, the full signature is: + +```sql +ST_Read( + VARCHAR, + sequential_layer_scan : BOOLEAN, + spatial_filter : WKB_BLOB, + open_options : VARCHAR[], + layer : VARCHAR, + allowed_drivers : VARCHAR[], + sibling_files : VARCHAR[], + spatial_filter_box : BOX_2D, + keep_wkb : BOOLEAN +) +``` + +* `sequential_layer_scan` (default: `false`): If set to `true`, the table function will scan through all layers sequentially and return the first layer that matches the given `layer` name. This is required for some drivers to work properly, e.g., the `OSM` driver. +* `spatial_filter` (default: `NULL`): If set to a WKB blob, the table function will only return rows that intersect with the given WKB geometry. Some drivers may support efficient spatial filtering natively, in which case it will be pushed down. Otherwise the filtering is done by GDAL which may be much slower. +* `open_options` (default: `[]`): A list of key-value pairs that are passed to the GDAL driver to control the opening of the file. E.g., the `GeoJSON` driver supports a `FLATTEN_NESTED_ATTRIBUTES=YES` option to flatten nested attributes. +* `layer` (default: `NULL`): The name of the layer to read from the file. If `NULL`, the first layer is returned. Can also be a layer index (starting at 0). +* `allowed_drivers` (default: `[]`): A list of GDAL driver names that are allowed to be used to open the file. If empty, all drivers are allowed. +* `sibling_files` (default: `[]`): A list of sibling files that are required to open the file. E.g., the `ESRI Shapefile` driver requires a `.shx` file to be present. Although most of the time these can be discovered automatically. +* `spatial_filter_box` (default: `NULL`): If set to a `BOX_2D`, the table function will only return rows that intersect with the given bounding box. Similar to `spatial_filter`. +* `keep_wkb` (default: `false`): If set, the table function will return geometries in a `wkb_geometry` column with the type `WKB_BLOB` (which can be cast to `BLOB`) instead of `GEOMETRY`. This is useful if you want to use DuckDB with more exotic geometry subtypes that DuckDB spatial doesn't support representing in the `GEOMETRY` type yet. + +Note that GDAL is single-threaded, so this table function will not be able to make full use of parallelism. We're planning to implement support for the most common vector formats natively in this extension with additional table functions in the future. + +We currently support over 50 different formats. You can generate the following table of supported GDAL drivers yourself by executing `SELECT * FROM ST_Drivers()`. + +| short_name | long_name | can_create | can_copy | can_open | help_url | +|--|---|-|-|-|---| +| ESRI Shapefile | ESRI Shapefile | true | false | true | | +| MapInfo File | MapInfo File | true | false | true | | +| UK .NTF | UK .NTF | false | false | true | | +| LVBAG | Kadaster LV BAG Extract 2.0 | false | false | true | | +| S57 | IHO S-57 (ENC) | true | false | true | | +| DGN | Microstation DGN | true | false | true | | +| OGR_VRT | VRT – Virtual Datasource | false | false | true | | +| Memory | Memory | true | false | true | | +| CSV | Comma Separated Value (.csv) | true | false | true | | +| GML | Geography Markup Language (GML) | true | false | true | | +| GPX | GPX | true | false | true | | +| KML | Keyhole Markup Language (KML) | true | false | true | | +| GeoJSON | GeoJSON | true | false | true | | +| GeoJSONSeq | GeoJSON Sequence | true | false | true | | +| ESRIJSON | ESRIJSON | false | false | true | | +| TopoJSON | TopoJSON | false | false | true | | +| OGR_GMT | GMT ASCII Vectors (.gmt) | true | false | true | | +| GPKG | GeoPackage | true | true | true | | +| SQLite | SQLite / Spatialite | true | false | true | | +| WAsP | WAsP .map format | true | false | true | | +| OpenFileGDB | ESRI FileGDB | true | false | true | | +| DXF | AutoCAD DXF | true | false | true | | +| CAD | AutoCAD Driver | false | false | true | | +| FlatGeobuf | FlatGeobuf | true | false | true | | +| Geoconcept | Geoconcept | true | false | true | | +| GeoRSS | GeoRSS | true | false | true | | +| VFK | Czech Cadastral Exchange Data Format | false | false | true | | +| PGDUMP | PostgreSQL SQL dump | true | false | false | | +| OSM | OpenStreetMap XML and PBF | false | false | true | | +| GPSBabel | GPSBabel | true | false | true | | +| WFS | OGC WFS (Web Feature Service) | false | false | true | | +| OAPIF | OGC API – Features | false | false | true | | +| EDIGEO | French EDIGEO exchange format | false | false | true | | +| SVG | Scalable Vector Graphics | false | false | true | | +| ODS | Open Document/ LibreOffice / OpenOffice Spreadsheet | true | false | true | | +| XLSX | MS Office Open XML spreadsheet | true | false | true | | +| Elasticsearch | Elastic Search | true | false | true | | +| Carto | Carto | true | false | true | | +| AmigoCloud | AmigoCloud | true | false | true | | +| SXF | Storage and eXchange Format | false | false | true | | +| Selafin | Selafin | true | false | true | | +| JML | OpenJUMP JML | true | false | true | | +| PLSCENES | Planet Labs Scenes API | false | false | true | | +| CSW | OGC CSW (Catalog Service for the Web) | false | false | true | | +| VDV | VDV-451/VDV-452/INTREST Data Format | true | false | true | | +| MVT | Mapbox Vector Tiles | true | false | true | | +| NGW | NextGIS Web | true | true | true | | +| MapML | MapML | true | false | true | | +| TIGER | U.S. Census TIGER/Line | false | false | true | | +| AVCBin | Arc/Info Binary Coverage | false | false | true | | +| AVCE00 | Arc/Info E00 (ASCII) Coverage | false | false | true | | + +Note that far from all of these drivers have been tested properly, and some may require additional options to be passed to work as expected. +If you run into any issues please first [consult the GDAL docs](https://gdal.org/drivers/vector/index.html). + +### `ST_ReadOsm()` – Read Compressed OSM Data + +The spatial extension also provides an experimental `ST_ReadOsm()` table function to read compressed OSM data directly from a `.osm.pbf` file. + +This will use multithreading and zero-copy protobuf parsing which makes it a lot faster than using the `st_read()` `OSM` driver, but it only outputs the raw OSM data (Nodes, Ways, Relations), without constructing any geometries. +For node entities you can trivially construct `POINT` geometries, but it is also possible to construct `LINESTRING` AND `POLYGON` by manually joining refs and nodes together in SQL. + +Example usage: + +```sql +SELECT * +FROM st_readosm('tmp/data/germany.osm.pbf') +WHERE tags['highway'] != [] +LIMIT 5; +``` + +
+ +| kind | id | tags | refs | lat | lon | ref_roles | ref_types | +|------|--------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|------|--------------------|------------|-----------|-----------| +| node | 122351 | {bicycle=yes, button_operated=yes, crossing=traffic_signals, highway=crossing, tactile_paving=no, traffic_signals:sound=yes, traffic_signals:vibration=yes} | NULL | 53.5492951 | 9.977553 | NULL | NULL | +| node | 122397 | {crossing=no, highway=traffic_signals, traffic_signals=signal, traffic_signals:direction=forward} | NULL | 53.520990100000006 | 10.0156924 | NULL | NULL | +| node | 122493 | {TMC:cid_58:tabcd_1:Class=Point, TMC:cid_58:tabcd_1:Direction=negative, TMC:cid_58:tabcd_1:LCLversion=9.00, TMC:cid_58:tabcd_1:LocationCode=10744, ...} | NULL | 53.129614600000004 | 8.1970173 | NULL | NULL | +| node | 123566 | {highway=traffic_signals} | NULL | 54.617268200000005 | 8.9718171 | NULL | NULL | +| node | 125801 | {TMC:cid_58:tabcd_1:Class=Point, TMC:cid_58:tabcd_1:Direction=negative, TMC:cid_58:tabcd_1:LCLversion=10.1, TMC:cid_58:tabcd_1:LocationCode=25041, ...} | NULL | 53.070685000000005 | 8.7819939 | NULL | NULL | + +## Spatial Replacement Scans + +The spatial extension also provides “replacement scans” for common geospatial file formats, allowing you to query files of these formats as if they were tables. + +```sql +SELECT * FROM './path/to/some/shapefile/dataset.shp'; +``` + +In practice this is just syntax-sugar for calling `ST_Read`, so there is no difference in performance. If you want to pass additional options, you should use the `ST_Read` table function directly. + +The following formats are currently recognized by their file extension: + +* ESRI ShapeFile, `.shp` +* GeoPackage, `.gpkg` +* FlatGeoBuf, `.fgb` + +Similarly there is a `.osm.pbf` replacement scan for `ST_ReadOsm`. + +## Spatial Copy Functions + +Much like the `ST_Read` table function the spatial extension provides a GDAL based `COPY` function to export DuckDB tables to different geospatial vector formats. +For example to export a table to a GeoJSON file, with generated bounding boxes, you can use the following query: + +```sql +COPY ⟨table⟩ TO 'some/file/path/filename.geojson' +WITH (FORMAT GDAL, DRIVER 'GeoJSON', LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES'); +``` + +Available options: + +* `FORMAT`: is the only required option and must be set to `GDAL` to use the GDAL based copy function. +* `DRIVER`: is the GDAL driver to use for the export. See the table above for a list of available drivers. +* `LAYER_CREATION_OPTIONS`: list of options to pass to the GDAL driver. See the GDAL docs for the driver you are using for a list of available options. +* `SRS`: Set a spatial reference system as metadata to use for the export. This can be a WKT string, an EPSG code or a proj-string, basically anything you would normally be able to pass to GDAL/OGR. This will not perform any reprojection of the input geometry though, it just sets the metadata if the target driver supports it. + +## Limitations + +Raster types are not supported and there is currently no plan to add them to the extension. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/sqlite.md b/docs/archive/1.0/extensions/sqlite.md new file mode 100644 index 00000000000..692487772d0 --- /dev/null +++ b/docs/archive/1.0/extensions/sqlite.md @@ -0,0 +1,270 @@ +--- +github_repository: https://github.com/duckdb/sqlite_scanner +layout: docu +redirect_from: +- docs/archive/1.0/extensions/sqlite_scanner +title: SQLite Extension +--- + +The SQLite extension allows DuckDB to directly read and write data from a SQLite database file. The data can be queried directly from the underlying SQLite tables. Data can be loaded from SQLite tables into DuckDB tables, or vice versa. + +## Installing and Loading + +The `sqlite` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL sqlite; +LOAD sqlite; +``` + +## Usage + +To make a SQLite file accessible to DuckDB, use the `ATTACH` statement with the `SQLITE` or `SQLITE_SCANNER` type. Attached SQLite databases support both read and write operations. + +For example, to attach to the [`sakila.db` file](https://github.com/duckdb/sqlite_scanner/raw/main/data/db/sakila.db), run: + +```sql +ATTACH 'sakila.db' (TYPE SQLITE); +USE sakila; +``` + +The tables in the file can be read as if they were normal DuckDB tables, but the underlying data is read directly from the SQLite tables in the file at query time. + +```sql +SHOW TABLES; +``` + +
+ +| name | +|------------------------| +| actor | +| address | +| category | +| city | +| country | +| customer | +| customer_list | +| film | +| film_actor | +| film_category | +| film_list | +| film_text | +| inventory | +| language | +| payment | +| rental | +| sales_by_film_category | +| sales_by_store | +| staff | +| staff_list | +| store | + +You can query the tables using SQL, e.g., using the example queries from [`sakila-examples.sql`](https://github.com/duckdb/sqlite_scanner/blob/main/data/sql/sakila-examples.sql): + +```sql +SELECT + cat.name AS category_name, + sum(ifnull(pay.amount, 0)) AS revenue +FROM category cat +LEFT JOIN film_category flm_cat + ON cat.category_id = flm_cat.category_id +LEFT JOIN film fil + ON flm_cat.film_id = fil.film_id +LEFT JOIN inventory inv + ON fil.film_id = inv.film_id +LEFT JOIN rental ren + ON inv.inventory_id = ren.inventory_id +LEFT JOIN payment pay + ON ren.rental_id = pay.rental_id +GROUP BY cat.name +ORDER BY revenue DESC +LIMIT 5; +``` + +## Data Types + +SQLite is a [weakly typed database system](https://www.sqlite.org/datatype3.html). As such, when storing data in a SQLite table, types are not enforced. The following is valid SQL in SQLite: + +```sql +CREATE TABLE numbers (i INTEGER); +INSERT INTO numbers VALUES ('hello'); +``` + +DuckDB is a strongly typed database system, as such, it requires all columns to have defined types and the system rigorously checks data for correctness. + +When querying SQLite, DuckDB must deduce a specific column type mapping. DuckDB follows SQLite's [type affinity rules](https://www.sqlite.org/datatype3.html#type_affinity) with a few extensions. + +1. If the declared type contains the string `INT` then it is translated into the type `BIGINT` +2. If the declared type of the column contains any of the strings `CHAR`, `CLOB`, or `TEXT` then it is translated into `VARCHAR`. +3. If the declared type for a column contains the string `BLOB` or if no type is specified then it is translated into `BLOB`. +4. If the declared type for a column contains any of the strings `REAL`, `FLOA`, `DOUB`, `DEC` or `NUM` then it is translated into `DOUBLE`. +5. If the declared type is `DATE`, then it is translated into `DATE`. +6. If the declared type contains the string `TIME`, then it is translated into `TIMESTAMP`. +7. If none of the above apply, then it is translated into `VARCHAR`. + +As DuckDB enforces the corresponding columns to contain only correctly typed values, we cannot load the string “hello” into a column of type `BIGINT`. As such, an error is thrown when reading from the “numbers” table above: + +```console +Error: Mismatch Type Error: Invalid type in column "i": column was declared as integer, found "hello" of type "text" instead. +``` + +This error can be avoided by setting the `sqlite_all_varchar` option: + +```sql +SET GLOBAL sqlite_all_varchar = true; +``` + +When set, this option overrides the type conversion rules described above, and instead always converts the SQLite columns into a `VARCHAR` column. Note that this setting must be set *before* `sqlite_attach` is called. + +## Opening SQLite Databases Directly + +SQLite databases can also be opened directly and can be used transparently instead of a DuckDB database file. In any client, when connecting, a path to a SQLite database file can be provided and the SQLite database will be opened instead. + +For example, with the shell, a SQLite database can be opened as follows: + +```bash +duckdb sakila.db +``` + +```sql +SELECT first_name +FROM actor +LIMIT 3; +``` + +| first_name | +|------------| +| PENELOPE | +| NICK | +| ED | + +## Writing Data to SQLite + +In addition to reading data from SQLite, the extension also allows you to create new SQLite database files, create tables, ingest data into SQLite and make other modifications to SQLite database files using standard SQL queries. + +This allows you to use DuckDB to, for example, export data that is stored in a SQLite database to Parquet, or read data from a Parquet file into SQLite. + +Below is a brief example of how to create a new SQLite database and load data into it. + +```sql +ATTACH 'new_sqlite_database.db' AS sqlite_db (TYPE SQLITE); +CREATE TABLE sqlite_db.tbl (id INTEGER, name VARCHAR); +INSERT INTO sqlite_db.tbl VALUES (42, 'DuckDB'); +``` + +The resulting SQLite database can then be read into from SQLite. + +```bash +sqlite3 new_sqlite_database.db +``` + +```sql +SQLite version 3.39.5 2022-10-14 20:58:05 +sqlite> SELECT * FROM tbl; +``` + +```text +id name +-- ------ +42 DuckDB +``` + +Many operations on SQLite tables are supported. All these operations directly modify the SQLite database, and the result of subsequent operations can then be read using SQLite. + +## Concurrency + +DuckDB can read or modify a SQLite database while DuckDB or SQLite reads or modifies the same database from a different thread or a separate process. More than one thread or process can read the SQLite database at the same time, but only a single thread or process can write to the database at one time. Database locking is handled by the SQLite library, not DuckDB. Within the same process, SQLite uses mutexes. When accessed from different processes, SQLite uses file system locks. The locking mechanisms also depend on SQLite configuration, like WAL mode. Refer to the [SQLite documentation on locking](https://www.sqlite.org/lockingv3.html) for more information. + +> Warning Linking multiple copies of the SQLite library into the same application can lead to application errors. See [sqlite_scanner Issue #82](https://github.com/duckdb/sqlite_scanner/issues/82) for more information. + +## Supported Operations + +Below is a list of supported operations. + +### `CREATE TABLE` + +```sql +CREATE TABLE sqlite_db.tbl (id INTEGER, name VARCHAR); +``` + +### `INSERT INTO` + +```sql +INSERT INTO sqlite_db.tbl VALUES (42, 'DuckDB'); +``` + +### `SELECT` + +```sql +SELECT * FROM sqlite_db.tbl; +``` + +| id | name | +|---:|--------| +| 42 | DuckDB | + +### `COPY` + +```sql +COPY sqlite_db.tbl TO 'data.parquet'; +COPY sqlite_db.tbl FROM 'data.parquet'; +``` + +### `UPDATE` + +```sql +UPDATE sqlite_db.tbl SET name = 'Woohoo' WHERE id = 42; +``` + +### `DELETE` + +```sql +DELETE FROM sqlite_db.tbl WHERE id = 42; +``` + +### `ALTER TABLE` + +```sql +ALTER TABLE sqlite_db.tbl ADD COLUMN k INTEGER; +``` + +### `DROP TABLE` + +```sql +DROP TABLE sqlite_db.tbl; +``` + +### `CREATE VIEW` + +```sql +CREATE VIEW sqlite_db.v1 AS SELECT 42; +``` + +### Transactions + +```sql +CREATE TABLE sqlite_db.tmp (i INTEGER); +``` + +```sql +BEGIN; +INSERT INTO sqlite_db.tmp VALUES (42); +SELECT * FROM sqlite_db.tmp; +``` + +| i | +|---:| +| 42 | + +```sql +ROLLBACK; +SELECT * FROM sqlite_db.tmp; +``` + +| i | +|--:| +| | + +> Deprecated The old `sqlite_attach` function is deprecated. It is recommended to switch over to the new [`ATTACH` syntax]({% link docs/archive/1.0/sql/statements/attach.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/extensions/substrait.md b/docs/archive/1.0/extensions/substrait.md new file mode 100644 index 00000000000..1c692acf841 --- /dev/null +++ b/docs/archive/1.0/extensions/substrait.md @@ -0,0 +1,147 @@ +--- +github_repository: https://github.com/duckdb/substrait +layout: docu +title: Substrait Extension +--- + +The main goal of the `substrait` extension is to support both production and consumption of [Substrait](https://substrait.io/) query plans in DuckDB. + +This extension is mainly exposed via 3 different APIs – the SQL API, the Python API, and the R API. +Here we depict how to consume and produce Substrait query plans in each API. + +> The Substrait integration is currently experimental. Support is currently only available on request. +> If you have not asked for permission to ask for support, [contact us prior to opening an issue](https://duckdblabs.com/contact/). +> If you open an issue without doing so, we will close it without further review. + +## Installing and Loading + +The Substrait extension is an autoloadable extensions, meaning that it will be loaded at runtime whenever one of the substrait functions is called. To explicitly install and load the released version of the Substrait extension, you can also use the following SQL commands. + +```sql +INSTALL substrait; +LOAD substrait; +``` + +## SQL + +In the SQL API, users can generate Substrait plans (into a BLOB or a JSON) and consume Substrait plans. + +### BLOB Generation + +To generate a Substrait BLOB the `get_substrait(sql)` function must be called with a valid SQL select query. + +```sql +CREATE TABLE crossfit (exercise TEXT, difficulty_level INTEGER); +INSERT INTO crossfit VALUES ('Push Ups', 3), ('Pull Ups', 5), ('Push Jerk', 7), ('Bar Muscle Up', 10); +``` + +```sql +.mode line +CALL get_substrait('SELECT count(exercise) AS exercise FROM crossfit WHERE difficulty_level <= 5'); +``` + +```text +Plan BLOB = \x12\x09\x1A\x07\x10\x01\x1A\x03lte\x12\x11\x1A\x0F\x10\x02\x1A\x0Bis_not_null\x12\x09\x1A\x07\x10\x03\x1A\x03and\x12\x0B\x1A\x09\x10\x04\x1A\x05count\x1A\xC8\x01\x12\xC5\x01\x0A\xB8\x01:\xB5\x01\x12\xA8\x01\x22\xA5\x01\x12\x94\x01\x0A\x91\x01\x12/\x0A\x08exercise\x0A\x10difficulty_level\x12\x11\x0A\x07\xB2\x01\x04\x08\x0D\x18\x01\x0A\x04*\x02\x10\x01\x18\x02\x1AJ\x1AH\x08\x03\x1A\x04\x0A\x02\x10\x01\x22\x22\x1A \x1A\x1E\x08\x01\x1A\x04*\x02\x10\x01\x22\x0C\x1A\x0A\x12\x08\x0A\x04\x12\x02\x08\x01\x22\x00\x22\x06\x1A\x04\x0A\x02(\x05\x22\x1A\x1A\x18\x1A\x16\x08\x02\x1A\x04*\x02\x10\x01\x22\x0C\x1A\x0A\x12\x08\x0A\x04\x12\x02\x08\x01\x22\x00\x22\x06\x0A\x02\x0A\x00\x10\x01:\x0A\x0A\x08crossfit\x1A\x00\x22\x0A\x0A\x08\x08\x04*\x04:\x02\x10\x01\x1A\x08\x12\x06\x0A\x02\x12\x00\x22\x00\x12\x08exercise2\x0A\x10\x18*\x06DuckDB +``` + +### JSON Generation + +To generate a JSON representing the Substrait plan the `get_substrait_json(sql)` function must be called with a valid SQL select query. + +```sql +CALL get_substrait_json('SELECT count(exercise) AS exercise FROM crossfit WHERE difficulty_level <= 5'); +``` + +```json + Json = {"extensions":[{"extensionFunction":{"functionAnchor":1,"name":"lte"}},{"extensionFunction":{"functionAnchor":2,"name":"is_not_null"}},{"extensionFunction":{"functionAnchor":3,"name":"and"}},{"extensionFunction":{"functionAnchor":4,"name":"count"}}],"relations":[{"root":{"input":{"project":{"input":{"aggregate":{"input":{"read":{"baseSchema":{"names":["exercise","difficulty_level"],"struct":{"types":[{"varchar":{"length":13,"nullability":"NULLABILITY_NULLABLE"}},{"i32":{"nullability":"NULLABILITY_NULLABLE"}}],"nullability":"NULLABILITY_REQUIRED"}},"filter":{"scalarFunction":{"functionReference":3,"outputType":{"bool":{"nullability":"NULLABILITY_NULLABLE"}},"arguments":[{"value":{"scalarFunction":{"functionReference":1,"outputType":{"i32":{"nullability":"NULLABILITY_NULLABLE"}},"arguments":[{"value":{"selection":{"directReference":{"structField":{"field":1}},"rootReference":{}}}},{"value":{"literal":{"i32":5}}}]}}},{"value":{"scalarFunction":{"functionReference":2,"outputType":{"i32":{"nullability":"NULLABILITY_NULLABLE"}},"arguments":[{"value":{"selection":{"directReference":{"structField":{"field":1}},"rootReference":{}}}}]}}}]}},"projection":{"select":{"structItems":[{}]},"maintainSingularStruct":true},"namedTable":{"names":["crossfit"]}}},"groupings":[{}],"measures":[{"measure":{"functionReference":4,"outputType":{"i64":{"nullability":"NULLABILITY_NULLABLE"}}}}]}},"expressions":[{"selection":{"directReference":{"structField":{}},"rootReference":{}}}]}},"names":["exercise"]}}],"version":{"minorNumber":24,"producer":"DuckDB"}} +``` + +### BLOB Consumption + +To consume a Substrait BLOB the `from_substrait(blob)` function must be called with a valid Substrait BLOB plan. + +```sql +CALL from_substrait('\x12\x09\x1A\x07\x10\x01\x1A\x03lte\x12\x11\x1A\x0F\x10\x02\x1A\x0Bis_not_null\x12\x09\x1A\x07\x10\x03\x1A\x03and\x12\x0B\x1A\x09\x10\x04\x1A\x05count\x1A\xC8\x01\x12\xC5\x01\x0A\xB8\x01:\xB5\x01\x12\xA8\x01\x22\xA5\x01\x12\x94\x01\x0A\x91\x01\x12/\x0A\x08exercise\x0A\x10difficulty_level\x12\x11\x0A\x07\xB2\x01\x04\x08\x0D\x18\x01\x0A\x04*\x02\x10\x01\x18\x02\x1AJ\x1AH\x08\x03\x1A\x04\x0A\x02\x10\x01\x22\x22\x1A \x1A\x1E\x08\x01\x1A\x04*\x02\x10\x01\x22\x0C\x1A\x0A\x12\x08\x0A\x04\x12\x02\x08\x01\x22\x00\x22\x06\x1A\x04\x0A\x02(\x05\x22\x1A\x1A\x18\x1A\x16\x08\x02\x1A\x04*\x02\x10\x01\x22\x0C\x1A\x0A\x12\x08\x0A\x04\x12\x02\x08\x01\x22\x00\x22\x06\x0A\x02\x0A\x00\x10\x01:\x0A\x0A\x08crossfit\x1A\x00\x22\x0A\x0A\x08\x08\x04*\x04:\x02\x10\x01\x1A\x08\x12\x06\x0A\x02\x12\x00\x22\x00\x12\x08exercise2\x0A\x10\x18*\x06DuckDB'::BLOB); +``` + +```text +exercise = 2 +``` + +## Python + +Substrait extension is autoloadable, but if you prefer to do so explicitly, you can use the relevant Python syntax within a connection: + +```python +import duckdb + +con = duckdb.connect() +con.install_extension("substrait") +con.load_extension("substrait") +``` + +### BLOB Generation + +To generate a Substrait BLOB the `get_substrait(sql)` function must be called, from a connection, with a valid SQL select query. + +```python +con.execute(query = "CREATE TABLE crossfit (exercise TEXT, difficulty_level INTEGER)") +con.execute(query = "INSERT INTO crossfit VALUES ('Push Ups', 3), ('Pull Ups', 5), ('Push Jerk', 7), ('Bar Muscle Up', 10)") + +proto_bytes = con.get_substrait(query="SELECT count(exercise) AS exercise FROM crossfit WHERE difficulty_level <= 5").fetchone()[0] +``` + +### JSON Generation + +To generate a JSON representing the Substrait plan the `get_substrait_json(sql)` function, from a connection, must be called with a valid SQL select query. + +```python +json = con.get_substrait_json("SELECT count(exercise) AS exercise FROM crossfit WHERE difficulty_level <= 5").fetchone()[0] +``` + +### BLOB Consumption + +To consume a Substrait BLOB the `from_substrait(blob)` function must be called, from the connection, with a valid Substrait BLOB plan. + +```python +query_result = con.from_substrait(proto=proto_bytes) +``` + +## R + +By default the extension will be autoloaded on first use. To explicitly install and load this extension in R, use the following commands: + +```r +library("duckdb") +con <- dbConnect(duckdb::duckdb()) +dbExecute(con, "INSTALL substrait") +dbExecute(con, "LOAD substrait") +``` + +### BLOB Generation + +To generate a Substrait BLOB the `duckdb_get_substrait(con, sql)` function must be called, with a connection and a valid SQL select query. + +```r +dbExecute(con, "CREATE TABLE crossfit (exercise TEXT, difficulty_level INTEGER)") +dbExecute(con, "INSERT INTO crossfit VALUES ('Push Ups', 3), ('Pull Ups', 5), ('Push Jerk', 7), ('Bar Muscle Up', 10)") + +proto_bytes <- duckdb::duckdb_get_substrait(con, "SELECT * FROM crossfit LIMIT 5") +``` + +### JSON Generation + +To generate a JSON representing the Substrait plan `duckdb_get_substrait_json(con, sql)` function, with a connection and a valid SQL select query. + +```r +json <- duckdb::duckdb_get_substrait_json(con, "SELECT count(exercise) AS exercise FROM crossfit WHERE difficulty_level <= 5") +``` + +### BLOB Consumption + +To consume a Substrait BLOB the `duckdb_prepare_substrait(con, blob)` function must be called, with a connection and a valid Substrait BLOB plan. + +```r +result <- duckdb::duckdb_prepare_substrait(con, proto_bytes) +df <- dbFetch(result) +``` \ No newline at end of file diff --git a/docs/archive/1.0/extensions/tpcds.md b/docs/archive/1.0/extensions/tpcds.md new file mode 100644 index 00000000000..c156481c04d --- /dev/null +++ b/docs/archive/1.0/extensions/tpcds.md @@ -0,0 +1,44 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/tpcds +layout: docu +title: TPC-DS Extension +--- + +The `tpcds` extension implements the data generator and queries for the [TPC-DS benchmark](https://www.tpc.org/tpcds/). + +## Installing and Loading + +The `tpcds` extension will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use from the official extension repository. +If you would like to install and load it manually, run: + +```sql +INSTALL tpcds; +LOAD tpcds; +``` + +## Usage + +To generate data for scale factor 1, use: + +```sql +CALL dsdgen(sf = 1); +``` + +To run a query, e.g., query 8, use: + +```sql +PRAGMA tpcds(8); +``` + +| s_store_name | sum(ss_net_profit) | +|--------------|-------------------:| +| able | -10354620.18 | +| ation | -10576395.52 | +| bar | -10625236.01 | +| ese | -10076698.16 | +| ought | -10994052.78 | + +## Limitations + +The `tpchds(⟨query_id⟩)` function runs a fixed TPC-DS query with pre-defined bind parameters (a.k.a. substitution parameters). +It is not possible to change the query parameters using the `tpcds` extension. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/tpch.md b/docs/archive/1.0/extensions/tpch.md new file mode 100644 index 00000000000..20812df4c2b --- /dev/null +++ b/docs/archive/1.0/extensions/tpch.md @@ -0,0 +1,108 @@ +--- +github_directory: https://github.com/duckdb/duckdb/tree/main/extension/tpch +layout: docu +title: TPC-H Extension +--- + +The `tpch` extension implements the data generator and queries for the [TPC-H benchmark](https://www.tpc.org/tpch/). + +## Installing and Loading + +The `tpch` extension is shipped by default in some DuckDB builds, otherwise it will be transparently [autoloaded]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) on first use. +If you would like to install and load it manually, run: + +```sql +INSTALL tpch; +LOAD tpch; +``` + +## Usage + +### Generating Data + +To generate data for scale factor 1, use: + +```sql +CALL dbgen(sf = 1); +``` + +Calling `dbgen` does not clean up existing TPC-H tables. +To clean up existing tables, use `DROP TABLE` before running `dbgen`: + +```sql +DROP TABLE IF EXISTS customer; +DROP TABLE IF EXISTS lineitem; +DROP TABLE IF EXISTS nation; +DROP TABLE IF EXISTS orders; +DROP TABLE IF EXISTS part; +DROP TABLE IF EXISTS partsupp; +DROP TABLE IF EXISTS region; +DROP TABLE IF EXISTS supplier; +``` + +### Running a Query + +To run a query, e.g., query 4, use: + +```sql +PRAGMA tpch(4); +``` + +| o_orderpriority | order_count | +|-----------------|------------:| +| 1-URGENT | 10594 | +| 2-HIGH | 10476 | +| 3-MEDIUM | 10410 | +| 4-NOT SPECIFIED | 10556 | +| 5-LOW | 10487 | + +### Listing Queries + +To list all 22 queries, run: + +```sql +FROM tpch_queries(); +``` + +This function returns a table with columns `query_nr` and `query`. + +### Listing Expected Answers + +To produced the expected results for all queries on scale factors 0.01, 0.1, and 1, run: + +```sql +FROM tpch_answers(); +``` + +This function returns a table with columns `query_nr`, `scale_factor`, and `answer`. + +## Data Generator Parameters + +The data generator function `dbgen` has the following parameters: + +
+ +| Name | Type | Description | +|--|--|------------| +| `catalog` | `VARCHAR` | Target catalog | +| `children` | `UINTEGER` | Number of partitions | +| `overwrite` | `BOOLEAN` | (Not used) | +| `sf` | `DOUBLE` | Scale factor | +| `step` | `UINTEGER` | Defines the partition to be generated, indexed from 0 to `children` - 1. Must be defined when the `children` arguments is defined | +| `suffix` | `VARCHAR` | Append the `suffix` to table names | + +## Generating Larger Than Memory Data Sets + +To generate data sets for large scale factors, which yield larger than memory data sets, run the `dbgen` function in steps. For example, you may generate SF300 in 10 steps: + +```sql +CALL dbgen(sf = 300, children = 10, step = 0); +CALL dbgen(sf = 300, children = 10, step = 1); +... +CALL dbgen(sf = 300, children = 10, step = 9); +``` + +## Limitations + +* The data generator function `dbgen` is single-threaded and does not support concurrency. Running multiple steps to parallelize over different partitions is also not supported at the moment. +* The `tpch(⟨query_id⟩)` function runs a fixed TPC-H query with pre-defined bind parameters (a.k.a. substitution parameters). It is not possible to change the query parameters using the `tpch` extension. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/versioning_of_extensions.md b/docs/archive/1.0/extensions/versioning_of_extensions.md new file mode 100644 index 00000000000..d99d38d48f5 --- /dev/null +++ b/docs/archive/1.0/extensions/versioning_of_extensions.md @@ -0,0 +1,80 @@ +--- +layout: docu +title: Versioning of Extensions +--- + +## Extension Versioning + +Just like DuckDB itself, DuckDB extensions have a version. This version can be used by users to determine which features are available +in the extension they have installed, and by developers to understand bug reports. DuckDB extensions can be versioned in different ways: + +**Extensions whose source lives in DuckDB's main repository** (in-tree extensions) are tagged with the short git hash of the repository. +For example, the parquet extension is built into DuckDB version `v0.10.3` (which has commit `70fd6a8a24`): + +```sql +SELECT extension_name, extension_version, install_mode +FROM duckdb_extensions() +WHERE extension_name='parquet'; +``` + +
+ +| extension_name | extension_version | install_mode | +|:------------------|:------------------|:---------------------| +| parquet | 70fd6a8a24 | STATICALLY_LINKED | + +**Extensions whose source lives in a separate repository** (out-of-tree extensions) have their own version. This version is **either** +the short git hash of the separate repository, **or** the git version tag in [Semantic Versioning](https://semver.org/) format. +For example, in DuckDB version `v0.10.3`, the azure extension could be versioned as follows: + +```sql +SELECT extension_name, extension_version, install_mode +FROM duckdb_extensions() +WHERE extension_name = 'azure'; +``` + +
+ +| extension_name | extension_version | install_mode | +|:---------------|:------------------|:---------------| +| azure | 49b63dc | REPOSITORY | + +## Updating Extensions + +> This feature was introduced in DuckDB 0.10.3. + +DuckDB has a dedicated statement that will automatically update all extensions to their latest version. The output will +give the user information on which extensions were updated to/from which version. For example: + +```sql +UPDATE EXTENSIONS; +``` + +
+ +| extension_name | repository | update_result | previous_version | current_version | +|:---------------|:-------------|:----------------------|:-----------------|:----------------| +| httpfs | core | NO_UPDATE_AVAILABLE | 70fd6a8a24 | 70fd6a8a24 | +| delta | core | UPDATED | d9e5cc1 | 04c61e4 | +| azure | core | NO_UPDATE_AVAILABLE | 49b63dc | 49b63dc | +| aws | core_nightly | NO_UPDATE_AVAILABLE | 42c78d3 | 42c78d3 | + +Note that DuckDB will look for updates in the source repository for each extension. So if an extension was installed from +`core_nightly`, it will be updated with the latest nightly build. + +The update statement can also be provided with a list of specific extensions to update: + +```sql +UPDATE EXTENSIONS (httpfs, azure); +``` + +
+ +| extension_name | repository | update_result | previous_version | current_version | +|:---------------|:-------------|:----------------------|:-----------------|:----------------| +| httpfs | core | NO_UPDATE_AVAILABLE | 70fd6a8a24 | 70fd6a8a24 | +| azure | core | NO_UPDATE_AVAILABLE | 49b63dc | 49b63dc | + +## Target DuckDB Version + +Currently, when extensions are compiled, they are tied to a specific version of DuckDB. What this means is that, for example, an extension binary compiled for v0.10.3 does not work for v1.0.0. In most cases, this will not cause any issues and is fully transparent; DuckDB will automatically ensure it installs the correct binary for its version. For extension developers, this means that they must ensure that new binaries are created whenever a new version of DuckDB is released. However, note that DuckDB provides an [extension template](https://github.com/duckdb/extension-template) that makes this fairly simple. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/vss.md b/docs/archive/1.0/extensions/vss.md new file mode 100644 index 00000000000..624d12c910f --- /dev/null +++ b/docs/archive/1.0/extensions/vss.md @@ -0,0 +1,114 @@ +--- +github_repository: https://github.com/duckdb/duckdb_vss +layout: docu +title: Vector Similarity Search Extension +--- + +The `vss` extension is an experimental extension for DuckDB that adds indexing support to accelerate vector similarity search queries using DuckDB's new fixed-size `ARRAY` type. + +See the [announcement blog post](/2024/05/03/vector-similarity-search-vss). + +## Usage + +To create a new HNSW index on a table with an `ARRAY` column, use the `CREATE INDEX` statement with the `USING HNSW` clause. For example: + +```sql +CREATE TABLE my_vector_table (vec FLOAT[3]); +INSERT INTO my_vector_table SELECT array_value(a, b, c) FROM range(1, 10) ra(a), range(1, 10) rb(b), range(1, 10) rc(c); +CREATE INDEX my_hnsw_index ON my_vector_table USING HNSW (vec); +``` + +The index will then be used to accelerate queries that use a `ORDER BY` clause evaluating one of the supported distance metric functions against the indexed columns and a constant vector, followed by a `LIMIT` clause. For example: + +```sql +SELECT * FROM my_vector_table ORDER BY array_distance(vec, [1, 2, 3]::FLOAT[3]) LIMIT 3; +``` + +We can verify that the index is being used by checking the `EXPLAIN` output and looking for the `HNSW_INDEX_SCAN` node in the plan: + +```sql +EXPLAIN SELECT * FROM my_vector_table ORDER BY array_distance(vec, [1, 2, 3]::FLOAT[3]) LIMIT 3; +``` + +```text +┌───────────────────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ #0 │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ vec │ +│array_distance(vec, [1.0, 2│ +│ .0, 3.0]) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HNSW_INDEX_SCAN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ t1 (HNSW INDEX SCAN : │ +│ my_idx) │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ vec │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ EC: 3 │ +└───────────────────────────┘ +``` + +By default the HNSW index will be created using the euclidean distance `l2sq` (L2-norm squared) metric, matching DuckDBs `array_distance` function, but other distance metrics can be used by specifying the `metric` option during index creation. For example: + +```sql +CREATE INDEX my_hnsw_cosine_index +ON my_vector_table +USING HNSW (vec) +WITH (metric = 'cosine'); +``` + +The following table shows the supported distance metrics and their corresponding DuckDB functions + +| Metric | Function | Description | +| -------- | ------------------------- | ------------------ | +| `l2sq` | `array_distance` | Euclidean distance | +| `cosine` | `array_cosine_similarity` | Cosine similarity | +| `ip` | `array_inner_product` | Inner product | + +Note that while each `HNSW` index only applies to a single column you can create multiple `HNSW` indexes on the same table each individually indexing a different column. Additionally, you can also create multiple `HNSW` indexes to the same column, each supporting a different distance metric. + +## Index options + +Besides the `metric` option, the `HNSW` index creation statement also supports the following options to control the hyperparameters of the index construction and search process: + +| Option | Default | Description | +|-------|--:|----------------------------| +| `ef_construction` | 128 | The number of candidate vertices to consider during the construction of the index. A higher value will result in a more accurate index, but will also increase the time it takes to build the index. | +| `ef_search` | 64 | The number of candidate vertices to consider during the search phase of the index. A higher value will result in a more accurate index, but will also increase the time it takes to perform a search. | +| `M` | 16 | The maximum number of neighbors to keep for each vertex in the graph. A higher value will result in a more accurate index, but will also increase the time it takes to build the index. | +| `M0` | 2 * `M` | The base connectivity, or the number of neighbors to keep for each vertex in the zero-th level of the graph. A higher value will result in a more accurate index, but will also increase the time it takes to build the index. | + +Additionally, you can also override the `ef_search` parameter set at index construction time by setting the `SET hnsw_ef_search = ⟨int⟩` configuration option at runtime. This can be useful if you want to trade search performance for accuracy or vice-versa on a per-connection basis. You can also unset the override by calling `RESET hnsw_ef_search`. + +## Persistence + +Due to some known issues related to peristence of custom extension indexes, the `HNSW` index can only be created on tables in in-memory databases by default, unless the `SET hnsw_enable_experimental_persistence = ⟨bool⟩` configuration option is set to `true`. + +The reasoning for locking this feature behind an experimental flag is that “WAL” recovery is not yet properly implemented for custom indexes, meaning that if a crash occurs or the database is shut down unexpectedly while there are uncommitted changes to a `HNSW`-indexed table, you can end up with __data loss or corruption of the index__. + +If you enable this option and experience an unexpected shutdown, you can try to recover the index by first starting DuckDB separately, loading the `vss` extension and then `ATTACH`ing the database file, which ensures that the `HNSW` index functionality is available during WAL-playback, allowing DuckDB's recovery process to proceed without issues. But we still recommend that you do not use this feature in production environments. + +With the `hnsw_enable_experimental_persistence` option enabled, the index will be persisted into the DuckDB database file (if you run DuckDB with a disk-backed database file), which means that after a database restart, the index can be loaded back into memory from disk instead of having to be re-created. With that in mind, there are no incremental updates to persistent index storage, so every time DuckDB performs a checkpoint the entire index will be serialized to disk and overwrite itself. Similarly, after a restart of the database, the index will be deserialized back into main memory in its entirety. Although this will be deferred until you first access the table associated with the index. Depending on how large the index is, the deserialization process may take some time, but it should still be faster than simply dropping and re-creating the index. + +## Inserts, Updates, Deletes and Re-Compaction + +The HNSW index does support inserting, updating and deleting rows from the table after index creation. However, there are two things to keep in mind: + +* It's faster to create the index after the table has been populated with data as the initial bulk load can make better use of parallelism on large tables. +* Deletes are not immediately reflected in the index, but are instead “marked” as deleted, which can cause the index to grow stale over time and negatively impact query quality and performance. + +To remedy the last point, you can call the `PRAGMA hnsw_compact_index('⟨index name⟩')` pragma function to trigger a re-compaction of the index pruning deleted items, or re-create the index after a significant number of updates. + +## Limitations + +* Only vectors consisting of `FLOAT`s (32-bit, single precision) are supported at the moment. +* The index itself is not buffer managed and must be able to fit into RAM memory. +* The size of the index in memory does not count towards DuckDB's `memory_limit` configuration parameter. +* `HNSW` indexes can only be created on tables in in-memory databases, unless the `SET hnsw_enable_experimental_persistence = ⟨bool⟩` configuration option is set to `true`, see [Persistence](#persistence) for more information. \ No newline at end of file diff --git a/docs/archive/1.0/extensions/working_with_extensions.md b/docs/archive/1.0/extensions/working_with_extensions.md new file mode 100644 index 00000000000..2cd20208263 --- /dev/null +++ b/docs/archive/1.0/extensions/working_with_extensions.md @@ -0,0 +1,233 @@ +--- +layout: docu +title: Working with Extensions +--- + +## Platforms + +Extension binaries must be built for each platform. Pre-built binaries are distributed for several platforms (see below). +For platforms where packages for certain extensions are not available, users can build them from source and [install the resulting binaries manually](#installing-an-extension-from-an-explicit-path). + +All official extensions are distributed for the following platforms. + +
+ +| Platform name | Operating system | Architecture | CPU types | Used by | +|--------------------|------------------|-----------------|---------------------------------|------------------------------| +| `linux_amd64` | Linux | x86_64 (AMD64) | | Node.js packages, etc. | +| `linux_amd64_gcc4` | Linux | x86_64 (AMD64) | | Python packages, CLI, etc. | +| `linux_arm64` | Linux | AArch64 (ARM64) | AWS Graviton, Snapdragon, etc. | all packages | +| `osx_amd64` | macOS | x86_64 (AMD64) | Intel | all packages | +| `osx_arm64` | macOS | AArch64 (ARM64) | Apple Silicon M1, M2, etc. | all packages | +| `windows_amd64` | Windows | x86_64 (AMD64) | Intel, AMD, etc. | all packages | + +> For some Linux ARM distributions (e.g., Python), two different binaries are distributed. These target either the `linux_arm64` or `linux_arm64_gcc4` platforms. Note that extension binaries are distributed for the first, but not the second. Effectively that means that on these platforms your glibc version needs to be 2.28 or higher to use the distributed extension binaries. + +Some extensions are distributed for the following platforms: + +* `windows_amd64_rtools` +* `wasm_eh` and `wasm_mvp` (see [DuckDB-Wasm's extensions]({% link docs/archive/1.0/api/wasm/extensions.md %})) + +For platforms outside the ones listed above, we do not officially distribute extensions (e.g., `linux_arm64_gcc4`, `windows_amd64_mingw`). + +### Sharing Extensions between Clients + +The shared installation location allows extensions to be shared between the client APIs _of the same DuckDB version_, as long as they share the same `platfrom` or ABI. For example, if an extension is installed with version 0.10.0 of the CLI client on macOS, it is available from the Python, R, etc. client libraries provided that they have access to the user's home directory and use DuckDB version 0.10.0. + +## Extension Repositories + +By default, DuckDB extensions are installed from a single repository containing extensions built and signed by the core +DuckDB team. This ensures the stability and security of the core set of extensions. These extensions live in the default `core` repository +which points to `http://extensions.duckdb.org`. + +Besides the core repository, DuckDB also supports installing extensions from other repositories. For example, the `core_nightly` repository contains nightly builds for core extensions +that are built for the latest stable release of DuckDB. This allows users to try out new features in extensions before they are officially published. + +### Installing Extensions from a Repository + +To install extensions from the default repository (default repository: `core`): + +```sql +INSTALL httpfs; +``` + +To explicitly install an extension from the core repository, run either of: + +```sql +INSTALL httpfs FROM core; +``` + +Or: + +```sql +INSTALL httpfs FROM 'http://extensions.duckdb.org'; +``` + +To install an extension from the core nightly repository: + +```sql +INSTALL spatial FROM core_nightly; +``` + +Or: + +```sql +INSTALL spatial FROM 'http://nightly-extensions.duckdb.org'; +``` + +To install an extensions from a custom repository unknown to DuckDB: + +```sql +INSTALL custom_extension FROM 'https://my-custom-extension-repository'; +``` + +Alternatively, the `custom_extension_repository` setting can be used to change the default repository used by DuckDB: + +```sql +SET custom_extension_repository = 'http://nightly-extensions.duckdb.org'; +``` + +While any url or local path can be used as a repository, currently DuckDB contains the following predefined repositories: + +
+ +| Alias | Url | Description | +|:----------------------|:---------------------------------------|:---------------------------------------------------------------------------------------| +| `core` | `http://extensions.duckdb.org` | DuckDB core extensions | +| `core_nightly` | `http://nightly-extensions.duckdb.org` | Nightly builds for `core` | +| `local_build_debug` | `./build/debug/repository` | Repository created when building DuckDB from source in debug mode (for development) | +| `local_build_release` | `./build/release/repository` | Repository created when building DuckDB from source in release mode (for development) | + +### Working with Multiple Repositories + +When working with extensions from different repositories, especially mixing `core` and `core_nightly`, it is important to keep track of the origins +and version of the different extensions. For this reason, DuckDB keeps track of this in the extension installation metadata. For example: + +```sql +INSTALL httpfs FROM core; +INSTALL aws FROM core_nightly; +SELECT extension_name, extension_version, installed_from, install_mode FROM duckdb_extensions(); +``` + +Would output: + +| extensions_name | extensions_version | installed_from | install_mode | +|:----------------|:-------------------|:---------------|:-------------| +| httpfs | 62d61a417f | core | REPOSITORY | +| aws | 42c78d3 | core_nightly | REPOSITORY | +| ... | ... | ... | ... | + +### Creating a Custom Repository + +A DuckDB repository is an HTTP, HTTPS, S3, or local file based directory that serves the extensions files in a specific structure. +This structure is describe [here](#downloading-extensions-directly-from-s3), and is the same +for local paths and remote servers, for example: + +```text +base_repository_path_or_url +└── v1.0.0 + └── osx_arm64 + ├── autocomplete.duckdb_extension + ├── httpfs.duckdb_extension + ├── icu.duckdb_extension + ├── inet.duckdb_extension + ├── json.duckdb_extension + ├── parquet.duckdb_extension + ├── tpcds.duckdb_extension + ├── tpcds.duckdb_extension + └── tpch.duckdb_extension +``` + +See the [`extension-template` repository](https://github.com/duckdb/extension-template/) for all necessary code and scripts +to set up a repository. + +When installing an extension from a custom repository, DuckDB will search for both a gzipped and non-gzipped version. For example: + +```sql +INSTALL icu FROM '⟨custom repository⟩'; +``` + +The execution of this statement will first look `icu.duckdb_extension.gz`, then `icu.duckdb_extension` in the repository's directory structure. + +If the custom repository is served over HTTPS or S3, the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) is required. DuckDB will attempt to [autoload]({% link docs/archive/1.0/extensions/overview.md %}#autoloading-extensions) +the `httpfs` extension when an installation over HTTPS or S3 is attempted. + +## Force Installing to Upgrade Extensions + +When DuckDB installs an extension, it is copied to a local directory to be cached and avoid future network traffic. +Any subsequent calls to `INSTALL ⟨extension_name⟩` will use the local version instead of downloading the extension again. To force re-downloading the extension, run: + +```sql +FORCE INSTALL extension_name; +``` + +Force installing can also be used to overwrite an extension with an extension with the same name from another repository, + +For example, first, `spatial` is installed from the core repository: + +```sql +INSTALL spatial; +``` + +Then, to overwrite this installation with the `spatial` extension from the `core_nightly` repository: + +```sql +FORCE INSTALL spatial FROM core_nightly; +``` + +## Alternative Approaches to Loading and Installing Extensions + +### Downloading Extensions Directly from S3 + +Downloading an extension directly can be helpful when building a [Lambda service](https://aws.amazon.com/pm/lambda/) or container that uses DuckDB. +DuckDB extensions are stored in public S3 buckets, but the directory structure of those buckets is not searchable. +As a result, a direct URL to the file must be used. +To download an extension file directly, use the following format: + +```text +http://extensions.duckdb.org/v⟨duckdb_version⟩/⟨platform_name⟩/⟨extension_name⟩.duckdb_extension.gz +``` + +For example: + +```text +http://extensions.duckdb.org/v{{ site.currentduckdbversion }}/windows_amd64/json.duckdb_extension.gz +``` + +### Installing an Extension from an Explicit Path + +`INSTALL` can be used with the path to a `.duckdb_extension` file: + +```sql +INSTALL 'path/to/httpfs.duckdb_extension'; +``` + +Note that compressed `.duckdb_extension.gz` files need to be decompressed beforehand. It is also possible to specify remote paths. + +### Loading an Extension from an Explicit Path + +`LOAD` can be used with the path to a `.duckdb_extension`. +For example, if the file was available at the (relative) path `path/to/httpfs.duckdb_extension`, you can load it as follows: + +```sql +LOAD 'path/to/httpfs.duckdb_extension'; +``` + +This will skip any currently installed extensions and load the specified extension directly. + +Note that using remote paths for compressed files is currently not possible. + +### Building and Installing Extensions from Source + +For building and installing extensions from source, see the [building guide]({% link docs/archive/1.0/dev/building/build_instructions.md %}#building-and-installing-extensions-from-source). + +### Statically Linking Extensions + +To statically link extensions, follow the [developer documentation's “Using extension config files” section](https://github.com/duckdb/duckdb/blob/main/extension/README.md#using-extension-config-files). + +## Limitations + +DuckDB's extension mechanism has the following limitations: + +* Once loaded, an extension cannot be reinstalled. +* Extensions cannot be unloaded. \ No newline at end of file diff --git a/docs/archive/1.0/guides/data_viewers/tableau.md b/docs/archive/1.0/guides/data_viewers/tableau.md new file mode 100644 index 00000000000..3eaa2e53329 --- /dev/null +++ b/docs/archive/1.0/guides/data_viewers/tableau.md @@ -0,0 +1,137 @@ +--- +layout: docu +title: Tableau – A Data Visualization Tool +--- + +[Tableau](https://www.tableau.com/) is a popular commercial data visualization tool. +In addition to a large number of built in connectors, +it also provides generic database connectivity via ODBC and JDBC connectors. + +Tableau has two main versions: Desktop and Online (Server). + +* For Desktop, connecting to a DuckDB database is similar to working in an embedded environment like Python. +* For Online, since DuckDB is in-process, the data needs to be either on the server itself + +or in a remote data bucket that is accessible from the server. + +## Database Creation + +When using a DuckDB database file +the data sets do not actually need to be imported into DuckDB tables; +it suffices to create views of the data. +For example, this will create a view of the `h2oai` Parquet test file in the current DuckDB code base: + +```sql +CREATE VIEW h2oai AS ( + FROM read_parquet('/Users/username/duckdb/data/parquet-testing/h2oai/h2oai_group_small.parquet') +); +``` + +Note that you should use full path names to local files so that they can be found from inside Tableau. +Also note that you will need to use a version of the driver that is compatible (i.e., from the same release) +as the database format used by the DuckDB tool (e.g., Python module, command line) that was used to create the file. + +## Installing the JDBC Driver + +Tableau provides documentation on how to [install a JDBC driver](https://help.tableau.com/current/pro/desktop/en-gb/jdbc_tableau.htm) +for Tableau to use. + +> Tableau (both Desktop and Server versions) need to be restarted any time you add or modify drivers. + +### Driver Links + +The link here is for a recent version of the JDBC driver that is compatible with Tableau. +If you wish to connect to a database file, +you will need to make sure the file was created with a file-compatible version of DuckDB. +Also, check that there is only one version of the driver installed as there are multiple filenames in use. + + +Download the [JAR file](https://repo1.maven.org/maven2/org/duckdb/duckdb_jdbc/{{ site.currentjavaversion }}/duckdb_jdbc-{{ site.currentjavaversion }}.jar). + + +* macOS: Copy it to `~/Library/Tableau/Drivers/` +* Windows: Copy it to `C:\Program Files\Tableau\Drivers` +* Linux: Copy it to `/opt/tableau/tableau_driver/jdbc`. + +## Using the PostgreSQL Dialect + +If you just want to do something simple, you can try connecting directly to the JDBC driver +and using Tableau-provided PostgreSQL dialect. + +1. Create a DuckDB file containing your views and/or data. +2. Launch Tableau +3. Under Connect > To a Server > More… click on “Other Databases (JDBC)” This will bring up the connection dialogue box. For the URL, enter `jdbc:duckdb:/User/username/path/to/database.db`. For the Dialect, choose PostgreSQL. The rest of the fields can be ignored: + +![Tableau PostgreSQL](/images/guides/tableau/tableau-osx-jdbc.png) + +However, functionality will be missing such as `median` and `percentile` aggregate functions. +To make the data source connection more compatible with the PostgreSQL dialect, +please use the DuckDB taco connector as described below. + +## Installing the Tableau DuckDB Connector + +While it is possible to use the Tableau-provided PostgreSQL dialect to communicate with the DuckDB JDBC driver, +we strongly recommend using the [DuckDB "taco" connector](https://github.com/MotherDuck-Open-Source/duckdb-tableau-connector). +This connector has been fully tested against the Tableau dialect generator +and [is more compatible](https://github.com/MotherDuck-Open-Source/duckdb-tableau-connector/blob/main/tableau_connectors/duckdb_jdbc/dialect.tdd) +than the provided PostgreSQL dialect. + +The documentation on how to install and use the connector is in its repository, +but essentially you will need the +[`duckdb_jdbc.taco`](https://github.com/MotherDuck-Open-Source/duckdb-tableau-connector/raw/main/packaged-connector/duckdb_jdbc-v1.0.0-signed.taco) file. +(Despite what the Tableau documentation says, the real security risk is in the JDBC driver code, +not the small amount of JavaScript in the Taco.) + +### Server (Online) + +On Linux, copy the Taco file to `/opt/tableau/connectors`. +On Windows, copy the Taco file to `C:\Program Files\Tableau\Connectors`. +Then issue these commands to disable signature validation: + +```bash +tsm configuration set -k native_api.disable_verify_connector_plugin_signature -v true +``` + +```bash +tsm pending-changes apply +``` + +The last command will restart the server with the new settings. + +### macOS + +Copy the Taco file to the `/Users/[User]/Documents/My Tableau Repository/Connectors` folder. +Then launch Tableau Desktop from the Terminal with the command line argument to disable signature validation: + +```bash +/Applications/Tableau\ Desktop\ ⟨year⟩.⟨quarter⟩.app/Contents/MacOS/Tableau -DDisableVerifyConnectorPluginSignature=true +``` + +You can also package this up with AppleScript by using the following script: + +```tableau +do shell script "\"/Applications/Tableau Desktop 2023.2.app/Contents/MacOS/Tableau\" -DDisableVerifyConnectorPluginSignature=true" +quit +``` + +Create this file with [the Script Editor](https://support.apple.com/guide/script-editor/welcome/mac) +(located in `/Applications/Utilities`) +and [save it as a packaged application](https://support.apple.com/guide/script-editor/save-a-script-as-an-app-scpedt1072/mac): + +![tableau-applescript](/images/guides/tableau/applescript.png) + +You can then double-click it to launch Tableau. +You will need to change the application name in the script when you get upgrades. + +### Windows Desktop + +Copy the Taco file to the `C:\Users\[Windows User]\Documents\My Tableau Repository\Connectors` directory. +Then launch Tableau Desktop from a shell with the `-DDisableVerifyConnectorPluginSignature=true` argument +to disable signature validation. + +## Output + +Once loaded, you can run queries against your data! +Here is the result of the first H2O.ai benchmark query from the Parquet test file: + +![tableau-parquet](/images/guides/tableau/h2oai-group-by-1.png) \ No newline at end of file diff --git a/docs/archive/1.0/guides/data_viewers/youplot.md b/docs/archive/1.0/guides/data_viewers/youplot.md new file mode 100644 index 00000000000..ea248ef14d0 --- /dev/null +++ b/docs/archive/1.0/guides/data_viewers/youplot.md @@ -0,0 +1,77 @@ +--- +layout: docu +title: CLI Charting with YouPlot +--- + +DuckDB can be used with CLI graphing tools to quickly pipe input to stdout to graph your data in one line. + +[YouPlot](https://github.com/red-data-tools/YouPlot) is a Ruby-based CLI tool for drawing visually pleasing plots on the terminal. It can accept input from other programs by piping data from `stdin`. It takes tab-separated (or delimiter of your choice) data and can easily generate various types of plots including bar, line, histogram and scatter. + +With DuckDB, you can write to the console (`stdout`) by using the `TO '/dev/stdout'` command. And you can also write comma-separated values by using `WITH (FORMAT 'csv', HEADER)`. + +## Installing YouPlot + +Installation instructions for YouPlot can be found on the main [YouPlot repository](https://github.com/red-data-tools/YouPlot#installation). If you're on a Mac, you can use: + +```bash +brew install youplot +``` + +Run `uplot --help` to ensure you've installed it successfully! + +## Piping DuckDB Queries to stdout + +By combining the [`COPY...TO`]({% link docs/archive/1.0/sql/statements/copy.md %}#copy-to) function with a CSV output file, data can be read from any format supported by DuckDB and piped to YouPlot. There are three important steps to doing this. + +1. As an example, this is how to read all data from `input.json`: + + ```bash + duckdb -s "SELECT * FROM read_json_auto('input.json')" + ``` + +2. To prepare the data for YouPlot, write a simple aggregate: + + ```bash + duckdb -s "SELECT date, sum(purchases) AS total_purchases FROM read_json_auto('input.json') GROUP BY 1 ORDER BY 2 DESC LIMIT 10" + ``` + +3. Finally, wrap the `SELECT` in the `COPY ... TO` function with an output location of `/dev/stdout`. + + The syntax looks like this: + + ```sql + COPY (⟨query⟩) TO '/dev/stdout' WITH (FORMAT 'csv', HEADER); + ``` + + The full DuckDB command below outputs the query in CSV format with a header: + + ```bash + duckdb -s "COPY (SELECT date, sum(purchases) AS total_purchases FROM read_json_auto('input.json') GROUP BY 1 ORDER BY 2 DESC LIMIT 10) TO '/dev/stdout' WITH (FORMAT 'csv', HEADER)" + ``` + +## Connecting DuckDB to YouPlot + +Finally, the data can now be piped to YouPlot! Let's assume we have an `input.json` file with dates and number of purchases made by somebody on that date. Using the query above, we'll pipe the data to the `uplot` command to draw a plot of the Top 10 Purchase Dates + +```bash +duckdb -s "COPY (SELECT date, sum(purchases) AS total_purchases FROM read_json_auto('input.json') GROUP BY 1 ORDER BY 2 DESC LIMIT 10) TO '/dev/stdout' WITH (FORMAT 'csv', HEADER)" \ + | uplot bar -d, -H -t "Top 10 Purchase Dates" +``` + +This tells `uplot` to draw a bar plot, use a comma-seperated delimiter (`-d,`), that the data has a header (`-H`), and give the plot a title (`-t`). + +![youplot-top-10](/images/guides/youplot/top-10-plot.png) + +## Bonus Round! stdin + stdout + +Maybe you're piping some data through `jq`. Maybe you're downloading a JSON file from somewhere. You can also tell DuckDB to read the data from another process by changing the filename to `/dev/stdin`. + +Let's combine this with a quick `curl` from GitHub to see what a certain user has been up to lately. + +```bash +curl -sL "https://api.github.com/users/dacort/events?per_page=100" \ + | duckdb -s "COPY (SELECT type, count(*) AS event_count FROM read_json_auto('/dev/stdin') GROUP BY 1 ORDER BY 2 DESC LIMIT 10) TO '/dev/stdout' WITH (FORMAT 'csv', HEADER)" \ + | uplot bar -d, -H -t "GitHub Events for @dacort" +``` + +![github-events](/images/guides/youplot/github-events.png) \ No newline at end of file diff --git a/docs/archive/1.0/guides/database_integration/mysql.md b/docs/archive/1.0/guides/database_integration/mysql.md new file mode 100644 index 00000000000..dd89f5dd780 --- /dev/null +++ b/docs/archive/1.0/guides/database_integration/mysql.md @@ -0,0 +1,53 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/query_mysql +title: MySQL Import +--- + +To run a query directly on a running MySQL database, the [`mysql` extension]({% link docs/archive/1.0/extensions/mysql.md %}) is required. + +## Installation and Loading + +The extension can be installed using the `INSTALL` SQL command. This only needs to be run once. + +```sql +INSTALL mysql; +``` + +To load the `mysql` extension for usage, use the `LOAD` SQL command: + +```sql +LOAD mysql; +``` + +## Usage + +After the `mysql` extension is installed, you can attach to a MySQL database using the following command: + +```sql +ATTACH 'host=localhost user=root port=0 database=mysqlscanner' AS mysql_db (TYPE mysql_scanner, READ_ONLY); +USE mysql_db; +``` + +The string used by `ATTACH` is a PostgreSQL-style connection string (_not_ a MySQL connection string!). It is a list of connection arguments provided in `{key}={value}` format. Below is a list of valid arguments. Any options not provided are replaced by their default values. + +
+ +| Setting | Default | +|------------|--------------| +| `database` | `NULL` | +| `host` | `localhost` | +| `password` | | +| `port` | `0` | +| `socket` | `NULL` | +| `user` | current user | + +You can directly read and write the MySQL database: + +```sql +CREATE TABLE tbl (id INTEGER, name VARCHAR); +INSERT INTO tbl VALUES (42, 'DuckDB'); +``` + +For a list of supported operations, see the [MySQL extension documentation]({% link docs/archive/1.0/extensions/mysql.md %}#supported-operations). \ No newline at end of file diff --git a/docs/archive/1.0/guides/database_integration/overview.md b/docs/archive/1.0/guides/database_integration/overview.md new file mode 100644 index 00000000000..eca102c4858 --- /dev/null +++ b/docs/archive/1.0/guides/database_integration/overview.md @@ -0,0 +1,4 @@ +--- +layout: docu +title: Database Integration +--- \ No newline at end of file diff --git a/docs/archive/1.0/guides/database_integration/postgres.md b/docs/archive/1.0/guides/database_integration/postgres.md new file mode 100644 index 00000000000..a91bb00c243 --- /dev/null +++ b/docs/archive/1.0/guides/database_integration/postgres.md @@ -0,0 +1,60 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/query_postgres +title: PostgreSQL Import +--- + +To run a query directly on a running PostgreSQL database, the [`postgres` extension]({% link docs/archive/1.0/extensions/postgres.md %}) is required. + +## Installation and Loading + +The extension can be installed using the `INSTALL` SQL command. This only needs to be run once. + +```sql +INSTALL postgres; +``` + +To load the `postgres` extension for usage, use the `LOAD` SQL command: + +```sql +LOAD postgres; +``` + +## Usage + +After the `postgres` extension is installed, tables can be queried from PostgreSQL using the `postgres_scan` function: + +```sql +-- Scan the table "mytable" from the schema "public" in the database "mydb" +SELECT * FROM postgres_scan('host=localhost port=5432 dbname=mydb', 'public', 'mytable'); +``` + +The first parameter to the `postgres_scan` function is the [PostgreSQL connection string](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING), a list of connection arguments provided in `{key}={value}` format. Below is a list of valid arguments. + +
+ +| Name | Description | Default | +|------------|--------------------------------------|----------------| +| `host` | Name of host to connect to | `localhost` | +| `hostaddr` | Host IP address | `localhost` | +| `port` | Port number | `5432` | +| `user` | PostgreSQL user name | [OS user name] | +| `password` | PostgreSQL password | | +| `dbname` | Database name | [user] | +| `passfile` | Name of file passwords are stored in | `~/.pgpass` | + +Alternatively, the entire database can be attached using the `ATTACH` command. This allows you to query all tables stored within the PostgreSQL database as if it was a regular database. + +```sql +-- Attach the PostgreSQL database using the given connection string +ATTACH 'host=localhost port=5432 dbname=mydb' AS test (TYPE postgres); +-- The table "tbl_name" can now be queried as if it is a regular table +SELECT * FROM test.tbl_name; +-- Switch the active database to "test" +USE test; +-- List all tables in the file +SHOW TABLES; +``` + +For more information see the [PostgreSQL extension documentation]({% link docs/archive/1.0/extensions/postgres.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/database_integration/sqlite.md b/docs/archive/1.0/guides/database_integration/sqlite.md new file mode 100644 index 00000000000..ed1a4a0064f --- /dev/null +++ b/docs/archive/1.0/guides/database_integration/sqlite.md @@ -0,0 +1,46 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/query_sqlite +title: SQLite Import +--- + +To run a query directly on a SQLite file, the `sqlite` extension is required. + +## Installation and Loading + +The extension can be installed using the `INSTALL` SQL command. This only needs to be run once. + +```sql +INSTALL sqlite; +``` + +To load the `sqlite` extension for usage, use the `LOAD` SQL command: + +```sql +LOAD sqlite; +``` + +## Usage + +After the SQLite extension is installed, tables can be queried from SQLite using the `sqlite_scan` function: + +```sql +-- Scan the table "tbl_name" from the SQLite file "test.db" +SELECT * FROM sqlite_scan('test.db', 'tbl_name'); +``` + +Alternatively, the entire file can be attached using the `ATTACH` command. This allows you to query all tables stored within a SQLite database file as if they were a regular database. + +```sql +-- Attach the SQLite file "test.db" +ATTACH 'test.db' AS test (TYPE sqlite); +-- The table "tbl_name" can now be queried as if it is a regular table +SELECT * FROM test.tbl_name; +-- Switch the active database to "test" +USE test; +-- List all tables in the file +SHOW TABLES; +``` + +For more information see the [SQLite extension documentation]({% link docs/archive/1.0/extensions/sqlite.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/csv_export.md b/docs/archive/1.0/guides/file_formats/csv_export.md new file mode 100644 index 00000000000..2aca259bf8d --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/csv_export.md @@ -0,0 +1,20 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/csv_export +title: CSV Export +--- + +To export the data from a table to a CSV file, use the `COPY` statement: + +```sql +COPY tbl TO 'output.csv' (HEADER, DELIMITER ','); +``` + +The result of queries can also be directly exported to a CSV file: + +```sql +COPY (SELECT * FROM tbl) TO 'output.csv' (HEADER, DELIMITER ','); +``` + +For additional options, see the [`COPY` statement documentation]({% link docs/archive/1.0/sql/statements/copy.md %}#csv-options). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/csv_import.md b/docs/archive/1.0/guides/file_formats/csv_import.md new file mode 100644 index 00000000000..4104a28cc63 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/csv_import.md @@ -0,0 +1,47 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/csv_import +title: CSV Import +--- + +To read data from a CSV file, use the `read_csv` function in the `FROM` clause of a query: + +```sql +SELECT * FROM read_csv('input.csv'); +``` + +Alternatively, you can omit the `read_csv` function and let DuckDB infer it from the extension: + +```sql +SELECT * FROM 'input.csv'; +``` + +To create a new table using the result from a query, use [`CREATE TABLE ... AS SELECT` statement]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas): + +```sql +CREATE TABLE new_tbl AS + SELECT * FROM read_csv('input.csv'); +``` + +We can use DuckDB's [optional `FROM`-first syntax]({% link docs/archive/1.0/sql/query_syntax/from.md %}) to omit `SELECT *`: + +```sql +CREATE TABLE new_tbl AS + FROM read_csv('input.csv'); +``` + +To load data into an existing table from a query, use `INSERT INTO` from a `SELECT` statement: + +```sql +INSERT INTO tbl + SELECT * FROM read_csv('input.csv'); +``` + +Alternatively, the `COPY` statement can also be used to load data from a CSV file into an existing table: + +```sql +COPY tbl FROM 'input.csv'; +``` + +For additional options, see the [CSV import reference]({% link docs/archive/1.0/data/csv/overview.md %}) and the [`COPY` statement documentation]({% link docs/archive/1.0/sql/statements/copy.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/excel_export.md b/docs/archive/1.0/guides/file_formats/excel_export.md new file mode 100644 index 00000000000..b468cb0a2e3 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/excel_export.md @@ -0,0 +1,38 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/excel_export +title: Excel Export +--- + +## Installing the Extension + +To export the data from a table to an Excel file, install and load the [spatial extension]({% link docs/archive/1.0/extensions/spatial.md %}). +This is only needed once per DuckDB connection. + +```sql +INSTALL spatial; +LOAD spatial; +``` + +## Exporting Excel Sheets + +Then use the `COPY` statement. The file will contain one worksheet with the same name as the file, but without the `.xlsx` extension: + +```sql +COPY tbl TO 'output.xlsx' WITH (FORMAT GDAL, DRIVER 'xlsx'); +``` + +The result of a query can also be directly exported to an Excel file: + +```sql +COPY (SELECT * FROM tbl) TO 'output.xlsx' WITH (FORMAT GDAL, DRIVER 'xlsx'); +``` + +> Dates and timestamps are currently not supported by the `xlsx` writer. +> Cast columns of those types to `VARCHAR` prior to creating the `xlsx` file. + +## See Also + +DuckDB can also [import Excel files]({% link docs/archive/1.0/guides/file_formats/excel_import.md %}). +For additional details, see the [spatial extension page]({% link docs/archive/1.0/extensions/spatial.md %}) and the [GDAL XLSX driver page](https://gdal.org/drivers/vector/xlsx.html). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/excel_import.md b/docs/archive/1.0/guides/file_formats/excel_import.md new file mode 100644 index 00000000000..b105cfa6593 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/excel_import.md @@ -0,0 +1,106 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/excel_import +title: Excel Import +--- + +## Installing the Extension + +To read data from an Excel file, install and load the [spatial extension]({% link docs/archive/1.0/extensions/spatial.md %}). +This is only needed once per DuckDB connection. + +```sql +INSTALL spatial; +LOAD spatial; +``` + +## Importing Excel Sheets + +Use the `st_read` function in the `FROM` clause of a query: + +```sql +SELECT * FROM st_read('test_excel.xlsx'); +``` + +The `layer` parameter allows specifying the name of the Excel worksheet: + +```sql +SELECT * FROM st_read('test_excel.xlsx', layer = 'Sheet1'); +``` + +### Creating a New Table + +To create a new table using the result from a query, use `CREATE TABLE ... AS` from a `SELECT` statement: + +```sql +CREATE TABLE new_tbl AS + SELECT * FROM st_read('test_excel.xlsx', layer = 'Sheet1'); +``` + +### Loading to an Existing Table + +To load data into an existing table from a query, use `INSERT INTO` from a `SELECT` statement: + +```sql +INSERT INTO tbl + SELECT * FROM st_read('test_excel.xlsx', layer = 'Sheet1'); +``` + +### Options + +Several configuration options are also available for the underlying GDAL library that is doing the XLSX parsing. +You can pass them via the `open_options` parameter of the `st_read` function as a list of `'KEY=VALUE'` strings. + +#### Importing a Sheet with/without a Header + +The option `HEADERS` has three possible values: + +* `FORCE`: treat the first row as a header +* `DISABLE` treat the first row as a row of data +* `AUTO` attempt auto-detection (default) + +For example, to treat the first row as a header, run: + +```sql +SELECT * +FROM st_read( + 'test_excel.xlsx', + layer = 'Sheet1', + open_options = ['HEADERS=FORCE'] +); +``` + +#### Detecting Types + +The option `FIELD_TYPE` defines how field types should be treated: + +* `STRING`: all fields should be loaded as strings (`VARCHAR` type) +* `AUTO`: field types should be auto-detected (default) + +For example, to treat the first row as a header and use auto-detection for types, run: + +```sql +SELECT * +FROM st_read( + 'test_excel.xlsx', + layer = 'Sheet1', + open_options = ['HEADERS=FORCE', 'FIELD_TYPES=AUTO'] +); +``` + +To treat the fields as strings: + +```sql +SELECT * +FROM st_read( + 'test_excel.xlsx', + layer = 'Sheet1', + open_options = ['FIELD_TYPES=STRING'] +); +``` + +## See Also + +DuckDB can also [export Excel files]({% link docs/archive/1.0/guides/file_formats/excel_export.md %}). +For additional details on Excel support, see the [spatial extension page]({% link docs/archive/1.0/extensions/spatial.md %}), the [GDAL XLSX driver page](https://gdal.org/drivers/vector/xlsx.html), and the [GDAL configuration options page](https://gdal.org/user/configoptions.html#configoptions). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/json_export.md b/docs/archive/1.0/guides/file_formats/json_export.md new file mode 100644 index 00000000000..b6c8162df92 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/json_export.md @@ -0,0 +1,20 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/json_export +title: JSON Export +--- + +To export the data from a table to a JSON file, use the `COPY` statement: + +```sql +COPY tbl TO 'output.json'; +``` + +The result of queries can also be directly exported to a JSON file: + +```sql +COPY (SELECT * FROM tbl) TO 'output.json'; +``` + +For additional options, see the [`COPY` statement documentation]({% link docs/archive/1.0/sql/statements/copy.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/json_import.md b/docs/archive/1.0/guides/file_formats/json_import.md new file mode 100644 index 00000000000..9d852c9a8a4 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/json_import.md @@ -0,0 +1,37 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/json_import +title: JSON Import +--- + +To read data from a JSON file, use the `read_json_auto` function in the `FROM` clause of a query: + +```sql +SELECT * +FROM read_json_auto('input.json'); +``` + +To create a new table using the result from a query, use `CREATE TABLE AS` from a `SELECT` statement: + +```sql +CREATE TABLE new_tbl AS + SELECT * + FROM read_json_auto('input.json'); +``` + +To load data into an existing table from a query, use `INSERT INTO` from a `SELECT` statement: + +```sql +INSERT INTO tbl + SELECT * + FROM read_json_auto('input.json'); +``` + +Alternatively, the `COPY` statement can also be used to load data from a JSON file into an existing table: + +```sql +COPY tbl FROM 'input.json'; +``` + +For additional options, see the [JSON Loading reference]({% link docs/archive/1.0/data/json/overview.md %}) and the [`COPY` statement documentation]({% link docs/archive/1.0/sql/statements/copy.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/overview.md b/docs/archive/1.0/guides/file_formats/overview.md new file mode 100644 index 00000000000..a405748aa80 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/overview.md @@ -0,0 +1,4 @@ +--- +layout: docu +title: File Formats +--- \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/parquet_export.md b/docs/archive/1.0/guides/file_formats/parquet_export.md new file mode 100644 index 00000000000..b5ed9bb94a5 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/parquet_export.md @@ -0,0 +1,20 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/parquet_export +title: Parquet Export +--- + +To export the data from a table to a Parquet file, use the `COPY` statement: + +```sql +COPY tbl TO 'output.parquet' (FORMAT PARQUET); +``` + +The result of queries can also be directly exported to a Parquet file: + +```sql +COPY (SELECT * FROM tbl) TO 'output.parquet' (FORMAT PARQUET); +``` + +The flags for setting compression, row group size, etc. are listed in the [Reading and Writing Parquet files]({% link docs/archive/1.0/data/parquet/overview.md %}) page. \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/parquet_import.md b/docs/archive/1.0/guides/file_formats/parquet_import.md new file mode 100644 index 00000000000..57db4823d9a --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/parquet_import.md @@ -0,0 +1,40 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/parquet_import +title: Parquet Import +--- + +To read data from a Parquet file, use the `read_parquet` function in the `FROM` clause of a query: + +```sql +SELECT * FROM read_parquet('input.parquet'); +``` + +Alternatively, you can omit the `read_parquet` function and let DuckDB infer it from the extension: + +```sql +SELECT * FROM 'input.parquet'; +``` + +To create a new table using the result from a query, use [`CREATE TABLE ... AS SELECT` statement]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas): + +```sql +CREATE TABLE new_tbl AS + SELECT * FROM read_parquet('input.parquet'); +``` + +To load data into an existing table from a query, use `INSERT INTO` from a `SELECT` statement: + +```sql +INSERT INTO tbl + SELECT * FROM read_parquet('input.parquet'); +``` + +Alternatively, the `COPY` statement can also be used to load data from a Parquet file into an existing table: + +```sql +COPY tbl FROM 'input.parquet' (FORMAT PARQUET); +``` + +For additional options, see the [Parquet loading reference]({% link docs/archive/1.0/data/parquet/overview.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/query_parquet.md b/docs/archive/1.0/guides/file_formats/query_parquet.md new file mode 100644 index 00000000000..2f441fedd35 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/query_parquet.md @@ -0,0 +1,16 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/query_parquet +title: Querying Parquet Files +--- + +To run a query directly on a Parquet file, use the `read_parquet` function in the `FROM` clause of a query. + +```sql +SELECT * FROM read_parquet('input.parquet'); +``` + +The Parquet file will be processed in parallel. Filters will be automatically pushed down into the Parquet scan, and only the relevant columns will be read automatically. + +For more information see the blog post [“Querying Parquet with Precision using DuckDB”](/2021/06/25/querying-parquet). \ No newline at end of file diff --git a/docs/archive/1.0/guides/file_formats/read_file.md b/docs/archive/1.0/guides/file_formats/read_file.md new file mode 100644 index 00000000000..2927b6eda40 --- /dev/null +++ b/docs/archive/1.0/guides/file_formats/read_file.md @@ -0,0 +1,65 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/read_file +title: Directly Reading Files +--- + +DuckDB allows directly reading files via the [`read_text`](#read_text) and [`read_blob`](#read_blob) functions. +These functions accept a filename, a list of filenames or a glob pattern, and output the content of each file as a `VARCHAR` or `BLOB`, respectively, as well as additional metadata such as the file size and last modified time. + +## `read_text` + +The `read_text` table function reads from the selected source(s) to a `VARCHAR`. Each file results in a single row with the `content` field holding the entire content of the respective file. + +```sql +SELECT size, parse_path(filename), content +FROM read_text('test/sql/table_function/files/*.txt'); +``` + +| size | parse_path(filename) | content | +|-----:|-----------------------------------------------|------------------| +| 12 | [test, sql, table_function, files, one.txt] | Hello World! | +| 2 | [test, sql, table_function, files, three.txt] | 42 | +| 10 | [test, sql, table_function, files, two.txt] | Foo Bar\nFöö Bär | + +The file content is first validated to be valid UTF-8. If `read_text` attempts to read a file with invalid UTF-8, an error is thrown suggesting to use [`read_blob`](#read_blob) instead. + +## `read_blob` + +The `read_blob` table function reads from the selected source(s) to a `BLOB`: + +```sql +SELECT size, content, filename +FROM read_blob('test/sql/table_function/files/*'); +``` + +| size | content | filename | +|-----:|--------------------------------------------------------------|-----------------------------------------| +| 178 | PK\x03\x04\x0A\x00\x00\x00\x00\x00\xACi=X\x14t\xCE\xC7\x0A… | test/sql/table_function/files/four.blob | +| 12 | Hello World! | test/sql/table_function/files/one.txt | +| 2 | 42 | test/sql/table_function/files/three.txt | +| 10 | F\xC3\xB6\xC3\xB6 B\xC3\xA4r | test/sql/table_function/files/two.txt | + +## Schema + +The schemas of the tables returned by `read_text` and `read_blob` are identical: + +```sql +DESCRIBE FROM read_text('README.md'); +``` + +| column_name | column_type | null | key | default | extra | +|---------------|-------------|------|------|---------|-------| +| filename | VARCHAR | YES | NULL | NULL | NULL | +| content | VARCHAR | YES | NULL | NULL | NULL | +| size | BIGINT | YES | NULL | NULL | NULL | +| last_modified | TIMESTAMP | YES | NULL | NULL | NULL | + +## Handling Missing Metadata + +In cases where the underlying filesystem is unable to provide some of this data due (e.g., because HTTPFS can't always return a valid timestamp), the cell is set to `NULL` instead. + +## Support for Projection Pushdown + +The table functions also utilize projection pushdown to avoid computing properties unnecessarily. So you could e.g., use this to glob a directory full of huge files to get the file size in the size column, as long as you omit the content column the data won't be read into DuckDB. \ No newline at end of file diff --git a/docs/archive/1.0/guides/glossary.md b/docs/archive/1.0/guides/glossary.md new file mode 100644 index 00000000000..1b0c2a12086 --- /dev/null +++ b/docs/archive/1.0/guides/glossary.md @@ -0,0 +1,24 @@ +--- +layout: docu +title: Glossary of Terms +--- + +This page contains a glossary of a few common terms used in DuckDB. + +## Terms + +### In-Process Database Management System + +The DBMS runs in the client application's process instead of running as a separate process, which is common in the traditional client–server setup. An alternative term is **embeddable** database management system. In general, the term _“embedded database management system”_ should be avoided, as it can be confused with DBMSs targeting _embedded systems_ (which run on e.g., microcontrollers). + +### Replacement Scan + +In DuckDB, replacement scans are used when a table name used by a query does not exist in the catalog. These scans can substitute another data source instead of the table. Using replacement scans allows DuckDB to, e.g., seamlessly read [Pandas DataFrames]({% link docs/archive/1.0/guides/python/sql_on_pandas.md %}) or read input data from remote sources without explicitly invoking the functions that perform this (e.g., [reading Parquet files from https]({% link docs/archive/1.0/guides/network_cloud_storage/http_import.md %})). For details, see the [C API - Replacement Scans page]({% link docs/archive/1.0/api/c/replacement_scans.md %}). + +### Extension + +DuckDB has a flexible extension mechanism that allows for dynamically loading extensions. These may extend DuckDB's functionality by providing support for additional file formats, introducing new types, and domain-specific functionality. For details, see the [Extensions page]({% link docs/archive/1.0/extensions/overview.md %}). + +### Platform + +The platform is a combination of the operating system (e.g., Linux, macOS, Windows), system architecture (e.g., AMD64, ARM64), and, optionally, the compiler used (e.g., GCC4). Platforms are used to distributed DuckDB binaries and [extension packages]({% link docs/archive/1.0/extensions/working_with_extensions.md %}#platforms). \ No newline at end of file diff --git a/docs/archive/1.0/guides/meta/describe.md b/docs/archive/1.0/guides/meta/describe.md new file mode 100644 index 00000000000..aa58358a1e3 --- /dev/null +++ b/docs/archive/1.0/guides/meta/describe.md @@ -0,0 +1,75 @@ +--- +layout: docu +title: Describe +--- + +## Describing a Table + +In order to view the schema of a table, use `DESCRIBE` or `SHOW` followed by the table name. + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j VARCHAR); +DESCRIBE tbl; +SHOW tbl; -- equivalent to DESCRIBE tbl; +``` + +
+ +| column_name | column_type | null | key | default | extra | +|-------------|-------------|------|------|---------|-------| +| i | INTEGER | NO | PRI | NULL | NULL | +| j | VARCHAR | YES | NULL | NULL | NULL | + +## Describing a Query + +In order to view the schema of the result of a query, prepend `DESCRIBE` to a query. + +```sql +DESCRIBE SELECT * FROM tbl; +``` + +
+ +| column_name | column_type | null | key | default | extra | +|-------------|-------------|------|------|---------|-------| +| i | INTEGER | YES | NULL | NULL | NULL | +| j | VARCHAR | YES | NULL | NULL | NULL | + +Note that there are subtle differences: compared to the result when [describing a table](#describing-a-table), nullability (`null`) and key information (`key`) are lost. + +## Using `DESCRIBE` in a Subquery + +`DESCRIBE` can be used a subquery. This allows creating a table from the description, for example: + +```sql +CREATE TABLE tbl_description AS SELECT * FROM (DESCRIBE tbl); +``` + +## Describing Remote Tables + +It is possible to describe remote tables via the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) using the `DESCRIBE TABLE` statement. For example: + +```sql +DESCRIBE TABLE 'https://blobs.duckdb.org/data/Star_Trek-Season_1.csv'; +``` + +| column_name | column_type | null | key | default | extra | +|-----------------------------------------|-------------|------|------|---------|-------| +| season_num | BIGINT | YES | NULL | NULL | NULL | +| episode_num | BIGINT | YES | NULL | NULL | NULL | +| aired_date | DATE | YES | NULL | NULL | NULL | +| cnt_kirk_hookups | BIGINT | YES | NULL | NULL | NULL | +| cnt_downed_redshirts | BIGINT | YES | NULL | NULL | NULL | +| bool_aliens_almost_took_over_planet | BIGINT | YES | NULL | NULL | NULL | +| bool_aliens_almost_took_over_enterprise | BIGINT | YES | NULL | NULL | NULL | +| cnt_vulcan_nerve_pinch | BIGINT | YES | NULL | NULL | NULL | +| cnt_warp_speed_orders | BIGINT | YES | NULL | NULL | NULL | +| highest_warp_speed_issued | BIGINT | YES | NULL | NULL | NULL | +| bool_hand_phasers_fired | BIGINT | YES | NULL | NULL | NULL | +| bool_ship_phasers_fired | BIGINT | YES | NULL | NULL | NULL | +| bool_ship_photon_torpedos_fired | BIGINT | YES | NULL | NULL | NULL | +| cnt_transporter_pax | BIGINT | YES | NULL | NULL | NULL | +| cnt_damn_it_jim_quote | BIGINT | YES | NULL | NULL | NULL | +| cnt_im_givin_her_all_shes_got_quote | BIGINT | YES | NULL | NULL | NULL | +| cnt_highly_illogical_quote | BIGINT | YES | NULL | NULL | NULL | +| bool_enterprise_saved_the_day | BIGINT | YES | NULL | NULL | NULL | \ No newline at end of file diff --git a/docs/archive/1.0/guides/meta/duckdb_environment.md b/docs/archive/1.0/guides/meta/duckdb_environment.md new file mode 100644 index 00000000000..8e60dcc7758 --- /dev/null +++ b/docs/archive/1.0/guides/meta/duckdb_environment.md @@ -0,0 +1,82 @@ +--- +layout: docu +title: DuckDB Environment +--- + +DuckDB provides a number of functions and `PRAGMA` options to retrieve information on the running DuckDB instance and its environment. + +## Version + +The `version()` function returns the version number of DuckDB. + +```sql +SELECT version() AS version; +``` + +
+ +| version | +|-----------| +| v{{ site.currentduckdbversion }} | + +Using a `PRAGMA`: + +```sql +PRAGMA version; +``` + +
+ +| library_version | source_id | +|-----------------|------------| +| v{{ site.currentduckdbversion }} | {{ site.currentduckdbhash }} | + +## Platform + +The platform information consists of the operating system, system architecture, and, optionally, the compiler. +The platform is used when [installing extensions]({% link docs/archive/1.0/extensions/working_with_extensions.md %}#platforms). +To retrieve the platform, use the following `PRAGMA`: + +```sql +PRAGMA platform; +``` + +On macOS, running on Apple Silicon architecture, the result is: + +| platform | +|-----------| +| osx_arm64 | + +On Windows, running on an AMD64 architecture, the platform is `windows_amd64`. +On CentOS 7, running on the AMD64 architecture, the platform is `linux_amd64_gcc4`. +On Ubuntu 22.04, running on the ARM64 architecture, the platform is `linux_arm64`. + +## Extensions + +To get a list of DuckDB extension and their status (e.g., `loaded`, `installed`), use the [`duckdb_extensions()` function]({% link docs/archive/1.0/extensions/overview.md %}#listing-extensions): + +```sql +SELECT * +FROM duckdb_extensions(); +``` + +## Meta Table Functions + +DuckDB has the following built-in table functions to obtain metadata about available catalog objects: + +* [`duckdb_columns()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_columns): columns +* [`duckdb_constraints()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_constraints): constraints +* [`duckdb_databases()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_databases): lists the databases that are accessible from within the current DuckDB process +* [`duckdb_dependencies()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_dependencies): dependencies between objects +* [`duckdb_extensions()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_extensions): extensions +* [`duckdb_functions()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_functions): functions +* [`duckdb_indexes()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_indexes): secondary indexes +* [`duckdb_keywords()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_keywords): DuckDB's keywords and reserved words +* [`duckdb_optimizers()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_optimizers): the available optimization rules in the DuckDB instance +* [`duckdb_schemas()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_schemas): schemas +* [`duckdb_sequences()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_sequences): sequences +* [`duckdb_settings()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_settings): settings +* [`duckdb_tables()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_tables): base tables +* [`duckdb_types()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_types): data types +* [`duckdb_views()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_views): views +* [`duckdb_temporary_files()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_temporary_files): the temporary files DuckDB has written to disk, to offload data from memory \ No newline at end of file diff --git a/docs/archive/1.0/guides/meta/explain.md b/docs/archive/1.0/guides/meta/explain.md new file mode 100644 index 00000000000..ec459484abd --- /dev/null +++ b/docs/archive/1.0/guides/meta/explain.md @@ -0,0 +1,115 @@ +--- +layout: docu +title: 'EXPLAIN: Inspect Query Plans' +--- + +In order to view the query plan of a query, prepend `EXPLAIN` to a query. + +```sql +EXPLAIN SELECT * FROM tbl; +``` + +By default only the final physical plan is shown. In order to see the unoptimized and optimized logical plans, change the `explain_output` setting: + +```sql +SET explain_output = 'all'; +``` + +Below is an example of running `EXPLAIN` on [`Q13`](https://github.com/duckdb/duckdb/blob/main/extension/tpch/dbgen/queries/q13.sql) of the [TPC-H benchmark]({% link docs/archive/1.0/extensions/tpch.md %}) on the scale factor 1 data set. + +```sql +EXPLAIN + SELECT + c_count, + count(*) AS custdist + FROM ( + SELECT + c_custkey, + count(o_orderkey) + FROM + customer + LEFT OUTER JOIN orders ON c_custkey = o_custkey + AND o_comment NOT LIKE '%special%requests%' + GROUP BY c_custkey + ) AS c_orders (c_custkey, c_count) + GROUP BY + c_count + ORDER BY + custdist DESC, + c_count DESC; +``` + +```text +┌─────────────────────────────┐ +│┌───────────────────────────┐│ +││ Physical Plan ││ +│└───────────────────────────┘│ +└─────────────────────────────┘ +┌───────────────────────────┐ +│ ORDER_BY │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ ORDERS: │ +│ count_star() DESC │ +│ c_orders.c_count DESC │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HASH_GROUP_BY │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ #0 │ +│ count_star() │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ c_count │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ count(o_orderkey) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HASH_GROUP_BY │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ #0 │ +│ count(#1) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ c_custkey │ +│ o_orderkey │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HASH_JOIN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ RIGHT │ +│ o_custkey = c_custkey ├──────────────┐ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ +│ EC: 300000 │ │ +└─────────────┬─────────────┘ │ +┌─────────────┴─────────────┐┌─────────────┴─────────────┐ +│ FILTER ││ SEQ_SCAN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ (o_comment !~~ '%special ││ customer │ +│ %requests%') ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ c_custkey │ +│ EC: 300000 ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ ││ EC: 150000 │ +└─────────────┬─────────────┘└───────────────────────────┘ +┌─────────────┴─────────────┐ +│ SEQ_SCAN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ orders │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ o_custkey │ +│ o_comment │ +│ o_orderkey │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ EC: 1500000 │ +└───────────────────────────┘ +``` + +## See Also + +For more information, see the [Profiling page]({% link docs/archive/1.0/dev/profiling.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/meta/explain_analyze.md b/docs/archive/1.0/guides/meta/explain_analyze.md new file mode 100644 index 00000000000..b7ba8f0e27c --- /dev/null +++ b/docs/archive/1.0/guides/meta/explain_analyze.md @@ -0,0 +1,152 @@ +--- +layout: docu +title: 'EXPLAIN ANALYZE: Profile Queries' +--- + +In order to profile a query, prepend `EXPLAIN ANALYZE` to a query. + +```sql +EXPLAIN ANALYZE SELECT * FROM tbl; +``` + +The query plan will be pretty-printed to the screen using timings for every operator. + +Note that the **cumulative** wall-clock time that is spent on every operator is shown. When multiple threads are processing the query in parallel, the total processing time of the query may be lower than the sum of all the times spent on the individual operators. + +Below is an example of running `EXPLAIN ANALYZE` on [`Q13`](https://github.com/duckdb/duckdb/blob/main/extension/tpch/dbgen/queries/q13.sql) of the [TPC-H benchmark]({% link docs/archive/1.0/extensions/tpch.md %}) on the scale factor 1 data set. + +```sql +EXPLAIN ANALYZE + SELECT + c_count, + count(*) AS custdist + FROM ( + SELECT + c_custkey, + count(o_orderkey) + FROM + customer + LEFT OUTER JOIN orders ON c_custkey = o_custkey + AND o_comment NOT LIKE '%special%requests%' + GROUP BY c_custkey + ) AS c_orders (c_custkey, c_count) + GROUP BY + c_count + ORDER BY + custdist DESC, + c_count DESC; +``` + +```text +┌─────────────────────────────────────┐ +│┌───────────────────────────────────┐│ +││ Total Time: 0.0487s ││ +│└───────────────────────────────────┘│ +└─────────────────────────────────────┘ +┌───────────────────────────┐ +│ RESULT_COLLECTOR │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 0 │ +│ (0.00s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ EXPLAIN_ANALYZE │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 0 │ +│ (0.00s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ ORDER_BY │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ ORDERS: │ +│ count_star() DESC │ +│ c_orders.c_count DESC │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 42 │ +│ (0.00s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HASH_GROUP_BY │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ #0 │ +│ count_star() │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 42 │ +│ (0.00s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ c_count │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 150000 │ +│ (0.00s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ count(o_orderkey) │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 150000 │ +│ (0.00s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HASH_GROUP_BY │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ #0 │ +│ count(#1) │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 150000 │ +│ (0.09s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ PROJECTION │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ c_custkey │ +│ o_orderkey │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 1534302 │ +│ (0.00s) │ +└─────────────┬─────────────┘ +┌─────────────┴─────────────┐ +│ HASH_JOIN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ RIGHT │ +│ o_custkey = c_custkey │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ├──────────────┐ +│ EC: 300000 │ │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ +│ 1534302 │ │ +│ (0.08s) │ │ +└─────────────┬─────────────┘ │ +┌─────────────┴─────────────┐┌─────────────┴─────────────┐ +│ FILTER ││ SEQ_SCAN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ (o_comment !~~ '%special ││ customer │ +│ %requests%') ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ c_custkey │ +│ EC: 300000 ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ EC: 150000 │ +│ 1484298 ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ (0.10s) ││ 150000 │ +│ ││ (0.00s) │ +└─────────────┬─────────────┘└───────────────────────────┘ +┌─────────────┴─────────────┐ +│ SEQ_SCAN │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ orders │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ o_custkey │ +│ o_comment │ +│ o_orderkey │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ EC: 1500000 │ +│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ +│ 1500000 │ +│ (0.01s) │ +└───────────────────────────┘ +``` + +## See Also + +For more information, see the [Profiling page]({% link docs/archive/1.0/dev/profiling.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/meta/list_tables.md b/docs/archive/1.0/guides/meta/list_tables.md new file mode 100644 index 00000000000..f57bf63ae15 --- /dev/null +++ b/docs/archive/1.0/guides/meta/list_tables.md @@ -0,0 +1,39 @@ +--- +layout: docu +title: List Tables +--- + +The `SHOW TABLES` command can be used to obtain a list of all tables within the selected schema. + +```sql +CREATE TABLE tbl (i INTEGER); +SHOW TABLES; +``` + +
+ +| name | +|------| +| tbl | + +`SHOW` or `SHOW ALL TABLES` can be used to obtain a list of all tables within **all** attached databases and schemas. + +```sql +CREATE TABLE tbl (i INTEGER); +CREATE SCHEMA s1; +CREATE TABLE s1.tbl (v VARCHAR); +SHOW ALL TABLES; +``` + +
+ +| database | schema | table_name | column_names | column_types | temporary | +|----------|--------|------------|--------------|--------------|-----------| +| memory | main | tbl | [i] | [INTEGER] | false | +| memory | s1 | tbl | [v] | [VARCHAR] | false | + +To view the schema of an individual table, use the [`DESCRIBE` command]({% link docs/archive/1.0/guides/meta/describe.md %}). + +## See Also + +The SQL-standard [`information_schema`]({% link docs/archive/1.0/sql/meta/information_schema.md %}) views are also defined. Moreover, DuckDB defines `sqlite_master` and many [PostgreSQL system catalog tables](https://www.postgresql.org/docs/16/catalogs.html) for compatibility with SQLite and PostgreSQL respectively. \ No newline at end of file diff --git a/docs/archive/1.0/guides/meta/summarize.md b/docs/archive/1.0/guides/meta/summarize.md new file mode 100644 index 00000000000..750134ba2d0 --- /dev/null +++ b/docs/archive/1.0/guides/meta/summarize.md @@ -0,0 +1,70 @@ +--- +layout: docu +title: Summarize +--- + +The `SUMMARIZE` command can be used to easily compute a number of aggregates over a table or a query. +The `SUMMARIZE` command launches a query that computes a number of aggregates over all columns (`min`, `max`, `approx_unique`, `avg`, `std`, `q25`, `q50`, `q75`, `count`), and return these along the column name, column type, and the percentage of `NULL` values in the column. + +## Usage + +In order to summarize the contents of a table, use `SUMMARIZE` followed by the table name. + +```sql +SUMMARIZE tbl; +``` + +In order to summarize a query, prepend `SUMMARIZE` to a query. + +```sql +SUMMARIZE SELECT * FROM tbl; +``` + +## Example + +Below is an example of `SUMMARIZE` on the `lineitem` table of TPC-H `SF1` table, generated using the [`tpch` extension]({% link docs/archive/1.0/extensions/tpch.md %}). + +```sql +INSTALL tpch; +LOAD tpch; +CALL dbgen(sf = 1); +``` + +```sql +SUMMARIZE lineitem; +``` + +| column_name | column_type | min | max | approx_unique | avg | std | q25 | q50 | q75 | count | null_percentage | +|-----------------|---------------|-------------|---------------------|---------------|---------------------|----------------------|---------|---------|---------|---------|-----------------| +| l_orderkey | INTEGER | 1 | 6000000 | 1508227 | 3000279.604204982 | 1732187.8734803519 | 1509447 | 2989869 | 4485232 | 6001215 | 0.0% | +| l_partkey | INTEGER | 1 | 200000 | 202598 | 100017.98932999402 | 57735.69082650496 | 49913 | 99992 | 150039 | 6001215 | 0.0% | +| l_suppkey | INTEGER | 1 | 10000 | 10061 | 5000.602606138924 | 2886.9619987306114 | 2501 | 4999 | 7500 | 6001215 | 0.0% | +| l_linenumber | INTEGER | 1 | 7 | 7 | 3.0005757167506912 | 1.7324314036519328 | 2 | 3 | 4 | 6001215 | 0.0% | +| l_quantity | DECIMAL(15,2) | 1.00 | 50.00 | 50 | 25.507967136654827 | 14.426262537016918 | 13 | 26 | 38 | 6001215 | 0.0% | +| l_extendedprice | DECIMAL(15,2) | 901.00 | 104949.50 | 923139 | 38255.138484656854 | 23300.43871096221 | 18756 | 36724 | 55159 | 6001215 | 0.0% | +| l_discount | DECIMAL(15,2) | 0.00 | 0.10 | 11 | 0.04999943011540163 | 0.03161985510812596 | 0 | 0 | 0 | 6001215 | 0.0% | +| l_tax | DECIMAL(15,2) | 0.00 | 0.08 | 9 | 0.04001350893110812 | 0.025816551798842728 | 0 | 0 | 0 | 6001215 | 0.0% | +| l_returnflag | VARCHAR | A | R | 3 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | +| l_linestatus | VARCHAR | F | O | 2 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | +| l_shipdate | DATE | 1992-01-02 | 1998-12-01 | 2516 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | +| l_commitdate | DATE | 1992-01-31 | 1998-10-31 | 2460 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | +| l_receiptdate | DATE | 1992-01-04 | 1998-12-31 | 2549 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | +| l_shipinstruct | VARCHAR | COLLECT COD | TAKE BACK RETURN | 4 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | +| l_shipmode | VARCHAR | AIR | TRUCK | 7 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | +| l_comment | VARCHAR | Tiresias | zzle? furiously iro | 3558599 | NULL | NULL | NULL | NULL | NULL | 6001215 | 0.0% | + +## Using `SUMMARIZE` in a Subquery + +`SUMMARIZE` can be used a subquery. This allows creating a table from the summary, for example: + +```sql +CREATE TABLE tbl_summary AS SELECT * FROM (SUMMARIZE tbl); +``` + +## Summarizing Remote Tables + +It is possible to summarize remote tables via the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) using the `SUMMARIZE TABLE` statement. For example: + +```sql +SUMMARIZE TABLE 'https://blobs.duckdb.org/data/Star_Trek-Season_1.csv'; +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/cloudflare_r2_import.md b/docs/archive/1.0/guides/network_cloud_storage/cloudflare_r2_import.md new file mode 100644 index 00000000000..34a57cf774f --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/cloudflare_r2_import.md @@ -0,0 +1,33 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/cloudflare_r2_import +title: Cloudflare R2 Import +--- + +## Prerequisites + +For Cloudflare R2, the [S3 Compatibility API](https://developers.cloudflare.com/r2/api/s3/api/) allows you to use DuckDB's S3 support to read and write from R2 buckets. + +This requires the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}), which can be installed using the `INSTALL` SQL command. This only needs to be run once. + +## Credentials and Configuration + +You will need to [generate an S3 auth token](https://developers.cloudflare.com/r2/api/s3/tokens/) and create an `R2` secret in DuckDB: + +```sql +CREATE SECRET ( + TYPE R2, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', + ACCOUNT_ID 'your-account-id-here' -- your 33 character hexadecimal account ID +); +``` + +## Querying + +After setting up the R2 credentials, you can query the R2 data using DuckDB's built-in methods, such as `read_csv` or `read_parquet`: + +```sql +SELECT * FROM read_parquet('r2://⟨r2_bucket_name⟩/⟨file⟩'); +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/duckdb_over_https_or_s3.md b/docs/archive/1.0/guides/network_cloud_storage/duckdb_over_https_or_s3.md new file mode 100644 index 00000000000..d12e6e52ba8 --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/duckdb_over_https_or_s3.md @@ -0,0 +1,57 @@ +--- +layout: docu +title: Attach to a DuckDB Database over HTTPS or S3 +--- + +You can establish a read-only connection to a DuckDB instance via HTTPS or the S3 API. + +## Prerequisites + +This guide requires the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}), which can be installed using the `INSTALL httpfs` SQL command. This only needs to be run once. + +## Attaching to a Database over HTTPS + +To connect to a DuckDB database via HTTPS, use the [`ATTACH` statement]({% link docs/archive/1.0/sql/statements/attach.md %}) as follows: + +```sql +LOAD httpfs; +ATTACH 'https://blobs.duckdb.org/databases/stations.duckdb' AS stations_db (READ_ONLY); +``` + +Then, the database can be queried using: + +```sql +SELECT count(*) AS num_stations +FROM stations_db.stations; +``` + +| num_stations | +|-------------:| +| 578 | + +## Attaching to a Database over the S3 API + +To connect to a DuckDB database via the S3 API, [configure the authentication]({% link docs/archive/1.0/guides/network_cloud_storage/s3_import.md %}#credentials-and-configuration) for your bucket (if required). +Then, use the [`ATTACH` statement]({% link docs/archive/1.0/sql/statements/attach.md %}) as follows: + +```sql +LOAD httpfs; +ATTACH 's3://duckdb-blobs/databases/stations.duckdb' AS stations_db (READ_ONLY); +``` + +The database can be queried using: + +```sql +SELECT count(*) AS num_stations +FROM stations_db.stations; +``` + +| num_stations | +|-------------:| +| 578 | + +> Connecting to S3-compatible APIs such as the [Google Cloud Storage (`gs://`)]({% link docs/archive/1.0/guides/network_cloud_storage/gcs_import.md %}#attaching-to-a-database) is also supported. + +## Limitations + +* Only read-only connections are allowed, writing the database via the HTTPS protocol or the S3 API is not possible. \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/gcs_import.md b/docs/archive/1.0/guides/network_cloud_storage/gcs_import.md new file mode 100644 index 00000000000..21a6299963b --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/gcs_import.md @@ -0,0 +1,43 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/gcs_import +title: Google Cloud Storage Import +--- + +## Prerequisites + +The Google Cloud Storage (GCS) can be used via the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}). +This can be installed with the `INSTALL httpfs` SQL command. This only needs to be run once. + +## Credentials and Configuration + +You need to create [HMAC keys](https://console.cloud.google.com/storage/settings;tab=interoperability) and declare them: + +```sql +CREATE SECRET ( + TYPE GCS, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY' +); +``` + +## Querying + +After setting up the GCS credentials, you can query the GCS data using: + +```sql +SELECT * +FROM read_parquet('gs://⟨gcs_bucket⟩/⟨file.parquet⟩'); +``` + +## Attaching to a Database + +You can [attach to a database file]({% link docs/archive/1.0/guides/network_cloud_storage/duckdb_over_https_or_s3.md %}) in read-only mode: + +```sql +LOAD httpfs; +ATTACH 'gs://⟨gcs_bucket⟩/⟨file.duckdb⟩' AS ⟨duckdb_database⟩ (READ_ONLY); +``` + +> Databases in Google Cloud Storage can only be attached in read-only mode. \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/http_import.md b/docs/archive/1.0/guides/network_cloud_storage/http_import.md new file mode 100644 index 00000000000..f7341d149bb --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/http_import.md @@ -0,0 +1,42 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/http_import +title: HTTP Parquet Import +--- + +To load a Parquet file over HTTP(S), the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) is required. This can be installed using the `INSTALL` SQL command. This only needs to be run once. + +```sql +INSTALL httpfs; +``` + +To load the `httpfs` extension for usage, use the `LOAD` SQL command: + +```sql +LOAD httpfs; +``` + +After the `httpfs` extension is set up, Parquet files can be read over `http(s)`: + +```sql +SELECT * FROM read_parquet('https://⟨domain⟩/path/to/file.parquet'); +``` + +For example: + +```sql +SELECT * FROM read_parquet('https://duckdb.org/data/prices.parquet'); +``` + +The function `read_parquet` can be omitted if the URL ends with `.parquet`: + +```sql +SELECT * FROM read_parquet('https://duckdb.org/data/holdings.parquet'); +``` + +Moreover, the `read_parquet` function itself can also be omitted thanks to DuckDB's [replacement scan mechanism]({% link docs/archive/1.0/api/c/replacement_scans.md %}): + +```sql +SELECT * FROM 'https://duckdb.org/data/holdings.parquet'; +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/overview.md b/docs/archive/1.0/guides/network_cloud_storage/overview.md new file mode 100644 index 00000000000..ee719aa15fe --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/overview.md @@ -0,0 +1,4 @@ +--- +layout: docu +title: Cloud Storage +--- \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/s3_export.md b/docs/archive/1.0/guides/network_cloud_storage/s3_export.md new file mode 100644 index 00000000000..0fc16ab35d0 --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/s3_export.md @@ -0,0 +1,62 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/s3_export +title: S3 Parquet Export +--- + +To write a Parquet file to S3, the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) is required. This can be installed using the `INSTALL` SQL command. This only needs to be run once. + +```sql +INSTALL httpfs; +``` + +To load the `httpfs` extension for usage, use the `LOAD` SQL command: + +```sql +LOAD httpfs; +``` + +After loading the `httpfs` extension, set up the credentials to write data. Note that the `region` parameter should match the region of the bucket you want to access. + +```sql +CREATE SECRET ( + TYPE S3, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', + REGION 'us-east-1' +); +``` + +> Tip If you get an IO Error (`Connection error for HTTP HEAD`), configure the endpoint explicitly via `ENDPOINT 's3.⟨your-region⟩.amazonaws.com'`. + +Alternatively, use the [`aws` extension]({% link docs/archive/1.0/extensions/aws.md %}) to retrieve the credentials automatically: + +```sql +CREATE SECRET ( + TYPE S3, + PROVIDER CREDENTIAL_CHAIN +); +``` + +After the `httpfs` extension is set up and the S3 credentials are correctly configured, Parquet files can be written to S3 using the following command: + +```sql +COPY ⟨table_name⟩ TO 's3://bucket/file.parquet'; +``` + +Similarly, Google Cloud Storage (GCS) is supported through the Interoperability API. You need to create [HMAC keys](https://console.cloud.google.com/storage/settings;tab=interoperability) and provide the credentials as follows: + +```sql +CREATE SECRET ( + TYPE GCS, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY' +); +``` + +After setting up the GCS credentials, you can export using: + +```sql +COPY ⟨table_name⟩ TO 'gs://gcs_bucket/file.parquet'; +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/s3_express_one.md b/docs/archive/1.0/guides/network_cloud_storage/s3_express_one.md new file mode 100644 index 00000000000..8cee09fe754 --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/s3_express_one.md @@ -0,0 +1,80 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/s3_express_one +title: S3 Express One +--- + +In late 2023, AWS [announced](https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-s3-express-one-zone-storage-class/) the [S3 Express One Zone](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-one-zone.html), a high-speed variant of traditional S3 buckets. +DuckDB can read S3 Express One buckets using the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}). + +## Credentials and Configuration + +The configuration of S3 Express One buckets is similar to [regular S3 buckets]({% link docs/archive/1.0/guides/network_cloud_storage/s3_import.md %}) with one exception: +we have to specify the endpoint according to the following pattern: + +```text +s3express-⟨availability zone⟩.⟨region⟩.amazonaws.com +``` + +where the `⟨availability zone⟩` (e.g., `use-az5`) can be obtained from the S3 Express One bucket's configuration page and the `⟨region⟩` is the AWS region (e.g., `us-east-1`). + +For example, to allow DuckDB to use an S3 Express One bucket, configure the [Secrets manager]({% link docs/archive/1.0/sql/statements/create_secret.md %}) as follows: + +```sql +CREATE SECRET ( + TYPE S3, + REGION 'us-east-1', + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', + ENDPOINT 's3express-use1-az5.us-east-1.amazonaws.com' +); +``` + +## Instance Location + +For best performance, make sure that the EC2 instance is in the same availability zone as the S3 Express One bucket you are querying. To determine the mapping between zone names and zone IDs, use the `aws ec2 describe-availability-zones` command. + +* Zone name to zone ID mapping: + + ```batch + aws ec2 describe-availability-zones --output json \ + | jq -r '.AvailabilityZones[] | select(.ZoneName == "us-east-1f") | .ZoneId' + ``` + + ```text + use1-az5 + ``` + +* Zone ID to zone name mapping: + + ```batch + aws ec2 describe-availability-zones --output json \ + | jq -r '.AvailabilityZones[] | select(.ZoneId == "use1-az5") | .ZoneName' + ``` + + ```text + us-east-1f + ``` + +## Querying + +You can query the S3 Express One bucket as any other S3 bucket: + +```sql +SELECT * +FROM 's3://express-bucket-name--use1-az5--x-s3/my-file.parquet'; +``` + +## Performance + +We ran two experiments on a `c7gd.12xlarge` instance using the [LDBC SF300 Comments `creationDate` Parquet file](https://blobs.duckdb.org/data/ldbc-sf300-comments-creationDate.parquet) file (also used in the [microbenchmarks of the performance guide]({% link docs/archive/1.0/guides/performance/benchmarks.md %}#data-sets)). + +
+ +| Experiment | File size | Runtime | +|:-----|--:|--:| +| Loading only from Parquet | 4.1 GB | 3.5s | +| Creating local table from Parquet | 4.1 GB | 5.1s | + +The “loading only” variant is running the load as part of an [`EXPLAIN ANALYZE`]({% link docs/archive/1.0/guides/meta/explain_analyze.md %}) statement to measure the runtime without account creating a local table, while the “creating local table” variant uses [`CREATE TABLE ... AS SELECT`]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas) to create a persistent table on the local disk. \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/s3_iceberg_import.md b/docs/archive/1.0/guides/network_cloud_storage/s3_iceberg_import.md new file mode 100644 index 00000000000..a7b1f4cfac3 --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/s3_iceberg_import.md @@ -0,0 +1,60 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/s3_iceberg_import +selected: S3 Iceberg Import +title: S3 Iceberg Import +--- + +## Prerequisites + +To load an Iceberg file from S3, both the [`httpfs`]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) and [`iceberg`]({% link docs/archive/1.0/extensions/iceberg.md %}) extensions are required. They can be installed using the `INSTALL` SQL command. The extensions only need to be installed once. + +```sql +INSTALL httpfs; +INSTALL iceberg; +``` + +To load the extensions for usage, use the `LOAD` command: + +```sql +LOAD httpfs; +LOAD iceberg; +``` + +## Credentials + +After loading the extensions, set up the credentials and S3 region to read data. You may either use an access key and secret, or a token. + +```sql +CREATE SECRET ( + TYPE S3, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', + REGION 'us-east-1' +); +``` + +Alternatively, use the [`aws` extension]({% link docs/archive/1.0/extensions/aws.md %}) to retrieve the credentials automatically: + +```sql +CREATE SECRET ( + TYPE S3, + PROVIDER CREDENTIAL_CHAIN +); +``` + +## Loading Iceberg Tables from S3 + +After the extensions are set up and the S3 credentials are correctly configured, Iceberg table can be read from S3 using the following command: + +```sql +SELECT * +FROM iceberg_scan('s3://⟨bucket⟩/⟨iceberg-table-folder⟩/metadata/⟨id⟩.metadata.json'); +``` + +Note that you need to link directly to the manifest file. Otherwise you'll get an error like this: + +```console +Error: IO Error: Cannot open file "s3://⟨bucket⟩/⟨iceberg-table-folder⟩/metadata/version-hint.text": No such file or directory +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/network_cloud_storage/s3_import.md b/docs/archive/1.0/guides/network_cloud_storage/s3_import.md new file mode 100644 index 00000000000..9d525edeaf7 --- /dev/null +++ b/docs/archive/1.0/guides/network_cloud_storage/s3_import.md @@ -0,0 +1,57 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/s3_import +title: S3 Parquet Import +--- + +## Prerequisites + +To load a Parquet file from S3, the [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) is required. This can be installed using the `INSTALL` SQL command. This only needs to be run once. + +```sql +INSTALL httpfs; +``` + +To load the `httpfs` extension for usage, use the `LOAD` SQL command: + +```sql +LOAD httpfs; +``` + +## Credentials and Configuration + +After loading the `httpfs` extension, set up the credentials and S3 region to read data: + +```sql +CREATE SECRET ( + TYPE S3, + KEY_ID 'AKIAIOSFODNN7EXAMPLE', + SECRET 'wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY', + REGION 'us-east-1' +); +``` + +> Tip If you get an IO Error (`Connection error for HTTP HEAD`), configure the endpoint explicitly via `ENDPOINT 's3.⟨your-region⟩.amazonaws.com'`. + +Alternatively, use the [`aws` extension]({% link docs/archive/1.0/extensions/aws.md %}) to retrieve the credentials automatically: + +```sql +CREATE SECRET ( + TYPE S3, + PROVIDER CREDENTIAL_CHAIN +); +``` + +## Querying + +After the `httpfs` extension is set up and the S3 configuration is set correctly, Parquet files can be read from S3 using the following command: + +```sql +SELECT * FROM read_parquet('s3://⟨bucket⟩/⟨file⟩'); +``` + +## Google Cloud Storage (GCS) and Cloudflare R2 + +DuckDB can also handle [Google Cloud Storage (GCS)]({% link docs/archive/1.0/guides/network_cloud_storage/gcs_import.md %}) and [Cloudflare R2]({% link docs/archive/1.0/guides/network_cloud_storage/cloudflare_r2_import.md %}) via the S3 API. +See the relevant guides for details. \ No newline at end of file diff --git a/docs/archive/1.0/guides/odbc/general.md b/docs/archive/1.0/guides/odbc/general.md new file mode 100644 index 00000000000..af1971ff30f --- /dev/null +++ b/docs/archive/1.0/guides/odbc/general.md @@ -0,0 +1,353 @@ +--- +layout: docu +title: 'ODBC 101: A Duck Themed Guide to ODBC' +--- + +## What is ODBC? + +[ODBC](https://learn.microsoft.com/en-us/sql/odbc/microsoft-open-database-connectivity-odbc?view=sql-server-ver16) which stands for Open Database Connectivity, is a standard that allows different programs to talk to different databases including, of course, **DuckDB** 🦆. This makes it easier to build programs that work with many different databases, which saves time as developers don't have to write custom code to connect to each database. Instead, they can use the standardized ODBC interface, which reduces development time and costs, and programs are easier to maintain. However, ODBC can be slower than other methods of connecting to a database, such as using a native driver, as it adds an extra layer of abstraction between the application and the database. Furthermore, because DuckDB is column-based and ODBC is row-based, there can be some inefficiencies when using ODBC with DuckDB. + +> There are links throughout this page to the official [Microsoft ODBC documentation](https://learn.microsoft.com/en-us/sql/odbc/reference/odbc-programmer-s-reference?view=sql-server-ver16), which is a great resource for learning more about ODBC. + +## General Concepts + +* [Handles](#handles) +* [Connecting](#connecting) +* [Error Handling and Diagnostics](#error-handling-and-diagnostics) +* [Buffers and Binding](#buffers-and-binding) + +### Handles + +A [handle](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/handles?view=sql-server-ver16) is a pointer to a specific ODBC object which is used to interact with the database. There are several different types of handles, each with a different purpose, these are the environment handle, the connection handle, the statement handle, and the descriptor handle. Handles are allocated using the [`SQLAllocHandle`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqlallochandle-function?view=sql-server-ver16) which takes as input the type of handle to allocate, and a pointer to the handle, the driver then creates a new handle of the specified type which it returns to the application. + +The DuckDB ODBC driver has the following handle types. + +#### Environment + +
+ +| **Handle name** | [Environment](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/environment-handles?view=sql-server-ver16) | +| **Type name** | `SQL_HANDLE_ENV` | +| **Description** | Manages the environment settings for ODBC operations, and provides a global context in which to access data. | +| **Use case** | Initializing ODBC, managing driver behavior, resource allocation | +| **Additional information** | Must be [allocated](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/allocating-the-environment-handle?view=sql-server-ver16) once per application upon starting, and freed at the end. | + +#### Connection + +
+ +| **Handle name** | [Connection](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/connection-handles?view=sql-server-ver16) | +| **Type name** | `SQL_HANDLE_DBC` | +| **Description** | Represents a connection to a data source. Used to establish, manage, and terminate connections. Defines both the driver and the data source to use within the driver. | +| **Use case** | Establishing a connection to a database, managing the connection state | +| **Additional information** | Multiple connection handles can be [created](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/allocating-a-connection-handle-odbc?view=sql-server-ver16) as needed, allowing simultaneous connections to multiple data sources. *Note:* Allocating a connection handle does not establish a connection, but must be allocated first, and then used once the connection has been established. | + +#### Statement + +
+ +| **Handle name** | [Statement](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/statement-handles?view=sql-server-ver16) +| **Type name** | `SQL_HANDLE_STMT` +| **Description** | Handles the execution of SQL statements, as well as the returned result sets. +| **Use case** | Executing SQL queries, fetching result sets, managing statement options. +| **Additional information** | To facilitate the execution of concurrent queries, multiple handles can be [allocated](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/allocating-a-statement-handle-odbc?view=sql-server-ver16) per connection. + +#### Descriptor + +
+ +| **Handle name** | [Descriptor](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/descriptor-handles?view=sql-server-ver16) +| **Type name** | `SQL_HANDLE_DESC` +| **Description** | Describes the attributes of a data structure or parameter, and allows the application to specify the structure of data to be bound/retrieved. +| **Use case** | Describing table structures, result sets, binding columns to application buffers +| **Additional information** | Used in situations where data structures need to be explicitly defined, for example during parameter binding or result set fetching. They are automatically allocated when a statement is allocated, but can also be allocated explicitly. + +### Connecting + +The first step is to connect to the data source so that the application can perform database operations. First the application must allocate an environment handle, and then a connection handle. The connection handle is then used to connect to the data source. There are two functions which can be used to connect to a data source, [`SQLDriverConnect`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqldriverconnect-function?view=sql-server-ver16) and [`SQLConnect`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqlconnect-function?view=sql-server-ver16). The former is used to connect to a data source using a connection string, while the latter is used to connect to a data source using a DSN. + +#### Connection String + +A [connection string](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/connection-strings?view=sql-server-ver16) is a string which contains the information needed to connect to a data source. It is formatted as a semicolon separated list of key-value pairs, however DuckDB currently only utilizes the DSN and ignores the rest of the parameters. + +#### DSN + +A DSN (_Data Source Name_) is a string that identifies a database. It can be a file path, URL, or a database name. For example: `C:\Users\me\duckdb.db` and `DuckDB` are both valid DSNs. More information on DSNs can be found on the [“Choosing a Data Source or Driver” page of the SQL Server documentation](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/choosing-a-data-source-or-driver?view=sql-server-ver16). + +### Error Handling and Diagnostics + +All functions in ODBC return a code which represents the success or failure of the function. This allows for easy error handling, as the application can simply check the return code of each function call to determine if it was successful. When unsuccessful, the application can then use the [`SQLGetDiagRec`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqlgetdiagrec-function?view=sql-server-ver16) function to retrieve the error information. The following table defines the [return codes](https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/return-codes-odbc?view=sql-server-ver16): + +| Return code | Description | +|----------|---------------------| +| `SQL_SUCCESS` | The function completed successfully. | +| `SQL_SUCCESS_WITH_INFO` | The function completed successfully, but additional information is available, including a warning | +| `SQL_ERROR` | The function failed. | +| `SQL_INVALID_HANDLE` | The handle provided was invalid, indicating a programming error, i.e., when a handle is not allocated before it is used, or is the wrong type | +| `SQL_NO_DATA` | The function completed successfully, but no more data is available | +| `SQL_NEED_DATA` | More data is needed, such as when a parameter data is sent at execution time, or additional connection information is required. | +| `SQL_STILL_EXECUTING` | A function that was asynchronously executed is still executing. | + +### Buffers and Binding + +A buffer is a block of memory used to store data. Buffers are used to store data retrieved from the database, or to send data to the database. Buffers are allocated by the application, and then bound to a column in a result set, or a parameter in a query, using the [`SQLBindCol`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqlbindcol-function?view=sql-server-ver16) and [`SQLBindParameter`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqlbindparameter-function?view=sql-server-ver16) functions. When the application fetches a row from the result set, or executes a query, the data is stored in the buffer. When the application sends a query to the database, the data in the buffer is sent to the database. + +## Setting up an Application + +The following is a step-by-step guide to setting up an application that uses ODBC to connect to a database, execute a query, and fetch the results in `C++`. + +> To install the driver as well as anything else you will need follow these [instructions]({% link docs/archive/1.0/api/odbc/overview.md %}). + +### 1. Include the SQL Header Files + +The first step is to include the SQL header files: + +```cpp +#include +#include +``` + +These files contain the definitions of the ODBC functions, as well as the data types used by ODBC. In order to be able to use these header files you have to have the `unixodbc` package installed: + +On macOS: + +```bash +brew install unixodbc +``` + +On Ubuntu and Debian: + +```bash +sudo apt-get install -y unixodbc-dev +``` + +On Fedora, CentOS, and Red Hat: + +```bash +sudo yum install -y unixODBC-devel +``` + +Remember to include the header file location in your `CFLAGS`. + +For `MAKEFILE`: + +```make +CFLAGS=-I/usr/local/include +# or +CFLAGS=-/opt/homebrew/Cellar/unixodbc/2.3.11/include +``` + +For `CMAKE`: + +```cmake +include_directories(/usr/local/include) +# or +include_directories(/opt/homebrew/Cellar/unixodbc/2.3.11/include) +``` + +You also have to link the library in your `CMAKE` or `MAKEFILE`. +For `CMAKE`: + +```cmake +target_link_libraries(ODBC_application /path/to/duckdb_odbc/libduckdb_odbc.dylib) +``` + +For `MAKEFILE`: + +```make +LDLIBS=-L/path/to/duckdb_odbc/libduckdb_odbc.dylib +``` + +### 2. Define the ODBC Handles and Connect to the Database + +#### 2.a. Connecting with SQLConnect + +Then set up the ODBC handles, allocate them, and connect to the database. First the environment handle is allocated, then the environment is set to ODBC version 3, then the connection handle is allocated, and finally the connection is made to the database. The following code snippet shows how to do this: + +```cpp +SQLHANDLE env; +SQLHANDLE dbc; + +SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &env); + +SQLSetEnvAttr(env, SQL_ATTR_ODBC_VERSION, (void*)SQL_OV_ODBC3, 0); + +SQLAllocHandle(SQL_HANDLE_DBC, env, &dbc); + +std::string dsn = "DSN=duckdbmemory"; +SQLConnect(dbc, (SQLCHAR*)dsn.c_str(), SQL_NTS, NULL, 0, NULL, 0); + +std::cout << "Connected!" << std::endl; +``` + +#### 2.b. Connecting with SQLDriverConnect + +Alternatively, you can connect to the ODBC driver using [`SQLDriverConnect`](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqldriverconnect-function?view=sql-server-ver16). +`SQLDriverConnect` accepts a connection string in which you can configure the database using any of the available [DuckDB configuration options]({% link docs/archive/1.0/configuration/overview.md %}). + +```cpp +SQLHANDLE env; +SQLHANDLE dbc; + +SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &env); + +SQLSetEnvAttr(env, SQL_ATTR_ODBC_VERSION, (void*)SQL_OV_ODBC3, 0); + +SQLAllocHandle(SQL_HANDLE_DBC, env, &dbc); + +SQLCHAR str[1024]; +SQLSMALLINT strl; +std::string dsn = "DSN=DuckDB;allow_unsigned_extensions=true;access_mode=READ_ONLY" +SQLDriverConnect(dbc, nullptr, (SQLCHAR*)dsn.c_str(), SQL_NTS, str, sizeof(str), &strl, SQL_DRIVER_COMPLETE) + +std::cout << "Connected!" << std::endl; +``` + +### 3. Adding a Query + +Now that the application is set up, we can add a query to it. First, we need to allocate a statement handle: + +```cpp +SQLHANDLE stmt; +SQLAllocHandle(SQL_HANDLE_STMT, dbc, &stmt); +``` + +Then we can execute a query: + +```cpp +SQLExecDirect(stmt, (SQLCHAR*)"SELECT * FROM integers", SQL_NTS); +``` + +### 4. Fetching Results + +Now that we have executed a query, we can fetch the results. First, we need to bind the columns in the result set to buffers: + +```cpp +SQLLEN int_val; +SQLLEN null_val; +SQLBindCol(stmt, 1, SQL_C_SLONG, &int_val, 0, &null_val); +``` + +Then we can fetch the results: + +```cpp +SQLFetch(stmt); +``` + +### 5. Go Wild + +Now that we have the results, we can do whatever we want with them. For example, we can print them: + +```cpp +std::cout << "Value: " << int_val << std::endl; +``` + +or do any other processing we want. As well as executing more queries and doing any thing else we want to do with the database such as inserting, updating, or deleting data. + +### 6. Free the Handles and Disconnecting + +Finally, we need to free the handles and disconnect from the database. First, we need to free the statement handle: + +```cpp +SQLFreeHandle(SQL_HANDLE_STMT, stmt); +``` + +Then we need to disconnect from the database: + +```cpp +SQLDisconnect(dbc); +``` + +And finally, we need to free the connection handle and the environment handle: + +```cpp +SQLFreeHandle(SQL_HANDLE_DBC, dbc); +SQLFreeHandle(SQL_HANDLE_ENV, env); +``` + +Freeing the connection and environment handles can only be done after the connection to the database has been closed. Trying to free them before disconnecting from the database will result in an error. + +## Sample Application + +The following is a sample application that includes a `cpp` file that connects to the database, executes a query, fetches the results, and prints them. It also disconnects from the database and frees the handles, and includes a function to check the return value of ODBC functions. It also includes a `CMakeLists.txt` file that can be used to build the application. + +### Sample `.cpp` file + +```cpp +#include +#include +#include + +void check_ret(SQLRETURN ret, std::string msg) { + if (ret != SQL_SUCCESS && ret != SQL_SUCCESS_WITH_INFO) { + std::cout << ret << ": " << msg << " failed" << std::endl; + exit(1); + } + if (ret == SQL_SUCCESS_WITH_INFO) { + std::cout << ret << ": " << msg << " succeeded with info" << std::endl; + } +} + +int main() { + SQLHANDLE env; + SQLHANDLE dbc; + SQLRETURN ret; + + ret = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &env); + check_ret(ret, "SQLAllocHandle(env)"); + + ret = SQLSetEnvAttr(env, SQL_ATTR_ODBC_VERSION, (void*)SQL_OV_ODBC3, 0); + check_ret(ret, "SQLSetEnvAttr"); + + ret = SQLAllocHandle(SQL_HANDLE_DBC, env, &dbc); + check_ret(ret, "SQLAllocHandle(dbc)"); + + std::string dsn = "DSN=duckdbmemory"; + ret = SQLConnect(dbc, (SQLCHAR*)dsn.c_str(), SQL_NTS, NULL, 0, NULL, 0); + check_ret(ret, "SQLConnect"); + + std::cout << "Connected!" << std::endl; + + SQLHANDLE stmt; + ret = SQLAllocHandle(SQL_HANDLE_STMT, dbc, &stmt); + check_ret(ret, "SQLAllocHandle(stmt)"); + + ret = SQLExecDirect(stmt, (SQLCHAR*)"SELECT * FROM integers", SQL_NTS); + check_ret(ret, "SQLExecDirect(SELECT * FROM integers)"); + + SQLLEN int_val; + SQLLEN null_val; + ret = SQLBindCol(stmt, 1, SQL_C_SLONG, &int_val, 0, &null_val); + check_ret(ret, "SQLBindCol"); + + ret = SQLFetch(stmt); + check_ret(ret, "SQLFetch"); + + std::cout << "Value: " << int_val << std::endl; + + ret = SQLFreeHandle(SQL_HANDLE_STMT, stmt); + check_ret(ret, "SQLFreeHandle(stmt)"); + + ret = SQLDisconnect(dbc); + check_ret(ret, "SQLDisconnect"); + + ret = SQLFreeHandle(SQL_HANDLE_DBC, dbc); + check_ret(ret, "SQLFreeHandle(dbc)"); + + ret = SQLFreeHandle(SQL_HANDLE_ENV, env); + check_ret(ret, "SQLFreeHandle(env)"); +} +``` + +### Sample `CMakelists.txt` file + +```cmake +cmake_minimum_required(VERSION 3.25) +project(ODBC_Tester_App) + +set(CMAKE_CXX_STANDARD 17) +include_directories(/opt/homebrew/Cellar/unixodbc/2.3.11/include) + +add_executable(ODBC_Tester_App main.cpp) +target_link_libraries(ODBC_Tester_App /duckdb_odbc/libduckdb_odbc.dylib) +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/offline-copy.md b/docs/archive/1.0/guides/offline-copy.md new file mode 100644 index 00000000000..a56dfa8af44 --- /dev/null +++ b/docs/archive/1.0/guides/offline-copy.md @@ -0,0 +1,16 @@ +--- +layout: docu +title: Browse Offline +--- + +You can browse the DuckDB documentation offline in two formats: + +* As a [single PDF file](/duckdb-docs.pdf) (approx. 4 MB) + +* As a [website packaged in a single ZIP file](/duckdb-docs.zip) (approx. 50 MB). To browse the website locally, decompress the package, navigate to the `duckdb-docs` directory, and run: + + ```bash + python -m http.server + ``` + + Then, connect to . \ No newline at end of file diff --git a/docs/archive/1.0/guides/overview.md b/docs/archive/1.0/guides/overview.md new file mode 100644 index 00000000000..2a400e87260 --- /dev/null +++ b/docs/archive/1.0/guides/overview.md @@ -0,0 +1,124 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides +- /docs/archive/1.0/guides/index +title: Guides +--- + +The guides section contains compact how-to guides that are focused on achieving a single goal. For an API references and examples, see the rest of the documentation. + +Note that there are many tools using DuckDB, which are not covered in the official guides. To find a list of these tools, check out the [Awesome DuckDB repository](https://github.com/davidgasquez/awesome-duckdb). + +> Tip For a short introductory tutorial, check out the [“Analyzing Railway Traffic in the Netherlands”]({% post_url 2024-05-31-analyzing-railway-traffic-in-the-netherlands %}) tutorial. + +## Data Import and Export + +* [Data import overview]({% link docs/archive/1.0/guides/file_formats/overview.md %}) + +### CSV Files + +* [How to load a CSV file into a table]({% link docs/archive/1.0/guides/file_formats/csv_import.md %}) +* [How to export a table to a CSV file]({% link docs/archive/1.0/guides/file_formats/csv_export.md %}) + +### Parquet Files + +* [How to load a Parquet file into a table]({% link docs/archive/1.0/guides/file_formats/parquet_import.md %}) +* [How to export a table to a Parquet file]({% link docs/archive/1.0/guides/file_formats/parquet_export.md %}) +* [How to run a query directly on a Parquet file]({% link docs/archive/1.0/guides/file_formats/query_parquet.md %}) + +### HTTP(S), S3 and GCP + +* [How to load a Parquet file directly from HTTP(S)]({% link docs/archive/1.0/guides/network_cloud_storage/http_import.md %}) +* [How to load a Parquet file directly from S3]({% link docs/archive/1.0/guides/network_cloud_storage/s3_import.md %}) +* [How to export a Parquet file to S3]({% link docs/archive/1.0/guides/network_cloud_storage/s3_export.md %}) +* [How to load a Parquet file from S3 Express One]({% link docs/archive/1.0/guides/network_cloud_storage/s3_express_one.md %}) +* [How to load a Parquet file directly from GCS]({% link docs/archive/1.0/guides/network_cloud_storage/gcs_import.md %}) +* [How to load a Parquet file directly from Cloudflare R2]({% link docs/archive/1.0/guides/network_cloud_storage/cloudflare_r2_import.md %}) +* [How to load an Iceberg table directly from S3]({% link docs/archive/1.0/guides/network_cloud_storage/s3_iceberg_import.md %}) + +### JSON Files + +* [How to load a JSON file into a table]({% link docs/archive/1.0/guides/file_formats/json_import.md %}) +* [How to export a table to a JSON file]({% link docs/archive/1.0/guides/file_formats/json_export.md %}) + +### Excel Files with the Spatial Extension + +* [How to load an Excel file into a table]({% link docs/archive/1.0/guides/file_formats/excel_import.md %}) +* [How to export a table to an Excel file]({% link docs/archive/1.0/guides/file_formats/excel_export.md %}) + +### Querying Other Database Systems + +* [How to directly query a MySQL database]({% link docs/archive/1.0/guides/database_integration/mysql.md %}) +* [How to directly query a PostgreSQL database]({% link docs/archive/1.0/guides/database_integration/postgres.md %}) +* [How to directly query a SQLite database]({% link docs/archive/1.0/guides/database_integration/sqlite.md %}) + +### Directly Reading Files + +* [How to directly read a binary file]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_blob) +* [How to directly read a text file]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_text) + +## Performance + +* [My workload is slow (troubleshooting guide)]({% link docs/archive/1.0/guides/performance/my_workload_is_slow.md %}) +* [How to design the schema for optimal performance]({% link docs/archive/1.0/guides/performance/schema.md %}) +* [What is the ideal hardware environment for DuckDB]({% link docs/archive/1.0/guides/performance/environment.md %}) +* [What performance implications do Parquet files and (compressed) CSV files have]({% link docs/archive/1.0/guides/performance/file_formats.md %}) +* [How to tune workloads]({% link docs/archive/1.0/guides/performance/how_to_tune_workloads.md %}) +* [Benchmarks]({% link docs/archive/1.0/guides/performance/benchmarks.md %}) + +## Meta Queries + +* [How to list all tables]({% link docs/archive/1.0/guides/meta/list_tables.md %}) +* [How to view the schema of the result of a query]({% link docs/archive/1.0/guides/meta/describe.md %}) +* [How to quickly get a feel for a dataset using summarize]({% link docs/archive/1.0/guides/meta/summarize.md %}) +* [How to view the query plan of a query]({% link docs/archive/1.0/guides/meta/explain.md %}) +* [How to profile a query]({% link docs/archive/1.0/guides/meta/explain_analyze.md %}) + +## ODBC + +* [How to set up an ODBC application (and more!)]({% link docs/archive/1.0/guides/odbc/general.md %}) + +## Python Client + +* [How to install the Python client]({% link docs/archive/1.0/guides/python/install.md %}) +* [How to execute SQL queries]({% link docs/archive/1.0/guides/python/execute_sql.md %}) +* [How to easily query DuckDB in Jupyter Notebooks]({% link docs/archive/1.0/guides/python/jupyter.md %}) +* [How to use Multiple Python Threads with DuckDB]({% link docs/archive/1.0/guides/python/multiple_threads.md %}) +* [How to use fsspec filesystems with DuckDB]({% link docs/archive/1.0/guides/python/filesystems.md %}) + +### Pandas + +* [How to execute SQL on a Pandas DataFrame]({% link docs/archive/1.0/guides/python/sql_on_pandas.md %}) +* [How to create a table from a Pandas DataFrame]({% link docs/archive/1.0/guides/python/import_pandas.md %}) +* [How to export data to a Pandas DataFrame]({% link docs/archive/1.0/guides/python/export_pandas.md %}) + +### Apache Arrow + +* [How to execute SQL on Apache Arrow]({% link docs/archive/1.0/guides/python/sql_on_arrow.md %}) +* [How to create a DuckDB table from Apache Arrow]({% link docs/archive/1.0/guides/python/import_arrow.md %}) +* [How to export data to Apache Arrow]({% link docs/archive/1.0/guides/python/export_arrow.md %}) + +### Relational API + +* [How to query Pandas DataFrames with the Relational API]({% link docs/archive/1.0/guides/python/relational_api_pandas.md %}) + +### Python Library Integrations + +* [How to use Ibis to query DuckDB with or without SQL]({% link docs/archive/1.0/guides/python/ibis.md %}) +* [How to use DuckDB with Polars DataFrames via Apache Arrow]({% link docs/archive/1.0/guides/python/polars.md %}) + +## SQL Features + +* [Friendly SQL]({% link docs/archive/1.0/sql/dialect/friendly_sql.md %}) +* [As-of join]({% link docs/archive/1.0/guides/sql_features/asof_join.md %}) +* [Full-text search]({% link docs/archive/1.0/guides/sql_features/full_text_search.md %}) + +## SQL Editors and IDEs + +* [How to set up the DBeaver SQL IDE]({% link docs/archive/1.0/guides/sql_editors/dbeaver.md %}) + +## Data Viewers + +* [How to visualize DuckDB databases with Tableau]({% link docs/archive/1.0/guides/data_viewers/tableau.md %}) +* [How to draw command-line plots with DuckDB and YouPlot]({% link docs/archive/1.0/guides/data_viewers/youplot.md %}) \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/benchmarks.md b/docs/archive/1.0/guides/performance/benchmarks.md new file mode 100644 index 00000000000..caaf1218da9 --- /dev/null +++ b/docs/archive/1.0/guides/performance/benchmarks.md @@ -0,0 +1,29 @@ +--- +layout: docu +title: Benchmarks +--- + +For several of the recommendations in our performance guide, we use microbenchmarks to back up our claims. For these benchmarks, we use data sets from the [TPC-H benchmark]({% link docs/archive/1.0/extensions/tpch.md %}) and the [LDBC Social Network Benchmark’s BI workload](https://github.com/ldbc/ldbc_snb_bi/blob/main/snb-bi-pre-generated-data-sets.md#compressed-csvs-in-the-composite-merged-fk-format). + + + +## Data Sets + +We use the [LDBC BI SF300 data set's Comment table](https://blobs.duckdb.org/data/ldbc-sf300-comments.tar.zst) (20 GB `.tar.zst` archive, 21 GB when decompressed into `.csv.gz` files), +while others use the same table's [`creationDate` column](https://blobs.duckdb.org/data/ldbc-sf300-comments-creationDate.parquet) (4 GB `.parquet` file). + +The TPC data sets used in the benchmark are generated with the DuckDB [tpch extension]({% link docs/archive/1.0/extensions/tpch.md %}). + +## A Note on Benchmarks + +Running [fair benchmarks is difficult](https://hannes.muehleisen.org/publications/DBTEST2018-performance-testing.pdf), especially when performing system-to-system comparison. +When running benchmarks on DuckDB, please make sure you are using the latest version (preferably the [nightly build]({% link docs/archive/1.0/installation/index.html %}?version=main)). +If in doubt about your benchmark results, feel free to contact us at `gabor@duckdb.org`. + +## Disclaimer on Benchmarks + +Note that the benchmark results presented in this guide do not constitute official TPC or LDBC benchmark results. Instead, they merely use the data sets of and some queries provided by the TPC-H and the LDBC BI benchmark frameworks, and omit other parts of the workloads such as updates. \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/environment.md b/docs/archive/1.0/guides/performance/environment.md new file mode 100644 index 00000000000..bc512a7eed1 --- /dev/null +++ b/docs/archive/1.0/guides/performance/environment.md @@ -0,0 +1,34 @@ +--- +layout: docu +title: Environment +--- + +The environment where DuckDB is run has an obvious impact on performance. This page focuses on the effects of the hardware configuration and the operating system used. + +## Hardware Configuration + +### CPU and Memory + +As a rule of thumb, DuckDB requires a **minimum** of 125 MB of memory per thread. +For example, if you use 8 threads, you need at least 1 GB of memory. +For ideal performance, aggregation-heavy workloads require approx. 5 GB memory per thread and join-heavy workloads require approximately 10 GB memory per thread. + +> Bestpractice Aim for 5-10 GB memory per thread. + +> Tip If you have a limited amount of memory, try to [limit the number of threads]({% link docs/archive/1.0/configuration/pragmas.md %}#threads), e.g., by issuing `SET threads = 4;`. + +### Disk + +DuckDB is capable of operating both as an in-memory and as a disk-based database system. In both cases, it can spill to disk to process larger-than-memory workloads (a.k.a. out-of-core processing) for which a fast disk is highly beneficial. However, if the workload fits in memory, the disk speed only has a limited effect on performance. + +In general, network-based storage will result in slower DuckDB workloads than using local disks. +This includes network disks such as [NFS](https://en.wikipedia.org/wiki/Network_File_System), +network drives such as [SMB](https://en.wikipedia.org/wiki/Server_Message_Block) and [Samba](https://en.wikipedia.org/wiki/Samba_(software)), +and network-backed cloud disks such as [AWS EBS](https://aws.amazon.com/ebs/). +However, different network disks can have vastly varying IO performance, ranging from very slow to almost as fast as local. Therefore, for optimal performance, only use network disks that can provide high IO performance. + +> Bestpractice Fast disks are important if your workload is larger than memory and/or fast data loading is important. Only use network-backed disks if they guarantee high IO. + +## Operating System + +We recommend using the latest stable version of operating systems: macOS, Windows, and Linux are all well-tested and DuckDB can run on them with high performance. Among Linux distributions, we recommended using Ubuntu Linux LTS due to its stability and the fact that most of DuckDB’s Linux test suite jobs run on Ubuntu workers. \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/file_formats.md b/docs/archive/1.0/guides/performance/file_formats.md new file mode 100644 index 00000000000..9130b26bc0c --- /dev/null +++ b/docs/archive/1.0/guides/performance/file_formats.md @@ -0,0 +1,105 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/performance/file-formats +title: File Formats +--- + +## Handling Parquet Files + +DuckDB has advanced support for Parquet files, which includes [directly querying Parquet files]({% post_url 2021-06-25-querying-parquet %}). +When deciding on whether to query these files directly or to first load them to the database, you need to consider several factors. + +### Reasons for Querying Parquet Files + +**Availability of basic statistics:** Parquet files use a columnar storage format and contain basic statistics such as [zonemaps]({% link docs/archive/1.0/guides/performance/indexing.md %}#zonemaps). Thanks to these features, DuckDB can leverage optimizations such as projection and filter pushdown on Parquet files. Therefore, workloads that combine projection, filtering, and aggregation tend to perform quite well when run on Parquet files. + +**Storage considerations:** Loading the data from Parquet files will require approximately the same amount of space for the DuckDB database file. Therefore, if the available disk space is constrained, it is worth running the queries directly on Parquet files. + +### Reasons against Querying Parquet Files + +**Lack of advanced statistics:** The DuckDB database format has the [hyperloglog statistics](https://en.wikipedia.org/wiki/HyperLogLog) that Parquet files do not have. These improve the accuracy of cardinality estimates, and are especially important if the queries contain a large number of join operators. + +**Tip.** If you find that DuckDB produces a suboptimal join order on Parquet files, try loading the Parquet files to DuckDB tables. The improved statistics likely help obtain a better join order. + +**Repeated queries:** If you plan to run multiple queries on the same data set, it is worth loading the data into DuckDB. The queries will always be somewhat faster, which over time amortizes the initial load time. + +**High decompression times:** Some Parquet files are compressed using heavyweight compression algorithms such as gzip. In these cases, querying the Parquet files will necessitate an expensive decompression time every time the file is accessed. Meanwhile, lightweight compression methods like Snappy, LZ4, and zstd, are faster to decompress. You may use the [`parquet_metadata` function]({% link docs/archive/1.0/data/parquet/metadata.md %}#parquet-metadata) to find out the compression algorithm used. + +#### Microbenchmark: Running TPC-H on a DuckDB Database vs. Parquet + +The queries on the [TPC-H benchmark]({% link docs/archive/1.0/extensions/tpch.md %}) run approximately 1.1-5.0x slower on Parquet files than on a DuckDB database. + +> Bestpractice If you have the storage space available, and have a join-heavy workload and/or plan to run many queries on the same dataset, load the Parquet files into the database first. The compression algorithm and the row group sizes in the Parquet files have a large effect on performance: study these using the [`parquet_metadata` function]({% link docs/archive/1.0/data/parquet/metadata.md %}#parquet-metadata). + +### The Effect of Row Group Sizes + +DuckDB works best on Parquet files with row groups of 100K-1M rows each. The reason for this is that DuckDB can only [parallelize over row groups]({% link docs/archive/1.0/guides/performance/how_to_tune_workloads.md %}#parallelism-multi-core-processing) – so if a Parquet file has a single giant row group it can only be processed by a single thread. You can use the [`parquet_metadata` function]({% link docs/archive/1.0/data/parquet/metadata.md %}#parquet-metadata) to figure out how many row groups a Parquet file has. When writing Parquet files, use the [`row_group_size`]({% link docs/archive/1.0/sql/statements/copy.md %}#parquet-options) option. + +#### Microbenchmark: Running Aggregation Query at Different Row Group Sizes + +We run a simple aggregation query over Parquet files using different row group sizes, selected between 960 and 1,966,080. The results are as follows. + +
+ +| Row group size | Execution time | +|---------------:|---------------:| +| 960 | 8.77s | +| 1920 | 8.95s | +| 3840 | 4.33s | +| 7680 | 2.35s | +| 15360 | 1.58s | +| 30720 | 1.17s | +| 61440 | 0.94s | +| 122880 | 0.87s | +| 245760 | 0.93s | +| 491520 | 0.95s | +| 983040 | 0.97s | +| 1966080 | 0.88s | + +The results show that row group sizes <5,000 have a strongly detrimental effect, making runtimes more than 5-10x larger than ideally-sized row groups, while row group sizes between 5,000 and 20,000 are still 1.5-2.5x off from best performance. Above row group size of 100,000, the differences are small: the gap is about 10% between the best and the worst runtime. + +### Parquet File Sizes + +DuckDB can also parallelize across multiple Parquet files. It is advisable to have at least as many total row groups across all files as there are CPU threads. For example, with a machine having 10 threads, both 10 files with 1 row group or 1 file with 10 row groups will achieve full parallelism. It is also beneficial to keep the size of individual Parquet files moderate. + +> Bestpractice The ideal range is between 100 MB and 10 GB per individual Parquet file. + +### Hive Partitioning for Filter Pushdown + +When querying many files with filter conditions, performance can be improved by using a [Hive-format folder structure]({% link docs/archive/1.0/data/partitioning/hive_partitioning.md %}) to partition the data along the columns used in the filter condition. DuckDB will only need to read the folders and files that meet the filter criteria. This can be especially helpful when querying remote files. + +### More Tips on Reading and Writing Parquet Files + +For tips on reading and writing Parquet files, see the [Parquet Tips page]({% link docs/archive/1.0/data/parquet/tips.md %}). + +## Loading CSV Files + +CSV files are often distributed in compressed format such as GZIP archives (`.csv.gz`). DuckDB can decompress these files on the fly. In fact, this is typically faster than decompressing the files first and loading them due to reduced IO. + +
+ +| Schema | Load Time | +|---|--:| +| Load from GZIP-compressed CSV files (`.csv.gz`) | 107.1s | +| Decompressing (using parallel `gunzip`) and loading from decompressed CSV files | 121.3s | + +### Loading Many Small CSV Files + +The [CSV reader]({% link docs/archive/1.0/data/csv/overview.md %}) runs the [CSV sniffer]({% post_url 2023-10-27-csv-sniffer %}) on all files. For many small files, this may cause an unnecessarily high overhead. +A potential optimization to speed this up is to turn the sniffer off. Assuming that all files have the same CSV dialect and colum names/types, get the sniffer options as follows: + +```sql +.mode line +SELECT Prompt FROM sniff_csv('part-0001.csv'); +``` + +```text +Prompt = FROM read_csv('file_path.csv', auto_detect=false, delim=',', quote='"', escape='"', new_line='\n', skip=0, header=true, columns={'hello': 'BIGINT', 'world': 'VARCHAR'}); +``` + +Then, you can adjust `read_csv` command, by e.g., applying [filename expansion (globbing)]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#globbing), and run with the rest of the options detected by the sniffer: + +```sql +FROM read_csv('part-*.csv', auto_detect=false, delim=',', quote='"', escape='"', new_line='\n', skip=0, header=true, columns={'hello': 'BIGINT', 'world': 'VARCHAR'}); +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/how_to_tune_workloads.md b/docs/archive/1.0/guides/performance/how_to_tune_workloads.md new file mode 100644 index 00000000000..301a4f1ed4e --- /dev/null +++ b/docs/archive/1.0/guides/performance/how_to_tune_workloads.md @@ -0,0 +1,139 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/performance/how-to-tune-workloads +title: Tuning Workloads +--- + +## Parallelism (Multi-Core Processing) + +### The Effect of Row Groups on Parallelism + +DuckDB parallelizes the workload based on _[row groups]({% link docs/archive/1.0/internals/storage.md %}#row-groups),_ i.e., groups of rows that are stored together at the storage level. +A row group in DuckDB's database format consists of max. 122,880 rows. +Parallelism starts at the level of row groups, therefore, for a query to run on _k_ threads, it needs to scan at least _k_ * 122,880 rows. + +### Too Many Threads + +Note that in certain cases DuckDB may launch _too many threads_ (e.g., due to HyperThreading), which can lead to slowdowns. In these cases, it’s worth manually limiting the number of threads using [`SET threads = X`]({% link docs/archive/1.0/configuration/pragmas.md %}#threads). + +## Larger-Than-Memory Workloads (Out-of-Core Processing) + +A key strength of DuckDB is support for larger-than-memory workloads, i.e., it is able to process data sets that are larger than the available system memory (also known as _out-of-core processing_). +It can also run queries where the intermediate results cannot fit into memory. +This section explains the prerequisites, scope, and known limitations of larger-than-memory processing in DuckDB. + +### Spilling to Disk + +Larger-than-memory workloads are supported by spilling to disk. +With the default configuration, DuckDB creates the `⟨database_file_name⟩.tmp` temporary directory (in persistent mode) or the `.tmp` directory (in in-memory mode). This directory can be changed using the the [`temp_directory` configuration option]({% link docs/archive/1.0/configuration/pragmas.md %}#temp-directory-for-spilling-data-to-disk), e.g.: + +```sql +SET temp_directory = '/path/to/temp_dir.tmp/'; +``` + +### Blocking Operators + +Some operators cannot output a single row until the last row of their input has been seen. +These are called _blocking operators_ as they require their entire input to be buffered, +and are the most memory-intensive operators in relational database systems. +The main blocking operators are the following: + +* _sorting:_ [`ORDER BY`]({% link docs/archive/1.0/sql/query_syntax/orderby.md %}), +* _grouping:_ [`GROUP BY`]({% link docs/archive/1.0/sql/query_syntax/groupby.md %}), +* _windowing:_ [`OVER ... (PARTITION BY ... ORDER BY ...)`]({% link docs/archive/1.0/sql/functions/window_functions.md %}), +* _joining:_ [`JOIN`]({% link docs/archive/1.0/sql/query_syntax/from.md %}#joins). + +DuckDB supports larger-than-memory processing for all of these operators. + +### Limitations + +DuckDB strives to always complete workloads even if they are larger-than-memory. +That said, there are some limitations at the moment: + +* If multiple blocking operators appear in the same query, DuckDB may still throw an out-of-memory exception due to the complex interplay of these operators. +* Some [aggregate functions]({% link docs/archive/1.0/sql/functions/aggregates.md %}), such as `list()` and `string_agg()`, do not support offloading to disk. +* [Aggregate functions that use sorting]({% link docs/archive/1.0/sql/functions/aggregates.md %}#order-by-clause-in-aggregate-functions) are holistic, i.e., they need all inputs before the aggregation can start. As DuckDB cannot yet offload some complex intermediate aggregate states to disk, these functions can cause an out-of-memory exception when run on large data sets. +* The `PIVOT` operation [internally uses the `list()` function]({% link docs/archive/1.0/sql/statements/pivot.md %}#internals), therefore it is subject to the same limitation. + +## Profiling + +If your queries are not performing as well as expected, it’s worth studying their query plans: + +* Use [`EXPLAIN`]({% link docs/archive/1.0/guides/meta/explain.md %}) to print the physical query plan without running the query. +* Use [`EXPLAIN ANALYZE`]({% link docs/archive/1.0/guides/meta/explain_analyze.md %}) to run and profile the query. This will show the CPU time that each step in the query takes. Note that due to multi-threading, adding up the individual times will be larger than the total query processing time. + +Query plans can point to the root of performance issues. A few general directions: + +* Avoid nested loop joins in favor of hash joins. +* A scan that does not include a filter pushdown for a filter condition that is later applied performs unnecessary IO. Try rewriting the query to apply a pushdown. +* Bad join orders where the cardinality of an operator explodes to billions of tuples should be avoided at all costs. + +## Prepared Statements + +[Prepared statements]({% link docs/archive/1.0/sql/query_syntax/prepared_statements.md %}) can improve performance when running the same query many times, but with different parameters. When a statement is prepared, it completes several of the initial portions of the query execution process (parsing, planning, etc.) and caches their output. When it is executed, those steps can be skipped, improving performance. This is beneficial mostly for repeatedly running small queries (with a runtime of < 100ms) with different sets of parameters. + +Note that it is not a primary design goal for DuckDB to quickly execute many small queries concurrently. Rather, it is optimized for running larger, less frequent queries. + +## Querying Remote Files + +DuckDB uses synchronous IO when reading remote files. This means that each DuckDB thread can make at most one HTTP request at a time. If a query must make many small requests over the network, increasing DuckDB's [`threads` setting]({% link docs/archive/1.0/configuration/pragmas.md %}#threads) to larger than the total number of CPU cores (approx. 2-5 times CPU cores) can improve parallelism and performance. + +### Avoid Reading Unnecessary Data + +The main bottleneck in workloads reading remote files is likely to be the IO. This means that minimizing the unnecessarily read data can be highly beneficial. + +Some basic SQL tricks can help with this: + +* Avoid `SELECT *`. Instead, only select columns that are actually used. DuckDB will try to only download the data it actually needs. +* Apply filters on remote parquet files when possible. DuckDB can use these filters to reduce the amount of data that is scanned. +* Either [sort]({% link docs/archive/1.0/sql/query_syntax/orderby.md %}) or [partition]({% link docs/archive/1.0/data/partitioning/partitioned_writes.md %}) data by columns that are regularly used for filters: this increases the effectiveness of the filters in reducing IO. + +To inspect how much remote data is transferred for a query, [`EXPLAIN ANALYZE`]({% link docs/archive/1.0/guides/meta/explain_analyze.md %}) can be used to print out the total number of requests and total data transferred for queries on remote files. + +### Avoid Reading Data More Than Once + +DuckDB does not cache data from remote files automatically. This means that running a query on a remote file twice will download the required data twice. So if data needs to be accessed multiple times, storing it locally can make sense. To illustrate this, lets look at an example: + +Consider the following queries: + +```sql +SELECT col_a + col_b FROM 's3://bucket/file.parquet' WHERE col_a > 10; +SELECT col_a * col_b FROM 's3://bucket/file.parquet' WHERE col_a > 10; +``` + +These queries download the columns `col_a` and `col_b` from `s3://bucket/file.parquet` twice. Now consider the following queries: + +```sql +CREATE TABLE local_copy_of_file AS + SELECT col_a, col_b FROM 's3://bucket/file.parquet' WHERE col_a > 10; + +SELECT col_a + col_b FROM local_copy_of_file; +SELECT col_a * col_b FROM local_copy_of_file; +``` + +Here DuckDB will first copy `col_a` and `col_b` from `s3://bucket/file.parquet` into a local table, and then query the local in-memory columns twice. Note also that the filter `WHERE col_a > 10` is also now applied only once. + +An important side note needs to be made here though. The first two queries are fully streaming, with only a small memory footprint, whereas the second requires full materialization of columns `col_a` and `col_b`. This means that in some rare cases (e.g., with a high-speed network, but with very limited memory available) it could actually be beneficial to download the data twice. + +## Best Practices for Using Connections + +DuckDB will perform best when reusing the same database connection many times. Disconnecting and reconnecting on every query will incur some overhead, which can reduce performance when running many small queries. DuckDB also caches some data and metadata in memory, and that cache is lost when the last open connection is closed. Frequently, a single connection will work best, but a connection pool may also be used. + +Using multiple connections can parallelize some operations, although it is typically not necessary. DuckDB does attempt to parallelize as much as possible within each individual query, but it is not possible to parallelize in all cases. Making multiple connections can process more operations concurrently. This can be more helpful if DuckDB is not CPU limited, but instead bottlenecked by another resource like network transfer speed. + +## The `preserve_insertion_order` Option + +When importing or exporting data sets (from/to the Parquet or CSV formats), which are much larger than the available memory, an out of memory error may occur: + +```console +Error: Out of Memory Error: failed to allocate data of size ... (.../... used) +``` + +In these cases, consider setting the [`preserve_insertion_order` configuration option]({% link docs/archive/1.0/configuration/overview.md %}) to `false`: + +```sql +SET preserve_insertion_order = false; +``` + +This allows the systems to re-order any results that do not contain `ORDER BY` clauses, potentially reducing memory usage. \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/import.md b/docs/archive/1.0/guides/performance/import.md new file mode 100644 index 00000000000..3d39fccb559 --- /dev/null +++ b/docs/archive/1.0/guides/performance/import.md @@ -0,0 +1,23 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/import/overview +title: Data Import +--- + +## Recommended Import Methods + +When importing data from other systems to DuckDB, there are several considerations to take into account. +We recommend importing using the following order: + +1. For systems which are supported by a DuckDB scanner extension, it's preferable to use the scanner. DuckDB currently offers scanners for [MySQL]({% link docs/archive/1.0/guides/database_integration/mysql.md %}), [PostgreSQL]({% link docs/archive/1.0/guides/database_integration/postgres.md %}), and [SQLite]({% link docs/archive/1.0/guides/database_integration/sqlite.md %}). +2. If there is a bulk export feature in the data source system, export the data to Parquet or CSV format, then load it using DuckDB's [Parquet]({% link docs/archive/1.0/guides/file_formats/parquet_import.md %}) or [CSV loader]({% link docs/archive/1.0/guides/file_formats/csv_import.md %}). +3. If the approaches above are not applicable, consider using the DuckDB [appender]({% link docs/archive/1.0/data/appender.md %}), currently available in the C, C++, Go, Java, and Rust APIs. +4. If the data source system supports Apache Arrow and the data transfer is a recurring task, consider using the DuckDB [Arrow]({% link docs/archive/1.0/extensions/arrow.md %}) extension. + +## Methods to Avoid + +If possible, avoid looping row-by-row (tuple-at-a-time) in favor of bulk operations. +Performing row-by-row inserts (even with prepared statements) is detrimental to performance and will result in slow load times. + +> Bestpractice Unless your data is small (<100k rows), avoid using inserts in loops. \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/indexing.md b/docs/archive/1.0/guides/performance/indexing.md new file mode 100644 index 00000000000..98bdd9979be --- /dev/null +++ b/docs/archive/1.0/guides/performance/indexing.md @@ -0,0 +1,63 @@ +--- +layout: docu +title: Indexing +--- + +DuckDB has two types of indexes: zonemaps and ART indexes. + +## Zonemaps + +DuckDB automatically creates [zonemaps](https://en.wikipedia.org/wiki/Block_Range_Index) (also known as min-max indexes) for the columns of all [general-purpose data types]({% link docs/archive/1.0/sql/data_types/overview.md %}#general-purpose-data-types). These indexes are used for predicate pushdown into scan operators and computing aggregations. This means that if a filter criterion (like `WHERE column1 = 123`) is in use, DuckDB can skip any row group whose min-max range does not contain that filter value (e.g., a block with a min-max range of 1000 to 2000 will be omitted when comparing for `= 123` or `< 400`). + +### The Effect of Ordering on Zonemaps + +The more ordered the data within a column, the more useful the zonemap indexes will be. For example, in the worst case, a column could contain a random number on every row. DuckDB will be unlikely to be able to skip any row groups. The best case of ordered data commonly arises with `DATETIME` columns. If specific columns will be queried with selective filters, it is best to pre-order data by those columns when inserting it. Even an imperfect ordering will still be helpful. + +### Microbenchmark: The Effect of Ordering + +For an example, let’s repeat the [microbenchmark for timestamps]({% link docs/archive/1.0/guides/performance/schema.md %}#microbenchmark-using-timestamps) with a timestamp column that sorted using an ascending order vs. an unordered one. + +
+ +| Column type | Ordered | Storage size | Query time | +|---|---|--:|--:| +| `DATETIME` | yes | 1.3 GB | 0.6 s | +| `DATETIME` | no | 3.3 GB | 0.9 s | + +The results show that simply keeping the column order allows for improved compression, yielding a 2.5x smaller storage size. +It also allows the computation to be 1.5x faster. + +### Ordered Integers + +Another practical way to exploit ordering is to use the `INTEGER` type with automatic increments rather than `UUID` for columns that will be queried using selective filters. `UUID`s will likely be inserted in a random order, so many row groups in the table will need to be scanned to find a specific `UUID` value, while an ordered `INTEGER` column will allow all row groups to be skipped except the one that contains the value. + +## ART Indexes + +DuckDB allows defining [Adaptive Radix Tree (ART) indexes](https://db.in.tum.de/~leis/papers/ART.pdf) in two ways. +First, such an index is created implicitly for columns with `PRIMARY KEY`, `FOREIGN KEY`, and `UNIQUE` [constraints]({% link docs/archive/1.0/guides/performance/schema.md %}#constraints). +Second, explicitly running a the [`CREATE INDEX`]({% link docs/archive/1.0/sql/indexes.md %}) statement creates an ART index on the target column(s). + +The tradeoffs of having an ART index on a column are as follows: + +1. It enables efficient constraint checking upon changes (inserts, updates, and deletes) for non-bulky changes. +2. Having an ART index makes changes to the affected column(s) slower compared to non-indexed performance. That is because of index maintenance for these operations. + +Regarding query performance, an ART index has the following effects: + +1. It speeds up point queries and other highly selective queries using the indexed column(s), where the filtering condition returns approx. 0.1% of all rows or fewer. When in doubt, use [`EXPLAIN`]({% link docs/archive/1.0/guides/meta/explain.md %}) to verify that your query plan uses the index scan. +2. An ART index has no effect on the performance of join, aggregation, and sorting queries. + +Indexes are serialized to disk and deserialized lazily, i.e., when the database is reopened, operations using the index will only load the required parts of the index. Therefore, having an index will not cause any slowdowns when opening an existing database. + +> Bestpractice We recommend following these guidelines: +> +> * Only use primary keys, foreign keys, or unique constraints, if these are necessary for enforcing constraints on your data. +> * Do not define explicit indexes unless you have highly selective queries. +> * If you define an ART index, do so after bulk loading the data to the table. Adding an index prior to loading, either explicitly or via primary/foreign keys, is [detrimental to load performance]({% link docs/archive/1.0/guides/performance/schema.md %}#microbenchmark-the-effect-of-primary-keys). + + \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/my_workload_is_slow.md b/docs/archive/1.0/guides/performance/my_workload_is_slow.md new file mode 100644 index 00000000000..4785369e904 --- /dev/null +++ b/docs/archive/1.0/guides/performance/my_workload_is_slow.md @@ -0,0 +1,19 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/performance/my-workload-is-slow +title: My Workload Is Slow +--- + +If you find that your workload in DuckDB is slow, we recommend performing the following checks. More detailed instructions are linked for each point. + +1. Do you have enough memory? DuckDB works best if you have [5-10 GB memory per CPU core]({% link docs/archive/1.0/guides/performance/environment.md %}#cpu-and-memory). +1. Are you using a fast disk? Network-attached disks can cause the workload to slow down, especially for [larger than memory workloads]({% link docs/archive/1.0/guides/performance/environment.md %}#disk). +1. Are you using indexes or constraints (primary key, unique, etc.)? If possible, try [disabling them]({% link docs/archive/1.0/guides/performance/schema.md %}#indexing), which boosts load and update performance. +1. Are you using the correct types? For example, [use `TIMESTAMP` to encode datetime values]({% link docs/archive/1.0/guides/performance/schema.md %}#types). +1. Are you reading from Parquet files? If so, do they have [row group sizes between 100k and 1M]({% link docs/archive/1.0/guides/performance/file_formats.md %}#the-effect-of-row-group-sizes) and file sizes between 100 MB to 10 GB? +1. Does the query plan look right? Study it with [`EXPLAIN`]({% link docs/archive/1.0/guides/performance/how_to_tune_workloads.md %}#profiling). +1. Is the workload running [in parallel]({% link docs/archive/1.0/guides/performance/how_to_tune_workloads.md %}#parallelism)? Use `htop` or the operating system's task manager to observe this. +1. Is DuckDB using too many threads? Try [limiting the amount of threads]({% link docs/archive/1.0/guides/performance/how_to_tune_workloads.md %}#parallelism-multi-core-processing). + +Are you aware of other common issues? If so, please click the _Report content issue_ link below and describe them along with their workarounds. \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/overview.md b/docs/archive/1.0/guides/performance/overview.md new file mode 100644 index 00000000000..af0bf72225f --- /dev/null +++ b/docs/archive/1.0/guides/performance/overview.md @@ -0,0 +1,10 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/guides/performance +title: Performance Guide +--- + +DuckDB aims to automatically achieve high performance by using well-chosen default configurations and having a forgiving architecture. Of course, there are still opportunities for tuning the system for specific workloads. The Performance Guide's page contain guidelines and tips for achieving good performance when loading and processing data with DuckDB. + +The guides include several microbenchmarks. You may find details about these on the [Benchmarks page]({% link docs/archive/1.0/guides/performance/benchmarks.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/performance/schema.md b/docs/archive/1.0/guides/performance/schema.md new file mode 100644 index 00000000000..ed7eef84432 --- /dev/null +++ b/docs/archive/1.0/guides/performance/schema.md @@ -0,0 +1,86 @@ +--- +layout: docu +title: Schema +--- + +## Types + +It is important to use the correct type for encoding columns (e.g., `BIGINT`, `DATE`, `DATETIME`). While it is always possible to use string types (`VARCHAR`, etc.) to encode more specific values, this is not recommended. Strings use more space and are slower to process in operations such as filtering, join, and aggregation. + +When loading CSV files, you may leverage the CSV reader's [auto-detection mechanism]({% link docs/archive/1.0/data/csv/auto_detection.md %}) to get the correct types for CSV inputs. + +If you run in a memory-constrained environment, using smaller data types (e.g., `TINYINT`) can reduce the amount of memory and disk space required to complete a query. DuckDB’s [bitpacking compression]({% post_url 2022-10-28-lightweight-compression %}#bit-packing) means small values stored in larger data types will not take up larger sizes on disk, but they will take up more memory during processing. + +> Bestpractice Use the most restrictive types possible when creating columns. Avoid using strings for encoding more specific data items. + +### Microbenchmark: Using Timestamps + +We illustrate the difference in aggregation speed using the [`creationDate` column of the LDBC Comment table on scale factor 300](https://blobs.duckdb.org/data/ldbc-sf300-comments-creationDate.parquet). This table has approx. 554 million unordered timestamp values. We run a simple aggregation query that returns the average day-of-the month from the timestamps in two configurations. + +First, we use a `DATETIME` to encode the values and run the query using the [`extract` datetime function]({% link docs/archive/1.0/sql/functions/timestamp.md %}): + +```sql +SELECT avg(extract('day' FROM creationDate)) FROM Comment; +``` + +Second, we use the `VARCHAR` type and use string operations: + +```sql +SELECT avg(CAST(creationDate[9:10] AS INTEGER)) FROM Comment; +``` + +The results of the microbenchmark are as follows: + +
+ +| Column type | Storage size | Query time | +| ----------- | -----------: | ---------: | +| `DATETIME` | 3.3 GB | 0.9 s | +| `VARCHAR` | 5.2 GB | 3.9 s | + +The results show that using the `DATETIME` value yields smaller storage sizes and faster processing. + +### Microbenchmark: Joining on Strings + +We illustrate the difference caused by joining on different types by computing a self-join on the [LDBC Comment table at scale factor 100](https://blobs.duckdb.org/data/ldbc-sf100-comments.tar.zst). The table has 64-bit integer identifiers used as the `id` attribute of each row. We perform the following join operation: + +```sql +SELECT count(*) AS count +FROM Comment c1 +JOIN Comment c2 ON c1.ParentCommentId = c2.id; +``` + +In the first experiment, we use the correct (most restrictive) types, i.e., both the `id` and the `ParentCommentId` columns are defined as `BIGINT`. +In the second experiment, we define all columns with the `VARCHAR` type. +While the results of the queries are the same for all both experiments, their runtime vary significantly. +The results below show that joining on `BIGINT` columns is approx. 1.8× faster than performing the same join on `VARCHAR`-typed columns encoding the same value. + +
+ +| Join column payload type | Join column schema type | Example value | Query time | +| ------------------------ | ----------------------- | ---------------------------------------- | ---------: | +| `BIGINT` | `BIGINT` | `70368755640078` | 1.2 s | +| `BIGINT` | `VARCHAR` | `'70368755640078'` | 2.1 s | + +> Bestpractice Avoid representing numeric values as strings, especially if you intend to perform e.g., join operations on them. + +## Constraints + +DuckDB allows defining [constraints]({% link docs/archive/1.0/sql/constraints.md %}) such as `UNIQUE`, `PRIMARY KEY`, and `FOREIGN KEY`. These constraints can be beneficial for ensuring data integrity but they have a negative effect on load performance as they necessitate building indexes and performing checks. Moreover, they _very rarely improve the performance of queries_ as DuckDB does not rely on these indexes for join and aggregation operators (see [indexing]({% link docs/archive/1.0/guides/performance/indexing.md %}) for more details). + +> Bestpractice Do not define constraints unless your goal is to ensure data integrity. + +### Microbenchmark: The Effect of Primary Keys + +We illustrate the effect of using primary keys with the [LDBC Comment table at scale factor 300](https://blobs.duckdb.org/data/ldbc-sf300-comments.tar.zst). This table has approx. 554 million entries. We first create the schema without a primary key, then load the data. In the second experiment, we create the schema with a primary key, then load the data. In both cases, we take the data from `.csv.gz` files, and measure the time required to perform the loading. + +
+ +| Operation | Execution time | +| ------------------------ | -------------: | +| Load without primary key | 92.2 s | +| Load with primary key | 286.8 s | + +In this case, primary keys will only have a (small) positive effect on highly selective queries such as when filtering on a single identifier. They do not have an effect on join and aggregation operators. + +> Bestpractice For best bulk load performance, avoid defining primary key constraints if possible. \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/execute_sql.md b/docs/archive/1.0/guides/python/execute_sql.md new file mode 100644 index 00000000000..260c8f7f729 --- /dev/null +++ b/docs/archive/1.0/guides/python/execute_sql.md @@ -0,0 +1,46 @@ +--- +layout: docu +title: Executing SQL in Python +--- + +SQL queries can be executed using the `duckdb.sql` function. + +```python +import duckdb + +duckdb.sql("SELECT 42").show() +``` + +By default this will create a relation object. The result can be converted to various formats using the result conversion functions. For example, the `fetchall` method can be used to convert the result to Python objects. + +```python +results = duckdb.sql("SELECT 42").fetchall() +print(results) +``` + +```text +[(42,)] +``` + +Several other result objects exist. For example, you can use `df` to convert the result to a Pandas DataFrame. + +```python +results = duckdb.sql("SELECT 42").df() +print(results) +``` + +```text + 42 + 0 42 +``` + +By default, a global in-memory connection will be used. Any data stored in files will be lost after shutting down the program. A connection to a persistent database can be created using the `connect` function. + +After connecting, SQL queries can be executed using the `sql` command. + +```python +con = duckdb.connect("file.db") +con.sql("CREATE TABLE integers (i INTEGER)") +con.sql("INSERT INTO integers VALUES (42)") +con.sql("SELECT * FROM integers").show() +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/export_arrow.md b/docs/archive/1.0/guides/python/export_arrow.md new file mode 100644 index 00000000000..721af2b87a2 --- /dev/null +++ b/docs/archive/1.0/guides/python/export_arrow.md @@ -0,0 +1,65 @@ +--- +layout: docu +title: Export to Apache Arrow +--- + +All results of a query can be exported to an [Apache Arrow Table](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html) using the `arrow` function. Alternatively, results can be returned as a [RecordBatchReader](https://arrow.apache.org/docs/python/generated/pyarrow.ipc.RecordBatchStreamReader.html) using the `fetch_record_batch` function and results can be read one batch at a time. In addition, relations built using DuckDB's [Relational API]({% link docs/archive/1.0/guides/python/relational_api_pandas.md %}) can also be exported. + +## Export to an Arrow Table + +```python +import duckdb +import pyarrow as pa + +my_arrow_table = pa.Table.from_pydict({'i': [1, 2, 3, 4], + 'j': ["one", "two", "three", "four"]}) + +# query the Apache Arrow Table "my_arrow_table" and return as an Arrow Table +results = duckdb.sql("SELECT * FROM my_arrow_table").arrow() +``` + +## Export as a RecordBatchReader + +```python +import duckdb +import pyarrow as pa + +my_arrow_table = pa.Table.from_pydict({'i': [1, 2, 3, 4], + 'j': ["one", "two", "three", "four"]}) + +# query the Apache Arrow Table "my_arrow_table" and return as an Arrow RecordBatchReader +chunk_size = 1_000_000 +results = duckdb.sql("SELECT * FROM my_arrow_table").fetch_record_batch(chunk_size) + +# Loop through the results. A StopIteration exception is thrown when the RecordBatchReader is empty +while True: + try: + # Process a single chunk here (just printing as an example) + print(results.read_next_batch().to_pandas()) + except StopIteration: + print('Already fetched all batches') + break +``` + +## Export from Relational API + +Arrow objects can also be exported from the Relational API. A relation can be converted to an Arrow table using the `arrow` or `to_arrow_table` functions, or a record batch using `record_batch`. +A result can be exported to an Arrow table with `arrow` or the alias `fetch_arrow_table`, or to a RecordBatchReader using `fetch_arrow_reader`. + +```python +import duckdb + +# connect to an in-memory database +con = duckdb.connect() + +con.execute('CREATE TABLE integers (i integer)') +con.execute('INSERT INTO integers VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9), (NULL)') + +# Create a relation from the table and export the entire relation as Arrow +rel = con.table("integers") +relation_as_arrow = rel.arrow() # or .to_arrow_table() + +# Or, calculate a result using that relation and export that result to Arrow +res = rel.aggregate("sum(i)").execute() +result_as_arrow = res.arrow() # or fetch_arrow_table() +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/export_numpy.md b/docs/archive/1.0/guides/python/export_numpy.md new file mode 100644 index 00000000000..342c922f41a --- /dev/null +++ b/docs/archive/1.0/guides/python/export_numpy.md @@ -0,0 +1,34 @@ +--- +layout: docu +title: Export to Numpy +--- + +The result of a query can be converted to a Numpy array using the `fetchnumpy()` function. For example: + +```python +import duckdb +import numpy as np + +my_arr = duckdb.sql("SELECT unnest([1, 2, 3]) AS x, 5.0 AS y").fetchnumpy() +my_arr +``` + +```text +{'x': array([1, 2, 3], dtype=int32), 'y': masked_array(data=[5.0, 5.0, 5.0], + mask=[False, False, False], + fill_value=1e+20)} +``` + +Then, the array can be processed using Numpy functions, e.g.: + +```python +np.sum(my_arr['x']) +``` + +```text +6 +``` + +## See Also + +DuckDB also supports [importing from Numpy]({% link docs/archive/1.0/guides/python/import_numpy.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/export_pandas.md b/docs/archive/1.0/guides/python/export_pandas.md new file mode 100644 index 00000000000..e73c4f4b11e --- /dev/null +++ b/docs/archive/1.0/guides/python/export_pandas.md @@ -0,0 +1,23 @@ +--- +layout: docu +title: Export to Pandas +--- + +The result of a query can be converted to a [Pandas](https://pandas.pydata.org/) DataFrame using the `df()` function. + +```python +import duckdb + +# read the result of an arbitrary SQL query to a Pandas DataFrame +results = duckdb.sql("SELECT 42").df() +results +``` + +```text + 42 +0 42 +``` + +## See Also + +DuckDB also supports [importing from Pandas]({% link docs/archive/1.0/guides/python/import_pandas.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/filesystems.md b/docs/archive/1.0/guides/python/filesystems.md new file mode 100644 index 00000000000..5e0f1815b5d --- /dev/null +++ b/docs/archive/1.0/guides/python/filesystems.md @@ -0,0 +1,31 @@ +--- +layout: docu +title: Using fsspec Filesystems +--- + +DuckDB support for [`fsspec`](https://filesystem-spec.readthedocs.io) filesystems allows querying data in filesystems that DuckDB's [`httpfs` extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) does not support. `fsspec` has a large number of [inbuilt filesystems](https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations), and there are also many [external implementations](https://filesystem-spec.readthedocs.io/en/latest/api.html#other-known-implementations). This capability is only available in DuckDB's Python client because `fsspec` is a Python library, while the `httpfs` extension is available in many DuckDB clients. + +## Example + +The following is an example of using `fsspec` to query a file in Google Cloud Storage (instead of using their S3-compatible API). + +Firstly, you must install `duckdb` and `fsspec`, and a filesystem interface of your choice. + +```bash +pip install duckdb fsspec gcsfs +``` + +Then, you can register whichever filesystem you'd like to query: + +```python +import duckdb +from fsspec import filesystem + +# this line will throw an exception if the appropriate filesystem interface is not installed +duckdb.register_filesystem(filesystem('gcs')) + +duckdb.sql("SELECT * FROM read_csv('gcs:///bucket/file.csv')") +``` + +> These filesystems are not implemented in C++, hence, their performance may not be comparable to the ones provided by the `httpfs` extension. +> It is also worth noting that as they are third-party libraries, they may contain bugs that are beyond our control. \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/ibis.md b/docs/archive/1.0/guides/python/ibis.md new file mode 100644 index 00000000000..0ebd2860609 --- /dev/null +++ b/docs/archive/1.0/guides/python/ibis.md @@ -0,0 +1,679 @@ +--- +layout: docu +title: Integration with Ibis +--- + +[Ibis](https://ibis-project.org) is a Python dataframe library that supports 20+ backends, with DuckDB as the default. Ibis with DuckDB provides a Pythonic interface for SQL with great performance. + +## Installation + +You can pip install Ibis with the DuckDB backend: + +```bash +pip install 'ibis-framework[duckdb,examples]' # examples is only required to access the sample data Ibis provides +``` + +or use conda: + +```bash +conda install ibis-framework +``` + +or use mamba: + +```bash +mamba install ibis-framework +``` + +## Create a Database File + +Ibis can work with several file types, but at its core, it connects to existing databases and interacts with the data there. You can get started with your own DuckDB databases or create a new one with example data. + +```python +import ibis + +con = ibis.connect("duckdb://penguins.ddb") +con.create_table( + "penguins", ibis.examples.penguins.fetch().to_pyarrow(), overwrite = True +) +``` + +```python +# Output: +DatabaseTable: penguins + species string + island string + bill_length_mm float64 + bill_depth_mm float64 + flipper_length_mm int64 + body_mass_g int64 + sex string + year int64 +``` + +You can now see the example dataset copied over to the database: + +```python +# reconnect to the persisted database (dropping temp tables) +con = ibis.connect("duckdb://penguins.ddb") +con.list_tables() +``` + +```python +# Output: +['penguins'] +``` + +There's one table, called `penguins`. We can ask Ibis to give us an object that we can interact with. + +```python +penguins = con.table("penguins") +penguins +``` + +```text +# Output: +DatabaseTable: penguins + species string + island string + bill_length_mm float64 + bill_depth_mm float64 + flipper_length_mm int64 + body_mass_g int64 + sex string + year int64 +``` + +Ibis is lazily evaluated, so instead of seeing the data, we see the schema of the table. To peek at the data, we can call `head` and then `to_pandas` to get the first few rows of the table as a pandas DataFrame. + +```python +penguins.head().to_pandas() +``` + +```text + species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year +0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007 +1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007 +2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007 +3 Adelie Torgersen NaN NaN NaN NaN None 2007 +4 Adelie Torgersen 36.7 19.3 193.0 3450.0 female 2007 +``` + +`to_pandas` takes the existing lazy table expression and evaluates it. If we leave it off, you'll see the Ibis representation of the table expression that `to_pandas` will evaluate (when you're ready!). + +```python +penguins.head() +``` + +```python +# Output: +r0 := DatabaseTable: penguins + species string + island string + bill_length_mm float64 + bill_depth_mm float64 + flipper_length_mm int64 + body_mass_g int64 + sex string + year int64 + +Limit[r0, n=5] +``` + +Ibis returns results as a pandas DataFrame using `to_pandas`, but isn't using pandas to perform any of the computation. The query is executed by DuckDB. Only when `to_pandas` is called does Ibis then pull back the results and convert them into a DataFrame. + +## Interactive Mode + +For the rest of this intro, we'll turn on interactive mode, which partially executes queries to give users a preview of the results. There is a small difference in the way the output is formatted, but otherwise this is the same as calling `to_pandas` on the table expression with a limit of 10 result rows returned. + +```python +ibis.options.interactive = True +penguins.head() +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ +┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ +│ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ +├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ +│ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │ +│ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │ +│ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │ +│ Adelie │ Torgersen │ nan │ nan │ NULL │ NULL │ NULL │ 2007 │ +│ Adelie │ Torgersen │ 36.7 │ 19.3 │ 193 │ 3450 │ female │ 2007 │ +└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ +``` + +## Common Operations + +Ibis has a collection of useful table methods to manipulate and query the data in a table. + +### filter + +`filter` allows you to select rows based on a condition or set of conditions. + +We can filter so we only have penguins of the species Adelie: + +```python +penguins.filter(penguins.species == "Gentoo") +``` + +```text +┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ +┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ +┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ +│ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ +├─────────┼────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ +│ Gentoo │ Biscoe │ 46.1 │ 13.2 │ 211 │ 4500 │ female │ 2007 │ +│ Gentoo │ Biscoe │ 50.0 │ 16.3 │ 230 │ 5700 │ male │ 2007 │ +│ Gentoo │ Biscoe │ 48.7 │ 14.1 │ 210 │ 4450 │ female │ 2007 │ +│ Gentoo │ Biscoe │ 50.0 │ 15.2 │ 218 │ 5700 │ male │ 2007 │ +│ Gentoo │ Biscoe │ 47.6 │ 14.5 │ 215 │ 5400 │ male │ 2007 │ +│ Gentoo │ Biscoe │ 46.5 │ 13.5 │ 210 │ 4550 │ female │ 2007 │ +│ Gentoo │ Biscoe │ 45.4 │ 14.6 │ 211 │ 4800 │ female │ 2007 │ +│ Gentoo │ Biscoe │ 46.7 │ 15.3 │ 219 │ 5200 │ male │ 2007 │ +│ Gentoo │ Biscoe │ 43.3 │ 13.4 │ 209 │ 4400 │ female │ 2007 │ +│ Gentoo │ Biscoe │ 46.8 │ 15.4 │ 215 │ 5150 │ male │ 2007 │ +│ … │ … │ … │ … │ … │ … │ … │ … │ +└─────────┴────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ +``` + +Or filter for Gentoo penguins that have a body mass larger than 6 kg. + +```python +penguins.filter((penguins.species == "Gentoo") & (penguins.body_mass_g > 6000)) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓ +┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ +┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩ +│ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ +├─────────┼────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┤ +│ Gentoo │ Biscoe │ 49.2 │ 15.2 │ 221 │ 6300 │ male │ 2007 │ +│ Gentoo │ Biscoe │ 59.6 │ 17.0 │ 230 │ 6050 │ male │ 2007 │ +└─────────┴────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┘ +``` + +You can use any boolean comparison in a filter (although if you try to do something like use `<` on a string, Ibis will yell at you). + +### select + +Your data analysis might not require all the columns present in a given table. `select` lets you pick out only those columns that you want to work with. + +To select a column you can use the name of the column as a string: + +```python +penguins.select("species", "island", "year").limit(3) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ +┃ species ┃ island ┃ year ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ +│ string │ string │ int64 │ +├─────────┼───────────┼───────┤ +│ Adelie │ Torgersen │ 2007 │ +│ Adelie │ Torgersen │ 2007 │ +│ Adelie │ Torgersen │ 2007 │ +│ … │ … │ … │ +└─────────┴───────────┴───────┘ +``` + +Or you can use column objects directly (this can be convenient when paired with tab-completion): + +```python +penguins.select(penguins.species, penguins.island, penguins.year).limit(3) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ +┃ species ┃ island ┃ year ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ +│ string │ string │ int64 │ +├─────────┼───────────┼───────┤ +│ Adelie │ Torgersen │ 2007 │ +│ Adelie │ Torgersen │ 2007 │ +│ Adelie │ Torgersen │ 2007 │ +│ … │ … │ … │ +└─────────┴───────────┴───────┘ +``` + +Or you can mix-and-match: + +```python +penguins.select("species", "island", penguins.year).limit(3) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓ +┃ species ┃ island ┃ year ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩ +│ string │ string │ int64 │ +├─────────┼───────────┼───────┤ +│ Adelie │ Torgersen │ 2007 │ +│ Adelie │ Torgersen │ 2007 │ +│ Adelie │ Torgersen │ 2007 │ +│ … │ … │ … │ +└─────────┴───────────┴───────┘ +``` + +### mutate + +`mutate` lets you add new columns to your table, derived from the values of existing columns. + +```python +penguins.mutate(bill_length_cm=penguins.bill_length_mm / 10) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ bill_length_cm ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━┩ +│ string │ string │ float64 │ float64 │ int64 │ int64 │ string │ int64 │ float64 │ +├─────────┼───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┼────────────────┤ +│ Adelie │ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ male │ 2007 │ 3.91 │ +│ Adelie │ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ female │ 2007 │ 3.95 │ +│ Adelie │ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ female │ 2007 │ 4.03 │ +│ Adelie │ Torgersen │ nan │ nan │ NULL │ NULL │ NULL │ 2007 │ nan │ +│ Adelie │ Torgersen │ 36.7 │ 19.3 │ 193 │ 3450 │ female │ 2007 │ 3.67 │ +│ Adelie │ Torgersen │ 39.3 │ 20.6 │ 190 │ 3650 │ male │ 2007 │ 3.93 │ +│ Adelie │ Torgersen │ 38.9 │ 17.8 │ 181 │ 3625 │ female │ 2007 │ 3.89 │ +│ Adelie │ Torgersen │ 39.2 │ 19.6 │ 195 │ 4675 │ male │ 2007 │ 3.92 │ +│ Adelie │ Torgersen │ 34.1 │ 18.1 │ 193 │ 3475 │ NULL │ 2007 │ 3.41 │ +│ Adelie │ Torgersen │ 42.0 │ 20.2 │ 190 │ 4250 │ NULL │ 2007 │ 4.20 │ +│ … │ … │ … │ … │ … │ … │ … │ … │ … │ +└─────────┴───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┴────────────────┘ +``` + +Notice that the table is a little too wide to display all the columns now (depending on your screen-size). `bill_length` is now present in millimeters _and_ centimeters. Use a `select` to trim down the number of columns we're looking at. + +```python +penguins.mutate(bill_length_cm=penguins.bill_length_mm / 10).select( + "species", + "island", + "bill_depth_mm", + "flipper_length_mm", + "body_mass_g", + "sex", + "year", + "bill_length_cm", +) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ bill_length_cm ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━┩ +│ string │ string │ float64 │ int64 │ int64 │ string │ int64 │ float64 │ +├─────────┼───────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┼────────────────┤ +│ Adelie │ Torgersen │ 18.7 │ 181 │ 3750 │ male │ 2007 │ 3.91 │ +│ Adelie │ Torgersen │ 17.4 │ 186 │ 3800 │ female │ 2007 │ 3.95 │ +│ Adelie │ Torgersen │ 18.0 │ 195 │ 3250 │ female │ 2007 │ 4.03 │ +│ Adelie │ Torgersen │ nan │ NULL │ NULL │ NULL │ 2007 │ nan │ +│ Adelie │ Torgersen │ 19.3 │ 193 │ 3450 │ female │ 2007 │ 3.67 │ +│ Adelie │ Torgersen │ 20.6 │ 190 │ 3650 │ male │ 2007 │ 3.93 │ +│ Adelie │ Torgersen │ 17.8 │ 181 │ 3625 │ female │ 2007 │ 3.89 │ +│ Adelie │ Torgersen │ 19.6 │ 195 │ 4675 │ male │ 2007 │ 3.92 │ +│ Adelie │ Torgersen │ 18.1 │ 193 │ 3475 │ NULL │ 2007 │ 3.41 │ +│ Adelie │ Torgersen │ 20.2 │ 190 │ 4250 │ NULL │ 2007 │ 4.20 │ +│ … │ … │ … │ … │ … │ … │ … │ … │ +└─────────┴───────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┴────────────────┘ +``` + +### selectors + +Typing out _all_ of the column names _except_ one is a little annoying. Instead of doing that again, we can use a `selector` to quickly select or deselect groups of columns. + +```python +import ibis.selectors as s + +penguins.mutate(bill_length_cm=penguins.bill_length_mm / 10).select( + ~s.matches("bill_length_mm") + # match every column except `bill_length_mm` +) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ sex ┃ year ┃ bill_length_cm ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━┩ +│ string │ string │ float64 │ int64 │ int64 │ string │ int64 │ float64 │ +├─────────┼───────────┼───────────────┼───────────────────┼─────────────┼────────┼───────┼────────────────┤ +│ Adelie │ Torgersen │ 18.7 │ 181 │ 3750 │ male │ 2007 │ 3.91 │ +│ Adelie │ Torgersen │ 17.4 │ 186 │ 3800 │ female │ 2007 │ 3.95 │ +│ Adelie │ Torgersen │ 18.0 │ 195 │ 3250 │ female │ 2007 │ 4.03 │ +│ Adelie │ Torgersen │ nan │ NULL │ NULL │ NULL │ 2007 │ nan │ +│ Adelie │ Torgersen │ 19.3 │ 193 │ 3450 │ female │ 2007 │ 3.67 │ +│ Adelie │ Torgersen │ 20.6 │ 190 │ 3650 │ male │ 2007 │ 3.93 │ +│ Adelie │ Torgersen │ 17.8 │ 181 │ 3625 │ female │ 2007 │ 3.89 │ +│ Adelie │ Torgersen │ 19.6 │ 195 │ 4675 │ male │ 2007 │ 3.92 │ +│ Adelie │ Torgersen │ 18.1 │ 193 │ 3475 │ NULL │ 2007 │ 3.41 │ +│ Adelie │ Torgersen │ 20.2 │ 190 │ 4250 │ NULL │ 2007 │ 4.20 │ +│ … │ … │ … │ … │ … │ … │ … │ … │ +└─────────┴───────────┴───────────────┴───────────────────┴─────────────┴────────┴───────┴────────────────┘ +``` + +You can also use a `selector` alongside a column name. + +```python +penguins.select("island", s.numeric()) +``` + +```text +┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━┓ +┃ island ┃ bill_length_mm ┃ bill_depth_mm ┃ flipper_length_mm ┃ body_mass_g ┃ year ┃ +┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━┩ +│ string │ float64 │ float64 │ int64 │ int64 │ int64 │ +├───────────┼────────────────┼───────────────┼───────────────────┼─────────────┼───────┤ +│ Torgersen │ 39.1 │ 18.7 │ 181 │ 3750 │ 2007 │ +│ Torgersen │ 39.5 │ 17.4 │ 186 │ 3800 │ 2007 │ +│ Torgersen │ 40.3 │ 18.0 │ 195 │ 3250 │ 2007 │ +│ Torgersen │ nan │ nan │ NULL │ NULL │ 2007 │ +│ Torgersen │ 36.7 │ 19.3 │ 193 │ 3450 │ 2007 │ +│ Torgersen │ 39.3 │ 20.6 │ 190 │ 3650 │ 2007 │ +│ Torgersen │ 38.9 │ 17.8 │ 181 │ 3625 │ 2007 │ +│ Torgersen │ 39.2 │ 19.6 │ 195 │ 4675 │ 2007 │ +│ Torgersen │ 34.1 │ 18.1 │ 193 │ 3475 │ 2007 │ +│ Torgersen │ 42.0 │ 20.2 │ 190 │ 4250 │ 2007 │ +│ … │ … │ … │ … │ … │ … │ +└───────────┴────────────────┴───────────────┴───────────────────┴─────────────┴───────┘ +``` + +You can read more about [`selectors`](https://ibis-project.org/reference/selectors/) in the docs! + +### `order_by` + +`order_by` arranges the values of one or more columns in ascending or descending order. + +By default, `ibis` sorts in ascending order: + +```python +penguins.order_by(penguins.flipper_length_mm).select( + "species", "island", "flipper_length_mm" +) +``` + +```text +┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ flipper_length_mm ┃ +┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩ +│ string │ string │ int64 │ +├───────────┼───────────┼───────────────────┤ +│ Adelie │ Biscoe │ 172 │ +│ Adelie │ Biscoe │ 174 │ +│ Adelie │ Torgersen │ 176 │ +│ Adelie │ Dream │ 178 │ +│ Adelie │ Dream │ 178 │ +│ Adelie │ Dream │ 178 │ +│ Chinstrap │ Dream │ 178 │ +│ Adelie │ Dream │ 179 │ +│ Adelie │ Torgersen │ 180 │ +│ Adelie │ Biscoe │ 180 │ +│ … │ … │ … │ +└───────────┴───────────┴───────────────────┘ +``` + +You can sort in descending order using the `desc` method of a column: + +```python +penguins.order_by(penguins.flipper_length_mm.desc()).select( + "species", "island", "flipper_length_mm" +) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ flipper_length_mm ┃ +┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩ +│ string │ string │ int64 │ +├─────────┼────────┼───────────────────┤ +│ Gentoo │ Biscoe │ 231 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 229 │ +│ Gentoo │ Biscoe │ 229 │ +│ … │ … │ … │ +└─────────┴────────┴───────────────────┘ +``` + +Or you can use `ibis.desc` + +```python +penguins.order_by(ibis.desc("flipper_length_mm")).select( + "species", "island", "flipper_length_mm" +) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ flipper_length_mm ┃ +┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩ +│ string │ string │ int64 │ +├─────────┼────────┼───────────────────┤ +│ Gentoo │ Biscoe │ 231 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 230 │ +│ Gentoo │ Biscoe │ 229 │ +│ Gentoo │ Biscoe │ 229 │ +│ … │ … │ … │ +└─────────┴────────┴───────────────────┘ +``` + +### aggregate + +Ibis has several aggregate functions available to help summarize data. + +`mean`, `max`, `min`, `count`, `sum` (the list goes on). + +To aggregate an entire column, call the corresponding method on that column. + +```python +penguins.flipper_length_mm.mean() +``` + +```python +# Output: +200.91520467836258 +``` + +You can compute multiple aggregates at once using the `aggregate` method: + +```python +penguins.aggregate([penguins.flipper_length_mm.mean(), penguins.bill_depth_mm.max()]) +``` + +```text +┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓ +┃ Mean(flipper_length_mm) ┃ Max(bill_depth_mm) ┃ +┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩ +│ float64 │ float64 │ +├─────────────────────────┼────────────────────┤ +│ 200.915205 │ 21.5 │ +└─────────────────────────┴────────────────────┘ +``` + +But `aggregate` _really_ shines when it's paired with `group_by`. + +### `group_by` + +`group_by` creates groupings of rows that have the same value for one or more columns. + +But it doesn't do much on its own -- you can pair it with `aggregate` to get a result. + +```python +penguins.group_by("species").aggregate() +``` + +```text +┏━━━━━━━━━━━┓ +┃ species ┃ +┡━━━━━━━━━━━┩ +│ string │ +├───────────┤ +│ Adelie │ +│ Gentoo │ +│ Chinstrap │ +└───────────┘ +``` + +We grouped by the `species` column and handed it an “empty” aggregate command. The result of that is a column of the unique values in the `species` column. + +If we add a second column to the `group_by`, we'll get each unique pairing of the values in those columns. + +```python +penguins.group_by(["species", "island"]).aggregate() +``` + +```text +┏━━━━━━━━━━━┳━━━━━━━━━━━┓ +┃ species ┃ island ┃ +┡━━━━━━━━━━━╇━━━━━━━━━━━┩ +│ string │ string │ +├───────────┼───────────┤ +│ Adelie │ Torgersen │ +│ Adelie │ Biscoe │ +│ Adelie │ Dream │ +│ Gentoo │ Biscoe │ +│ Chinstrap │ Dream │ +└───────────┴───────────┘ +``` + +Now, if we add an aggregation function to that, we start to really open things up. + +```python +penguins.group_by(["species", "island"]).aggregate(penguins.bill_length_mm.mean()) +``` + +```text +┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ Mean(bill_length_mm) ┃ +┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩ +│ string │ string │ float64 │ +├───────────┼───────────┼──────────────────────┤ +│ Adelie │ Torgersen │ 38.950980 │ +│ Adelie │ Biscoe │ 38.975000 │ +│ Adelie │ Dream │ 38.501786 │ +│ Gentoo │ Biscoe │ 47.504878 │ +│ Chinstrap │ Dream │ 48.833824 │ +└───────────┴───────────┴──────────────────────┘ +``` + +By adding that `mean` to the `aggregate`, we now have a concise way to calculate aggregates over each of the distinct groups in the `group_by`. And we can calculate as many aggregates as we need. + +```python +penguins.group_by(["species", "island"]).aggregate( + [penguins.bill_length_mm.mean(), penguins.flipper_length_mm.max()] +) +``` + +```text +┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ Mean(bill_length_mm) ┃ Max(flipper_length_mm) ┃ +┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩ +│ string │ string │ float64 │ int64 │ +├───────────┼───────────┼──────────────────────┼────────────────────────┤ +│ Adelie │ Torgersen │ 38.950980 │ 210 │ +│ Adelie │ Biscoe │ 38.975000 │ 203 │ +│ Adelie │ Dream │ 38.501786 │ 208 │ +│ Gentoo │ Biscoe │ 47.504878 │ 231 │ +│ Chinstrap │ Dream │ 48.833824 │ 212 │ +└───────────┴───────────┴──────────────────────┴────────────────────────┘ +``` + +If we need more specific groups, we can add to the `group_by`. + +```python +penguins.group_by(["species", "island", "sex"]).aggregate( + [penguins.bill_length_mm.mean(), penguins.flipper_length_mm.max()] +) +``` + +```text +┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓ +┃ species ┃ island ┃ sex ┃ Mean(bill_length_mm) ┃ Max(flipper_length_mm) ┃ +┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩ +│ string │ string │ string │ float64 │ int64 │ +├─────────┼───────────┼────────┼──────────────────────┼────────────────────────┤ +│ Adelie │ Torgersen │ male │ 40.586957 │ 210 │ +│ Adelie │ Torgersen │ female │ 37.554167 │ 196 │ +│ Adelie │ Torgersen │ NULL │ 37.925000 │ 193 │ +│ Adelie │ Biscoe │ female │ 37.359091 │ 199 │ +│ Adelie │ Biscoe │ male │ 40.590909 │ 203 │ +│ Adelie │ Dream │ female │ 36.911111 │ 202 │ +│ Adelie │ Dream │ male │ 40.071429 │ 208 │ +│ Adelie │ Dream │ NULL │ 37.500000 │ 179 │ +│ Gentoo │ Biscoe │ female │ 45.563793 │ 222 │ +│ Gentoo │ Biscoe │ male │ 49.473770 │ 231 │ +│ … │ … │ … │ … │ … │ +└─────────┴───────────┴────────┴──────────────────────┴────────────────────────┘ +``` + +## Chaining It All Together + +We've already chained some Ibis calls together. We used `mutate` to create a new column and then `select` to only view a subset of the new table. We were just chaining `group_by` with `aggregate`. + +There's nothing stopping us from putting all of these concepts together to ask questions of the data. + +How about: + +* What was the largest female penguin (by body mass) on each island in the year 2008? + +```python +penguins.filter((penguins.sex == "female") & (penguins.year == 2008)).group_by( + ["island"] +).aggregate(penguins.body_mass_g.max()) +``` + +```text +┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓ +┃ island ┃ Max(body_mass_g) ┃ +┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩ +│ string │ int64 │ +├───────────┼──────────────────┤ +│ Biscoe │ 5200 │ +│ Torgersen │ 3800 │ +│ Dream │ 3900 │ +└───────────┴──────────────────┘ +``` + +* What about the largest male penguin (by body mass) on each island for each year of data collection? + +```python +penguins.filter(penguins.sex == "male").group_by(["island", "year"]).aggregate( + penguins.body_mass_g.max().name("max_body_mass") +).order_by(["year", "max_body_mass"]) +``` + +```text +┏━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━┓ +┃ island ┃ year ┃ max_body_mass ┃ +┡━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━┩ +│ string │ int64 │ int64 │ +├───────────┼───────┼───────────────┤ +│ Dream │ 2007 │ 4650 │ +│ Torgersen │ 2007 │ 4675 │ +│ Biscoe │ 2007 │ 6300 │ +│ Torgersen │ 2008 │ 4700 │ +│ Dream │ 2008 │ 4800 │ +│ Biscoe │ 2008 │ 6000 │ +│ Torgersen │ 2009 │ 4300 │ +│ Dream │ 2009 │ 4475 │ +│ Biscoe │ 2009 │ 6000 │ +└───────────┴───────┴───────────────┘ +``` + +## Learn More + +That's all for this quick-start guide. If you want to learn more, check out the [Ibis documentation](https://ibis-project.org). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/import_arrow.md b/docs/archive/1.0/guides/python/import_arrow.md new file mode 100644 index 00000000000..1f5eb484fc2 --- /dev/null +++ b/docs/archive/1.0/guides/python/import_arrow.md @@ -0,0 +1,20 @@ +--- +layout: docu +title: Import from Apache Arrow +--- + +`CREATE TABLE AS` and `INSERT INTO` can be used to create a table from any query. We can then create tables or insert into existing tables by referring to referring to the Apache Arrow object in the query. This example imports from an [Arrow Table](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html), but DuckDB can query different Apache Arrow formats as seen in the [SQL on Arrow guide]({% link docs/archive/1.0/guides/python/sql_on_arrow.md %}). + +```python +import duckdb +import pyarrow as pa + +# connect to an in-memory database +my_arrow = pa.Table.from_pydict({'a': [42]}) + +# create the table "my_table" from the DataFrame "my_arrow" +duckdb.sql("CREATE TABLE my_table AS SELECT * FROM my_arrow") + +# insert into the table "my_table" from the DataFrame "my_arrow" +duckdb.sql("INSERT INTO my_table SELECT * FROM my_arrow") +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/import_numpy.md b/docs/archive/1.0/guides/python/import_numpy.md new file mode 100644 index 00000000000..d307154270c --- /dev/null +++ b/docs/archive/1.0/guides/python/import_numpy.md @@ -0,0 +1,32 @@ +--- +layout: docu +title: Import from Numpy +--- + +It is possible to query Numpy arrays from DuckDB. +There is no need to register the arrays manually – +DuckDB can find them in the Python process by name thanks to [replacement scans]({% link docs/archive/1.0/guides/glossary.md %}#replacement-scan). +For example: + +```python +import duckdb +import numpy as np + +my_arr = np.array([(1, 9.0), (2, 8.0), (3, 7.0)]) + +duckdb.sql("SELECT * FROM my_arr") +``` + +```text +┌─────────┬─────────┬─────────┐ +│ column0 │ column1 │ column2 │ +│ double │ double │ double │ +├─────────┼─────────┼─────────┤ +│ 1.0 │ 2.0 │ 3.0 │ +│ 9.0 │ 8.0 │ 7.0 │ +└─────────┴─────────┴─────────┘ +``` + +## See Also + +DuckDB also supports [exporting to Numpy]({% link docs/archive/1.0/guides/python/export_numpy.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/import_pandas.md b/docs/archive/1.0/guides/python/import_pandas.md new file mode 100644 index 00000000000..0136d99de05 --- /dev/null +++ b/docs/archive/1.0/guides/python/import_pandas.md @@ -0,0 +1,34 @@ +--- +layout: docu +title: Import from Pandas +--- + +[`CREATE TABLE ... AS`]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas) and [`INSERT INTO`]({% link docs/archive/1.0/sql/statements/insert.md %}) can be used to create a table from any query. +We can then create tables or insert into existing tables by referring to referring to the [Pandas](https://pandas.pydata.org/) DataFrame in the query. +There is no need to register the DataFrames manually – +DuckDB can find them in the Python process by name thanks to [replacement scans]({% link docs/archive/1.0/guides/glossary.md %}#replacement-scan). + +```python +import duckdb +import pandas + +# Create a Pandas dataframe +my_df = pandas.DataFrame.from_dict({'a': [42]}) + +# create the table "my_table" from the DataFrame "my_df" +# Note: duckdb.sql connects to the default in-memory database connection +duckdb.sql("CREATE TABLE my_table AS SELECT * FROM my_df") + +# insert into the table "my_table" from the DataFrame "my_df" +duckdb.sql("INSERT INTO my_table SELECT * FROM my_df") +``` + +If the order of columns is different or not all columns are present in the DataFrame, use [`INSERT INTO ... BY NAME`]({% link docs/archive/1.0/sql/statements/insert.md %}#insert-into--by-name): + +```python +duckdb.sql("INSERT INTO my_table BY NAME SELECT * FROM my_df") +``` + +## See Also + +DuckDB also supports [exporting to Pandas]({% link docs/archive/1.0/guides/python/export_pandas.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/install.md b/docs/archive/1.0/guides/python/install.md new file mode 100644 index 00000000000..d1c52f6d2ae --- /dev/null +++ b/docs/archive/1.0/guides/python/install.md @@ -0,0 +1,30 @@ +--- +layout: docu +title: Installing the Python Client +--- + +## Installing via Pip + +The latest release of the Python client can be installed using `pip`. + +```bash +pip install duckdb +``` + +The pre-release Python client can be installed using `--pre`. + +```bash +pip install duckdb --upgrade --pre +``` + +## Installing from Source + +The latest Python client can be installed from source from the [`tools/pythonpkg` directory in the DuckDB GitHub repository](https://github.com/duckdb/duckdb/tree/main/tools/pythonpkg). + +```batch +BUILD_PYTHON=1 GEN=ninja make +cd tools/pythonpkg +python setup.py install +``` + +For detailed instructions on how to compile DuckDB from source, see the [Building guide]({% link docs/archive/1.0/dev/building/build_instructions.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/jupyter.md b/docs/archive/1.0/guides/python/jupyter.md new file mode 100644 index 00000000000..d00a9e662a8 --- /dev/null +++ b/docs/archive/1.0/guides/python/jupyter.md @@ -0,0 +1,201 @@ +--- +layout: docu +title: Jupyter Notebooks +--- + +DuckDB's Python client can be used directly in Jupyter notebooks with no additional configuration if desired. +However, additional libraries can be used to simplify SQL query development. +This guide will describe how to utilize those additional libraries. +See other guides in the Python section for how to use DuckDB and Python together. + +In this example, we use the [JupySQL](https://github.com/ploomber/jupysql) package. + +This example workflow is also available as a [Google Colab notebook](https://colab.research.google.com/drive/1eOA2FYHqEfZWLYssbUxdIpSL3PFxWVjk?usp=sharing). + +## Library Installation + +Four additional libraries improve the DuckDB experience in Jupyter notebooks. + +1. [jupysql](https://github.com/ploomber/jupysql): Convert a Jupyter code cell into a SQL cell +2. [Pandas](https://github.com/pandas-dev/pandas): Clean table visualizations and compatibility with other analysis +3. [matplotlib](https://github.com/matplotlib/matplotlib): Plotting with Python +4. [duckdb-engine (DuckDB SQLAlchemy driver)](https://github.com/Mause/duckdb_engine): Used by SQLAlchemy to connect to DuckDB (optional) + +Run these `pip install` commands from the command line if Jupyter Notebook is not yet installed. Otherwise, see Google Colab link above for an in-notebook example: + +```bash +pip install duckdb +``` + +Install Jupyter Notebook + +```bash +pip install notebook +``` + +Or JupyterLab: + +```bash +pip install jupyterlab +``` + +Install supporting libraries: + +```bash +pip install jupysql pandas matplotlib duckdb-engine +``` + +## Library Import and Configuration + +Open a Jupyter Notebook and import the relevant libraries. + +### Connecting to DuckDB Natively + +To connect to DuckDB, run: + +```python +import duckdb +import pandas as pd + +%load_ext sql +conn = duckdb.connect() +%sql conn --alias duckdb +``` + +### Connecting to DuckDB via SQLAlchemy Using `duckdb_engine` + +Alternatively, you can connect to DuckDB via SQLAlchemy using `duckdb_engine`. See the [performance and feature differences](https://jupysql.ploomber.io/en/latest/tutorials/duckdb-native-sqlalchemy.html). + +```python +import duckdb +import pandas as pd +# No need to import duckdb_engine +# jupysql will auto-detect the driver needed based on the connection string! + +# Import jupysql Jupyter extension to create SQL cells +%load_ext sql +``` + +Set configurations on jupysql to directly output data to Pandas and to simplify the output that is printed to the notebook. + +```python +%config SqlMagic.autopandas = True +%config SqlMagic.feedback = False +%config SqlMagic.displaycon = False +``` + +Connect jupysql to DuckDB using a SQLAlchemy-style connection string. +Either connect to a new [in-memory DuckDB]({% link docs/archive/1.0/api/python/dbapi.md %}#in-memory-connection), the [default connection]({% link docs/archive/1.0/api/python/dbapi.md %}#default-connection) or a file backed database: + +```sql +%sql duckdb:///:memory: +``` + +```sql +%sql duckdb:///:default: +``` + +```sql +%sql duckdb:///path/to/file.db +``` + +> The `%sql` command and `duckdb.sql` share the same [default connection]({% link docs/archive/1.0/api/python/dbapi.md %}) if you provide `duckdb:///:default:` as the SQLAlchemy connection string. + +## Querying DuckDB + +Single line SQL queries can be run using `%sql` at the start of a line. Query results will be displayed as a Pandas DataFrame. + +```sql +%sql SELECT 'Off and flying!' AS a_duckdb_column; +``` + +An entire Jupyter cell can be used as a SQL cell by placing `%%sql` at the start of the cell. Query results will be displayed as a Pandas DataFrame. + +```sql +%%sql +SELECT + schema_name, + function_name +FROM duckdb_functions() +ORDER BY ALL DESC +LIMIT 5; +``` + +To store the query results in a Python variable, use `<<` as an assignment operator. +This can be used with both the `%sql` and `%%sql` Jupyter magics. + +```sql +%sql res << SELECT 'Off and flying!' AS a_duckdb_column; +``` + +If the `%config SqlMagic.autopandas = True` option is set, the variable is a Pandas dataframe, otherwise, it is a `ResultSet` that can be converted to Pandas with the `DataFrame()` function. + +## Querying Pandas Dataframes + +DuckDB is able to find and query any dataframe stored as a variable in the Jupyter notebook. + +```python +input_df = pd.DataFrame.from_dict({"i": [1, 2, 3], + "j": ["one", "two", "three"]}) +``` + +The dataframe being queried can be specified just like any other table in the `FROM` clause. + +```sql +%sql output_df << SELECT sum(i) AS total_i FROM input_df; +``` + +## Visualizing DuckDB Data + +The most common way to plot datasets in Python is to load them using Pandas and then use matplotlib or seaborn for plotting. +This approach requires loading all data into memory which is highly inefficient. +The plotting module in JupySQL runs computations in the SQL engine. +This delegates memory management to the engine and ensures that intermediate computations do not keep eating up memory, efficiently plotting massive datasets. + +### Install and Load DuckDB httpfs Extension + +DuckDB's [httpfs extension]({% link docs/archive/1.0/extensions/httpfs/overview.md %}) allows Parquet and CSV files to be queried remotely over http. +These examples query a Parquet file that contains historical taxi data from NYC. +Using the Parquet format allows DuckDB to only pull the rows and columns into memory that are needed rather than downloading the entire file. +DuckDB can be used to process local [Parquet files]({% link docs/archive/1.0/data/parquet/overview.md %}) as well, which may be desirable if querying the entire Parquet file, or running multiple queries that require large subsets of the file. + +```sql +%%sql +INSTALL httpfs; +LOAD httpfs; +``` + +### Boxplot & Histogram + +To create a boxplot, call `%sqlplot boxplot`, passing the name of the table and the column to plot. +In this case, the name of the table is the URL of the remotely stored Parquet file. + +```python +%sqlplot boxplot --table https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2021-01.parquet --column trip_distance +``` + +![Boxplot of the trip_distance column](/images/trip-distance-boxplot.png) + +Now, create a query that filters by the 90th percentile. +Note the use of the `--save`, and `--no-execute` functions. +This tells JupySQL to store the query, but skips execution. It will be referenced in the next plotting call. + +```sql +%%sql --save short_trips --no-execute +SELECT * +FROM 'https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2021-01.parquet' +WHERE trip_distance < 6.3 +``` + +To create a histogram, call `%sqlplot histogram` and pass the name of the table, the column to plot, and the number of bins. +This uses `--with short-trips` so JupySQL uses the query defined previously and therefore only plots a subset of the data. + +```python +%sqlplot histogram --table short_trips --column trip_distance --bins 10 --with short_trips +``` + +![Histogram of the trip_distance column](/images/trip-distance-histogram.png) + +## Summary + +You now have the ability to alternate between SQL and Pandas in a simple and highly performant way! You can plot massive datasets directly through the engine (avoiding both the download of the entire file and loading all of it into Pandas in memory). Dataframes can be read as tables in SQL, and SQL results can be output into Dataframes. Happy analyzing! \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/multiple_threads.md b/docs/archive/1.0/guides/python/multiple_threads.md new file mode 100644 index 00000000000..2745da3d7d8 --- /dev/null +++ b/docs/archive/1.0/guides/python/multiple_threads.md @@ -0,0 +1,115 @@ +--- +layout: docu +title: Multiple Python Threads +--- + +This page demonstrates how to simultaneously insert into and read from a DuckDB database across multiple Python threads. +This could be useful in scenarios where new data is flowing in and an analysis should be periodically re-run. +Note that this is all within a single Python process (see the [FAQ]({% link faq.md %}) for details on DuckDB concurrency). +Feel free to follow along in this [Google Colab notebook](https://colab.research.google.com/drive/190NB2m-LIfDcMamCY5lIzaD2OTMnYclB?usp=sharing). + +## Setup + +First, import DuckDB and several modules from the Python standard library. +Note: if using Pandas, add `import pandas` at the top of the script as well (as it must be imported prior to the multi-threading). +Then connect to a file-backed DuckDB database and create an example table to store inserted data. +This table will track the name of the thread that completed the insert and automatically insert the timestamp when that insert occurred using the [`DEFAULT` expression]({% link docs/archive/1.0/sql/statements/create_table.md %}#syntax). + +```python +import duckdb +from threading import Thread, current_thread +import random + +duckdb_con = duckdb.connect('my_peristent_db.duckdb') +# Use connect without parameters for an in-memory database +# duckdb_con = duckdb.connect() +duckdb_con.execute(""" + CREATE OR REPLACE TABLE my_inserts ( + thread_name VARCHAR, + insert_time TIMESTAMP DEFAULT current_timestamp + ) +""") +``` + +## Reader and Writer Functions + +Next, define functions to be executed by the writer and reader threads. +Each thread must use the `.cursor()` method to create a thread-local connection to the same DuckDB file based on the original connection. +This approach also works with in-memory DuckDB databases. + +```python +def write_from_thread(duckdb_con): + # Create a DuckDB connection specifically for this thread + local_con = duckdb_con.cursor() + # Insert a row with the name of the thread. insert_time is auto-generated. + thread_name = str(current_thread().name) + result = local_con.execute(""" + INSERT INTO my_inserts (thread_name) + VALUES (?) + """, (thread_name,)).fetchall() + +def read_from_thread(duckdb_con): + # Create a DuckDB connection specifically for this thread + local_con = duckdb_con.cursor() + # Query the current row count + thread_name = str(current_thread().name) + results = local_con.execute(""" + SELECT + ? AS thread_name, + count(*) AS row_counter, + current_timestamp + FROM my_inserts + """, (thread_name,)).fetchall() + print(results) +``` + +## Create Threads + +We define how many writers and readers to use, and define a list to track all of the threads that will be created. +Then, create first writer and then reader threads. +Next, shuffle them so that they will be kicked off in a random order to simulate simultaneous writers and readers. +Note that the threads have not yet been executed, only defined. + +```python +write_thread_count = 50 +read_thread_count = 5 +threads = [] + +# Create multiple writer and reader threads (in the same process) +# Pass in the same connection as an argument +for i in range(write_thread_count): + threads.append(Thread(target = write_from_thread, + args = (duckdb_con,), + name = 'write_thread_' + str(i))) + +for j in range(read_thread_count): + threads.append(Thread(target = read_from_thread, + args = (duckdb_con,), + name = 'read_thread_' + str(j))) + +# Shuffle the threads to simulate a mix of readers and writers +random.seed(6) # Set the seed to ensure consistent results when testing +random.shuffle(threads) +``` + +## Run Threads and Show Results + +Now, kick off all threads to run in parallel, then wait for all of them to finish before printing out the results. +Note that the timestamps of readers and writers are interspersed as expected due to the randomization. + +```python +# Kick off all threads in parallel +for thread in threads: + thread.start() + +# Ensure all threads complete before printing final results +for thread in threads: + thread.join() + +print(duckdb_con.execute(""" + SELECT * + FROM my_inserts + ORDER BY + insert_time +""").df()) +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/polars.md b/docs/archive/1.0/guides/python/polars.md new file mode 100644 index 00000000000..e740f475a78 --- /dev/null +++ b/docs/archive/1.0/guides/python/polars.md @@ -0,0 +1,61 @@ +--- +layout: docu +title: Integration with Polars +--- + +[Polars](https://github.com/pola-rs/polars) is a DataFrames library built in Rust with bindings for Python and Node.js. It uses [Apache Arrow's columnar format](https://arrow.apache.org/docs/format/Columnar.html) as its memory model. DuckDB can read Polars DataFrames and convert query results to Polars DataFrames. It does this internally using the efficient Apache Arrow integration. Note that the `pyarrow` library must be installed for the integration to work. + +## Installation + +```bash +pip install -U duckdb 'polars[pyarrow]' +``` + +## Polars to DuckDB + +DuckDB can natively query Polars DataFrames by referring to the name of Polars DataFrames as they exist in the current scope. + +```python +import duckdb +import polars as pl + +df = pl.DataFrame( + { + "A": [1, 2, 3, 4, 5], + "fruits": ["banana", "banana", "apple", "apple", "banana"], + "B": [5, 4, 3, 2, 1], + "cars": ["beetle", "audi", "beetle", "beetle", "beetle"], + } +) +duckdb.sql("SELECT * FROM df").show() +``` + +## DuckDB to Polars + +DuckDB can output results as Polars DataFrames using the `.pl()` result-conversion method. + +```python +df = duckdb.sql(""" + SELECT 1 AS id, 'banana' AS fruit + UNION ALL + SELECT 2, 'apple' + UNION ALL + SELECT 3, 'mango'""" +).pl() +print(df) +``` + +```text +shape: (3, 2) +┌─────┬────────┐ +│ id ┆ fruit │ +│ --- ┆ --- │ +│ i32 ┆ str │ +╞═════╪════════╡ +│ 1 ┆ banana │ +│ 2 ┆ apple │ +│ 3 ┆ mango │ +└─────┴────────┘ +``` + +To learn more about Polars, feel free to explore their [Python API Reference](https://pola-rs.github.io/polars/py-polars/html/reference/index.html). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/relational_api_pandas.md b/docs/archive/1.0/guides/python/relational_api_pandas.md new file mode 100644 index 00000000000..d92aac48555 --- /dev/null +++ b/docs/archive/1.0/guides/python/relational_api_pandas.md @@ -0,0 +1,32 @@ +--- +layout: docu +title: Relational API on Pandas +--- + +DuckDB offers a relational API that can be used to chain together query operations. These are lazily evaluated so that DuckDB can optimize their execution. These operators can act on Pandas DataFrames, DuckDB tables or views (which can point to any underlying storage format that DuckDB can read, such as CSV or Parquet files, etc.). Here we show a simple example of reading from a Pandas DataFrame and returning a DataFrame. + +```python +import duckdb +import pandas + +# connect to an in-memory database +con = duckdb.connect() + +input_df = pandas.DataFrame.from_dict({'i': [1, 2, 3, 4], + 'j': ["one", "two", "three", "four"]}) + +# create a DuckDB relation from a dataframe +rel = con.from_df(input_df) + +# chain together relational operators (this is a lazy operation, so the operations are not yet executed) +# equivalent to: SELECT i, j, i*2 AS two_i FROM input_df WHERE i >= 2 ORDER BY i DESC LIMIT 2 +transformed_rel = rel.filter('i >= 2').project('i, j, i*2 as two_i').order('i desc').limit(2) + +# trigger execution by requesting .df() of the relation +# .df() could have been added to the end of the chain above - it was separated for clarity +output_df = transformed_rel.df() +``` + +Relational operators can also be used to group rows, aggregate, find distinct combinations of values, join, union, and more. They are also able to directly insert results into a DuckDB table or write to a CSV. + +Please see [these additional examples](https://github.com/duckdb/duckdb/blob/main/examples/python/duckdb-python.py) and [the available relational methods on the `DuckDBPyRelation` class]({% link docs/archive/1.0/api/python/reference/index.md %}#duckdb.DuckDBPyRelation). \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/sql_on_arrow.md b/docs/archive/1.0/guides/python/sql_on_arrow.md new file mode 100644 index 00000000000..578e3641c32 --- /dev/null +++ b/docs/archive/1.0/guides/python/sql_on_arrow.md @@ -0,0 +1,113 @@ +--- +layout: docu +title: SQL on Apache Arrow +--- + +DuckDB can query multiple different types of Apache Arrow objects. + +## Apache Arrow Tables + +[Arrow Tables](https://arrow.apache.org/docs/python/generated/pyarrow.Table.html) stored in local variables can be queried as if they are regular tables within DuckDB. + +```python +import duckdb +import pyarrow as pa + +# connect to an in-memory database +con = duckdb.connect() + +my_arrow_table = pa.Table.from_pydict({'i': [1, 2, 3, 4], + 'j': ["one", "two", "three", "four"]}) + +# query the Apache Arrow Table "my_arrow_table" and return as an Arrow Table +results = con.execute("SELECT * FROM my_arrow_table WHERE i = 2").arrow() +``` + +## Apache Arrow Datasets + +[Arrow Datasets](https://arrow.apache.org/docs/python/dataset.html) stored as variables can also be queried as if they were regular tables. +Datasets are useful to point towards directories of Parquet files to analyze large datasets. +DuckDB will push column selections and row filters down into the dataset scan operation so that only the necessary data is pulled into memory. + +```python +import duckdb +import pyarrow as pa +import tempfile +import pathlib +import pyarrow.parquet as pq +import pyarrow.dataset as ds + +# connect to an in-memory database +con = duckdb.connect() + +my_arrow_table = pa.Table.from_pydict({'i': [1, 2, 3, 4], + 'j': ["one", "two", "three", "four"]}) + +# create example Parquet files and save in a folder +base_path = pathlib.Path(tempfile.gettempdir()) +(base_path / "parquet_folder").mkdir(exist_ok = True) +pq.write_to_dataset(my_arrow_table, str(base_path / "parquet_folder")) + +# link to Parquet files using an Arrow Dataset +my_arrow_dataset = ds.dataset(str(base_path / 'parquet_folder/')) + +# query the Apache Arrow Dataset "my_arrow_dataset" and return as an Arrow Table +results = con.execute("SELECT * FROM my_arrow_dataset WHERE i = 2").arrow() +``` + +## Apache Arrow Scanners + +[Arrow Scanners](https://arrow.apache.org/docs/python/generated/pyarrow.dataset.Scanner.html) stored as variables can also be queried as if they were regular tables. Scanners read over a dataset and select specific columns or apply row-wise filtering. This is similar to how DuckDB pushes column selections and filters down into an Arrow Dataset, but using Arrow compute operations instead. Arrow can use asynchronous IO to quickly access files. + +```python +import duckdb +import pyarrow as pa +import tempfile +import pathlib +import pyarrow.parquet as pq +import pyarrow.dataset as ds +import pyarrow.compute as pc + +# connect to an in-memory database +con = duckdb.connect() + +my_arrow_table = pa.Table.from_pydict({'i': [1, 2, 3, 4], + 'j': ["one", "two", "three", "four"]}) + +# create example Parquet files and save in a folder +base_path = pathlib.Path(tempfile.gettempdir()) +(base_path / "parquet_folder").mkdir(exist_ok = True) +pq.write_to_dataset(my_arrow_table, str(base_path / "parquet_folder")) + +# link to Parquet files using an Arrow Dataset +my_arrow_dataset = ds.dataset(str(base_path / 'parquet_folder/')) + +# define the filter to be applied while scanning +# equivalent to "WHERE i = 2" +scanner_filter = (pc.field("i") == pc.scalar(2)) + +arrow_scanner = ds.Scanner.from_dataset(my_arrow_dataset, filter = scanner_filter) + +# query the Apache Arrow scanner "arrow_scanner" and return as an Arrow Table +results = con.execute("SELECT * FROM arrow_scanner").arrow() +``` + +## Apache Arrow RecordBatchReaders + +[Arrow RecordBatchReaders](https://arrow.apache.org/docs/python/generated/pyarrow.RecordBatchReader.html) are a reader for Arrow's streaming binary format and can also be queried directly as if they were tables. This streaming format is useful when sending Arrow data for tasks like interprocess communication or communicating between language runtimes. + +```python +import duckdb +import pyarrow as pa + +# connect to an in-memory database +con = duckdb.connect() + +my_recordbatch = pa.RecordBatch.from_pydict({'i': [1, 2, 3, 4], + 'j': ["one", "two", "three", "four"]}) + +my_recordbatchreader = pa.ipc.RecordBatchReader.from_batches(my_recordbatch.schema, [my_recordbatch]) + +# query the Apache Arrow RecordBatchReader "my_recordbatchreader" and return as an Arrow Table +results = con.execute("SELECT * FROM my_recordbatchreader WHERE i = 2").arrow() +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/python/sql_on_pandas.md b/docs/archive/1.0/guides/python/sql_on_pandas.md new file mode 100644 index 00000000000..ddbb7ae81e1 --- /dev/null +++ b/docs/archive/1.0/guides/python/sql_on_pandas.md @@ -0,0 +1,20 @@ +--- +layout: docu +title: SQL on Pandas +--- + +Pandas DataFrames stored in local variables can be queried as if they are regular tables within DuckDB. + +```python +import duckdb +import pandas + +# Create a Pandas dataframe +my_df = pandas.DataFrame.from_dict({'a': [42]}) + +# query the Pandas DataFrame "my_df" +# Note: duckdb.sql connects to the default in-memory database connection +results = duckdb.sql("SELECT * FROM my_df").df() +``` + +The seamless integration of Pandas DataFrames to DuckDB SQL queries is allowed by [replacement scans]({% link docs/archive/1.0/api/c/replacement_scans.md %}), which replace instances of accessing the `my_df` table (which does not exist in DuckDB) with a table function that reads the `my_df` dataframe. \ No newline at end of file diff --git a/docs/archive/1.0/guides/snippets/create_synthetic_data.md b/docs/archive/1.0/guides/snippets/create_synthetic_data.md new file mode 100644 index 00000000000..76f5cb5dcae --- /dev/null +++ b/docs/archive/1.0/guides/snippets/create_synthetic_data.md @@ -0,0 +1,57 @@ +--- +layout: docu +title: Create Synthetic Data +--- + +DuckDB allows you to quickly generate synthetic data sets. To do so, you may use: + +* [range functions]({% link docs/archive/1.0/sql/functions/nested.md %}#range-functions) +* hash functions, e.g., + [`hash`]({% link docs/archive/1.0/sql/functions/utility.md %}#hashvalue), + [`md5`]({% link docs/archive/1.0/sql/functions/utility.md %}#md5string), + [`sha256`]({% link docs/archive/1.0/sql/functions/utility.md %}#sha256value) +* the [Faker Python package](https://faker.readthedocs.io/) via the [Python function API]({% link docs/archive/1.0/api/python/function.md %}) +* using [cross products (Cartesian products)]({% link docs/archive/1.0/sql/query_syntax/from.md %}#cross-product-joins-cartesian-product) + +For example: + +```python +import duckdb + +from duckdb.typing import * +from faker import Faker + +def random_date(): + fake = Faker() + return fake.date_between() + +duckdb.create_function("random_date", random_date, [], DATE, type="native", side_effects=True) +res = duckdb.sql(""" + SELECT hash(i * 10 + j) AS id, random_date() AS creationDate, IF (j % 2, true, false) + FROM generate_series(1, 5) s(i) + CROSS JOIN generate_series(1, 2) t(j) + """) +res.show() +``` + +This generates the following: + +```text +┌──────────────────────┬──────────────┬─────────┐ +│ id │ creationDate │ flag │ +│ uint64 │ date │ boolean │ +├──────────────────────┼──────────────┼─────────┤ +│ 6770051751173734325 │ 2019-11-05 │ true │ +│ 16510940941872865459 │ 2002-08-03 │ true │ +│ 13285076694688170502 │ 1998-11-27 │ true │ +│ 11757770452869451863 │ 1998-07-03 │ true │ +│ 2064835973596856015 │ 2010-09-06 │ true │ +│ 17776805813723356275 │ 2020-12-26 │ false │ +│ 13540103502347468651 │ 1998-03-21 │ false │ +│ 4800297459639118879 │ 2015-06-12 │ false │ +│ 7199933130570745587 │ 2005-04-13 │ false │ +│ 18103378254596719331 │ 2014-09-15 │ false │ +├──────────────────────┴──────────────┴─────────┤ +│ 10 rows 3 columns │ +└───────────────────────────────────────────────┘ +``` \ No newline at end of file diff --git a/docs/archive/1.0/guides/sql_editors/dbeaver.md b/docs/archive/1.0/guides/sql_editors/dbeaver.md new file mode 100644 index 00000000000..e5c40c6f532 --- /dev/null +++ b/docs/archive/1.0/guides/sql_editors/dbeaver.md @@ -0,0 +1,65 @@ +--- +layout: docu +title: DBeaver SQL IDE +--- + +[DBeaver](https://dbeaver.io/) is a powerful and popular desktop sql editor and integrated development environment (IDE). It has both an open source and enterprise version. It is useful for visually inspecting the available tables in DuckDB and for quickly building complex queries. DuckDB's [JDBC connector](https://search.maven.org/artifact/org.duckdb/duckdb_jdbc) allows DBeaver to query DuckDB files, and by extension, any other files that DuckDB can access (like [Parquet files]({% link docs/archive/1.0/guides/file_formats/query_parquet.md %})). + +## Installing DBeaver + +1. Install DBeaver using the download links and instructions found at their [download page](https://dbeaver.io/download/). + +2. Open DBeaver and create a new connection. Either click on the “New Database Connection” button or go to Database > New Database Connection in the menu bar. + + DBeaver New Database Connection + DBeaver New Database Connection Menu + +3. Search for DuckDB, select it, and click Next. + + DBeaver Select Database Driver + +4. Enter the path or browse to the DuckDB database file you wish to query. To use an in-memory DuckDB (useful primarily if just interested in querying Parquet files, or for testing) enter `:memory:` as the path. + + DBeaver Set Path + +5. Click “Test Connection”. This will then prompt you to install the DuckDB JDBC driver. If you are not prompted, see alternative driver installation instructions below. + + DBeaver Test Connection + +6. Click “Download” to download DuckDB's JDBC driver from Maven. Once download is complete, click “OK”, then click “Finish”. +* Note: If you are in a corporate environment or behind a firewall, before clicking download, click the “Download Configuration” link to configure your proxy settings. + + DBeaver Download Driver Files + +7. You should now see a database connection to your DuckDB database in the left hand “Database Navigator” pane. Expand it to see the tables and views in your database. Right click on that connection and create a new SQL script. + + DBeaver New SQL Script + +8. Write some SQL and click the “Execute” button. + + DBeaver Execute Query + +9. Now you're ready to fly with DuckDB and DBeaver! + + DBeaver Query Results + +## Alternative Driver Installation + +1. If not prompted to install the DuckDB driver when testing your connection, return to the “Connect to a database” dialog and click “Edit Driver Settings”. + + DBeaver Edit Driver Settings + +2. (Alternate) You may also access the driver settings menu by returning to the main DBeaver window and clicking Database > Driver Manager in the menu bar. Then select DuckDB, then click Edit. + + DBeaver Driver Manager + DBeaver Driver Manager Edit + +3. Go to the “Libraries” tab, then click on the DuckDB driver and click “Download/Update”. If you do not see the DuckDB driver, first click on “Reset to Defaults”. + + DBeaver Edit Driver + +4. Click “Download” to download DuckDB's JDBC driver from Maven. Once download is complete, click “OK”, then return to the main DBeaver window and continue with step 7 above. + + * Note: If you are in a corporate environment or behind a firewall, before clicking download, click the “Download Configuration” link to configure your proxy settings. + + DBeaver Download Driver Files 2 \ No newline at end of file diff --git a/docs/archive/1.0/guides/sql_features/asof_join.md b/docs/archive/1.0/guides/sql_features/asof_join.md new file mode 100644 index 00000000000..93b3d402cd4 --- /dev/null +++ b/docs/archive/1.0/guides/sql_features/asof_join.md @@ -0,0 +1,174 @@ +--- +layout: docu +title: AsOf Join +--- + +## What is an AsOf Join? + +Time series data is not always perfectly aligned. +Clocks may be slightly off, or there may be a delay between cause and effect. +This can make connecting two sets of ordered data challenging. +AsOf joins are a tool for solving this and other similar problems. + +One of the problems that AsOf joins are used to solve is +finding the value of a varying property at a specific point in time. +This use case is so common that it is where the name came from: + +_Give me the value of the property **as of this time**_. + +More generally, however, AsOf joins embody some common temporal analytic semantics, +which can be cumbersome and slow to implement in standard SQL. + +## Portfolio Example Data Set + +Let's start with a concrete example. +Suppose we have a table of stock [`prices`](/data/prices.csv) with timestamps: + +
+ +| ticker | when | price | +| :----- | :--- | ----: | +| APPL | 2001-01-01 00:00:00 | 1 | +| APPL | 2001-01-01 00:01:00 | 2 | +| APPL | 2001-01-01 00:02:00 | 3 | +| MSFT | 2001-01-01 00:00:00 | 1 | +| MSFT | 2001-01-01 00:01:00 | 2 | +| MSFT | 2001-01-01 00:02:00 | 3 | +| GOOG | 2001-01-01 00:00:00 | 1 | +| GOOG | 2001-01-01 00:01:00 | 2 | +| GOOG | 2001-01-01 00:02:00 | 3 | + +We have another table containing portfolio [`holdings`](/data/holdings.csv) at various points in time: + +
+ +| ticker | when | shares | +| :----- | :--- | -----: | +| APPL | 2000-12-31 23:59:30 | 5.16 | +| APPL | 2001-01-01 00:00:30 | 2.94 | +| APPL | 2001-01-01 00:01:30 | 24.13 | +| GOOG | 2000-12-31 23:59:30 | 9.33 | +| GOOG | 2001-01-01 00:00:30 | 23.45 | +| GOOG | 2001-01-01 00:01:30 | 10.58 | +| DATA | 2000-12-31 23:59:30 | 6.65 | +| DATA | 2001-01-01 00:00:30 | 17.95 | +| DATA | 2001-01-01 00:01:30 | 18.37 | + +To load these tables to DuckDB, run: + +```sql +CREATE TABLE prices AS FROM 'https://duckdb.org/data/prices.csv'; +CREATE TABLE holdings AS FROM 'https://duckdb.org/data/holdings.csv'; +``` + +## Inner AsOf Joins + +We can compute the value of each holding at that point in time by finding +the most recent price before the holding's timestamp by using an AsOf Join: + +```sql +SELECT h.ticker, h.when, price * shares AS value +FROM holdings h +ASOF JOIN prices p + ON h.ticker = p.ticker + AND h.when >= p.when; +``` + +This attaches the value of the holding at that time to each row: + +
+ +| ticker | when | value | +| :----- | :--- | ----: | +| APPL | 2001-01-01 00:00:30 | 2.94 | +| APPL | 2001-01-01 00:01:30 | 48.26 | +| GOOG | 2001-01-01 00:00:30 | 23.45 | +| GOOG | 2001-01-01 00:01:30 | 21.16 | + +It essentially executes a function defined by looking up nearby values in the `prices` table. +Note also that missing `ticker` values do not have a match and don't appear in the output. + +## Outer AsOf Joins + +Because AsOf produces at most one match from the right hand side, +the left side table will not grow as a result of the join, +but it could shrink if there are missing times on the right. +To handle this situation, you can use an *outer* AsOf Join: + +```sql +SELECT h.ticker, h.when, price * shares AS value +FROM holdings h +ASOF LEFT JOIN prices p + ON h.ticker = p.ticker + AND h.when >= p.when +ORDER BY ALL; +``` + +As you might expect, this will produce `NULL` prices and values instead of dropping left side rows +when there is no ticker or the time is before the prices begin. + +
+ +| ticker | when | value | +| :----- | :--- | ----: | +| APPL | 2000-12-31 23:59:30 | | +| APPL | 2001-01-01 00:00:30 | 2.94 | +| APPL | 2001-01-01 00:01:30 | 48.26 | +| GOOG | 2000-12-31 23:59:30 | | +| GOOG | 2001-01-01 00:00:30 | 23.45 | +| GOOG | 2001-01-01 00:01:30 | 21.16 | +| DATA | 2000-12-31 23:59:30 | | +| DATA | 2001-01-01 00:00:30 | | +| DATA | 2001-01-01 00:01:30 | | + +## AsOf Joins with the `USING` Keyword + +So far we have been explicit about specifying the conditions for AsOf, +but SQL also has a simplified join condition syntax +for the common case where the column names are the same in both tables. +This syntax uses the `USING` keyword to list the fields that should be compared for equality. +AsOf also supports this syntax, but with two restrictions: + +* The last field is the inequality +* The inequality is `>=` (the most common case) + +Our first query can then be written as: + +```sql +SELECT ticker, h.when, price * shares AS value +FROM holdings h +ASOF JOIN prices p USING (ticker, "when"); +``` + +### Clarification on Column Selection with `USING` in ASOF Joins + +When you use the `USING` keyword in a join, the columns specified in the `USING` clause are merged in the result set. This means that if you run: + +```sql +SELECT * +FROM holdings h +ASOF JOIN prices p USING (ticker, "when"); +``` + +You will get back only the columns `h.ticker, h.when, h.shares, p.price`. The columns `ticker` and `when` will appear only once, with `ticker` +and `when` coming from the left table (holdings). + +This behavior is fine for the `ticker` column because the value is the same in both tables. However, for the `when` column, the values might +differ between the two tables due to the `>=` condition used in the AsOf join. The AsOf join is designed to match each row in the left +table (`holdings`) with the nearest preceding row in the right table (`prices`) based on the `when` column. + +If you want to retrieve the `when` column from both tables to see both timestamps, you need to list the columns explicitly rather than +relying on `*`, like so: + +```sql +SELECT h.ticker, h.when AS holdings_when, p.when AS prices_when, h.shares, p.price +FROM holdings h +ASOF JOIN prices p USING (ticker, "when"); +``` + +This ensures that you get the complete information from both tables, avoiding any potential confusion caused by the default behavior of +the `USING` keyword. + +## See Also + +For implementation details, see the [blog post “DuckDB's AsOf joins: Fuzzy Temporal Lookups”](/2023/09/15/asof-joins-fuzzy-temporal-lookups). \ No newline at end of file diff --git a/docs/archive/1.0/guides/sql_features/full_text_search.md b/docs/archive/1.0/guides/sql_features/full_text_search.md new file mode 100644 index 00000000000..e838e79d0e6 --- /dev/null +++ b/docs/archive/1.0/guides/sql_features/full_text_search.md @@ -0,0 +1,77 @@ +--- +layout: docu +title: Full-Text Search +--- + +DuckDB supports full-text search via the [`fts` extension]({% link docs/archive/1.0/extensions/full_text_search.md %}). +A full-text index allows for a query to quickly search for all occurrences of individual words within longer text strings. + +## Example: Shakespeare Corpus + +Here's an example of building a full-text index of Shakespeare's plays. + +```sql +CREATE TABLE corpus AS + SELECT * FROM 'https://blobs.duckdb.org/data/shakespeare.parquet'; +``` + +```sql +DESCRIBE corpus; +``` + +| column_name | column_type | null | key | default | extra | +|-------------|-------------|------|------|---------|-------| +| line_id | VARCHAR | YES | NULL | NULL | NULL | +| play_name | VARCHAR | YES | NULL | NULL | NULL | +| line_number | VARCHAR | YES | NULL | NULL | NULL | +| speaker | VARCHAR | YES | NULL | NULL | NULL | +| text_entry | VARCHAR | YES | NULL | NULL | NULL | + +The text of each line is in `text_entry`, and a unique key for each line is in `line_id`. + +## Creating a Full-Text Search Index + +First, we create the index, specifying the table name, the unique id column, and the column(s) to index. We will just index the single column `text_entry`, which contains the text of the lines in the play. + +```sql +PRAGMA create_fts_index('corpus', 'line_id', 'text_entry'); +``` + +The table is now ready to query using the [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25) ranking function. Rows with no match return a null score. + +What does Shakespeare say about butter? + +```sql +SELECT + fts_main_corpus.match_bm25(line_id, 'butter') AS score, + line_id, play_name, speaker, text_entry +FROM corpus +WHERE score IS NOT NULL +ORDER BY score DESC; +``` + +| score | line_id | play_name | speaker | text_entry | +|-------------------:|-------------|--------------------------|--------------|----------------------------------------------------| +| 4.427313429798464 | H4/2.4.494 | Henry IV | Carrier | As fat as butter. | +| 3.836270302568675 | H4/1.2.21 | Henry IV | FALSTAFF | prologue to an egg and butter. | +| 3.836270302568675 | H4/2.1.55 | Henry IV | Chamberlain | They are up already, and call for eggs and butter; | +| 3.3844488405497115 | H4/4.2.21 | Henry IV | FALSTAFF | toasts-and-butter, with hearts in their bellies no | +| 3.3844488405497115 | H4/4.2.62 | Henry IV | PRINCE HENRY | already made thee butter. But tell me, Jack, whose | +| 3.3844488405497115 | AWW/4.1.40 | Alls well that ends well | PAROLLES | butter-womans mouth and buy myself another of | +| 3.3844488405497115 | AYLI/3.2.93 | As you like it | TOUCHSTONE | right butter-womens rank to market. | +| 3.3844488405497115 | KL/2.4.132 | King Lear | Fool | kindness to his horse, buttered his hay. | +| 3.0278411214953107 | AWW/5.2.9 | Alls well that ends well | Clown | henceforth eat no fish of fortunes buttering. | +| 3.0278411214953107 | MWW/2.2.260 | Merry Wives of Windsor | FALSTAFF | Hang him, mechanical salt-butter rogue! I will | +| 3.0278411214953107 | MWW/2.2.284 | Merry Wives of Windsor | FORD | rather trust a Fleming with my butter, Parson Hugh | +| 3.0278411214953107 | MWW/3.5.7 | Merry Wives of Windsor | FALSTAFF | Ill have my brains taen out and buttered, and give | +| 3.0278411214953107 | MWW/3.5.102 | Merry Wives of Windsor | FALSTAFF | to heat as butter; a man of continual dissolution | +| 2.739219044070792 | H4/2.4.115 | Henry IV | PRINCE HENRY | Didst thou never see Titan kiss a dish of butter? | + +Unlike standard indexes, full-text indexes don't auto-update as the underlying data is changed, so you need to `PRAGMA drop_fts_index(my_fts_index)` and recreate it when appropriate. + +## Note on Generating the Corpus Table + +For more details, see the [“Generating a Shakespeare corpus for full-text searching from JSON” blog post](https://duckdb.blogspot.com/2023/04/generating-shakespeare-corpus-for-full.html) +* The Columns are: line_id, play_name, line_number, speaker, text_entry. +* We need a unique key for each row in order for full-text searching to work. +* The line_id `KL/2.4.132` means King Lear, Act 2, Scene 4, Line 132. \ No newline at end of file diff --git a/docs/archive/1.0/index.md b/docs/archive/1.0/index.md new file mode 100644 index 00000000000..e927bf4d972 --- /dev/null +++ b/docs/archive/1.0/index.md @@ -0,0 +1,22 @@ +--- +layout: docu +title: Documentation +--- + +Welcome to the DuckDB documentation! + +* [DuckDB connection overview]({% link docs/archive/1.0/connect/overview.md %}) +* Client APIs + * [CLI (command line interface)]({% link docs/archive/1.0/api/cli/overview.md %}) + * [Java]({% link docs/archive/1.0/api/java.md %}) + * [Python]({% link docs/archive/1.0/api/python/overview.md %}) + * [R]({% link docs/archive/1.0/api/r.md %}) + * [WebAssembly]({% link docs/archive/1.0/api/wasm/overview.md %}) + * see all [client APIs]({% link docs/archive/1.0/api/overview.md %}) +* SQL + * [Introduction]({% link docs/archive/1.0/sql/introduction.md %}) + * [Statements]({% link docs/archive/1.0/sql/statements/overview.md %}) +* [Guides]({% link docs/archive/1.0/guides/overview.md %}) +* [Installation]({% link docs/archive/1.0/installation/index.html %}) + +You can also [browse the DuckDB documentation offline]({% link docs/archive/1.0/guides/offline-copy.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/installation/index.html b/docs/archive/1.0/installation/index.html new file mode 100644 index 00000000000..be26b86d1e9 --- /dev/null +++ b/docs/archive/1.0/installation/index.html @@ -0,0 +1,110 @@ +--- +body_class: installation +excerpt: DuckDB installation page +layout: docu +title: DuckDB Installation +--- +

This page contains installation options for DuckDB. We recommend using the stable release, 1.0.0.

+

Binaries are available for major programming languages and platforms. If there are no pre-packaged binaries available, consider building DuckDB from source.

+
+
+

+ Version +

+
    +
  • +
    1.0.0 (stable release)
    +
  • +
  • +
    Nightly build (bleeding edge)
    +
  • +
+
+
+

+ Environment +

+
    +
  • Command line
  • +
  • Python
  • +
  • R
  • +
  • Java
  • +
  • Node.js
  • +
  • Rust
  • +
  • Go
  • +
  • C/C++
  • +
  • ODBC
  • +
+
+
+

+ Platform +

+
    +
  • Windows
  • +
  • macOS
  • +
  • Linux
  • +
+
The package is available for all platforms.
+
+
+

+ Download method +

+
    +
  • Package manager
  • +
  • Direct download
  • +
+
+
+

+ Architecture +

+
    +
  • x86_64
  • +
  • arm64
  • +
+
The package supports both architectures.
+
+ + +
+

+ Installation +

+
+
+
+
+
+
+ + + +
+

+ +

+
+ +
+
+ +
+

+ Usage example +

+
+
+
+
+
+
+
\ No newline at end of file diff --git a/docs/archive/1.0/internals/overview.md b/docs/archive/1.0/internals/overview.md new file mode 100644 index 00000000000..f1b02f3ff3e --- /dev/null +++ b/docs/archive/1.0/internals/overview.md @@ -0,0 +1,80 @@ +--- +layout: docu +redirect_from: +- /internals/overview +title: Overview of DuckDB Internals +--- + +On this page is a brief description of the internals of the DuckDB engine. + +## Parser + +The parser converts a query string into the following tokens: + +* [`SQLStatement`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/parser/sql_statement.hpp) +* [`QueryNode`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/parser/query_node.hpp) +* [`TableRef`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/parser/tableref.hpp) +* [`ParsedExpression`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/parser/parsed_expression.hpp) + +The parser is not aware of the catalog or any other aspect of the database. It will not throw errors if tables do not exist, and will not resolve **any** types of columns yet. It only transforms a query string into a set of tokens as specified. + +### ParsedExpression + +The ParsedExpression represents an expression within a SQL statement. This can be e.g., a reference to a column, an addition operator or a constant value. The type of the ParsedExpression indicates what it represents, e.g., a comparison is represented as a [`ComparisonExpression`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/parser/expression/comparison_expression.hpp). + +ParsedExpressions do **not** have types, except for nodes with explicit types such as `CAST` statements. The types for expressions are resolved in the Binder, not in the Parser. + +### TableRef + +The TableRef represents any table source. This can be a reference to a base table, but it can also be a join, a table-producing function or a subquery. + +### QueryNode + +The QueryNode represents either (1) a `SELECT` statement, or (2) a set operation (i.e. `UNION`, `INTERSECT` or `DIFFERENCE`). + +### SQL Statement + +The SQLStatement represents a complete SQL statement. The type of the SQL Statement represents what kind of statement it is (e.g., `StatementType::SELECT` represents a `SELECT` statement). A single SQL string can be transformed into multiple SQL statements in case the original query string contains multiple queries. + +## Binder + +The binder converts all nodes into their **bound** equivalents. In the binder phase: + +* The tables and columns are resolved using the catalog +* Types are resolved +* Aggregate/window functions are extracted + +The following conversions happen: + +* SQLStatement → [`BoundStatement`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/planner/bound_statement.hpp) +* QueryNode → [`BoundQueryNode`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/planner/bound_query_node.hpp) +* TableRef → [`BoundTableRef`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/planner/bound_tableref.hpp) +* ParsedExpression → [`Expression`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/planner/expression.hpp) + +## Logical Planner + +The logical planner creates [`LogicalOperator`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/planner/logical_operator.hpp) nodes from the bound statements. In this phase, the actual logical query tree is created. + +## Optimizer + +After the logical planner has created the logical query tree, the optimizers are run over that query tree to create an optimized query plan. The following query optimizers are run: + +* **Expression Rewriter**: Simplifies expressions, performs constant folding +* **Filter Pushdown**: Pushes filters down into the query plan and duplicates filters over equivalency sets. Also prunes subtrees that are guaranteed to be empty (because of filters that statically evaluate to false). +* **Join Order Optimizer**: Reorders joins using dynamic programming. Specifically, the `DPccp` algorithm from the paper [Dynamic Programming Strikes Back](https://15721.courses.cs.cmu.edu/spring2017/papers/14-optimizer1/p539-moerkotte.pdf) is used. +* **Common Sub Expressions**: Extracts common subexpressions from projection and filter nodes to prevent unnecessary duplicate execution. +* **In Clause Rewriter**: Rewrites large static IN clauses to a MARK join or INNER join. + +## Column Binding Resolver + +The column binding resolver converts logical [`BoundColumnRefExpresion`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/planner/expression/bound_columnref_expression.hpp) nodes that refer to a column of a specific table into [`BoundReferenceExpression`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/planner/expression/bound_reference_expression.hpp) nodes that refer to a specific index into the DataChunks that are passed around in the execution engine. + +## Physical Plan Generator + +The physical plan generator converts the resulting logical operator tree into a [`PhysicalOperator`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/execution/physical_operator.hpp) tree. + +## Execution + +In the execution phase, the physical operators are executed to produce the query result. +DuckDB uses a push-based vectorized model, where [`DataChunks`](https://github.com/duckdb/duckdb/blob/main/src/include/duckdb/common/types/data_chunk.hpp) are pushed through the operator tree. +For more information, see the talk [Push-Based Execution in DuckDB](https://www.youtube.com/watch?v=1kDrPgRUuEI). \ No newline at end of file diff --git a/docs/archive/1.0/internals/storage.md b/docs/archive/1.0/internals/storage.md new file mode 100644 index 00000000000..8d38950de53 --- /dev/null +++ b/docs/archive/1.0/internals/storage.md @@ -0,0 +1,138 @@ +--- +layout: docu +redirect_from: +- /internals/storage +title: Storage +--- + +## Compatibility + +### Backward Compatibility + +_Backward compatibility_ refers to the ability of a newer DuckDB version to read storage files created by an older DuckDB version. Version 0.10 is the first release of DuckDB that supports backward compatibility in the storage format. DuckDB v0.10 can read and operate on files created by the previous DuckDB version – DuckDB v0.9. + +For future DuckDB versions, our goal is to ensure that any DuckDB version released **after** can read files created by previous versions, starting from this release. We want to ensure that the file format is fully backward compatible. This allows you to keep data stored in DuckDB files around and guarantees that you will be able to read the files without having to worry about which version the file was written with or having to convert files between versions. + +### Forward Compatibility + +_Forward compatibility_ refers to the ability of an older DuckDB version to read storage files produced by a newer DuckDB version. DuckDB v0.9 is [**partially** forward compatible with DuckDB v0.10]({% post_url 2024-02-13-announcing-duckdb-0100 %}#forward-compatibility). Certain files created by DuckDB v0.10 can be read by DuckDB v0.9. + +Forward compatibility is provided on a **best effort** basis. While stability of the storage format is important – there are still many improvements and innovations that we want to make to the storage format in the future. As such, forward compatibility may be (partially) broken on occasion. + +## How to Move Between Storage Formats + +When you update DuckDB and open an old database file, you might encounter an error message about incompatible storage formats, pointing to this page. +To move your database(s) to newer format you only need the older and the newer DuckDB executable. + +Open your database file with the older DuckDB and run the SQL statement `EXPORT DATABASE 'tmp'`. This allows you to save the whole state of the current database in use inside folder `tmp`. +The content of the `tmp` folder will be overridden, so choose an empty/non yet existing location. Then, start the newer DuckDB and execute `IMPORT DATABASE 'tmp'` (pointing to the previously populated folder) to load the database, which can be then saved to the file you pointed DuckDB to. + +A bash one-liner (to be adapted with the file names and executable locations) is: + +```bash +/older/version/duckdb mydata.db -c "EXPORT DATABASE 'tmp'" && /newer/duckdb mydata.new.db -c "IMPORT DATABASE 'tmp'" +``` + +After this, `mydata.db` will remain in the old format, `mydata.new.db` will contain the same data but in a format accessible by the more recent DuckDB version, and the folder `tmp` will hold the same data in a universal format as different files. + +Check [`EXPORT` documentation]({% link docs/archive/1.0/sql/statements/export.md %}) for more details on the syntax. + +## Storage Header + +DuckDB files start with a `uint64_t` which contains a checksum for the main header, followed by four magic bytes (`DUCK`), followed by the storage version number in a `uint64_t`. + +```bash +hexdump -n 20 -C mydata.db +``` + +```text +00000000 01 d0 e2 63 9c 13 39 3e 44 55 43 4b 2b 00 00 00 |...c..9>DUCK+...| +00000010 00 00 00 00 |....| +00000014 +``` + +A simple example of reading the storage version using Python is below. + +```python +import struct + +pattern = struct.Struct('<8x4sQ') + +with open('test/sql/storage_version/storage_version.db', 'rb') as fh: + print(pattern.unpack(fh.read(pattern.size))) +``` + +## Storage Version Table + +For changes in each given release, check out the [change log](https://github.com/duckdb/duckdb/releases) on GitHub. +To see the commits that changed each storage version, see the [commit log](https://github.com/duckdb/duckdb/commits/main/src/storage/storage_info.cpp). + +
+ +| Storage version | DuckDB version(s) | +|----------------:|---------------------------------| +| 64 | v0.9.x, v0.10.x, v1.0.0, v1.1.x | +| 51 | v0.8.x | +| 43 | v0.7.x | +| 39 | v0.6.x | +| 38 | v0.5.x | +| 33 | v0.3.3, v0.3.4, v0.4.0 | +| 31 | v0.3.2 | +| 27 | v0.3.1 | +| 25 | v0.3.0 | +| 21 | v0.2.9 | +| 18 | v0.2.8 | +| 17 | v0.2.7 | +| 15 | v0.2.6 | +| 13 | v0.2.5 | +| 11 | v0.2.4 | +| 6 | v0.2.3 | +| 4 | v0.2.2 | +| 1 | v0.2.1 and prior | + +## Compression + +DuckDB uses [lightweight compression]({% post_url 2022-10-28-lightweight-compression %}). +Note that compression is only applied to persistent databases and is **not applied to in-memory instances**. + +### Compression Algorithms + +The compression algorithms supported by DuckDB include the following: + +* [Constant Encoding]({% post_url 2022-10-28-lightweight-compression %}#constant-encoding) +* [Run-Length Encoding (RLE)]({% post_url 2022-10-28-lightweight-compression %}#run-length-encoding-rle) +* [Bit Packing]({% post_url 2022-10-28-lightweight-compression %}#bit-packing) +* [Frame of Reference (FOR)]({% post_url 2022-10-28-lightweight-compression %}#frame-of-reference) +* [Dictionary Encoding]({% post_url 2022-10-28-lightweight-compression %}#dictionary-encoding) +* [Fast Static Symbol Table (FSST)]({% post_url 2022-10-28-lightweight-compression %}#fsst) – [VLDB 2020 paper](https://www.vldb.org/pvldb/vol13/p2649-boncz.pdf) +* [Adaptive Lossless Floating-Point Compression (ALP)]({% post_url 2024-02-13-announcing-duckdb-0100 %}#adaptive-lossless-floating-point-compression-alp) – [SIGMOD 2024 paper](https://ir.cwi.nl/pub/33334/33334.pdf) +* [Chimp]({% post_url 2022-10-28-lightweight-compression %}#chimp--patas) – [VLDB 2022 paper](https://www.vldb.org/pvldb/vol15/p3058-liakos.pdf) +* [Patas]({% post_url 2022-11-14-announcing-duckdb-060 %}#compression-improvements) + +## Disk Usage + +The disk usage of DuckDB's format depends on a number of factors, including the data type and the data distribution, the compression methods used, etc. +As a rough approximation, loading 100 GB of uncompressed CSV files into a DuckDB database file will require 25 GB of disk space, while loading 100 GB of Parquet files will require 120 GB of disk space. + +## Row Groups + +DuckDB's storage format stores the data in _row groups,_ i.e., horizontal partitions of the data. +This concept is equivalent to [Parquet's row groups](https://parquet.apache.org/docs/concepts/). +Several features in DuckDB, including [parallelism]({% link docs/archive/1.0/guides/performance/how_to_tune_workloads.md %}) and [compression]({% post_url 2022-10-28-lightweight-compression %}) are based on row groups. + +## Troubleshooting + +### Error Message When Opening an Incompatible Database File + +When opening a database file that has been written by a different DuckDB version from the one you are using, the following error message may occur: + +```console +Error: unable to open database "...": Serialization Error: Failed to deserialize: ... +``` + +The message implies that the database file was created with a newer DuckDB version and uses features that are backward incompatible with the DuckDB version used to read the file. + +There are two potential workarounds: + +1. Update your DuckDB version to the latest stable version. +2. Open the database with the latest version of DuckDB, export it to a standard format (e.g., Parquet), then import it using to any version of DuckDB. See the [`EXPORT/IMPORT DATABASE` statements]({% link docs/archive/1.0/sql/statements/export.md %}) for details. \ No newline at end of file diff --git a/docs/archive/1.0/internals/vector.md b/docs/archive/1.0/internals/vector.md new file mode 100644 index 00000000000..2f6362bf5a3 --- /dev/null +++ b/docs/archive/1.0/internals/vector.md @@ -0,0 +1,141 @@ +--- +layout: docu +redirect_from: +- /internals/vector +title: Execution Format +--- + +`Vector` is the container format used to store in-memory data during execution. +`DataChunk` is a collection of Vectors, used for instance to represent a column list in a `PhysicalProjection` operator. + +## Data Flow + +DuckDB uses a vectorized query execution model. +All operators in DuckDB are optimized to work on Vectors of a fixed size. + +This fixed size is commonly referred to in the code as `STANDARD_VECTOR_SIZE`. +The default `STANDARD_VECTOR_SIZE` is 2048 tuples. + +## Vector Format + +Vectors logically represent arrays that contain data of a single type. DuckDB supports different *vector formats*, which allow the system to store the same logical data with a different *physical representation*. This allows for a more compressed representation, and potentially allows for compressed execution throughout the system. Below the list of supported vector formats is shown. + +### Flat Vectors + +Flat vectors are physically stored as a contiguous array, this is the standard uncompressed vector format. +For flat vectors the logical and physical representations are identical. + +Flat Vector example + +### Constant Vectors + +Constant vectors are physically stored as a single constant value. + +Constant Vector example + +Constant vectors are useful when data elements are repeated – for example, when representing the result of a constant expression in a function call, the constant vector allows us to only store the value once. + +```sql +SELECT lst || 'duckdb' +FROM range(1000) tbl(lst); +``` + +Since `duckdb` is a string literal, the value of the literal is the same for every row. In a flat vector, we would have to duplicate the literal 'duckdb' once for every row. The constant vector allows us to only store the literal once. + +Constant vectors are also emitted by the storage when decompressing from constant compression. + +### Dictionary Vectors + +Dictionary vectors are physically stored as a child vector, and a selection vector that contains indexes into the child vector. + +Dictionary Vector example + +Dictionary vectors are emitted by the storage when decompressing from dictionary + +Just like constant vectors, dictionary vectors are also emitted by the storage. +When deserializing a dictionary compressed column segment, we store this in a dictionary vector so we can keep the data compressed during query execution. + +### Sequence Vectors + +Sequence vectors are physically stored as an offset and an increment value. + +Sequence Vector example + +Sequence vectors are useful for efficiently storing incremental sequences. They are generally emitted for row identifiers. + +### Unified Vector Format + +These properties of the different vector formats are great for optimization purposes, for example you can imagine the scenario where all the parameters to a function are constant, we can just compute the result once and emit a constant vector. +But writing specialized code for every combination of vector types for every function is unfeasible due to the combinatorial explosion of possibilities. + +Instead of doing this, whenever you want to generically use a vector regardless of the type, the UnifiedVectorFormat can be used. +This format essentially acts as a generic view over the contents of the Vector. Every type of Vector can convert to this format. + +## Complex Types + +### String Vectors + +To efficiently store strings, we make use of our `string_t` class. + +```cpp +struct string_t { + union { + struct { + uint32_t length; + char prefix[4]; + char *ptr; + } pointer; + struct { + uint32_t length; + char inlined[12]; + } inlined; + } value; +}; +``` + +Short strings (`<= 12 bytes`) are inlined into the structure, while larger strings are stored with a pointer to the data in the auxiliary string buffer. The length is used throughout the functions to avoid having to call `strlen` and having to continuously check for null-pointers. The prefix is used for comparisons as an early out (when the prefix does not match, we know the strings are not equal and don't need to chase any pointers). + +### List Vectors + +List vectors are stored as a series of *list entries* together with a child Vector. The child vector contains the *values* that are present in the list, and the list entries specify how each individual list is constructed. + +```cpp +struct list_entry_t { + idx_t offset; + idx_t length; +}; +``` + +The offset refers to the start row in the child Vector, the length keeps track of the size of the list of this row. + +List vectors can be stored recursively. For nested list vectors, the child of a list vector is again a list vector. + +For example, consider this mock representation of a Vector of type `BIGINT[][]`: + +```json +{ + "type": "list", + "data": "list_entry_t", + "child": { + "type": "list", + "data": "list_entry_t", + "child": { + "type": "bigint", + "data": "int64_t" + } + } +} +``` + +### Struct Vectors + +Struct vectors store a list of child vectors. The number and types of the child vectors is defined by the schema of the struct. + +### Map Vectors + +Internally map vectors are stored as a `LIST[STRUCT(key KEY_TYPE, value VALUE_TYPE)]`. + +### Union Vectors + +Internally `UNION` utilizes the same structure as a `STRUCT`. +The first “child” is always occupied by the Tag Vector of the `UNION`, which records for each row which of the `UNION`'s types apply to that row. \ No newline at end of file diff --git a/docs/archive/1.0/operations_manual/footprint_of_duckdb/files_created_by_duckdb.md b/docs/archive/1.0/operations_manual/footprint_of_duckdb/files_created_by_duckdb.md new file mode 100644 index 00000000000..883c11fc75c --- /dev/null +++ b/docs/archive/1.0/operations_manual/footprint_of_duckdb/files_created_by_duckdb.md @@ -0,0 +1,31 @@ +--- +layout: docu +title: Files Created by DuckDB +--- + +DuckDB creates several files and directories on disk. This page lists both the global and the local ones. + +## Global Files and Directories + +DuckDB creates the following global files and directories in the user's home directory (denoted with `~`): + +| Location | Description | Shared between versions | Shared between clients | +|-------|-------------------|--|--| +| `~/.duckdbrc` | The content of this file is executed when starting the [DuckDB CLI client]({% link docs/archive/1.0/api/cli/overview.md %}). The commands can be both [dot command]({% link docs/archive/1.0/api/cli/dot_commands.md %}) and SQL statements. The naming of this file follows the `~/.bashrc` and `~/.zshrc` “run commands” files. | Yes | Only used by CLI | +| `~/.duckdb_history` | History file, similar to `~/.bash_history` and `~/.zsh_history`. Used by the [DuckDB CLI client]({% link docs/archive/1.0/api/cli/overview.md %}). | Yes | Only used by CLI | +| `~/.duckdb/extensions` | Binaries of installed [extensions]({% link docs/archive/1.0/extensions/overview.md %}). | No | Yes | +| `~/.duckdb/stored_secrets` | [Persistent secrets]({% link docs/archive/1.0/configuration/secrets_manager.md %}#persistent-secrets) created by the [Secrets manager]({% link docs/archive/1.0/configuration/secrets_manager.md %}). | Yes | Yes | + +## Local Files and Directories + +DuckDB creates the following files and directories in the working directory (for in-memory connections) or relative to the database file (for persistent connections): + +| Name | Description | Example | +|-------|-------------------|---| +| `⟨database_filename⟩` | Database file. Only created in on-disk mode. The file can have any extension with typical extensions being `.duckdb`, `.db`, and `.ddb`. | `weather.duckdb` | +| `.tmp/` | Temporary directory. Only created in in-memory mode. | `.tmp/` | +| `⟨database_filename⟩.tmp/` | Temporary directory. Only created in on-disk mode. | `weather.tmp/` | +| `⟨database_filename⟩.wal` | [Write-ahead log](https://en.wikipedia.org/wiki/Write-ahead_logging) file. If DuckDB exits normally, the WAL file is deleted upon exit. If DuckDB crashes, the WAL file is required to recover data. | `weather.wal` | + +If you are working in a Git repository and would like to disable tracking these files by Git, +see the instructions on using [`.gitignore` for DuckDB]({% link docs/archive/1.0/operations_manual/footprint_of_duckdb/gitignore_for_duckdb.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/operations_manual/footprint_of_duckdb/gitignore_for_duckdb.md b/docs/archive/1.0/operations_manual/footprint_of_duckdb/gitignore_for_duckdb.md new file mode 100644 index 00000000000..b0371fb78e6 --- /dev/null +++ b/docs/archive/1.0/operations_manual/footprint_of_duckdb/gitignore_for_duckdb.md @@ -0,0 +1,32 @@ +--- +layout: docu +title: Gitignore for DuckDB +--- + +If you work in a Git repository, you may want to configure your [Gitignore](https://git-scm.com/docs/gitignore) to disable tracking [files created by DuckDB]({% link docs/archive/1.0/operations_manual/footprint_of_duckdb/files_created_by_duckdb.md %}). +These potentially include the DuckDB database, write ahead log, temporary files. + +## Sample Gitignore Files + +In the following, we present sample Gitignore configuration snippets for DuckDB. + +### Ignore Temporary Files but Keep Database + +This configuration is useful if you would like to keep the database file in the version control system: + +```text +*.wal +*.tmp/ +``` + +### Ignore Database and Temporary Files + +If you would like to ignore both the database and the temporary files, extend the Gitignore file to include the database file. +The exact Gitignore configuration to achieve this depends on the extension you use for your DuckDB databases (`.duckdb`, `.db`, `.ddb`, etc.). +For example, if your DuckDB files use the `.duckdb` extension, add the following lines to your `.gitignore` file: + +```text +*.duckdb* +*.wal +*.tmp/ +``` \ No newline at end of file diff --git a/docs/archive/1.0/operations_manual/limits.md b/docs/archive/1.0/operations_manual/limits.md new file mode 100644 index 00000000000..25b66c77cd1 --- /dev/null +++ b/docs/archive/1.0/operations_manual/limits.md @@ -0,0 +1,16 @@ +--- +layout: docu +title: Limits +--- + +This page contains DuckDB's built-in limit values. + +| Limit | Default value | Configuration option | +|---|---|---| +| Array size | 100000 | - | +| BLOB size | 4 GB | - | +| Expression depth | 1000 | [`max_expression_depth`]({% link docs/archive/1.0/configuration/overview.md %}) | +| Memory allocation for a vector | 128 GB | - | +| Memory use | 80% of RAM | [`memory_limit`]({% link docs/archive/1.0/configuration/overview.md %}) | +| String size | 4 GB | - | +| Temporary directory size | unlimited | [`max_temp_directory_size`]({% link docs/archive/1.0/configuration/overview.md %}) | \ No newline at end of file diff --git a/docs/archive/1.0/operations_manual/non-deterministic_behavior.md b/docs/archive/1.0/operations_manual/non-deterministic_behavior.md new file mode 100644 index 00000000000..7b2aa00ecf3 --- /dev/null +++ b/docs/archive/1.0/operations_manual/non-deterministic_behavior.md @@ -0,0 +1,88 @@ +--- +layout: docu +title: Non-Deterministic Behavior +--- + +Several operators in DuckDB exhibit non-deterministic behavior. +Most notably, SQL uses set semantics, which allows results to be returned in a different order. +DuckDB exploits this to improve performance, particularly when performing multi-threaded query execution. +Other factors, such as using different compilers, operating systems, and hardware architectures, can also cause changes in ordering. +This page documents the cases where non-determinism is an _expected behavior_. +If you would like to make your queries determinisic, see the [“Working Around Non-Determinism” section](#working-around-non-determinism). + +## Set Semantics + +One of the most common sources of non-determinism is the set semantics used by SQL. +E.g., if you run the following query repeatedly, you may get two different results: + +```sql +SELECT * +FROM ( + SELECT 'A' AS x + UNION + SELECT 'B' AS x +); +``` + +Both results `A`, `B` and `B`, `A` are correct. + +## Different Results on Different Platforms: `array_distinct` + +The `array_distinct` function may return results [in a different order on different platforms](https://github.com/duckdb/duckdb/issues/13746): + +```sql +SELECT array_distinct(['A', 'A', 'B', NULL, NULL]) AS arr; +``` + +For this query, both `[A, B]` and `[B, A]` are valid results. + +## Floating-Point Aggregate Operations with Multi-Threading + +Floating-point inaccuracies may produce different results when run in a multi-threaded configurations: +For example, [`stddev` and `corr` may produce non-deterministic results](https://github.com/duckdb/duckdb/issues/13763): + +```sql +CREATE TABLE tbl AS + SELECT 'ABCDEFG'[floor(random() * 7 + 1)::INT] AS s, 3.7 AS x, i AS y + FROM range(1, 1_000_000) r(i); + +SELECT s, stddev(x) AS standard_deviation, corr(x, y) AS correlation FROM tbl +GROUP BY s +ORDER BY s; +``` + +The expected standard deviations and correlations from this query are 0 for all values of `s`. +However, when executed on multiple threads, the query may return small numbers (`0 <= z < 10e-16`) due to floating-point inaccuracies. + +## Working Around Non-Determinism + +For the majority of use cases, non-determinism is not causing any issues. +However, there are some cases where deterministic results are desirable. +In these cases, try the following workarounds: + +1. Limit the number of threads to prevent non-determinism introduced by multi-threading. + + ```sql + SET threads = 1; + ``` + +2. Enforce ordering. For example, you can use the [`ORDER BY ALL` clause]({% link docs/archive/1.0/sql/query_syntax/orderby.md %}#order-by-all): + + ```sql + SELECT * + FROM ( + SELECT 'A' AS x + UNION + SELECT 'B' AS x + ) + ORDER BY ALL; + ``` + + You can also sort lists using [`list_sort`]({% link docs/archive/1.0/sql/functions/list.md %}#list_sortlist) + + ```sql + SELECT list_sort(array_distinct(['A', 'A', 'B', NULL, NULL])) AS i + ORDER BY i; + ``` + + It's also possible to introduce a [deterministic shuffling]({% post_url 2024-08-19-duckdb-tricks-part-1 %}#shuffling-data). \ No newline at end of file diff --git a/docs/archive/1.0/operations_manual/overview.md b/docs/archive/1.0/operations_manual/overview.md new file mode 100644 index 00000000000..1c9477f64b0 --- /dev/null +++ b/docs/archive/1.0/operations_manual/overview.md @@ -0,0 +1,10 @@ +--- +layout: docu +title: Overview +--- + +We designed DuckDB to be easy to deploy and operate. We believe that most users do not need to consult the pages of the operations manual. +However, there are certain setups – e.g., when DuckDB is running in mission-critical infrastructure – where we would like to offer advice on how to configure DuckDB. +The operations manual contains advice for these cases and also offers convenient configuration snippets such as Gitignore files. + +For advice on getting the best performance from DuckDB, see also the [Performance Guide]({% link docs/archive/1.0/guides/performance/overview.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/operations_manual/securing_duckdb/overview.md b/docs/archive/1.0/operations_manual/securing_duckdb/overview.md new file mode 100644 index 00000000000..66d6e922bc3 --- /dev/null +++ b/docs/archive/1.0/operations_manual/securing_duckdb/overview.md @@ -0,0 +1,92 @@ +--- +layout: docu +title: Securing DuckDB +--- + +DuckDB is quite powerful, which can be problematic, especially if untrusted SQL queries are run, e.g., from public-facing user inputs. +This page lists some options to restrict the potential fallout from malicious SQL queries. + +The approach to securing DuckDB varies depending on your use case, environment, and potential attack models. +Therefore, consider the security-related configuration options carefully, especially when working with confidential data sets. + +## Reporting Vulnerabilities + +If you discover a potential vulnerability, please [report it confidentially via GitHub](https://github.com/duckdb/duckdb/security/advisories/new). + +## Disabling File Access + +DuckDB can list directories and read arbitrary files via its CSV parser’s [`read_csv` function]({% link docs/archive/1.0/data/csv/overview.md %}) or read text via the [`read_text` function]({% link docs/archive/1.0/sql/functions/char.md %}#read_textsource). For example: + +```sql +SELECT * +FROM read_csv('/etc/passwd', sep = ':'); +``` + +This can be disabled either by disabling external access altogether (`enable_external_access`) or disabling individual file systems. For example: + +```sql +SET disabled_filesystems = 'LocalFileSystem'; +``` + +## Secrets + +[Secrets]({% link docs/archive/1.0/configuration/secrets_manager.md %}) are used to manage credentials to log into third party services like AWS or Azure. DuckDB can show a list of secrets using the `duckdb_secrets()` table function. This will redact any sensitive information such as security keys by default. The `allow_unredacted_secrets` option can be set to show all information contained within a security key. It is recommended not to turn on this option if you are running untrusted SQL input. + +Queries can access the secrets defined in the Secrets Manager. For example, if there is a secret defined to authenticate with a user, who has write privileges to a given AWS S3 bucket, queries may write to that bucket. This is applicable for both persistent and temporary secrets. + +[Persistent secrets]({% link docs/archive/1.0/configuration/secrets_manager.md %}#persistent-secrets) are stored in unencrypted binary format on the disk. These have the same permissions as SSH keys, `600`, i.e., only user who is running the DuckDB (parent) process can read and write them. + +## Locking Configurations + +Security-related configuration settings generally lock themselves for safety reasons. For example, while we can disable [community extensions]({% link docs/archive/1.0/extensions/community_extensions.md %}) using the `SET allow_community_extensions = false`, we cannot re-enable them again after the fact without restarting the database. Trying to do so will result in an error: + +```console +Invalid Input Error: Cannot upgrade allow_community_extensions setting while database is running +``` + +This prevents untrusted SQL input from re-enabling settings that were explicitly disabled for security reasons. + +Nevertheless, many configuration settings do not disable themselves, such as the resource constraints. If you allow users to run SQL statements unrestricted on your own hardware, it is recommended that you lock the configuration after your own configuration has finished using the following command: + +```sql +SET lock_configuration = true; +``` + +This prevents any configuration settings from being modified from that point onwards. + +## Constrain Resource Usage + +DuckDB can use quite a lot of CPU, RAM, and disk space. To avoid denial of service attacks, these resources can be limited. + +The number of CPU threads that DuckDB can use can be set using, for example: + +```sql +SET threads = 4; +``` + +Where 4 is the number of allowed threads. + +The maximum amount of memory (RAM) can also be limited, for example: + +```sql +SET memory_limit = '4GB'; +``` + +The size of the temporary file directory can be limited with: + +```sql +SET max_temp_directory_size = '4GB'; +``` + +## Extensions + +DuckDB has a powerful extension mechanism, which have the same privileges as the user running DuckDB's (parent) process. +This introduces security considerations. Therefore, we recommend reviewing the configuration options for [securing extensions]({% link docs/archive/1.0/operations_manual/securing_duckdb/securing_extensions.md %}). + +## Generic Solutions + +Securing DuckDB can also be supported via proven means, for example: + +* Scoping user privileges via [`chroot`](https://en.wikipedia.org/wiki/Chroot), relying on the operating system +* Containerization, e.g., Docker and Podman +* Running DuckDB in WebAssembly \ No newline at end of file diff --git a/docs/archive/1.0/operations_manual/securing_duckdb/securing_extensions.md b/docs/archive/1.0/operations_manual/securing_duckdb/securing_extensions.md new file mode 100644 index 00000000000..74b2734755b --- /dev/null +++ b/docs/archive/1.0/operations_manual/securing_duckdb/securing_extensions.md @@ -0,0 +1,78 @@ +--- +layout: docu +title: Securing Extensions +--- + +DuckDB has a powerful extension mechanism, which have the same privileges as the user running DuckDB's (parent) process. +This introduces security considerations. Therefore, we recommend reviewing the configuration options listed on this page and setting them according to your attack models. + +## DuckDB Signature Checks + +DuckDB extensions are checked on every load using the signature of the binaries. +There are currently three categories of extensions: + +* Signed with a `core` key. Only extensions vetted by the core DuckDB team are signed with these keys. +* Signed with a `community` key. These are open-source extensions distributed via the [DuckDB Community Extensions repository](https://community-extensions.duckdb.org/). +* Unsigned. + +## Overview of Security Levels for Extensions + +DuckDB offers the following security levels for extensions. + +| Usable extensions | Description | Configuration | +|-----|---|---| +| core | Extensions can only be installed from the `core` repository. | `SET allow_community_extensions = false` | +| core and community | Extensions can only be installed from the `core` and `community` repositories. | This is the default security level. | +| any extension incl. unsigned | Any extensions can be installed. | `SET allow_unsigned_extensions = true` | + +Security-related configuration settings [lock themselves]({% link docs/archive/1.0/operations_manual/securing_duckdb/overview.md %}#locking-configurations), i.e., it is only possible to restrict capabilities in the current process. + +For example, attempting the following configuration changes will result in an error: + +```sql +SET allow_community_extensions = false; +SET allow_community_extensions = true; +``` + +```console +Invalid Input Error: Cannot upgrade allow_community_extensions setting while database is running +``` + +## Community Extensions + +DuckDB has a [Community Extensions repository]({% link docs/archive/1.0/extensions/community_extensions.md %}), which allows convenient installation of third-party extensions. +Community extension repositories like pip or npm are essentially enabling remote code execution by design. This is less dramatic than it sounds. For better or worse, we are quite used to piping random scripts from the web into our shells, and routinely install a staggering amount of transitive dependencies without thinking twice. Some repositories like CRAN enforce a human inspection at some point, but that’s no guarantee for anything either. + +We’ve studied several different approaches to community extension repositories and have picked what we think is a sensible approach: we do not attempt to review the submissions, but require that the *source code of extensions is available*. We do take over the complete build, sign and distribution process. Note that this is a step up from pip and npm that allow uploading arbitrary binaries but a step down from reviewing everything manually. We allow users to [report malicious extensions](https://github.com/duckdb/community-extensions/security/advisories/new) and show adoption statistics like GitHub stars and download count. Because we manage the repository, we can remove problematic extensions from distribution quickly. + +Despite this, installing and loading DuckDB extensions from the community extension repository will execute code written by third party developers, and therefore *can* be dangerous. A malicious developer could create and register a harmless-looking DuckDB extension that steals your crypto coins. If you’re running a web service that executes untrusted SQL from users with DuckDB, it is probably a good idea to disable community extension installation and loading entirely. This can be done like so: + +```sql +SET allow_community_extensions = false; +``` + +## Disabling Autoinstalling and Autoloading Known Extensions + +By default, DuckDB automatically installs and loads known extensions. + +To disable autoinstalling known extensions, run: + +```sql +SET autoinstall_known_extensions = true; +``` + +To disable autoloading known extensions, run: + +```sql +SET autoload_known_extensions = true; +``` + +To lock this configuration, use the [`lock_configuration` option]({% link docs/archive/1.0/operations_manual/securing_duckdb/overview.md %}#locking-configurations): + +```sql +SET lock_configuration = true; +``` + +## Always Require Signed Extensions + +By default, DuckDB requires extensions to be either signed as core extensions (created by the DuckDB developers) or community extensions (created by third-party developers but distributed by the DuckDB developers). The `allow_unsigned_extensions` setting can be enabled on start-up to allow running extensions that are not signed at all. While useful for extension development, enabling this setting will allow DuckDB to load any extensions, which means more care must be taken to ensure malicious extensions are not loaded. \ No newline at end of file diff --git a/docs/archive/1.0/search.md b/docs/archive/1.0/search.md new file mode 100644 index 00000000000..039d11a7cf4 --- /dev/null +++ b/docs/archive/1.0/search.md @@ -0,0 +1,19 @@ +--- +layout: docu +title: Search +--- + + + +
+
+
+ +
+ +
+ +
+ + + \ No newline at end of file diff --git a/docs/archive/1.0/sitemap.md b/docs/archive/1.0/sitemap.md new file mode 100644 index 00000000000..a25eb59285f --- /dev/null +++ b/docs/archive/1.0/sitemap.md @@ -0,0 +1,6 @@ +--- +layout: docu +title: Sitemap +--- + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/constraints.md b/docs/archive/1.0/sql/constraints.md new file mode 100644 index 00000000000..b27b6b88f0e --- /dev/null +++ b/docs/archive/1.0/sql/constraints.md @@ -0,0 +1,112 @@ +--- +layout: docu +railroad: statements/constraints.js +title: Constraints +--- + +In SQL, constraints can be specified for tables. Constraints enforce certain properties over data that is inserted into a table. Constraints can be specified along with the schema of the table as part of the [`CREATE TABLE` statement]({% link docs/archive/1.0/sql/statements/create_table.md %}). In certain cases, constraints can also be added to a table using the [`ALTER TABLE` statement]({% link docs/archive/1.0/sql/statements/alter_table.md %}), but this is not currently supported for all constraints. + +> Warning Constraints have a strong impact on performance: they slow down loading and updates but speed up certain queries. Please consult the [Performance Guide]({% link docs/archive/1.0/guides/performance/schema.md %}#constraints) for details. + +## Syntax + +
+ +## Check Constraint + +Check constraints allow you to specify an arbitrary boolean expression. Any columns that *do not* satisfy this expression violate the constraint. For example, we could enforce that the `name` column does not contain spaces using the following `CHECK` constraint. + +```sql +CREATE TABLE students (name VARCHAR CHECK (NOT contains(name, ' '))); +INSERT INTO students VALUES ('this name contains spaces'); +``` + +```console +Constraint Error: CHECK constraint failed: students +``` + +## Not Null Constraint + +A not-null constraint specifies that the column cannot contain any `NULL` values. By default, all columns in tables are nullable. Adding `NOT NULL` to a column definition enforces that a column cannot contain `NULL` values. + +```sql +CREATE TABLE students (name VARCHAR NOT NULL); +INSERT INTO students VALUES (NULL); +``` + +```console +Constraint Error: NOT NULL constraint failed: students.name +``` + +## Primary Key and Unique Constraint + +Primary key or unique constraints define a column, or set of columns, that are a unique identifier for a row in the table. The constraint enforces that the specified columns are *unique* within a table, i.e., that at most one row contains the given values for the set of columns. + +```sql +CREATE TABLE students (id INTEGER PRIMARY KEY, name VARCHAR); +INSERT INTO students VALUES (1, 'Student 1'); +INSERT INTO students VALUES (1, 'Student 2'); +``` + +```console +Constraint Error: Duplicate key "id: 1" violates primary key constraint +``` + +```sql +CREATE TABLE students (id INTEGER, name VARCHAR, PRIMARY KEY (id, name)); +INSERT INTO students VALUES (1, 'Student 1'); +INSERT INTO students VALUES (1, 'Student 2'); +INSERT INTO students VALUES (1, 'Student 1'); +``` + +```console +Constraint Error: Duplicate key "id: 1, name: Student 1" violates primary key constraint +``` + +In order to enforce this property efficiently, an [ART index is automatically created]({% link docs/archive/1.0/sql/indexes.md %}) for every primary key or unique constraint that is defined in the table. + +Primary key constraints and unique constraints are identical except for two points: + +* A table can only have one primary key constraint defined, but many unique constraints +* A primary key constraint also enforces the keys to not be `NULL`. + +```sql +CREATE TABLE students(id INTEGER PRIMARY KEY, name VARCHAR, email VARCHAR UNIQUE); +INSERT INTO students VALUES (1, 'Student 1', 'student1@uni.com'); +INSERT INTO students values (2, 'Student 2', 'student1@uni.com'); +``` + +```console +Constraint Error: Duplicate key "email: student1@uni.com" violates unique constraint. +``` + +```sql +INSERT INTO students(id, name) VALUES (3, 'Student 3'); +INSERT INTO students(name, email) VALUES ('Student 3', 'student3@uni.com'); +``` + +```console +Constraint Error: NOT NULL constraint failed: students.id +``` + +> Warning Indexes have certain limitations that might result in constraints being evaluated too eagerly, leading to constraint errors such as `violates primary key constraint` and `violates unique constraint`. See the [indexes section for more details]({% link docs/archive/1.0/sql/indexes.md %}#index-limitations). + +## Foreign Keys + +Foreign keys define a column, or set of columns, that refer to a primary key or unique constraint from *another* table. The constraint enforces that the key exists in the other table. + +```sql +CREATE TABLE students (id INTEGER PRIMARY KEY, name VARCHAR); +CREATE TABLE exams (exam_id INTEGER REFERENCES students(id), grade INTEGER); +INSERT INTO students VALUES (1, 'Student 1'); +INSERT INTO exams VALUES (1, 10); +INSERT INTO exams VALUES (2, 10); +``` + +```console +Constraint Error: Violates foreign key constraint because key "id: 2" does not exist in the referenced table +``` + +In order to enforce this property efficiently, an [ART index is automatically created]({% link docs/archive/1.0/sql/indexes.md %}) for every foreign key constraint that is defined in the table. + +> Warning Indexes have certain limitations that might result in constraints being evaluated too eagerly, leading to constraint errors such as `violates primary key constraint` and `violates unique constraint`. See the [indexes section for more details]({% link docs/archive/1.0/sql/indexes.md %}#index-limitations). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/array.md b/docs/archive/1.0/sql/data_types/array.md new file mode 100644 index 00000000000..0493f22aae3 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/array.md @@ -0,0 +1,123 @@ +--- +layout: docu +title: Array Type +--- + +An `ARRAY` column stores fixed-sized arrays. All fields in the column must have the same length and the same underlying type. Arrays are typically used to store arrays of numbers, but can contain any uniform data type, including `ARRAY`, [`LIST`]({% link docs/archive/1.0/sql/data_types/list.md %}) and [`STRUCT`]({% link docs/archive/1.0/sql/data_types/struct.md %}) types. + +Arrays can be used to store vectors such as [word embeddings](https://en.wikipedia.org/wiki/Word_embedding) or image embeddings. + +To store variable-length lists, use the [`LIST` type]({% link docs/archive/1.0/sql/data_types/list.md %}). See the [data types overview]({% link docs/archive/1.0/sql/data_types/overview.md %}) for a comparison between nested data types. + +> The `ARRAY` type in PostgreSQL allows variable-length fields. DuckDB's `ARRAY` type is fixed-length. + +## Creating Arrays + +Arrays can be created using the [`array_value(expr, ...)`]({% link docs/archive/1.0/sql/functions/nested.md %}#list-functions) function. + +Construct with the `array_value` function: + +```sql +SELECT array_value(1, 2, 3); +``` + +You can always implicitly cast an array to a list (and use list functions, like `list_extract`, `[i]`): + +```sql +SELECT array_value(1, 2, 3)[2]; +``` + +You can cast from a list to an array, but the dimensions have to match up!: + +```sql +SELECT [3, 2, 1]::INTEGER[3]; +``` + +Arrays can be nested: + +```sql +SELECT array_value(array_value(1, 2), array_value(3, 4), array_value(5, 6)); +``` + +Arrays can store structs: + +```sql +SELECT array_value({'a': 1, 'b': 2}, {'a': 3, 'b': 4}); +``` + +## Defining an Array Field + +Arrays can be created using the `⟨TYPE_NAME⟩[⟨LENGTH⟩]` syntax. For example, to create an array field for 3 integers, run: + +```sql +CREATE TABLE array_table (id INTEGER, arr INTEGER[3]); +INSERT INTO array_table VALUES (10, [1, 2, 3]), (20, [4, 5, 6]); +``` + +## Retrieving Values from Arrays + +Retrieving one or more values from an array can be accomplished using brackets and slicing notation, or through [list functions]({% link docs/archive/1.0/sql/functions/list.md %}#list-functions) like `list_extract` and `array_extract`. Using the example in [Defining an Array Field](#defining-an-array-field). + +The following queries for extracting the second element of an array are equivalent: + +```sql +SELECT id, arr[1] AS element FROM array_table; +SELECT id, list_extract(arr, 1) AS element FROM array_table; +SELECT id, array_extract(arr, 1) AS element FROM array_table; +``` + +| id | element | +|---:|--------:| +| 10 | 1 | +| 20 | 4 | + +Using the slicing notation returns a `LIST`: + +```sql +SELECT id, arr[1:2] AS elements FROM array_table; +``` + +| id | elements | +|---:|----------| +| 10 | [1, 2] | +| 20 | [4, 5] | + +## Functions + +All [`LIST` functions]({% link docs/archive/1.0/sql/functions/nested.md %}#list-functions) work with the `ARRAY` type. Additionally, several `ARRAY`-native functions are also supported. +See the [`ARRAY` functions]({% link docs/archive/1.0/sql/functions/array.md %}#array-native-functions). + +## Examples + +Create sample data: + +```sql +CREATE TABLE x (i INTEGER, v FLOAT[3]); +CREATE TABLE y (i INTEGER, v FLOAT[3]); +INSERT INTO x VALUES (1, array_value(1.0::FLOAT, 2.0::FLOAT, 3.0::FLOAT)); +INSERT INTO y VALUES (1, array_value(2.0::FLOAT, 3.0::FLOAT, 4.0::FLOAT)); +``` + +Compute cross product: + +```sql +SELECT array_cross_product(x.v, y.v) +FROM x, y +WHERE x.i = y.i; +``` + +Compute cosine similarity: + +```sql +SELECT array_cosine_similarity(x.v, y.v) +FROM x, y +WHERE x.i = y.i; +``` + +## Ordering + +The ordering of `ARRAY` instances is defined using a lexicographical order. `NULL` values compare greater than all other values and are considered equal to each other. + +## See Also + +For more functions, see [List Functions]({% link docs/archive/1.0/sql/functions/nested.md %}#list-functions). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/bitstring.md b/docs/archive/1.0/sql/data_types/bitstring.md new file mode 100644 index 00000000000..522eed99f20 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/bitstring.md @@ -0,0 +1,56 @@ +--- +blurb: The bitstring type are strings of 1s and 0s. +layout: docu +title: Bitstring Type +--- + +
+ +| Name | Aliases | Description | +|:---|:---|:---| +| `BITSTRING` | `BIT` | variable-length strings of 1s and 0s | + +Bitstrings are strings of 1s and 0s. The bit type data is of variable length. A bitstring value requires 1 byte for each group of 8 bits, plus a fixed amount to store some metadata. + +By default bitstrings will not be padded with zeroes. +Bitstrings can be very large, having the same size restrictions as `BLOB`s. + +## Creating a Bitstring + +A string encoding a bitstring can be cast to a `BITSTRING`: + +```sql +SELECT '101010'::BITSTRING AS b; +``` + +
+ +| b | +|--------| +| 101010 | + +Create a `BITSTRING` with predefined length is possible with the `bitstring` function. The resulting bitstring will be left-padded with zeroes. + +```sql +SELECT bitstring('0101011', 12) AS b; +``` + +| b | +|--------------| +| 000000101011 | + +Numeric values (integer and float values) can also be converted to a `BITSTRING` via casting. For example: + +```sql +SELECT 123::BITSTRING AS b; +``` + +
+ +| b | +|----------------------------------| +| 00000000000000000000000001111011 | + +## Functions + +See [Bitstring Functions]({% link docs/archive/1.0/sql/functions/bitstring.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/blob.md b/docs/archive/1.0/sql/data_types/blob.md new file mode 100644 index 00000000000..b07c376702c --- /dev/null +++ b/docs/archive/1.0/sql/data_types/blob.md @@ -0,0 +1,38 @@ +--- +blurb: The blob (Binary Large OBject) type represents an arbitrary binary object stored + in the database system. +layout: docu +title: Blob Type +--- + +
+ +| Name | Aliases | Description | +|:---|:---|:---| +| `BLOB` | `BYTEA`, `BINARY`, `VARBINARY` | variable-length binary data | + +The blob (**B**inary **L**arge **OB**ject) type represents an arbitrary binary object stored in the database system. The blob type can contain any type of binary data with no restrictions. What the actual bytes represent is opaque to the database system. + +Create a `BLOB` value with a single byte (170): + +```sql +SELECT '\xAA'::BLOB; +``` + +Create a `BLOB` value with three bytes (170, 171, 172): + +```sql +SELECT '\xAA\xAB\xAC'::BLOB; +``` + +Create a `BLOB` value with two bytes (65, 66): + +```sql +SELECT 'AB'::BLOB; +``` + +Blobs are typically used to store non-textual objects that the database does not provide explicit support for, such as images. While blobs can hold objects up to 4 GB in size, typically it is not recommended to store very large objects within the database system. In many situations it is better to store the large file on the file system, and store the path to the file in the database system in a `VARCHAR` field. + +## Functions + +See [Blob Functions]({% link docs/archive/1.0/sql/functions/blob.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/boolean.md b/docs/archive/1.0/sql/data_types/boolean.md new file mode 100644 index 00000000000..38ee01ed206 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/boolean.md @@ -0,0 +1,68 @@ +--- +blurb: The BOOLEAN type represents a statement of truth (“true” or “false”). +layout: docu +title: Boolean Type +--- + +
+ +| Name | Aliases | Description | +|:---|:---|:---| +| `BOOLEAN` | `BOOL` | logical boolean (`true`/`false`) | + +The `BOOLEAN` type represents a statement of truth (“true” or “false”). In SQL, the `BOOLEAN` field can also have a third state “unknown” which is represented by the SQL `NULL` value. + +Select the three possible values of a `BOOLEAN` column: + +```sql +SELECT true, false, NULL::BOOLEAN; +``` + +Boolean values can be explicitly created using the literals `true` and `false`. However, they are most often created as a result of comparisons or conjunctions. For example, the comparison `i > 10` results in a boolean value. Boolean values can be used in the `WHERE` and `HAVING` clauses of a SQL statement to filter out tuples from the result. In this case, tuples for which the predicate evaluates to `true` will pass the filter, and tuples for which the predicate evaluates to `false` or `NULL` will be filtered out. Consider the following example: + +Create a table with the values 5, 15 and `NULL`: + +```sql +CREATE TABLE integers (i INTEGER); +INSERT INTO integers VALUES (5), (15), (NULL); +``` + +Select all entries where `i > 10`: + +```sql +SELECT * FROM integers WHERE i > 10; +``` + +In this case 5 and `NULL` are filtered out (`5 > 10` is `false` and `NULL > 10` is `NULL`): + +| i | +|---:| +| 15 | + +## Conjunctions + +The `AND`/`OR` conjunctions can be used to combine boolean values. + +Below is the truth table for the `AND` conjunction (i.e., `x AND y`). + +
+ +| X | X AND true | X AND false | X AND NULL | +|-------|-------|-------|-------| +| true | true | false | NULL | +| false | false | false | false | +| NULL | NULL | false | NULL | + +Below is the truth table for the `OR` conjunction (i.e., `x OR y`). + +
+ +| X | X OR true | X OR false | X OR NULL | +|-------|------|-------|------| +| true | true | true | true | +| false | true | false | NULL | +| NULL | true | NULL | NULL | + +## Expressions + +See [Logical Operators]({% link docs/archive/1.0/sql/expressions/logical_operators.md %}) and [Comparison Operators]({% link docs/archive/1.0/sql/expressions/comparison_operators.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/date.md b/docs/archive/1.0/sql/data_types/date.md new file mode 100644 index 00000000000..6678d3db507 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/date.md @@ -0,0 +1,47 @@ +--- +blurb: A date specifies a combination of year, month and day. +layout: docu +title: Date Types +--- + +
+ +| Name | Aliases | Description | +|:-------|:--------|:--------------------------------| +| `DATE` | | calendar date (year, month day) | + +A date specifies a combination of year, month and day. DuckDB follows the SQL standard's lead by counting dates exclusively in the Gregorian calendar, even for years before that calendar was in use. Dates can be created using the `DATE` keyword, where the data must be formatted according to the ISO 8601 format (`YYYY-MM-DD`). + +```sql +SELECT DATE '1992-09-20'; +``` + +## Special Values + +There are also three special date values that can be used on input: + +
+ +| Input string | Description | +|:-------------|:----------------------------------| +| epoch | 1970-01-01 (Unix system day zero) | +| infinity | later than all other dates | +| -infinity | earlier than all other dates | + +The values `infinity` and `-infinity` are specially represented inside the system and will be displayed unchanged; +but `epoch` is simply a notational shorthand that will be converted to the date value when read. + +```sql +SELECT + '-infinity'::DATE AS negative, + 'epoch'::DATE AS epoch, + 'infinity'::DATE AS positive; +``` + +| negative | epoch | positive | +|-----------|------------|----------| +| -infinity | 1970-01-01 | infinity | + +## Functions + +See [Date Functions]({% link docs/archive/1.0/sql/functions/date.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/enum.md b/docs/archive/1.0/sql/data_types/enum.md new file mode 100644 index 00000000000..37ef046c5f1 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/enum.md @@ -0,0 +1,228 @@ +--- +blurb: The Enum type represents a dictionary data structure with all possible unique + values of a column. +layout: docu +title: Enum Data Type +--- + +
+ +| Name | Description | +|:--|:-----| +| enum | Dictionary Encoding representing all possible string values of a column. | + +The enum type represents a dictionary data structure with all possible unique values of a column. For example, a column storing the days of the week can be an enum holding all possible days. Enums are particularly interesting for string columns with low cardinality (i.e., fewer distinct values). This is because the column only stores a numerical reference to the string in the enum dictionary, resulting in immense savings in disk storage and faster query performance. + +## Enum Definition + +Enum types are created from either a hardcoded set of values or from a select statement that returns a single column of `VARCHAR`s. The set of values in the select statement will be deduplicated, but if the enum is created from a hardcoded set there may not be any duplicates. + +Create enum using hardcoded values: + +```sql +CREATE TYPE ⟨enum_name⟩ AS ENUM ([⟨value_1⟩, ⟨value_2⟩,...]); +``` + +Create enum using a `SELECT` statement that returns a single column of `VARCHAR`s: + +```sql +CREATE TYPE ⟨enum_name⟩ AS ENUM (select_expression⟩); +``` + +For example: + +Creates new user defined type 'mood' as an enum: + +```sql +CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy'); +``` + +This will fail since the `mood` type already exists: + +```sql +CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy', 'anxious'); +``` + +This will fail since enums cannot hold `NULL` values: + +```sql +CREATE TYPE breed AS ENUM ('maltese', NULL); +``` + +This will fail since enum values must be unique: + +```sql +CREATE TYPE breed AS ENUM ('maltese', 'maltese'); +``` + +Create an enum from a select statement. First create an example table of values: + +```sql +CREATE TABLE my_inputs AS + SELECT 'duck' AS my_varchar UNION ALL + SELECT 'duck' AS my_varchar UNION ALL + SELECT 'goose' AS my_varchar; +``` + +Create an enum using the unique string values in the `my_varchar` column: + +```sql +CREATE TYPE birds AS ENUM (SELECT my_varchar FROM my_inputs); +``` + +Show the available values in the `birds` enum using the `enum_range` function: + +```sql +SELECT enum_range(NULL::birds) AS my_enum_range; +``` + +
+ +| my_enum_range | +|-----------------| +| `[duck, goose]` | + +## Enum Usage + +After an enum has been created, it can be used anywhere a standard built-in type is used. For example, we can create a table with a column that references the enum. + +Creates a table `person`, with attributes `name` (string type) and `current_mood` (mood type): + +```sql +CREATE TABLE person ( + name TEXT, + current_mood mood +); +``` + +Inserts tuples in the `person` table: + +```sql +INSERT INTO person +VALUES ('Pedro', 'happy'), ('Mark', NULL), ('Pagliacci', 'sad'), ('Mr. Mackey', 'ok'); +``` + +The following query will fail since the mood type does not have `quackity-quack` value. + +```sql +INSERT INTO person +VALUES ('Hannes', 'quackity-quack'); +``` + +The string `sad` is cast to the type `mood`, returning a numerical reference value. +This makes the comparison a numerical comparison instead of a string comparison. + +```sql +SELECT * +FROM person +WHERE current_mood = 'sad'; +``` + +| name | current_mood | +|-----------|--------------| +| Pagliacci | sad | + +If you are importing data from a file, you can create an enum for a `VARCHAR` column before importing. +Given this, the following subquery selects automatically selects only distinct values: + +```sql +CREATE TYPE mood AS ENUM (SELECT mood FROM 'path/to/file.csv'); +``` + +Then you can create a table with the enum type and import using any data import statement: + +```sql +CREATE TABLE person (name TEXT, current_mood mood); +COPY person FROM 'path/to/file.csv'; +``` + +## Enums vs. Strings + +DuckDB enums are automatically cast to `VARCHAR` types whenever necessary. This characteristic allows for enum columns to be used in any `VARCHAR` function. In addition, it also allows for comparisons between different enum columns, or an enum and a `VARCHAR` column. + +For example: + +Regexp_matches is a function that takes a VARCHAR, hence current_mood is cast to VARCHAR: + +```sql +SELECT regexp_matches(current_mood, '.*a.*') AS contains_a +FROM person; +``` + +| contains_a | +|:-----------| +| true | +| NULL | +| true | +| false | + +Create a new mood and table: + +```sql +CREATE TYPE new_mood AS ENUM ('happy', 'anxious'); +CREATE TABLE person_2 ( + name text, + current_mood mood, + future_mood new_mood, + past_mood VARCHAR +); +``` + +Since the `current_mood` and `future_mood` columns are constructed on different enum types, DuckDB will cast both enums to strings and perform a string comparison: + +```sql +SELECT * +FROM person_2 +WHERE current_mood = future_mood; +``` + +When comparing the `past_mood` column (string), DuckDB will cast the `current_mood` enum to `VARCHAR` and perform a string comparison: + +```sql +SELECT * +FROM person_2 +WHERE current_mood = past_mood; +``` + +## Enum Removal + +Enum types are stored in the catalog, and a catalog dependency is added to each table that uses them. It is possible to drop an enum from the catalog using the following command: + +```sql +DROP TYPE ⟨enum_name⟩; +``` + +Currently, it is possible to drop enums that are used in tables without affecting the tables. + +> Warning This behavior of the enum removal feature is subject to change. In future releases, it is expected that any dependent columns must be removed before dropping the enum, or the enum must be dropped with the additional `CASCADE` parameter. + +## Comparison of Enums + +Enum values are compared according to their order in the enum's definition. For example: + +```sql +CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy'); +``` + +```sql +SELECT 'sad'::mood < 'ok'::mood AS comp; +``` + +| comp | +|-----:| +| true | + +```sql +SELECT unnest(['ok'::mood, 'happy'::mood, 'sad'::mood]) AS m +ORDER BY m; +``` + +| m | +|-------| +| sad | +| ok | +| happy | + +## Functions + +See [Enum Functions]({% link docs/archive/1.0/sql/functions/enum.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/interval.md b/docs/archive/1.0/sql/data_types/interval.md new file mode 100644 index 00000000000..8afbbe8f59f --- /dev/null +++ b/docs/archive/1.0/sql/data_types/interval.md @@ -0,0 +1,130 @@ +--- +blurb: Intervals represent periods of time measured in months, days, microseconds, + or a combination thereof. +layout: docu +title: Interval Type +--- + +`INTERVAL`s represent periods of time that can be added to or subtracted from `DATE`, `TIMESTAMP`, `TIMESTAMPTZ`, or `TIME` values. + +
+ +| Name | Description | +|:---|:---| +| `INTERVAL` | Period of time | + +An `INTERVAL` can be constructed by providing amounts together with units. +Units that aren't *months*, *days*, or *microseconds* are converted to equivalent amounts in the next smaller of these three basis units. + +```sql +SELECT + INTERVAL 1 YEAR, -- single unit using YEAR keyword; stored as 12 months + INTERVAL (random() * 10) YEAR, -- parentheses necessary for variable amounts; + -- stored as integer number of months + INTERVAL '1 month 1 day', -- string type necessary for multiple units; stored as (1 month, 1 day) + '16 months'::INTERVAL, -- string cast supported; stored as 16 months + '48:00:00'::INTERVAL, -- HH::MM::SS string supported; stored as (48 * 60 * 60 * 1e6 microseconds) +; +``` +> Warning Decimal values can be used in strings but are rounded to integers. +> ```sql +> SELECT INTERVAL '1.5' YEARS; +> -- Returns 12 months; equivalent to `to_years(CAST(trunc(1.5) AS INTEGER))` +> ``` +> For more precision, use a more granular unit; e.g., `18 MONTHS` instead of `'1.5' YEARS`. + +Three basis units are necessary because a month does not correspond to a fixed amount of days (February has fewer days than March) and a day doesn't correspond to a fixed amount of microseconds. +The division into components makes the `INTERVAL` class suitable for adding or subtracting specific time units to a date. For example, we can generate a table with the first day of every month using the following SQL query: + +```sql +SELECT DATE '2000-01-01' + INTERVAL (i) MONTH +FROM range(12) t(i); +``` + +When `INTERVAL`s are deconstructed via the `datepart` function, the *months* component is additionally split into years and months, and the *microseconds* component is split into hours, minutes, and microseconds. The *days* component is not split into additional units. To demonstrate this, the following query generates an `INTERVAL` called `period` by summing random amounts of the three basis units. It then extracts the aforementioned six parts from `period`, adds them back together, and confirms that the result is always equal to the original `period`. + +```sql +SELECT + period = list_reduce( + [INTERVAL (datepart(part, period) || part) FOR part IN + ['year', 'month', 'day', 'hour', 'minute', 'microsecond'] + ], + (i1, i2) -> i1 + i2 + ) -- always true +FROM ( + VALUES ( + INTERVAL (random() * 123_456_789_123) MICROSECONDS + + INTERVAL (random() * 12_345) DAYS + + INTERVAL (random() * 12_345) MONTHS + ) +) _(period); +``` + +> Warning The *microseconds* component is split only into hours, minutes, and microseconds, rather than hours, minutes, *seconds*, and microseconds. + +Additionally, the amounts of centuries, decades, quarters, seconds, and milliseconds in an `INTERVAL`, rounded down to the nearest integer, can be extracted via the `datepart` function. However, these components are not required to reassemble the original `INTERVAL`. In fact, if the previous query additionally extracted decades or seconds, then the sum of extracted parts would generally be larger than the original `period` since this would double count the months and microseconds components, respectively. + +> All units use 0-based indexing, except for quarters, which use 1-based indexing. + +For example: + +```sql +SELECT + datepart('decade', INTERVAL 12 YEARS), -- returns 1 + datepart('year', INTERVAL 12 YEARS), -- returns 12 + datepart('second', INTERVAL 1_234 MILLISECONDS), -- returns 1 + datepart('microsecond', INTERVAL 1_234 MILLISECONDS), -- returns 1_234_000 +``` + +## Arithmetic with Timestamps, Dates and Intervals + +`INTERVAL`s can be added to and subtracted from `TIMESTAMP`s, `TIMESTAMPTZ`s, `DATE`s, and `TIME`s using the `+` and `-` operators. + +```sql +SELECT + DATE '2000-01-01' + INTERVAL 1 YEAR, + TIMESTAMP '2000-01-01 01:33:30' - INTERVAL '1 month 13 hours', + TIME '02:00:00' - INTERVAL '3 days 23 hours', -- wraps; equals TIME '03:00:00' +; +``` + +Conversely, subtracting two `TIMESTAMP`s or two `TIMESTAMPTZ`s from one another creates an `INTERVAL` describing the difference between the timestamps with only the *days and microseconds* components. For example: + +```sql +SELECT + TIMESTAMP '2000-02-06 12:00:00' - TIMESTAMP '2000-01-01 11:00:00', -- 36 days 1 hour + TIMESTAMP '2000-02-01' + (TIMESTAMP '2000-02-01' - TIMESTAMP '2000-01-01'), -- '2000-03-03', NOT '2000-03-01' +; +``` + +Subtracting two `DATE`s from one another does not create an `INTERVAL` but rather returns the number of days between the given dates as integer value. + +> Warning Extracting a component of the `INTERVAL` difference between two `TIMESTAMP`s is not equivalent to computing the number of partition boundaries between the two `TIMESTAMP`s for the corresponding unit, as computed by the `datediff` function: +> ```sql +> SELECT +> datediff('day', TIMESTAMP '2020-01-01 01:00:00', TIMESTAMP '2020-01-02 00:00:00'), -- 1 +> datepart('day', TIMESTAMP '2020-01-02 00:00:00' - TIMESTAMP '2020-01-01 01:00:00'), -- 0 +> ; +> ``` + +## Equality and Comparison + +For equality and ordering comparisons only, the total number of microseconds in an `INTERVAL` is computed by converting the days basis unit to `24 * 60 * 60 * 1e6` microseconds and the months basis unit to 30 days, or `30 * 24 * 60 * 60 * 1e6` microseconds. + +As a result, `INTERVAL`s can compare equal even when they are functionally different, and the ordering of `INTERVAL`s is not always preserved when they are added to dates or timestamps. + +For example: + +* `INTERVAL 30 DAYS = INTERVAL 1 MONTH` +* but `DATE '2020-01-01' + INTERVAL 30 DAYS != DATE '2020-01-01' + INTERVAL 1 MONTH`. + +and + +* `INTERVAL '30 days 12 hours' > INTERVAL 1 MONTH` +* but `DATE '2020-01-01' + INTERVAL '30 days 12 hours' < DATE '2020-01-01' + INTERVAL 1 MONTH`. + +## Functions + +See the [Date Part Functions page]({% link docs/archive/1.0/sql/functions/datepart.md %}) for a list of available date parts for use with an `INTERVAL`. + +See the [Interval Operators page]({% link docs/archive/1.0/sql/functions/interval.md %}) for functions that operate on intervals. \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/list.md b/docs/archive/1.0/sql/data_types/list.md new file mode 100644 index 00000000000..227d20711d4 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/list.md @@ -0,0 +1,133 @@ +--- +layout: docu +title: List Type +--- + +A `LIST` column encodes lists of values. Fields in the column can have values with different lengths, but they must all have the same underlying type. `LIST`s are typically used to store arrays of numbers, but can contain any uniform data type, including other `LIST`s and `STRUCT`s. + +`LIST`s are similar to PostgreSQL's `ARRAY` type. DuckDB uses the `LIST` terminology, but some [array functions]({% link docs/archive/1.0/sql/functions/nested.md %}#list-functions) are provided for PostgreSQL compatibility. + +See the [data types overview]({% link docs/archive/1.0/sql/data_types/overview.md %}) for a comparison between nested data types. + +> For storing fixed-length lists, DuckDB uses the [`ARRAY` type]({% link docs/archive/1.0/sql/data_types/array.md %}). + +## Creating Lists + +Lists can be created using the [`list_value(expr, ...)`]({% link docs/archive/1.0/sql/functions/nested.md %}#list-functions) function or the equivalent bracket notation `[expr, ...]`. The expressions can be constants or arbitrary expressions. To create a list from a table column, use the [`list`]({% link docs/archive/1.0/sql/functions/aggregates.md %}#general-aggregate-functions) aggregate function. + +List of integers: + +```sql +SELECT [1, 2, 3]; +``` + +List of strings with a `NULL` value: + +```sql +SELECT ['duck', 'goose', NULL, 'heron']; +``` + +List of lists with `NULL` values: + +```sql +SELECT [['duck', 'goose', 'heron'], NULL, ['frog', 'toad'], []]; +``` + +Create a list with the list_value function: + +```sql +SELECT list_value(1, 2, 3); +``` + +Create a table with an `INTEGER` list column and a `VARCHAR` list column: + +```sql +CREATE TABLE list_table (int_list INTEGER[], varchar_list VARCHAR[]); +``` + +## Retrieving from Lists + +Retrieving one or more values from a list can be accomplished using brackets and slicing notation, or through [list functions]({% link docs/archive/1.0/sql/functions/nested.md %}#list-functions) like `list_extract`. Multiple equivalent functions are provided as aliases for compatibility with systems that refer to lists as arrays. For example, the function `array_slice`. + +> We wrap the list creation in parenthesis so that it happens first. +> This is only needed in our basic examples here, not when working with a list column. +> For example, this can't be parsed: `SELECT ['a', 'b', 'c'][1]`. + +
+ +| Example | Result | +|:-----------------------------------------|:-----------| +| SELECT (['a', 'b', 'c'])[3] | 'c' | +| SELECT (['a', 'b', 'c'])[-1] | 'c' | +| SELECT (['a', 'b', 'c'])[2 + 1] | 'c' | +| SELECT list_extract(['a', 'b', 'c'], 3) | 'c' | +| SELECT (['a', 'b', 'c'])[1:2] | ['a', 'b'] | +| SELECT (['a', 'b', 'c'])[:2] | ['a', 'b'] | +| SELECT (['a', 'b', 'c'])[-2:] | ['b', 'c'] | +| SELECT list_slice(['a', 'b', 'c'], 2, 3) | ['b', 'c'] | + +## Comparison and Ordering + +The `LIST` type can be compared using all the [comparison operators]({% link docs/archive/1.0/sql/expressions/comparison_operators.md %}). +These comparisons can be used in [logical expressions]({% link docs/archive/1.0/sql/expressions/logical_operators.md %}) +such as `WHERE` and `HAVING` clauses, and return [`BOOLEAN` values]({% link docs/archive/1.0/sql/data_types/boolean.md %}). + +The `LIST` ordering is defined positionally using the following rules, where `min_len = min(len(l1), len(l2))`. + +* **Equality.** `l1` and `l2` are equal, if for each `i` in `[1, min_len]`: `l1[i] = l2[i]`. +* **Less Than**. For the first index `i` in `[1, min_len]` where `l1[i] != l2[i]`: + If `l1[i] < l2[i]`, `l1` is less than `l2`. + +`NULL` values are compared following PostgreSQL's semantics. +Lower nesting levels are used for tie-breaking. + +Here are some queries returning `true` for the comparison. + +```sql +SELECT [1, 2] < [1, 3] AS result; +``` + +```sql +SELECT [[1], [2, 4, 5]] < [[2]] AS result; +``` + +```sql +SELECT [ ] < [1] AS result; +``` + +These queries return `false`. + +```sql +SELECT [ ] < [ ] AS result; +``` + +```sql +SELECT [1, 2] < [1] AS result; +``` + +These queries return `NULL`. + +```sql +SELECT [1, 2] < [1, NULL, 4] AS result; +``` + +## Updating Lists + +Updates on lists are internally represented as an insert and a delete operation. +Therefore, updating list values may lead to a duplicate key error on primary/unique keys. +See the following example: + +```sql +CREATE TABLE tbl (id INTEGER PRIMARY KEY, lst INTEGER[], comment VARCHAR); +INSERT INTO tbl VALUES (1, [12, 34], 'asd'); +UPDATE tbl SET lst = [56, 78] WHERE id = 1; +``` + +```console +Constraint Error: Duplicate key "id: 1" violates primary key constraint. +If this is an unexpected constraint violation please double check with the known index limitations section in our documentation (https://duckdb.org/docs/sql/indexes). +``` + +## Functions + +See [Nested Functions]({% link docs/archive/1.0/sql/functions/nested.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/literal_types.md b/docs/archive/1.0/sql/data_types/literal_types.md new file mode 100644 index 00000000000..c61d6a146d0 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/literal_types.md @@ -0,0 +1,186 @@ +--- +layout: docu +title: Literal Types +--- + +DuckDB has special literal types for representing `NULL`, integer and string literals in queries. These have their own binding and conversion rules. + +> Prior to DuckDB version 0.10.0, integer and string literals behaved identically to the `INTEGER` and `VARCHAR` types. + +## Null Literals + +The `NULL` literal is denoted with the keyword `NULL`. The `NULL` literal can be implicitly converted to any other type. + +## Integer Literals + +Integer literals are denoted as a sequence of one or more digits. At runtime, these result in values of the `INTEGER_LITERAL` type. `INTEGER_LITERAL` types can be implicitly converted to any [integer type]({% link docs/archive/1.0/sql/data_types/numeric.md %}#integer-types) in which the value fits. For example, the integer literal `42` can be implicitly converted to a `TINYINT`, but the integer literal `1000` cannot be. + +## Other Numeric Literals + +Non-integer numeric literals can be denoted with decimal notation, using the period character (`.`) to separate the integer part and the decimal part of the number. +Either the integer part or the decimal part may be omitted: + +```sql +SELECT 1.5; -- 1.5 +SELECT .50; -- 0.5 +SELECT 2.; -- 2.0 +``` + +Non-integer numeric literals can also be denoted using [_E notation_](https://en.wikipedia.org/wiki/Scientific_notation#E_notation). In E notation, an integer or decimal literal is followed by and exponential part, which is denoted by `e` or `E`, followed by a literal integer indicating the exponent. +The exponential part indicates that the preceding value should be multiplied by 10 raised to the power of the exponent: + +```sql +SELECT 1e2; -- 100 +SELECT 6.02214e23; -- Avogadro's constant +SELECT 1e-10; -- 1 ångström +``` + +## Underscores in Numeric Literals + +DuckDB's SQL dialect allows using the underscore character `_` in numeric literals as an optional separator. The rules for using underscores are as follows: + +* Underscores are allowed in integer, decimal, hexadecimal and binary notation. +* Underscores can not be the first or last character in a literal. +* Underscores have to have an integer/numeric part on either side of them, i.e., there can not be multiple underscores in a row and not immediately before/after a decimal or exponent. + +Examples: + +```sql +SELECT 100_000_000; -- 100000000 +SELECT '0xFF_FF'::INTEGER; -- 65535 +SELECT 1_2.1_2E0_1; -- 121.2 +SELECT '0b0_1_0_1'::INTEGER; -- 5 +``` + +## String Literals + +String literals are delimited using single quotes (`'`, apostrophe) and result in `STRING_LITERAL` values. +Note that double quotes (`"`) cannot be used as string delimiter character: instead, double quotes are used to delimit [quoted identifiers]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#identifiers). + +### Implicit String Literal Concatenation + +Consecutive single-quoted string literals separated only by whitespace that contains at least one newline are implicitly concatenated: + +```sql +SELECT 'Hello' + ' ' + 'World' AS greeting; +``` + +is equivalent to: + +```sql +SELECT 'Hello' + || ' ' + || 'World' AS greeting; +``` + +They both return the following result: + +| greeting | +|-------------| +| Hello World | + +Note that implicit concatenation only works if there is at least one newline between the literals. Using adjacent string literals separated by whitespace without a newline results in a syntax error: + +```sql +SELECT 'Hello' ' ' 'World' AS greeting; +``` + +```console +Parser Error: syntax error at or near "' '" +LINE 1: SELECT 'Hello' ' ' 'World' as greeting; +``` + +Also note that implicit concatenation only works with single-quoted string literals, and does not work with other kinds of string values. + +### Implicit string conversion + +`STRING_LITERAL` instances can be implicitly converted to _any_ other type. + +For example, we can compare string literals with dates: + +```sql +SELECT d > '1992-01-01' AS result FROM (VALUES (DATE '1992-01-01')) t(d); +``` + +| result | +|:-------| +| false | + +However, we cannot compare `VARCHAR` values with dates. + +```sql +SELECT d > '1992-01-01'::VARCHAR FROM (VALUES (DATE '1992-01-01')) t(d); +``` + +```console +Binder Error: Cannot compare values of type DATE and type VARCHAR - an explicit cast is required +``` + +### Escape String Literals + +To escape a single quote (apostrophe) character in a string literal, use `''`. For example, `SELECT '''' AS s` returns `'`. + +To include special characters such as newline, use `E` escape the string. Both the uppercase (`E'...'`) and lowercase variants (`e'...'`) work. + +```sql +SELECT E'Hello\nworld' AS msg; +``` + +Or: + +```sql +SELECT e'Hello\nworld' AS msg; +``` + + + +```text +┌──────────────┐ +│ msg │ +│ varchar │ +├──────────────┤ +│ Hello\nworld │ +└──────────────┘ +``` + +The following backslash escape sequences are supported: + +| Escape sequence | Name | ASCII code | +|:--|:--|--:| +| `\b` | backspace | 8 | +| `\f` | form feed | 12 | +| `\n` | newline | 10 | +| `\r` | carriage return | 13 | +| `\t` | tab | 9 | + +### Dollar-Quoted String Literals + +DuckDB supports dollar-quoted string literals, which are surrounded by double-dollar symbols (`$$`): + +```sql +SELECT $$Hello +world$$ AS msg; +``` + + + +```text +┌──────────────┐ +│ msg │ +│ varchar │ +├──────────────┤ +│ Hello\nworld │ +└──────────────┘ +``` + +```sql +SELECT $$The price is $9.95$$ AS msg; +``` + +| msg | +|--------------------| +| The price is $9.95 | + +[Implicit concatenation](#implicit-string-literal-concatenation) only works for single-quoted string literals, not with dollar-quoted ones. \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/map.md b/docs/archive/1.0/sql/data_types/map.md new file mode 100644 index 00000000000..755975bd43d --- /dev/null +++ b/docs/archive/1.0/sql/data_types/map.md @@ -0,0 +1,96 @@ +--- +layout: docu +title: Map Type +--- + +`MAP`s are similar to `STRUCT`s in that they are an ordered list of “entries” where a key maps to a value. However, `MAP`s do not need to have the same keys present for each row, and thus are suitable for other use cases. `MAP`s are useful when the schema is unknown beforehand or when the schema varies per row; their flexibility is a key differentiator. + +`MAP`s must have a single type for all keys, and a single type for all values. Keys and values can be any type, and the type of the keys does not need to match the type of the values (Ex: a `MAP` of `VARCHAR` to `INT` is valid). `MAP`s may not have duplicate keys. `MAP`s return an empty list if a key is not found rather than throwing an error as structs do. + +In contrast, `STRUCT`s must have string keys, but each key may have a value of a different type. See the [data types overview]({% link docs/archive/1.0/sql/data_types/overview.md %}) for a comparison between nested data types. + +To construct a `MAP`, use the bracket syntax preceded by the `MAP` keyword. + +## Creating Maps + +A map with `VARCHAR` keys and `INTEGER` values. This returns `{key1=10, key2=20, key3=30}`: + +```sql +SELECT MAP {'key1': 10, 'key2': 20, 'key3': 30}; +``` + +Alternatively use the map_from_entries function. This returns `{key1=10, key2=20, key3=30}`: + +```sql +SELECT map_from_entries([('key1', 10), ('key2', 20), ('key3', 30)]); +``` + +A map can be also created using two lists: keys and values. This returns `{key1=10, key2=20, key3=30}`: + +```sql +SELECT MAP(['key1', 'key2', 'key3'], [10, 20, 30]); +``` + +A map can also use INTEGER keys and NUMERIC values. This returns `{1=42.001, 5=-32.100}`: + +```sql +SELECT MAP {1: 42.001, 5: -32.1}; +``` + +Keys and/or values can also be nested types. This returns `{[a, b]=[1.1, 2.2], [c, d]=[3.3, 4.4]}`: + +```sql +SELECT MAP {['a', 'b']: [1.1, 2.2], ['c', 'd']: [3.3, 4.4]}; +``` + +Create a table with a map column that has INTEGER keys and DOUBLE values: + +```sql +CREATE TABLE tbl (col MAP(INTEGER, DOUBLE)); +``` + +## Retrieving from Maps + +`MAP`s use bracket notation for retrieving values. Selecting from a `MAP` returns a `LIST` rather than an individual value, with an empty `LIST` meaning that the key was not found. + +Use bracket notation to retrieve a list containing the value at a key's location. This returns `[5]`. Note that the expression in bracket notation must match the type of the map's key: + +```sql +SELECT MAP {'key1': 5, 'key2': 43}['key1']; +``` + +To retrieve the underlying value, use list selection syntax to grab the first element. This returns `5`: + +```sql +SELECT MAP {'key1': 5, 'key2': 43}['key1'][1]; +``` + +If the element is not in the map, an empty list will be returned. This returns `[]`. Note that the expression in bracket notation must match the type of the map's key else an error is returned: + +```sql +SELECT MAP {'key1': 5, 'key2': 43}['key3']; +``` + +The element_at function can also be used to retrieve a map value. This returns `[5]`: + +```sql +SELECT element_at(MAP {'key1': 5, 'key2': 43}, 'key1'); +``` + +## Comparison Operators + +Nested types can be compared using all the [comparison operators]({% link docs/archive/1.0/sql/expressions/comparison_operators.md %}). +These comparisons can be used in [logical expressions]({% link docs/archive/1.0/sql/expressions/logical_operators.md %}) +for both `WHERE` and `HAVING` clauses, as well as for creating [Boolean values]({% link docs/archive/1.0/sql/data_types/boolean.md %}). + +The ordering is defined positionally in the same way that words can be ordered in a dictionary. +`NULL` values compare greater than all other values and are considered equal to each other. + +At the top level, `NULL` nested values obey standard SQL `NULL` comparison rules: +comparing a `NULL` nested value to a non-`NULL` nested value produces a `NULL` result. +Comparing nested value _members_, however, uses the internal nested value rules for `NULL`s, +and a `NULL` nested value member will compare above a non-`NULL` nested value member. + +## Functions + +See [Nested Functions]({% link docs/archive/1.0/sql/functions/nested.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/nulls.md b/docs/archive/1.0/sql/data_types/nulls.md new file mode 100644 index 00000000000..d8daa44b43c --- /dev/null +++ b/docs/archive/1.0/sql/data_types/nulls.md @@ -0,0 +1,131 @@ +--- +blurb: The NULL value represents a missing value. +layout: docu +title: NULL Values +--- + +`NULL` values are special values that are used to represent missing data in SQL. Columns of any type can contain `NULL` values. Logically, a `NULL` value can be seen as “the value of this field is unknown”. + +A `NULL` value can be inserted to any field that does not have the `NOT NULL` qualifier: + +```sql +CREATE TABLE integers (i INTEGER); +INSERT INTO integers VALUES (NULL); +``` + +`NULL` values have special semantics in many parts of the query as well as in many functions: + +> Any comparison with a `NULL` value returns `NULL`, including `NULL = NULL`. + +You can use `IS NOT DISTINCT FROM` to perform an equality comparison where `NULL` values compare equal to each other. Use `IS (NOT) NULL` to check if a value is `NULL`. + +```sql +SELECT NULL = NULL; +``` + +```text +NULL +``` + +```sql +SELECT NULL IS NOT DISTINCT FROM NULL; +``` + +```text +true +``` + +```sql +SELECT NULL IS NULL; +``` + +```text +true +``` + +## NULL and Functions + +A function that has input argument as `NULL` **usually** returns `NULL`. + +```sql +SELECT cos(NULL); +``` + +```text +NULL +``` + +The `coalesce` function is an exception to this: it takes any number of arguments, and returns for each row the first argument that is not `NULL`. If all arguments are `NULL`, `coalesce` also returns `NULL`. + +```sql +SELECT coalesce(NULL, NULL, 1); +``` + +```text +1 +``` + +```sql +SELECT coalesce(10, 20); +``` + +```text +10 +``` + +```sql +SELECT coalesce(NULL, NULL); +``` + +```text +NULL +``` + +The `ifnull` function is a two-argument version of `coalesce`. + +```sql +SELECT ifnull(NULL, 'default_string'); +``` + +```text +default_string +``` + +```sql +SELECT ifnull(1, 'default_string'); +``` + +```text +1 +``` + +## `NULL` and Conjunctions + +`NULL` values have special semantics in `AND`/`OR` conjunctions. For the ternary logic truth tables, see the [Boolean Type documentation]({% link docs/archive/1.0/sql/data_types/boolean.md %}). + +## `NULL` and Aggregate Functions + +`NULL` values are ignored in most aggregate functions. + +Aggregate functions that do not ignore `NULL` values include: `first`, `last`, `list`, and `array_agg`. To exclude `NULL` values from those aggregate functions, the [`FILTER` clause]({% link docs/archive/1.0/sql/query_syntax/filter.md %}) can be used. + +```sql +CREATE TABLE integers (i INTEGER); +INSERT INTO integers VALUES (1), (10), (NULL); +``` + +```sql +SELECT min(i) FROM integers; +``` + +```text +1 +``` + +```sql +SELECT max(i) FROM integers; +``` + +```text +10 +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/numeric.md b/docs/archive/1.0/sql/data_types/numeric.md new file mode 100644 index 00000000000..567e5bd271e --- /dev/null +++ b/docs/archive/1.0/sql/data_types/numeric.md @@ -0,0 +1,114 @@ +--- +blurb: Numeric types are used to store numbers, and come in different shapes and sizes. +layout: docu +title: Numeric Types +--- + +## Integer Types + +The types `TINYINT`, `SMALLINT`, `INTEGER`, `BIGINT` and `HUGEINT` store whole numbers, that is, numbers without fractional components, of various ranges. Attempts to store values outside of the allowed range will result in an error. +The types `UTINYINT`, `USMALLINT`, `UINTEGER`, `UBIGINT` and `UHUGEINT` store whole unsigned numbers. Attempts to store negative numbers or values outside of the allowed range will result in an error. + +| Name | Aliases | Min | Max | +|:--|:--|----:|----:| +| `TINYINT` | `INT1` | -128 | 127 | +| `SMALLINT` | `INT2`, `SHORT` | -32768 | 32767 | +| `INTEGER` | `INT4`, `INT`, `SIGNED` | -2147483648 | 2147483647 | +| `BIGINT` | `INT8`, `LONG` | -9223372036854775808 | 9223372036854775807 | +| `HUGEINT` | - | -170141183460469231731687303715884105728 | 170141183460469231731687303715884105727 | +| `UTINYINT` | - | 0 | 255 | +| `USMALLINT` | -| 0 | 65535 | +| `UINTEGER` | - | 0 | 4294967295 | +| `UBIGINT` | - | 0 | 18446744073709551615 | +| `UHUGEINT` | - | 0 | 340282366920938463463374607431768211455 | + +The type integer is the common choice, as it offers the best balance between range, storage size, and performance. The `SMALLINT` type is generally only used if disk space is at a premium. The `BIGINT` and `HUGEINT` types are designed to be used when the range of the integer type is insufficient. + +## Fixed-Point Decimals + +The data type `DECIMAL(WIDTH, SCALE)` (also available under the alias `NUMERIC(WIDTH, SCALE)`) represents an exact fixed-point decimal value. When creating a value of type `DECIMAL`, the `WIDTH` and `SCALE` can be specified to define which size of decimal values can be held in the field. The `WIDTH` field determines how many digits can be held, and the `scale` determines the amount of digits after the decimal point. For example, the type `DECIMAL(3, 2)` can fit the value `1.23`, but cannot fit the value `12.3` or the value `1.234`. The default `WIDTH` and `SCALE` is `DECIMAL(18, 3)`, if none are specified. + +Internally, decimals are represented as integers depending on their specified width. + +
+ +| Width | Internal | Size (bytes) | +|:---|:---|---:| +| 1-4 | `INT16` | 2 | +| 5-9 | `INT32` | 4 | +| 10-18 | `INT64` | 8 | +| 19-38 | `INT128` | 16 | + +Performance can be impacted by using too large decimals when not required. In particular decimal values with a width above 19 are slow, as arithmetic involving the `INT128` type is much more expensive than operations involving the `INT32` or `INT64` types. It is therefore recommended to stick with a width of `18` or below, unless there is a good reason for why this is insufficient. + +## Floating-Point Types + +The data types `FLOAT` and `DOUBLE` precision are variable-precision numeric types. In practice, these types are usually implementations of IEEE Standard 754 for Binary Floating-Point Arithmetic (single and double precision, respectively), to the extent that the underlying processor, operating system, and compiler support it. + +
+ +| Name | Aliases | Description | +|:--|:--|:--------| +| `FLOAT` | `FLOAT4`, `REAL` | single precision floating-point number (4 bytes) | +| `DOUBLE` | `FLOAT8` | double precision floating-point number (8 bytes) | + +Like for fixed-point data types, conversion from literals or casts from other datatypes to floating-point types stores inputs that cannot be represented exactly as approximations. However, it can be harder to predict what inputs are affected by this. For example, it is not surprising that `1.3::DECIMAL(1, 0) - 0.7::DECIMAL(1, 0) != 0.6::DECIMAL(1, 0)` but it may he surprising that `1.3::FLOAT - 0.7::FLOAT != 0.6::FLOAT`. + +Additionally, whereas multiplication, addition, and subtraction of fixed-point decimal data types is exact, these operations are only approximate on floating-point binary data types. + +For more complex mathematical operations, however, floating-point arithmetic is used internally and more precise results can be obtained if intermediate steps are _not_ cast to fixed point formats of the same width as in- and outputs. For example, `(10::FLOAT / 3::FLOAT)::FLOAT * 3 = 10` whereas `(10::DECIMAL(18, 3) / 3::DECIMAL(18, 3))::DECIMAL(18, 3) * 3 = 9.999`. + +In general, we advise that: + +* If you require exact storage of numbers with a known number of decimal digits and require exact additions, subtractions, and multiplications (such as for monetary amounts), use the [`DECIMAL` data type](#fixed-point-decimals) or its `NUMERIC` alias instead. +* If you want to do fast or complicated calculations, the floating-point data types may be more appropriate. However, if you use the results for anything important, you should evaluate your implementation carefully for corner cases (ranges, infinities, underflows, invalid operations) that may be handled differently from what you expect and you should familiarize yourself with common floating-point pitfalls. The article [“What Every Computer Scientist Should Know About Floating-Point Arithmetic” by David Goldberg](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) and [the floating point series on Bruce Dawson's blog](https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/) provide excellent starting points. + +On most platforms, the `FLOAT` type has a range of at least 1E-37 to 1E+37 with a precision of at least 6 decimal digits. The `DOUBLE` type typically has a range of around 1E-307 to 1E+308 with a precision of at least 15 digits. Positive numbers outside of these ranges (and negative numbers ourside the mirrored ranges) may cause errors on some platforms but will usually be converted to zero or infinity, respectively. + +In addition to ordinary numeric values, the floating-point types have several special values representing IEEE 754 special values: + +* `Infinity`: infinity +* `-Infinity`: negative infinity +* `NaN`: not a number + +> On a machine whose floating-point arithmetic does not follow IEEE 754, these values will probably not work as expected. + +When writing these values as constants in a SQL command, you must put quotes around them, for example: + +```sql +UPDATE table +SET x = '-Infinity'; +``` + +On input, these strings are recognized in a case-insensitive manner. + +### Floating-Point Arithmetic + +DuckDB and PostgreSQL handle division by zero on floating-point values differently from the [IEEE Standard for Floating-Point Arithmetic (IEEE 754)](https://en.wikipedia.org/wiki/IEEE_754). To show the differences, run the following SQL queries: + +```sql +SELECT 1.0 / 0.0 AS x; +SELECT 0.0 / 0.0 AS x; +SELECT -1.0 / 0.0 AS x; +``` + +
+ +| Expression | DuckDB | IEEE 754 | +| :--------- | -----: | --------: | +| 1.0 / 0.0 | NULL | Infinity | +| 0.0 / 0.0 | NULL | Nan | +| -1.0 / 0.0 | NULL | -Infinity | + +To see the differences between DuckDB and PostgreSQL, see the [PostgreSQL Compatibility page]({% link docs/archive/1.0/sql/dialect/postgresql_compatibility.md %}). + +## Universally Unique Identifiers (`UUID`s) + +DuckDB supports universally unique identifiers (UUIDs) through the `UUID` type. These use 128 bits and are represented internally as `HUGEINT` values. +When printed, they are shown with hexadecimal characters, separated by dashes as follows: `⟨8 characters⟩-⟨4 characters⟩-⟨4 characters⟩-⟨4 characters⟩-⟨12 characters⟩` (using 36 characters in total). For example, `4ac7a9e9-607c-4c8a-84f3-843f0191e3fd` is a valid UUID. + +To generate a new UUID, use the [`uuid()` utility function]({% link docs/archive/1.0/sql/functions/utility.md %}#utility-functions). + +## Functions + +See [Numeric Functions and Operators]({% link docs/archive/1.0/sql/functions/numeric.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/overview.md b/docs/archive/1.0/sql/data_types/overview.md new file mode 100644 index 00000000000..9067b84b999 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/overview.md @@ -0,0 +1,96 @@ +--- +blurb: The table below shows all the built-in general-purpose data types. +layout: docu +title: Data Types +--- + +## General-Purpose Data Types + +The table below shows all the built-in general-purpose data types. The alternatives listed in the aliases column can be used to refer to these types as well, however, note that the aliases are not part of the SQL standard and hence might not be accepted by other database engines. + +| Name | Aliases | Description | +|:--|:--|:----| +| `BIGINT` | `INT8`, `LONG` | signed eight-byte integer | +| `BIT` | `BITSTRING` | string of 1s and 0s | +| `BLOB` | `BYTEA`, `BINARY,` `VARBINARY` | variable-length binary data | +| `BOOLEAN` | `BOOL`, `LOGICAL` | logical boolean (true/false) | +| `DATE` | | calendar date (year, month day) | +| `DECIMAL(prec, scale)` | `NUMERIC(prec, scale)` | fixed-precision number with the given width (precision) and scale, defaults to `prec = 18` and `scale = 3` | +| `DOUBLE` | `FLOAT8`, | double precision floating-point number (8 bytes) | +| `FLOAT` | `FLOAT4`, `REAL` | single precision floating-point number (4 bytes)| +| `HUGEINT` | | signed sixteen-byte integer| +| `INTEGER` | `INT4`, `INT`, `SIGNED` | signed four-byte integer | +| `INTERVAL` | | date / time delta | +| `JSON` | | JSON object (via the [`json` extension]({% link docs/archive/1.0/extensions/json.md %})) | +| `SMALLINT` | `INT2`, `SHORT` | signed two-byte integer| +| `TIME` | | time of day (no time zone) | +| `TIMESTAMP WITH TIME ZONE` | `TIMESTAMPTZ` | combination of time and date that uses the current time zone | +| `TIMESTAMP` | `DATETIME` | combination of time and date | +| `TINYINT` | `INT1` | signed one-byte integer| +| `UBIGINT` | | unsigned eight-byte integer | +| `UHUGEINT` | | unsigned sixteen-byte integer | +| `UINTEGER` | | unsigned four-byte integer | +| `USMALLINT` | | unsigned two-byte integer | +| `UTINYINT` | | unsigned one-byte integer | +| `UUID` | | UUID data type | +| `VARCHAR` | `CHAR`, `BPCHAR`, `TEXT`, `STRING` | variable-length character string | + +Implicit and explicit typecasting is possible between numerous types, see the [Typecasting]({% link docs/archive/1.0/sql/data_types/typecasting.md %}) page for details. + +## Nested / Composite Types + +DuckDB supports five nested data types: `ARRAY`, `LIST`, `MAP`, `STRUCT`, and `UNION`. Each supports different use cases and has a different structure. + +| Name | Description | Rules when used in a column | Build from values | Define in DDL/CREATE | +|:-|:---|:---|:--|:--| +| [`ARRAY`]({% link docs/archive/1.0/sql/data_types/array.md %}) | An ordered, fixed-length sequence of data values of the same type. | Each row must have the same data type within each instance of the `ARRAY` and the same number of elements. | `[1, 2, 3]` | `INTEGER[3]` | +| [`LIST`]({% link docs/archive/1.0/sql/data_types/list.md %}) | An ordered sequence of data values of the same type. | Each row must have the same data type within each instance of the `LIST`, but can have any number of elements. | `[1, 2, 3]` | `INTEGER[]` | +| [`MAP`]({% link docs/archive/1.0/sql/data_types/map.md %}) | A dictionary of multiple named values, each key having the same type and each value having the same type. Keys and values can be any type and can be different types from one another. | Rows may have different keys. | `map([1, 2], ['a', 'b'])` | `MAP(INTEGER, VARCHAR)` | +| [`STRUCT`]({% link docs/archive/1.0/sql/data_types/struct.md %}) | A dictionary of multiple named values, where each key is a string, but the value can be a different type for each key. | Each row must have the same keys. | `{'i': 42, 'j': 'a'}` | `STRUCT(i INTEGER, j VARCHAR)` | +| [`UNION`]({% link docs/archive/1.0/sql/data_types/union.md %}) | A union of multiple alternative data types, storing one of them in each value at a time. A union also contains a discriminator “tag” value to inspect and access the currently set member type. | Rows may be set to different member types of the union. | `union_value(num := 2)` | `UNION(num INTEGER, text VARCHAR)` | + +### Updating Values of Nested Types + +When performing _updates_ on values of nested types, DuckDB performs a _delete_ operation followed by an _insert_ operation. +When used in a table with ART indexes (either via explicit indexes or primary keys/unique constraints), this can lead to [unexpected constraint violations]({% link docs/archive/1.0/sql/indexes.md %}#over-eager-unique-constraint-checking). +For example: + +```sql +CREATE TABLE students (id INTEGER PRIMARY KEY, name VARCHAR); +INSERT INTO students VALUES (1, 'Student 1'); + +UPDATE tbl + SET j = [2] + WHERE i = 1; +``` + +```console +Constraint Error: Duplicate key "i: 1" violates primary key constraint. +If this is an unexpected constraint violation please double check with the known index limitations section in our documentation (https://duckdb.org/docs/sql/indexes). +``` + +## Nesting + +`ARRAY`, `LIST`, `MAP`, `STRUCT`, and `UNION` types can be arbitrarily nested to any depth, so long as the type rules are observed. + +Struct with `LIST`s: + +```sql +SELECT {'birds': ['duck', 'goose', 'heron'], 'aliens': NULL, 'amphibians': ['frog', 'toad']}; +``` + +Struct with list of `MAP`s: + +```sql +SELECT {'test': [MAP([1, 5], [42.1, 45]), MAP([1, 5], [42.1, 45])]}; +``` + +A list of `UNION`s: + +```sql +SELECT [union_value(num := 2), union_value(str := 'ABC')::UNION(str VARCHAR, num INTEGER)]; +``` + +## Performance Implications + +The choice of data types can have a strong effect on performance. Please consult the [Performance Guide]({% link docs/archive/1.0/guides/performance/schema.md %}) for details. \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/struct.md b/docs/archive/1.0/sql/data_types/struct.md new file mode 100644 index 00000000000..5b3b208acce --- /dev/null +++ b/docs/archive/1.0/sql/data_types/struct.md @@ -0,0 +1,293 @@ +--- +layout: docu +title: Struct Data Type +--- + +Conceptually, a `STRUCT` column contains an ordered list of columns called “entries”. The entries are referenced by name using strings. This document refers to those entry names as keys. Each row in the `STRUCT` column must have the same keys. The names of the struct entries are part of the *schema*. Each row in a `STRUCT` column must have the same layout. The names of the struct entries are case-insensitive. + +`STRUCT`s are typically used to nest multiple columns into a single column, and the nested column can be of any type, including other `STRUCT`s and `LIST`s. + +`STRUCT`s are similar to PostgreSQL's `ROW` type. The key difference is that DuckDB `STRUCT`s require the same keys in each row of a `STRUCT` column. This allows DuckDB to provide significantly improved performance by fully utilizing its vectorized execution engine, and also enforces type consistency for improved correctness. DuckDB includes a `row` function as a special way to produce a `STRUCT`, but does not have a `ROW` data type. See an example below and the [nested functions docs]({% link docs/archive/1.0/sql/functions/nested.md %}#struct-functions) for details. + +`STRUCT`s have a fixed schema. It is not possible to change the schema of a `STRUCT` using `UPDATE` operations. + +See the [data types overview]({% link docs/archive/1.0/sql/data_types/overview.md %}) for a comparison between nested data types. + +### Creating Structs + +Structs can be created using the [`struct_pack(name := expr, ...)`]({% link docs/archive/1.0/sql/functions/nested.md %}#struct-functions) function, the equivalent array notation `{'name': expr, ...}`, using a row variable, or using the `row` function. + +Create a struct using the `struct_pack` function. Note the lack of single quotes around the keys and the use of the `:=` operator: + +```sql +SELECT struct_pack(key1 := 'value1', key2 := 42) AS s; +``` + +Create a struct using the array notation: + +```sql +SELECT {'key1': 'value1', 'key2': 42} AS s; +``` + +Create a struct using a row variable: + +```sql +SELECT d AS s FROM (SELECT 'value1' AS key1, 42 AS key2) d; +``` + +Create a struct of integers: + +```sql +SELECT {'x': 1, 'y': 2, 'z': 3} AS s; +``` + +Create a struct of strings with a `NULL` value: + +```sql +SELECT {'yes': 'duck', 'maybe': 'goose', 'huh': NULL, 'no': 'heron'} AS s; +``` + +Create a struct with a different type for each key: + +```sql +SELECT {'key1': 'string', 'key2': 1, 'key3': 12.345} AS s; +``` + +Create a struct of structs with `NULL` values: + +```sql +SELECT { + 'birds': {'yes': 'duck', 'maybe': 'goose', 'huh': NULL, 'no': 'heron'}, + 'aliens': NULL, + 'amphibians': {'yes': 'frog', 'maybe': 'salamander', 'huh': 'dragon', 'no': 'toad'} + } AS s; +``` + +### Adding Field(s)/Value(s) to Structs + +Add to a struct of integers: + +```sql +SELECT struct_insert({'a': 1, 'b': 2, 'c': 3}, d := 4) AS s; +``` + +### Retrieving from Structs + +Retrieving a value from a struct can be accomplished using dot notation, bracket notation, or through [struct functions]({% link docs/archive/1.0/sql/functions/nested.md %}#struct-functions) like `struct_extract`. + +Use dot notation to retrieve the value at a key's location. In the following query, the subquery generates a struct column `a`, which we then query with `a.x`. + +```sql +SELECT a.x FROM (SELECT {'x': 1, 'y': 2, 'z': 3} AS a); +``` + +If a key contains a space, simply wrap it in double quotes (`"`). + +```sql +SELECT a."x space" FROM (SELECT {'x space': 1, 'y': 2, 'z': 3} AS a); +``` + +Bracket notation may also be used. Note that this uses single quotes (`'`) since the goal is to specify a certain string key and only constant expressions may be used inside the brackets (no expressions): + +```sql +SELECT a['x space'] FROM (SELECT {'x space': 1, 'y': 2, 'z': 3} AS a); +``` + +The struct_extract function is also equivalent. This returns 1: + +```sql +SELECT struct_extract({'x space': 1, 'y': 2, 'z': 3}, 'x space'); +``` + +#### `STRUCT.*` + +Rather than retrieving a single key from a struct, star notation (`*`) can be used to retrieve all keys from a struct as separate columns. +This is particularly useful when a prior operation creates a struct of unknown shape, or if a query must handle any potential struct keys. + +All keys within a struct can be returned as separate columns using `*`: + +```sql +SELECT a.* +FROM (SELECT {'x': 1, 'y': 2, 'z': 3} AS a); +``` + +
+ +| x | y | z | +|:---|:---|:---| +| 1 | 2 | 3 | + +### Dot Notation Order of Operations + +Referring to structs with dot notation can be ambiguous with referring to schemas and tables. In general, DuckDB looks for columns first, then for struct keys within columns. DuckDB resolves references in these orders, using the first match to occur: + +#### No Dots + +```sql +SELECT part1 +FROM tbl; +``` + +1. `part1` is a column + +#### One Dot + +```sql +SELECT part1.part2 +FROM tbl; +``` + +1. `part1` is a table, `part2` is a column +2. `part1` is a column, `part2` is a property of that column + +#### Two (or More) Dots + +```sql +SELECT part1.part2.part3 +FROM tbl; +``` + +1. `part1` is a schema, `part2` is a table, `part3` is a column +2. `part1` is a table, `part2` is a column, `part3` is a property of that column +3. `part1` is a column, `part2` is a property of that column, `part3` is a property of that column + +Any extra parts (e.g., `.part4.part5`, etc.) are always treated as properties + +### Creating Structs with the `row` Function + +The `row` function can be used to automatically convert multiple columns to a single struct column. +When using `row` the keys will be empty strings allowing for easy insertion into a table with a struct column. +Columns, however, cannot be initialized with the `row` function, and must be explicitly named. +For example, inserting values into a struct column using the `row` function: + +```sql +CREATE TABLE t1 (s STRUCT(v VARCHAR, i INTEGER)); +INSERT INTO t1 VALUES (row('a', 42)); +SELECT * FROM t1; +``` + +The table will contain a single entry: + +```sql +{'v': a, 'i': 42} +``` + +The following produces the same result as above: + +```sql +CREATE TABLE t1 AS ( + SELECT row('a', 42)::STRUCT(v VARCHAR, i INTEGER) +); +``` + +Initializing a struct column with the `row` function will fail: + +```sql +CREATE TABLE t2 AS SELECT row('a'); +``` + +```console +Invalid Input Error: A table cannot be created from an unnamed struct +``` + +When casting structs, the names of fields have to match. Therefore, the following query will fail: + +```sql +SELECT a::STRUCT(y INTEGER) AS b +FROM + (SELECT {'x': 42} AS a); +``` + +```console +Mismatch Type Error: Type STRUCT(x INTEGER) does not match with STRUCT(y INTEGER). Cannot cast STRUCTs - element "x" in source struct was not found in target struct +``` + +A workaround for this is to use [`struct_pack`](#creating-structs) instead: + +```sql +SELECT struct_pack(y := a.x) AS b +FROM + (SELECT {'x': 42} AS a); +``` + +> This behavior was introduced in DuckDB v0.9.0. Previously, this query ran successfully and returned struct `{'y': 42}` as column `b`. + +The `row` function can be used to return unnamed structs. For example: + +```sql +SELECT row(x, x + 1, y) FROM (SELECT 1 AS x, 'a' AS y) AS s; +``` + +This produces `(1, 2, a)`. + +If using multiple expressions when creating a struct, the `row` function is optional. The following query returns the same result as the previous one: + +```sql +SELECT (x, x + 1, y) AS s FROM (SELECT 1 AS x, 'a' AS y); +``` + +## Comparison and Ordering + +The `STRUCT` type can be compared using all the [comparison operators]({% link docs/archive/1.0/sql/expressions/comparison_operators.md %}). +These comparisons can be used in [logical expressions]({% link docs/archive/1.0/sql/expressions/logical_operators.md %}) +such as `WHERE` and `HAVING` clauses, and return [`BOOLEAN` values]({% link docs/archive/1.0/sql/data_types/boolean.md %}). + +For comparisons, the keys of a `STRUCT` have a fixed positional order, from left to right. +Comparisons behave the same as row comparisons, therefore, matching keys must be at identical positions. + +Specifically, for any `STRUCT` comparison, the following rules apply: + +* **Equality.** `s1` and `s2` are equal, if all respective values are equal. +* **Less Than**. For the first index `i` where `s1.value[i] != s2.value[i]`: +If `s1.value[i] < s2.value[i]`, `s1` is less than `s2`. + +`NULL` values are compared following PostgreSQL's semantics. +Lower nesting levels are used for tie-breaking. + +Here are some queries returning `true` for the comparison. + +```sql +SELECT {'k1': 2, 'k2': 3} < {'k1': 2, 'k2': 4} AS result; +``` + +```sql +SELECT {'k1': 'hello'} < {'k1': 'world'} AS result; +``` + +These queries return `false`. + +```sql +SELECT {'k2': 4, 'k1': 3} < {'k2': 2, 'k1': 4} AS result; +``` + +```sql +SELECT {'k1': [4, 3]} < {'k1': [3, 6, 7]} AS result; +``` + +These queries return `NULL`. + +```sql +SELECT {'k1': 2, 'k2': 3} < {'k1': 2, 'k2': NULL} AS result; +``` + +This query returns a `Binder Error` because the keys do not match positionally. + +```sql +SELECT {'k1': 2, 'k2': 3} < {'k2': 2, 'k1': 4} AS result; +``` + +```console +Binder Error: Cannot compare values of type STRUCT(k1 INTEGER, k2 INTEGER) +and type STRUCT(k2 INTEGER, k1 INTEGER) - an explicit cast is required +``` + +> Up to DuckDB 0.10.1, nested `NULL` values were compared as follows. +> At the top level, nested `NULL` values obey standard SQL `NULL` comparison rules: +> comparing a nested `NULL` value to a nested non-`NULL` value produces a `NULL` result. +> Comparing nested value _entries_, however, uses the internal nested value rules for `NULL`s, +> and a nested `NULL` value entry will compare above a nested non-`NULL` value entry. +> DuckDB 0.10.2 introduced a breaking change in semantics as described above. + +## Functions + +See [Nested Functions]({% link docs/archive/1.0/sql/functions/nested.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/text.md b/docs/archive/1.0/sql/data_types/text.md new file mode 100644 index 00000000000..043453e0107 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/text.md @@ -0,0 +1,61 @@ +--- +blurb: In DuckDB, strings can be stored in the VARCHAR field. +layout: docu +title: Text Types +--- + +In DuckDB, strings can be stored in the `VARCHAR` field. +The field allows storage of Unicode characters. Internally, the data is encoded as UTF-8. + +
+ +| Name | Aliases | Description | +|:---|:---|:---| +| `VARCHAR` | `CHAR`, `BPCHAR`, `STRING`, `TEXT` | Variable-length character string | +| `VARCHAR(n)` | `STRING(n)`, `TEXT(n)` | Variable-length character string. The maximum length _n_ has no effect and is only provided for compatibility. | + +## Specifying a Length Limit + +Specifying the length for the `VARCHAR`, `STRING`, and `TEXT` types is not required and has no effect on the system. Specifying the length will not improve performance or reduce storage space of the strings in the database. These variants variant is supported for compatibility reasons with other systems that do require a length to be specified for strings. + +If you wish to restrict the number of characters in a `VARCHAR` column for data integrity reasons the `CHECK` constraint should be used, for example: + +```sql +CREATE TABLE strings ( + val VARCHAR CHECK (length(val) <= 10) -- val has a maximum length of 10 +); +``` + +The `VARCHAR` field allows storage of Unicode characters. Internally, the data is encoded as UTF-8. + +## Text Type Values + +Values of the text type are character strings, also known as string values or simply strings. At runtime, string values are constructed in one of the following ways: + +* referencing columns whose declared or implied type is the text data type +* [string literals]({% link docs/archive/1.0/sql/data_types/literal_types.md %}#string-literals) +* [casting]({% link docs/archive/1.0/sql/expressions/cast.md %}#explicit-casting) expressions to a text type +* applying a [string operator]({% link docs/archive/1.0/sql/functions/char.md %}#text-functions-and-operators), or invoking a function that returns a text type value + +## Strings with Special Characters + +To use special characters in string, use [escape string literals]({% link docs/archive/1.0/sql/data_types/literal_types.md %}#escape-string-literals) or [dollar-quoted string literals]({% link docs/archive/1.0/sql/data_types/literal_types.md %}#dollar-quoted-string-literals). Alternatively, you can use concatenation and the [`chr` character function]({% link docs/archive/1.0/sql/functions/char.md %}): + +```sql +SELECT 'Hello' || chr(10) || 'world' AS msg; +``` + + + +```text +┌──────────────┐ +│ msg │ +│ varchar │ +├──────────────┤ +│ Hello\nworld │ +└──────────────┘ +``` + +## Functions + +See [Character Functions]({% link docs/archive/1.0/sql/functions/char.md %}) and [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/time.md b/docs/archive/1.0/sql/data_types/time.md new file mode 100644 index 00000000000..db4e22de76c --- /dev/null +++ b/docs/archive/1.0/sql/data_types/time.md @@ -0,0 +1,51 @@ +--- +blurb: A time instance represents the time of a day (hour, minute, second, microsecond). +layout: docu +title: Time Types +--- + +The `TIME` and `TIMETZ` types specify the hour, minute, second, microsecond of a day. + +
+ +| Name | Aliases | Description | +| :------- | :----------------------- | :------------------------------ | +| `TIME` | `TIME WITHOUT TIME ZONE` | time of day (ignores time zone) | +| `TIMETZ` | `TIME WITH TIME ZONE` | time of day (uses time zone) | + +Instances can be created using the type names as a keyword, where the data must be formatted according to the ISO 8601 format (`hh:mm:ss[.zzzzzz][+-TT[:tt]]`). + +```sql +SELECT TIME '1992-09-20 11:30:00.123456'; +``` + +```text +11:30:00.123456 +``` + +```sql +SELECT TIMETZ '1992-09-20 11:30:00.123456'; +``` + +```text +11:30:00.123456+00 +``` + +```sql +SELECT TIMETZ '1992-09-20 11:30:00.123456-02:00'; +``` + +```text +13:30:00.123456+00 +``` + +```sql +SELECT TIMETZ '1992-09-20 11:30:00.123456+05:30'; +``` + +```text +06:00:00.123456+00 +``` + +> Warning The `TIME` type should only be used in rare cases, where the date part of the timestamp can be disregarded. +> Most applications should use the [`TIMESTAMP` types]({% link docs/archive/1.0/sql/data_types/timestamp.md %}) to represent their timestamps. \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/timestamp.md b/docs/archive/1.0/sql/data_types/timestamp.md new file mode 100644 index 00000000000..4ffb9f93799 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/timestamp.md @@ -0,0 +1,190 @@ +--- +blurb: A timestamp specifies a combination of a date (year, month, day) and a time + (hour, minute, second, microsecond). +layout: docu +title: Timestamp Types +--- + +Timestamps represent points in absolute time, usually called *instants*. +DuckDB represents instants as the number of microseconds (µs) since `1970-01-01 00:00:00+00`. + +## Timestamp Types + +| Name | Aliases | Description | +|:---|:---|:---| +| `TIMESTAMP_NS` | | timestamp with nanosecond precision (ignores time zone) | +| `TIMESTAMP` | `DATETIME` | timestamp with microsecond precision (ignores time zone) | +| `TIMESTAMP_MS` | | timestamp with millisecond precision (ignores time zone) | +| `TIMESTAMP_S` | | timestamp with second precision (ignores time zone) | +| `TIMESTAMPTZ` | `TIMESTAMP WITH TIME ZONE` | timestamp (uses time zone) | + +A timestamp specifies a combination of [`DATE`]({% link docs/archive/1.0/sql/data_types/date.md %}) (year, month, day) and a [`TIME`]({% link docs/archive/1.0/sql/data_types/time.md %}) (hour, minute, second, microsecond). Timestamps can be created using the `TIMESTAMP` keyword, where the data must be formatted according to the ISO 8601 format (`YYYY-MM-DD hh:mm:ss[.zzzzzz][+-TT[:tt]]`). Decimal places beyond the targeted sub-second precision are ignored. + +> Warning When defining timestamps using a `TIMESTAMP_NS` literal, the decimal places beyond _microseconds_ are ignored. Note that the `TIMESTAMP_NS` type is able to hold nanoseconds when created e.g., via the ingestion of Parquet files. + +```sql +SELECT TIMESTAMP_NS '1992-09-20 11:30:00.123456789'; +``` + +```text +1992-09-20 11:30:00.123456 +``` + +```sql +SELECT TIMESTAMP '1992-09-20 11:30:00.123456789'; +``` + +```text +1992-09-20 11:30:00.123456 +``` + +```sql +SELECT DATETIME '1992-09-20 11:30:00.123456789'; +``` + +```text +1992-09-20 11:30:00.123456 +``` + +```sql +SELECT TIMESTAMP_MS '1992-09-20 11:30:00.123456789'; +``` + +```text +1992-09-20 11:30:00.123 +``` + +```sql +SELECT TIMESTAMP_S '1992-09-20 11:30:00.123456789'; +``` + +```text +1992-09-20 11:30:00 +``` + +```sql +SELECT TIMESTAMPTZ '1992-09-20 11:30:00.123456789'; +``` + +```text +1992-09-20 11:30:00.123456+00 +``` + +```sql +SELECT TIMESTAMP WITH TIME ZONE '1992-09-20 11:30:00.123456789'; +``` + +```text +1992-09-20 11:30:00.123456+00 +``` + +## Special Values + +There are also three special date values that can be used on input: + +
+ +| Input string | Valid types | Description | +|:-------------|:---------------------------|:-----------------------------------------------| +| `epoch` | `TIMESTAMP`, `TIMESTAMPTZ` | 1970-01-01 00:00:00+00 (Unix system time zero) | +| `infinity` | `TIMESTAMP`, `TIMESTAMPTZ` | later than all other time stamps | +| `-infinity` | `TIMESTAMP`, `TIMESTAMPTZ` | earlier than all other time stamps | + +The values `infinity` and `-infinity` are specially represented inside the system and will be displayed unchanged; +but `epoch` is simply a notational shorthand that will be converted to the time stamp value when read. + +```sql +SELECT '-infinity'::TIMESTAMP, 'epoch'::TIMESTAMP, 'infinity'::TIMESTAMP; +``` + +
+ +| Negative | Epoch | Positive | +|:----------|:--------------------|:---------| +| -infinity | 1970-01-01 00:00:00 | infinity | + +## Functions + +See [Timestamp Functions]({% link docs/archive/1.0/sql/functions/timestamp.md %}). + +## Time Zones + +The `TIMESTAMPTZ` type can be binned into calendar and clock bins using a suitable extension. +The built-in [ICU extension]({% link docs/archive/1.0/extensions/icu.md %}) implements all the binning and arithmetic functions using the +[International Components for Unicode](https://icu.unicode.org) time zone and calendar functions. + +To set the time zone to use, first load the ICU extension. The ICU extension comes pre-bundled with several DuckDB clients (including Python, R, JDBC, and ODBC), so this step can be skipped in those cases. In other cases you might first need to install and load the ICU extension. + +```sql +INSTALL icu; +LOAD icu; +``` + +Next, use the `SET TimeZone` command: + +```sql +SET TimeZone = 'America/Los_Angeles'; +``` + +Time binning operations for `TIMESTAMPTZ` will then be implemented using the given time zone. + +A list of available time zones can be pulled from the `pg_timezone_names()` table function: + +```sql +SELECT + name, + abbrev, + utc_offset +FROM pg_timezone_names() +ORDER BY + name; +``` + +You can also find a reference table of [available time zones]({% link docs/archive/1.0/sql/data_types/timezones.md %}). + +## Calendars + +The [ICU extension]({% link docs/archive/1.0/extensions/icu.md %}) also supports non-Gregorian calendars using the `SET Calendar` command. +Note that the `INSTALL` and `LOAD` steps are only required if the DuckDB client does not bundle the ICU extension. + +```sql +INSTALL icu; +LOAD icu; +SET Calendar = 'japanese'; +``` + +Time binning operations for `TIMESTAMPTZ` will then be implemented using the given calendar. +In this example, the `era` part will now report the Japanese imperial era number. + +A list of available calendars can be pulled from the `icu_calendar_names()` table function: + +```sql +SELECT name +FROM icu_calendar_names() +ORDER BY 1; +``` + +## Settings + +The current value of the `TimeZone` and `Calendar` settings are determined by ICU when it starts up. +They can be queried from in the `duckdb_settings()` table function: + +```sql +SELECT * +FROM duckdb_settings() +WHERE name = 'TimeZone'; +``` + +| name | value | description | input_type | +|----------|------------------|-----------------------|------------| +| TimeZone | Europe/Amsterdam | The current time zone | VARCHAR | + +```sql +SELECT * +FROM duckdb_settings() +WHERE name = 'Calendar'; +``` + +| name | value | description | input_type | +|----------|-----------|----------------------|------------| +| Calendar | gregorian | The current calendar | VARCHAR | \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/timezones.md b/docs/archive/1.0/sql/data_types/timezones.md new file mode 100644 index 00000000000..e81a9a738c6 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/timezones.md @@ -0,0 +1,655 @@ +--- +blurb: A reference list for Time Zones +layout: docu +title: Time Zone Reference List +--- + +An up-to-date version of this list can be pulled from the `pg_timezone_names()` table function: + +```sql +SELECT name, abbrev +FROM pg_timezone_names() +ORDER BY name; +``` + +
+ +| name | abbrev | +|----------------------------------|----------------------------------| +| ACT | ACT | +| AET | AET | +| AGT | AGT | +| ART | ART | +| AST | AST | +| Africa/Abidjan | Iceland | +| Africa/Accra | Iceland | +| Africa/Addis_Ababa | EAT | +| Africa/Algiers | Africa/Algiers | +| Africa/Asmara | EAT | +| Africa/Asmera | EAT | +| Africa/Bamako | Iceland | +| Africa/Bangui | Africa/Bangui | +| Africa/Banjul | Iceland | +| Africa/Bissau | Africa/Bissau | +| Africa/Blantyre | CAT | +| Africa/Brazzaville | Africa/Brazzaville | +| Africa/Bujumbura | CAT | +| Africa/Cairo | ART | +| Africa/Casablanca | Africa/Casablanca | +| Africa/Ceuta | Africa/Ceuta | +| Africa/Conakry | Iceland | +| Africa/Dakar | Iceland | +| Africa/Dar_es_Salaam | EAT | +| Africa/Djibouti | EAT | +| Africa/Douala | Africa/Douala | +| Africa/El_Aaiun | Africa/El_Aaiun | +| Africa/Freetown | Iceland | +| Africa/Gaborone | CAT | +| Africa/Harare | CAT | +| Africa/Johannesburg | Africa/Johannesburg | +| Africa/Juba | Africa/Juba | +| Africa/Kampala | EAT | +| Africa/Khartoum | Africa/Khartoum | +| Africa/Kigali | CAT | +| Africa/Kinshasa | Africa/Kinshasa | +| Africa/Lagos | Africa/Lagos | +| Africa/Libreville | Africa/Libreville | +| Africa/Lome | Iceland | +| Africa/Luanda | Africa/Luanda | +| Africa/Lubumbashi | CAT | +| Africa/Lusaka | CAT | +| Africa/Malabo | Africa/Malabo | +| Africa/Maputo | CAT | +| Africa/Maseru | Africa/Maseru | +| Africa/Mbabane | Africa/Mbabane | +| Africa/Mogadishu | EAT | +| Africa/Monrovia | Africa/Monrovia | +| Africa/Nairobi | EAT | +| Africa/Ndjamena | Africa/Ndjamena | +| Africa/Niamey | Africa/Niamey | +| Africa/Nouakchott | Iceland | +| Africa/Ouagadougou | Iceland | +| Africa/Porto-Novo | Africa/Porto-Novo | +| Africa/Sao_Tome | Africa/Sao_Tome | +| Africa/Timbuktu | Iceland | +| Africa/Tripoli | Libya | +| Africa/Tunis | Africa/Tunis | +| Africa/Windhoek | Africa/Windhoek | +| America/Adak | America/Adak | +| America/Anchorage | AST | +| America/Anguilla | PRT | +| America/Antigua | PRT | +| America/Araguaina | America/Araguaina | +| America/Argentina/Buenos_Aires | AGT | +| America/Argentina/Catamarca | America/Argentina/Catamarca | +| America/Argentina/ComodRivadavia | America/Argentina/ComodRivadavia | +| America/Argentina/Cordoba | America/Argentina/Cordoba | +| America/Argentina/Jujuy | America/Argentina/Jujuy | +| America/Argentina/La_Rioja | America/Argentina/La_Rioja | +| America/Argentina/Mendoza | America/Argentina/Mendoza | +| America/Argentina/Rio_Gallegos | America/Argentina/Rio_Gallegos | +| America/Argentina/Salta | America/Argentina/Salta | +| America/Argentina/San_Juan | America/Argentina/San_Juan | +| America/Argentina/San_Luis | America/Argentina/San_Luis | +| America/Argentina/Tucuman | America/Argentina/Tucuman | +| America/Argentina/Ushuaia | America/Argentina/Ushuaia | +| America/Aruba | PRT | +| America/Asuncion | America/Asuncion | +| America/Atikokan | America/Atikokan | +| America/Atka | America/Atka | +| America/Bahia | America/Bahia | +| America/Bahia_Banderas | America/Bahia_Banderas | +| America/Barbados | America/Barbados | +| America/Belem | America/Belem | +| America/Belize | America/Belize | +| America/Blanc-Sablon | PRT | +| America/Boa_Vista | America/Boa_Vista | +| America/Bogota | America/Bogota | +| America/Boise | America/Boise | +| America/Buenos_Aires | AGT | +| America/Cambridge_Bay | America/Cambridge_Bay | +| America/Campo_Grande | America/Campo_Grande | +| America/Cancun | America/Cancun | +| America/Caracas | America/Caracas | +| America/Catamarca | America/Catamarca | +| America/Cayenne | America/Cayenne | +| America/Cayman | America/Cayman | +| America/Chicago | CST | +| America/Chihuahua | America/Chihuahua | +| America/Ciudad_Juarez | America/Ciudad_Juarez | +| America/Coral_Harbour | America/Coral_Harbour | +| America/Cordoba | America/Cordoba | +| America/Costa_Rica | America/Costa_Rica | +| America/Creston | PNT | +| America/Cuiaba | America/Cuiaba | +| America/Curacao | PRT | +| America/Danmarkshavn | America/Danmarkshavn | +| America/Dawson | America/Dawson | +| America/Dawson_Creek | America/Dawson_Creek | +| America/Denver | Navajo | +| America/Detroit | America/Detroit | +| America/Dominica | PRT | +| America/Edmonton | America/Edmonton | +| America/Eirunepe | America/Eirunepe | +| America/El_Salvador | America/El_Salvador | +| America/Ensenada | America/Ensenada | +| America/Fort_Nelson | America/Fort_Nelson | +| America/Fort_Wayne | IET | +| America/Fortaleza | America/Fortaleza | +| America/Glace_Bay | America/Glace_Bay | +| America/Godthab | America/Godthab | +| America/Goose_Bay | America/Goose_Bay | +| America/Grand_Turk | America/Grand_Turk | +| America/Grenada | PRT | +| America/Guadeloupe | PRT | +| America/Guatemala | America/Guatemala | +| America/Guayaquil | America/Guayaquil | +| America/Guyana | America/Guyana | +| America/Halifax | America/Halifax | +| America/Havana | Cuba | +| America/Hermosillo | America/Hermosillo | +| America/Indiana/Indianapolis | IET | +| America/Indiana/Knox | America/Indiana/Knox | +| America/Indiana/Marengo | America/Indiana/Marengo | +| America/Indiana/Petersburg | America/Indiana/Petersburg | +| America/Indiana/Tell_City | America/Indiana/Tell_City | +| America/Indiana/Vevay | America/Indiana/Vevay | +| America/Indiana/Vincennes | America/Indiana/Vincennes | +| America/Indiana/Winamac | America/Indiana/Winamac | +| America/Indianapolis | IET | +| America/Inuvik | America/Inuvik | +| America/Iqaluit | America/Iqaluit | +| America/Jamaica | Jamaica | +| America/Jujuy | America/Jujuy | +| America/Juneau | America/Juneau | +| America/Kentucky/Louisville | America/Kentucky/Louisville | +| America/Kentucky/Monticello | America/Kentucky/Monticello | +| America/Knox_IN | America/Knox_IN | +| America/Kralendijk | PRT | +| America/La_Paz | America/La_Paz | +| America/Lima | America/Lima | +| America/Los_Angeles | PST | +| America/Louisville | America/Louisville | +| America/Lower_Princes | PRT | +| America/Maceio | America/Maceio | +| America/Managua | America/Managua | +| America/Manaus | America/Manaus | +| America/Marigot | PRT | +| America/Martinique | America/Martinique | +| America/Matamoros | America/Matamoros | +| America/Mazatlan | America/Mazatlan | +| America/Mendoza | America/Mendoza | +| America/Menominee | America/Menominee | +| America/Merida | America/Merida | +| America/Metlakatla | America/Metlakatla | +| America/Mexico_City | America/Mexico_City | +| America/Miquelon | America/Miquelon | +| America/Moncton | America/Moncton | +| America/Monterrey | America/Monterrey | +| America/Montevideo | America/Montevideo | +| America/Montreal | America/Montreal | +| America/Montserrat | PRT | +| America/Nassau | America/Nassau | +| America/New_York | America/New_York | +| America/Nipigon | America/Nipigon | +| America/Nome | America/Nome | +| America/Noronha | America/Noronha | +| America/North_Dakota/Beulah | America/North_Dakota/Beulah | +| America/North_Dakota/Center | America/North_Dakota/Center | +| America/North_Dakota/New_Salem | America/North_Dakota/New_Salem | +| America/Nuuk | America/Nuuk | +| America/Ojinaga | America/Ojinaga | +| America/Panama | America/Panama | +| America/Pangnirtung | America/Pangnirtung | +| America/Paramaribo | America/Paramaribo | +| America/Phoenix | PNT | +| America/Port-au-Prince | America/Port-au-Prince | +| America/Port_of_Spain | PRT | +| America/Porto_Acre | America/Porto_Acre | +| America/Porto_Velho | America/Porto_Velho | +| America/Puerto_Rico | PRT | +| America/Punta_Arenas | America/Punta_Arenas | +| America/Rainy_River | America/Rainy_River | +| America/Rankin_Inlet | America/Rankin_Inlet | +| America/Recife | America/Recife | +| America/Regina | America/Regina | +| America/Resolute | America/Resolute | +| America/Rio_Branco | America/Rio_Branco | +| America/Rosario | America/Rosario | +| America/Santa_Isabel | America/Santa_Isabel | +| America/Santarem | America/Santarem | +| America/Santiago | America/Santiago | +| America/Santo_Domingo | America/Santo_Domingo | +| America/Sao_Paulo | BET | +| America/Scoresbysund | America/Scoresbysund | +| America/Shiprock | Navajo | +| America/Sitka | America/Sitka | +| America/St_Barthelemy | PRT | +| America/St_Johns | CNT | +| America/St_Kitts | PRT | +| America/St_Lucia | PRT | +| America/St_Thomas | PRT | +| America/St_Vincent | PRT | +| America/Swift_Current | America/Swift_Current | +| America/Tegucigalpa | America/Tegucigalpa | +| America/Thule | America/Thule | +| America/Thunder_Bay | America/Thunder_Bay | +| America/Tijuana | America/Tijuana | +| America/Toronto | America/Toronto | +| America/Tortola | PRT | +| America/Vancouver | America/Vancouver | +| America/Virgin | PRT | +| America/Whitehorse | America/Whitehorse | +| America/Winnipeg | America/Winnipeg | +| America/Yakutat | America/Yakutat | +| America/Yellowknife | America/Yellowknife | +| Antarctica/Casey | Antarctica/Casey | +| Antarctica/Davis | Antarctica/Davis | +| Antarctica/DumontDUrville | Antarctica/DumontDUrville | +| Antarctica/Macquarie | Antarctica/Macquarie | +| Antarctica/Mawson | Antarctica/Mawson | +| Antarctica/McMurdo | NZ | +| Antarctica/Palmer | Antarctica/Palmer | +| Antarctica/Rothera | Antarctica/Rothera | +| Antarctica/South_Pole | NZ | +| Antarctica/Syowa | Antarctica/Syowa | +| Antarctica/Troll | Antarctica/Troll | +| Antarctica/Vostok | Antarctica/Vostok | +| Arctic/Longyearbyen | Arctic/Longyearbyen | +| Asia/Aden | Asia/Aden | +| Asia/Almaty | Asia/Almaty | +| Asia/Amman | Asia/Amman | +| Asia/Anadyr | Asia/Anadyr | +| Asia/Aqtau | Asia/Aqtau | +| Asia/Aqtobe | Asia/Aqtobe | +| Asia/Ashgabat | Asia/Ashgabat | +| Asia/Ashkhabad | Asia/Ashkhabad | +| Asia/Atyrau | Asia/Atyrau | +| Asia/Baghdad | Asia/Baghdad | +| Asia/Bahrain | Asia/Bahrain | +| Asia/Baku | Asia/Baku | +| Asia/Bangkok | Asia/Bangkok | +| Asia/Barnaul | Asia/Barnaul | +| Asia/Beirut | Asia/Beirut | +| Asia/Bishkek | Asia/Bishkek | +| Asia/Brunei | Asia/Brunei | +| Asia/Calcutta | IST | +| Asia/Chita | Asia/Chita | +| Asia/Choibalsan | Asia/Choibalsan | +| Asia/Chongqing | CTT | +| Asia/Chungking | CTT | +| Asia/Colombo | Asia/Colombo | +| Asia/Dacca | BST | +| Asia/Damascus | Asia/Damascus | +| Asia/Dhaka | BST | +| Asia/Dili | Asia/Dili | +| Asia/Dubai | Asia/Dubai | +| Asia/Dushanbe | Asia/Dushanbe | +| Asia/Famagusta | Asia/Famagusta | +| Asia/Gaza | Asia/Gaza | +| Asia/Harbin | CTT | +| Asia/Hebron | Asia/Hebron | +| Asia/Ho_Chi_Minh | VST | +| Asia/Hong_Kong | Hongkong | +| Asia/Hovd | Asia/Hovd | +| Asia/Irkutsk | Asia/Irkutsk | +| Asia/Istanbul | Turkey | +| Asia/Jakarta | Asia/Jakarta | +| Asia/Jayapura | Asia/Jayapura | +| Asia/Jerusalem | Israel | +| Asia/Kabul | Asia/Kabul | +| Asia/Kamchatka | Asia/Kamchatka | +| Asia/Karachi | PLT | +| Asia/Kashgar | Asia/Kashgar | +| Asia/Kathmandu | Asia/Kathmandu | +| Asia/Katmandu | Asia/Katmandu | +| Asia/Khandyga | Asia/Khandyga | +| Asia/Kolkata | IST | +| Asia/Krasnoyarsk | Asia/Krasnoyarsk | +| Asia/Kuala_Lumpur | Singapore | +| Asia/Kuching | Asia/Kuching | +| Asia/Kuwait | Asia/Kuwait | +| Asia/Macao | Asia/Macao | +| Asia/Macau | Asia/Macau | +| Asia/Magadan | Asia/Magadan | +| Asia/Makassar | Asia/Makassar | +| Asia/Manila | Asia/Manila | +| Asia/Muscat | Asia/Muscat | +| Asia/Nicosia | Asia/Nicosia | +| Asia/Novokuznetsk | Asia/Novokuznetsk | +| Asia/Novosibirsk | Asia/Novosibirsk | +| Asia/Omsk | Asia/Omsk | +| Asia/Oral | Asia/Oral | +| Asia/Phnom_Penh | Asia/Phnom_Penh | +| Asia/Pontianak | Asia/Pontianak | +| Asia/Pyongyang | Asia/Pyongyang | +| Asia/Qatar | Asia/Qatar | +| Asia/Qostanay | Asia/Qostanay | +| Asia/Qyzylorda | Asia/Qyzylorda | +| Asia/Rangoon | Asia/Rangoon | +| Asia/Riyadh | Asia/Riyadh | +| Asia/Saigon | VST | +| Asia/Sakhalin | Asia/Sakhalin | +| Asia/Samarkand | Asia/Samarkand | +| Asia/Seoul | ROK | +| Asia/Shanghai | CTT | +| Asia/Singapore | Singapore | +| Asia/Srednekolymsk | Asia/Srednekolymsk | +| Asia/Taipei | ROC | +| Asia/Tashkent | Asia/Tashkent | +| Asia/Tbilisi | Asia/Tbilisi | +| Asia/Tehran | Iran | +| Asia/Tel_Aviv | Israel | +| Asia/Thimbu | Asia/Thimbu | +| Asia/Thimphu | Asia/Thimphu | +| Asia/Tokyo | JST | +| Asia/Tomsk | Asia/Tomsk | +| Asia/Ujung_Pandang | Asia/Ujung_Pandang | +| Asia/Ulaanbaatar | Asia/Ulaanbaatar | +| Asia/Ulan_Bator | Asia/Ulan_Bator | +| Asia/Urumqi | Asia/Urumqi | +| Asia/Ust-Nera | Asia/Ust-Nera | +| Asia/Vientiane | Asia/Vientiane | +| Asia/Vladivostok | Asia/Vladivostok | +| Asia/Yakutsk | Asia/Yakutsk | +| Asia/Yangon | Asia/Yangon | +| Asia/Yekaterinburg | Asia/Yekaterinburg | +| Asia/Yerevan | NET | +| Atlantic/Azores | Atlantic/Azores | +| Atlantic/Bermuda | Atlantic/Bermuda | +| Atlantic/Canary | Atlantic/Canary | +| Atlantic/Cape_Verde | Atlantic/Cape_Verde | +| Atlantic/Faeroe | Atlantic/Faeroe | +| Atlantic/Faroe | Atlantic/Faroe | +| Atlantic/Jan_Mayen | Atlantic/Jan_Mayen | +| Atlantic/Madeira | Atlantic/Madeira | +| Atlantic/Reykjavik | Iceland | +| Atlantic/South_Georgia | Atlantic/South_Georgia | +| Atlantic/St_Helena | Iceland | +| Atlantic/Stanley | Atlantic/Stanley | +| Australia/ACT | AET | +| Australia/Adelaide | Australia/Adelaide | +| Australia/Brisbane | Australia/Brisbane | +| Australia/Broken_Hill | Australia/Broken_Hill | +| Australia/Canberra | AET | +| Australia/Currie | Australia/Currie | +| Australia/Darwin | ACT | +| Australia/Eucla | Australia/Eucla | +| Australia/Hobart | Australia/Hobart | +| Australia/LHI | Australia/LHI | +| Australia/Lindeman | Australia/Lindeman | +| Australia/Lord_Howe | Australia/Lord_Howe | +| Australia/Melbourne | Australia/Melbourne | +| Australia/NSW | AET | +| Australia/North | ACT | +| Australia/Perth | Australia/Perth | +| Australia/Queensland | Australia/Queensland | +| Australia/South | Australia/South | +| Australia/Sydney | AET | +| Australia/Tasmania | Australia/Tasmania | +| Australia/Victoria | Australia/Victoria | +| Australia/West | Australia/West | +| Australia/Yancowinna | Australia/Yancowinna | +| BET | BET | +| BST | BST | +| Brazil/Acre | Brazil/Acre | +| Brazil/DeNoronha | Brazil/DeNoronha | +| Brazil/East | BET | +| Brazil/West | Brazil/West | +| CAT | CAT | +| CET | CET | +| CNT | CNT | +| CST | CST | +| CST6CDT | CST6CDT | +| CTT | CTT | +| Canada/Atlantic | Canada/Atlantic | +| Canada/Central | Canada/Central | +| Canada/East-Saskatchewan | Canada/East-Saskatchewan | +| Canada/Eastern | Canada/Eastern | +| Canada/Mountain | Canada/Mountain | +| Canada/Newfoundland | CNT | +| Canada/Pacific | Canada/Pacific | +| Canada/Saskatchewan | Canada/Saskatchewan | +| Canada/Yukon | Canada/Yukon | +| Chile/Continental | Chile/Continental | +| Chile/EasterIsland | Chile/EasterIsland | +| Cuba | Cuba | +| EAT | EAT | +| ECT | ECT | +| EET | EET | +| EST | EST | +| EST5EDT | EST5EDT | +| Egypt | ART | +| Eire | Eire | +| Etc/GMT | GMT | +| Etc/GMT+0 | GMT | +| Etc/GMT+1 | Etc/GMT+1 | +| Etc/GMT+10 | Etc/GMT+10 | +| Etc/GMT+11 | Etc/GMT+11 | +| Etc/GMT+12 | Etc/GMT+12 | +| Etc/GMT+2 | Etc/GMT+2 | +| Etc/GMT+3 | Etc/GMT+3 | +| Etc/GMT+4 | Etc/GMT+4 | +| Etc/GMT+5 | Etc/GMT+5 | +| Etc/GMT+6 | Etc/GMT+6 | +| Etc/GMT+7 | Etc/GMT+7 | +| Etc/GMT+8 | Etc/GMT+8 | +| Etc/GMT+9 | Etc/GMT+9 | +| Etc/GMT-0 | GMT | +| Etc/GMT-1 | Etc/GMT-1 | +| Etc/GMT-10 | Etc/GMT-10 | +| Etc/GMT-11 | Etc/GMT-11 | +| Etc/GMT-12 | Etc/GMT-12 | +| Etc/GMT-13 | Etc/GMT-13 | +| Etc/GMT-14 | Etc/GMT-14 | +| Etc/GMT-2 | Etc/GMT-2 | +| Etc/GMT-3 | Etc/GMT-3 | +| Etc/GMT-4 | Etc/GMT-4 | +| Etc/GMT-5 | Etc/GMT-5 | +| Etc/GMT-6 | Etc/GMT-6 | +| Etc/GMT-7 | Etc/GMT-7 | +| Etc/GMT-8 | Etc/GMT-8 | +| Etc/GMT-9 | Etc/GMT-9 | +| Etc/GMT0 | GMT | +| Etc/Greenwich | GMT | +| Etc/UCT | UCT | +| Etc/UTC | UCT | +| Etc/Universal | UCT | +| Etc/Zulu | UCT | +| Europe/Amsterdam | Europe/Amsterdam | +| Europe/Andorra | Europe/Andorra | +| Europe/Astrakhan | Europe/Astrakhan | +| Europe/Athens | Europe/Athens | +| Europe/Belfast | GB | +| Europe/Belgrade | Europe/Belgrade | +| Europe/Berlin | Europe/Berlin | +| Europe/Bratislava | Europe/Bratislava | +| Europe/Brussels | Europe/Brussels | +| Europe/Bucharest | Europe/Bucharest | +| Europe/Budapest | Europe/Budapest | +| Europe/Busingen | Europe/Busingen | +| Europe/Chisinau | Europe/Chisinau | +| Europe/Copenhagen | Europe/Copenhagen | +| Europe/Dublin | Eire | +| Europe/Gibraltar | Europe/Gibraltar | +| Europe/Guernsey | GB | +| Europe/Helsinki | Europe/Helsinki | +| Europe/Isle_of_Man | GB | +| Europe/Istanbul | Turkey | +| Europe/Jersey | GB | +| Europe/Kaliningrad | Europe/Kaliningrad | +| Europe/Kiev | Europe/Kiev | +| Europe/Kirov | Europe/Kirov | +| Europe/Kyiv | Europe/Kyiv | +| Europe/Lisbon | Portugal | +| Europe/Ljubljana | Europe/Ljubljana | +| Europe/London | GB | +| Europe/Luxembourg | Europe/Luxembourg | +| Europe/Madrid | Europe/Madrid | +| Europe/Malta | Europe/Malta | +| Europe/Mariehamn | Europe/Mariehamn | +| Europe/Minsk | Europe/Minsk | +| Europe/Monaco | ECT | +| Europe/Moscow | W-SU | +| Europe/Nicosia | Europe/Nicosia | +| Europe/Oslo | Europe/Oslo | +| Europe/Paris | ECT | +| Europe/Podgorica | Europe/Podgorica | +| Europe/Prague | Europe/Prague | +| Europe/Riga | Europe/Riga | +| Europe/Rome | Europe/Rome | +| Europe/Samara | Europe/Samara | +| Europe/San_Marino | Europe/San_Marino | +| Europe/Sarajevo | Europe/Sarajevo | +| Europe/Saratov | Europe/Saratov | +| Europe/Simferopol | Europe/Simferopol | +| Europe/Skopje | Europe/Skopje | +| Europe/Sofia | Europe/Sofia | +| Europe/Stockholm | Europe/Stockholm | +| Europe/Tallinn | Europe/Tallinn | +| Europe/Tirane | Europe/Tirane | +| Europe/Tiraspol | Europe/Tiraspol | +| Europe/Ulyanovsk | Europe/Ulyanovsk | +| Europe/Uzhgorod | Europe/Uzhgorod | +| Europe/Vaduz | Europe/Vaduz | +| Europe/Vatican | Europe/Vatican | +| Europe/Vienna | Europe/Vienna | +| Europe/Vilnius | Europe/Vilnius | +| Europe/Volgograd | Europe/Volgograd | +| Europe/Warsaw | Poland | +| Europe/Zagreb | Europe/Zagreb | +| Europe/Zaporozhye | Europe/Zaporozhye | +| Europe/Zurich | Europe/Zurich | +| Factory | Factory | +| GB | GB | +| GB-Eire | GB | +| GMT | GMT | +| GMT+0 | GMT | +| GMT-0 | GMT | +| GMT0 | GMT | +| Greenwich | GMT | +| HST | HST | +| Hongkong | Hongkong | +| IET | IET | +| IST | IST | +| Iceland | Iceland | +| Indian/Antananarivo | EAT | +| Indian/Chagos | Indian/Chagos | +| Indian/Christmas | Indian/Christmas | +| Indian/Cocos | Indian/Cocos | +| Indian/Comoro | EAT | +| Indian/Kerguelen | Indian/Kerguelen | +| Indian/Mahe | Indian/Mahe | +| Indian/Maldives | Indian/Maldives | +| Indian/Mauritius | Indian/Mauritius | +| Indian/Mayotte | EAT | +| Indian/Reunion | Indian/Reunion | +| Iran | Iran | +| Israel | Israel | +| JST | JST | +| Jamaica | Jamaica | +| Japan | JST | +| Kwajalein | Kwajalein | +| Libya | Libya | +| MET | MET | +| MIT | MIT | +| MST | MST | +| MST7MDT | MST7MDT | +| Mexico/BajaNorte | Mexico/BajaNorte | +| Mexico/BajaSur | Mexico/BajaSur | +| Mexico/General | Mexico/General | +| NET | NET | +| NST | NZ | +| NZ | NZ | +| NZ-CHAT | NZ-CHAT | +| Navajo | Navajo | +| PLT | PLT | +| PNT | PNT | +| PRC | CTT | +| PRT | PRT | +| PST | PST | +| PST8PDT | PST8PDT | +| Pacific/Apia | MIT | +| Pacific/Auckland | NZ | +| Pacific/Bougainville | Pacific/Bougainville | +| Pacific/Chatham | NZ-CHAT | +| Pacific/Chuuk | Pacific/Chuuk | +| Pacific/Easter | Pacific/Easter | +| Pacific/Efate | Pacific/Efate | +| Pacific/Enderbury | Pacific/Enderbury | +| Pacific/Fakaofo | Pacific/Fakaofo | +| Pacific/Fiji | Pacific/Fiji | +| Pacific/Funafuti | Pacific/Funafuti | +| Pacific/Galapagos | Pacific/Galapagos | +| Pacific/Gambier | Pacific/Gambier | +| Pacific/Guadalcanal | SST | +| Pacific/Guam | Pacific/Guam | +| Pacific/Honolulu | Pacific/Honolulu | +| Pacific/Johnston | Pacific/Johnston | +| Pacific/Kanton | Pacific/Kanton | +| Pacific/Kiritimati | Pacific/Kiritimati | +| Pacific/Kosrae | Pacific/Kosrae | +| Pacific/Kwajalein | Kwajalein | +| Pacific/Majuro | Pacific/Majuro | +| Pacific/Marquesas | Pacific/Marquesas | +| Pacific/Midway | Pacific/Midway | +| Pacific/Nauru | Pacific/Nauru | +| Pacific/Niue | Pacific/Niue | +| Pacific/Norfolk | Pacific/Norfolk | +| Pacific/Noumea | Pacific/Noumea | +| Pacific/Pago_Pago | Pacific/Pago_Pago | +| Pacific/Palau | Pacific/Palau | +| Pacific/Pitcairn | Pacific/Pitcairn | +| Pacific/Pohnpei | SST | +| Pacific/Ponape | SST | +| Pacific/Port_Moresby | Pacific/Port_Moresby | +| Pacific/Rarotonga | Pacific/Rarotonga | +| Pacific/Saipan | Pacific/Saipan | +| Pacific/Samoa | Pacific/Samoa | +| Pacific/Tahiti | Pacific/Tahiti | +| Pacific/Tarawa | Pacific/Tarawa | +| Pacific/Tongatapu | Pacific/Tongatapu | +| Pacific/Truk | Pacific/Truk | +| Pacific/Wake | Pacific/Wake | +| Pacific/Wallis | Pacific/Wallis | +| Pacific/Yap | Pacific/Yap | +| Poland | Poland | +| Portugal | Portugal | +| ROC | ROC | +| ROK | ROK | +| SST | SST | +| Singapore | Singapore | +| SystemV/AST4 | SystemV/AST4 | +| SystemV/AST4ADT | SystemV/AST4ADT | +| SystemV/CST6 | SystemV/CST6 | +| SystemV/CST6CDT | SystemV/CST6CDT | +| SystemV/EST5 | SystemV/EST5 | +| SystemV/EST5EDT | SystemV/EST5EDT | +| SystemV/HST10 | SystemV/HST10 | +| SystemV/MST7 | SystemV/MST7 | +| SystemV/MST7MDT | SystemV/MST7MDT | +| SystemV/PST8 | SystemV/PST8 | +| SystemV/PST8PDT | SystemV/PST8PDT | +| SystemV/YST9 | SystemV/YST9 | +| SystemV/YST9YDT | SystemV/YST9YDT | +| Turkey | Turkey | +| UCT | UCT | +| US/Alaska | AST | +| US/Aleutian | US/Aleutian | +| US/Arizona | PNT | +| US/Central | CST | +| US/East-Indiana | IET | +| US/Eastern | US/Eastern | +| US/Hawaii | US/Hawaii | +| US/Indiana-Starke | US/Indiana-Starke | +| US/Michigan | US/Michigan | +| US/Mountain | Navajo | +| US/Pacific | PST | +| US/Pacific-New | PST | +| US/Samoa | US/Samoa | +| UTC | UCT | +| Universal | UCT | +| VST | VST | +| W-SU | W-SU | +| WET | WET | +| Zulu | UCT | \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/typecasting.md b/docs/archive/1.0/sql/data_types/typecasting.md new file mode 100644 index 00000000000..a36e016dd62 --- /dev/null +++ b/docs/archive/1.0/sql/data_types/typecasting.md @@ -0,0 +1,110 @@ +--- +layout: docu +title: Typecasting +--- + +Typecasting is an operation that converts a value in one particular data type to the closest corresponding value in another data type. +Like other SQL engines, DuckDB supports both implicit and explicit typecasting. + +## Explicit Casting + +Explicit typecasting is performed by using a `CAST` expression. For example, `CAST(col AS VARCHAR)` or `col::VARCHAR` explicitly cast the column `col` to `VARCHAR`. See the [cast page]({% link docs/archive/1.0/sql/expressions/cast.md %}) for more information. + +## Implicit Casting + +In many situations, the system will add casts by itself. This is called *implicit* casting. This happens for example when a function is called with an argument that does not match the type of the function, but can be casted to the desired type. + +Consider the function `sin(DOUBLE)`. This function takes as input argument a column of type `DOUBLE`, however, it can be called with an integer as well: `sin(1)`. The integer is converted into a double before being passed to the `sin` function. + +Implicit casts can only be added for a number of type combinations, and is generally only possible when the cast cannot fail. For example, an implicit cast can be added from `INTEGER` to `DOUBLE` – but not from `DOUBLE` to `INTEGER`. + +## Casting Operations Matrix + +Values of a particular data type cannot always be cast to any arbitrary target data type. The only exception is the `NULL` value – which can always be converted between types. +The following matrix describes which conversions are supported. +When implicit casting is allowed, it implies that explicit casting is also possible. + +![Typecasting matrix](/images/typecasting-matrix.png) + +Even though a casting operation is supported based on the source and target data type, it does not necessarily mean the cast operation will succeed at runtime. + +> Deprecated Prior to version 0.10.0, DuckDB allowed any type to be implicitly cast to `VARCHAR` during function binding. +> Version 0.10.0 introduced a [breaking change which no longer allows implicit casts to `VARCHAR`]({% post_url 2024-02-13-announcing-duckdb-0100 %}#breaking-sql-changes). +> The [`old_implicit_casting` configuration option]({% link docs/archive/1.0/configuration/pragmas.md %}#implicit-casting-to-varchar) setting can be used to revert to the old behavior. +> However, please note that this flag will be deprecated in the future. + +### Lossy Casts + +Casting operations that result in loss of precision are allowed. For example, it is possible to explicitly cast a numeric type with fractional digits like `DECIMAL`, `FLOAT` or `DOUBLE` to an integral type like `INTEGER`. The number will be rounded. + +```sql +SELECT CAST(3.5 AS INTEGER); +``` + +### Overflows + +Casting operations that would result in a value overflow throw an error. For example, the value `999` is too large to be represented by the `TINYINT` data type. Therefore, an attempt to cast that value to that type results in a runtime error: + +```sql +SELECT CAST(999 AS TINYINT); +``` + +```console +Conversion Error: Type INT32 with value 999 can't be cast because the value is out of range for the destination type INT8 +``` + +So even though the cast operation from `INTEGER` to `TINYINT` is supported, it is not possible for this particular value. [TRY_CAST]({% link docs/archive/1.0/sql/expressions/cast.md %}) can be used to convert the value into `NULL` instead of throwing an error. + +### Varchar + +The [`VARCHAR`]({% link docs/archive/1.0/sql/data_types/text.md %}) type acts as a univeral target: any arbitrary value of any arbitrary type can always be cast to the `VARCHAR` type. This type is also used for displaying values in the shell. + +```sql +SELECT CAST(42.5 AS VARCHAR); +``` + +Casting from `VARCHAR` to another data type is supported, but can raise an error at runtime if DuckDB cannot parse and convert the provided text to the target data type. + +```sql +SELECT CAST('NotANumber' AS INTEGER); +``` + +In general, casting to `VARCHAR` is a lossless operation and any type can be cast back to the original type after being converted into text. + +```sql +SELECT CAST(CAST([1, 2, 3] AS VARCHAR) AS INTEGER[]); +``` + +### Literal Types + +Integer literals (such as `42`) and string literals (such as `'string'`) have special implicit casting rules. See the [literal types page]({% link docs/archive/1.0/sql/data_types/literal_types.md %}) for more information. + +### Lists / Arrays + +Lists can be explicitly cast to other lists using the same casting rules. The cast is applied to the children of the list. For example, if we convert a `INTEGER[]` list to a `VARCHAR[]` list, the child `INTEGER` elements are individually cast to `VARCHAR` and a new list is constructed. + +```sql +SELECT CAST([1, 2, 3] AS VARCHAR[]); +``` + +### Arrays + +Arrays follow the same casting rules as lists. In addition, arrays can be implicitly cast to lists of the same type. For example, an `INTEGER[3]` array can be implicitly cast to an `INTEGER[]` list. + +### Structs + +Structs can be cast to other structs as long as the names of the child elements match. + +```sql +SELECT CAST({'a': 42} AS STRUCT(a VARCHAR)); +``` + +The names of the struct can also be in a different order. The fields of the struct will be reshuffled based on the names of the structs. + +```sql +SELECT CAST({'a': 42, 'b': 84} AS STRUCT(b VARCHAR, a VARCHAR)); +``` + +### Unions + +Union casting rules can be found on the [`UNION type page`]({% link docs/archive/1.0/sql/data_types/union.md %}#casting-to-unions). \ No newline at end of file diff --git a/docs/archive/1.0/sql/data_types/union.md b/docs/archive/1.0/sql/data_types/union.md new file mode 100644 index 00000000000..1302d14e60e --- /dev/null +++ b/docs/archive/1.0/sql/data_types/union.md @@ -0,0 +1,114 @@ +--- +layout: docu +title: Union Type +--- + +A `UNION` *type* (not to be confused with the SQL [`UNION` operator]({% link docs/archive/1.0/sql/query_syntax/setops.md %}#union-all-by-name)) is a nested type capable of holding one of multiple “alternative” values, much like the `union` in C. The main difference being that these `UNION` types are *tagged unions* and thus always carry a discriminator “tag” which signals which alternative it is currently holding, even if the inner value itself is null. `UNION` types are thus more similar to C++17's `std::variant`, Rust's `Enum` or the “sum type” present in most functional languages. + +`UNION` types must always have at least one member, and while they can contain multiple members of the same type, the tag names must be unique. `UNION` types can have at most 256 members. + +Under the hood, `UNION` types are implemented on top of `STRUCT` types, and simply keep the “tag” as the first entry. + +`UNION` values can be created with the [`union_value(tag := expr)`]({% link docs/archive/1.0/sql/functions/nested.md %}#union-functions) function or by [casting from a member type](#casting-to-unions). + +## Example + +Create a table with a `UNION` column: + +```sql +CREATE TABLE tbl1 (u UNION(num INTEGER, str VARCHAR)); +INSERT INTO tbl1 values (1), ('two'), (union_value(str := 'three')); +``` + +Any type can be implicitly cast to a `UNION` containing the type. Any `UNION` can also be implicitly cast to another `UNION` if the source `UNION` members are a subset of the target's (if the cast is unambiguous). + +`UNION` uses the member types' `VARCHAR` cast functions when casting to `VARCHAR`: + +```sql +SELECT u FROM tbl1; +``` + +| u | +|-------| +| 1 | +| two | +| three | + +Select all the `str` members: + +```sql +SELECT union_extract(u, 'str') AS str +FROM tbl1; +``` + +| str | +|-------| +| NULL | +| two | +| three | + +Alternatively, you can use 'dot syntax' similarly to [`STRUCT`s]({% link docs/archive/1.0/sql/data_types/struct.md %}). + +```sql +SELECT u.str +FROM tbl1; +``` + +| str | +|-------| +| NULL | +| two | +| three | + +Select the currently active tag from the `UNION` as an `ENUM`. + +```sql +SELECT union_tag(u) AS t +FROM tbl1; +``` + +| t | +|-----| +| num | +| str | +| str | + +## Union Casts + +Compared to other nested types, `UNION`s allow a set of implicit casts to facilitate unintrusive and natural usage when working with their members as “subtypes”. +However, these casts have been designed with two principles in mind, to avoid ambiguity and to avoid casts that could lead to loss of information. This prevents `UNION`s from being completely “transparent”, while still allowing `UNION` types to have a “supertype” relationship with their members. + +Thus `UNION` types can't be implicitly cast to any of their member types in general, since the information in the other members not matching the target type would be “lost”. If you want to coerce a `UNION` into one of its members, you should use the `union_extract` function explicitly instead. + +The only exception to this is when casting a `UNION` to `VARCHAR`, in which case the members will all use their corresponding `VARCHAR` casts. Since everything can be cast to `VARCHAR`, this is “safe” in a sense. + +### Casting to Unions + +A type can always be implicitly cast to a `UNION` if it can be implicitly cast to one of the `UNION` member types. + +* If there are multiple candidates, the built in implicit casting priority rules determine the target type. For example, a `FLOAT` → `UNION(i INTEGER, v VARCHAR)` cast will always cast the `FLOAT` to the `INTEGER` member before `VARCHAR`. +* If the cast still is ambiguous, i.e., there are multiple candidates with the same implicit casting priority, an error is raised. This usually happens when the `UNION` contains multiple members of the same type, e.g., a `FLOAT` → `UNION(i INTEGER, num INTEGER)` is always ambiguous. + +So how do we disambiguate if we want to create a `UNION` with multiple members of the same type? By using the `union_value` function, which takes a keyword argument specifying the tag. For example, `union_value(num := 2::INTEGER)` will create a `UNION` with a single member of type `INTEGER` with the tag `num`. This can then be used to disambiguate in an explicit (or implicit, read on below!) `UNION` to `UNION` cast, like `CAST(union_value(b := 2) AS UNION(a INTEGER, b INTEGER))`. + +### Casting between Unions + +`UNION` types can be cast between each other if the source type is a “subset” of the target type. In other words, all the tags in the source `UNION` must be present in the target `UNION`, and all the types of the matching tags must be implicitly castable between source and target. In essence, this means that `UNION` types are covariant with respect to their members. + +
+ +| Ok | Source | Target | Comments | +|----|------------------------|------------------------|----------------------------------------| +| ✅ | `UNION(a A, b B)` | `UNION(a A, b B, c C)` | | +| ✅ | `UNION(a A, b B)` | `UNION(a A, b C)` | if `B` can be implicitly cast to `C` | +| ❌ | `UNION(a A, b B, c C)` | `UNION(a A, b B)` | | +| ❌ | `UNION(a A, b B)` | `UNION(a A, b C)` | if `B` can't be implicitly cast to `C` | +| ❌ | `UNION(A, B, D)` | `UNION(A, B, C)` | | + +## Comparison and Sorting + +Since `UNION` types are implemented on top of `STRUCT` types internally, they can be used with all the comparison operators as well as in both `WHERE` and `HAVING` clauses with [the same semantics as `STRUCT`s]({% link docs/archive/1.0/sql/data_types/struct.md %}#comparison-operators). The “tag” is always stored as the first struct entry, which ensures that the `UNION` types are compared and ordered by “tag” first. + +## Functions + +See [Nested Functions]({% link docs/archive/1.0/sql/functions/nested.md %}#union-functions). \ No newline at end of file diff --git a/docs/archive/1.0/sql/dialect/friendly_sql.md b/docs/archive/1.0/sql/dialect/friendly_sql.md new file mode 100644 index 00000000000..6c5e11e9ccd --- /dev/null +++ b/docs/archive/1.0/sql/dialect/friendly_sql.md @@ -0,0 +1,99 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/guides/sql_features/friendly_sql +title: Friendly SQL +--- + +DuckDB offers several advanced SQL features as well syntactic sugar to make SQL queries more concise. We call these colloquially as “friendly SQL”. + +> Several of these features are also supported in other systems while some are (currently) exclusive to DuckDB. + +## Clauses + +* Creating tables and inserting data: + * [`CREATE OR REPLACE TABLE`]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-or-replace): this clause allows avoiding `DROP TABLE IF EXISTS` statements in scripts. + * [`CREATE TABLE ... AS SELECT` (CTAS)]({% link docs/archive/1.0/sql/statements/create_table.md %}#create-table--as-select-ctas): this clause allows creating a new table from the output of a table without manually defining a schema. + * [`INSERT INTO ... BY NAME`]({% link docs/archive/1.0/sql/statements/insert.md %}#insert-into--by-name): this variant of the `INSERT` statement allows using column names instead of positions. +* Describing tables and computing statistics: + * [`DESCRIBE`]({% link docs/archive/1.0/guides/meta/describe.md %}): this clause provides a succinct summary of the schema of a table or query. + * [`SUMMARIZE`]({% link docs/archive/1.0/guides/meta/summarize.md %}): this clause returns summary statistics for a table or query. +* Making SQL clauses more compact: + * [`FROM`-first syntax with an optional `SELECT` clause]({% link docs/archive/1.0/sql/query_syntax/from.md %}#from-first-syntax): DuckDB allows queries in the form of `FROM tbl` which selects all columns (performing a `SELECT *` statement). + * [`GROUP BY ALL`]({% link docs/archive/1.0/sql/query_syntax/groupby.md %}#group-by-all): this clause allows omitting the group-by columns by inferring them from the list of attributes in the `SELECT` clause. + * [`ORDER BY ALL`]({% link docs/archive/1.0/sql/query_syntax/orderby.md %}#order-by-all): this clause allows ordering on all columns (e.g., to ensure deterministic results). + * [`SELECT * EXCLUDE`]({% link docs/archive/1.0/sql/expressions/star.md %}#exclude-clause): the `EXCLUDE` option allows excluding specific columns from the `*` expression. + * [`SELECT * REPLACE`]({% link docs/archive/1.0/sql/expressions/star.md %}#replace-clause): the `REPLACE` option allows replacing specific columns with different expressions in a `*` expression. + * [`UNION BY NAME`]({% link docs/archive/1.0/sql/query_syntax/setops.md %}#union-all-by-name): this clause performing the `UNION` operation along the names of columns (instead of relying on positions). +* Transforming tables: + * [`PIVOT`]({% link docs/archive/1.0/sql/statements/pivot.md %}) to turn long tables to wide tables. + * [`UNPIVOT`]({% link docs/archive/1.0/sql/statements/unpivot.md %}) to turn wide tables to long tables. + +## Query Features + +* [Column aliases in `WHERE`, `GROUP BY`, and `HAVING`]({% post_url 2022-05-04-friendlier-sql %}#column-aliases-in-where--group-by--having) +* [`COLUMNS()` expression]({% link docs/archive/1.0/sql/expressions/star.md %}#columns-expression) can be used to execute the same expression on multiple columns: + * [with regular expressions]({% post_url 2023-08-23-even-friendlier-sql %}#columns-with-regular-expressions) + * [with `EXCLUDE` and `REPLACE`]({% post_url 2023-08-23-even-friendlier-sql %}#columns-with-exclude-and-replace) + * [with lambda functions]({% post_url 2023-08-23-even-friendlier-sql %}#columns-with-lambda-functions) +* Reusable column aliases, e.g.: `SELECT i + 1 AS j, j + 2 AS k FROM range(0, 3) t(i)` +* Advanced aggregation features for analytical (OLAP) queries: + * [`FILTER` clause]({% link docs/archive/1.0/sql/query_syntax/filter.md %}) + * [`GROUPING SETS`, `GROUP BY CUBE`, `GROUP BY ROLLUP` clauses]({% link docs/archive/1.0/sql/query_syntax/grouping_sets.md %}) +* [`count()` shorthand]({% link docs/archive/1.0/sql/functions/aggregates.md %}) for `count(*)` + +## Literals and Identifiers + +* [Case-insensitivity while maintaining case of entities in the catalog]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#case-sensitivity-of-identifiers) +* [Deduplicating identifiers]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#deduplicating-identifiers) +* [Underscores as digit separators in numeric literals]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#numeric-literals) + +## Data Types + +* [`MAP` data type]({% link docs/archive/1.0/sql/data_types/map.md %}) +* [`UNION` data type]({% link docs/archive/1.0/sql/data_types/union.md %}) + +## Data Import + +* [Auto-detecting the headers and schema of CSV files]({% link docs/archive/1.0/data/csv/auto_detection.md %}) +* Directly querying [CSV files]({% link docs/archive/1.0/data/csv/overview.md %}) and [Parquet files]({% link docs/archive/1.0/data/parquet/overview.md %}) +* Loading from files using the syntax `FROM 'my.csv'`, `FROM 'my.csv.gz'`, `FROM 'my.parquet'`, etc. +* [Filename expansion (globbing)]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#globbing), e.g.: `FROM 'my-data/part-*.parquet'` + +## Functions and Expressions + +* [Dot operator for function chaining]({% link docs/archive/1.0/sql/functions/overview.md %}#function-chaining-via-the-dot-operator): `SELECT ('hello').upper()` +* String formatters: + the [`format()` function with the `fmt` syntax]({% link docs/archive/1.0/sql/functions/char.md %}#fmt-syntax) and + the [`printf() function`]({% link docs/archive/1.0/sql/functions/char.md %}#printf-syntax) +* [List comprehensions]({% post_url 2023-08-23-even-friendlier-sql %}#list-comprehensions) +* [List slicing]({% post_url 2022-05-04-friendlier-sql %}#string-slicing) +* [String slicing]({% post_url 2022-05-04-friendlier-sql %}#string-slicing) +* [`STRUCT.*` notation]({% post_url 2022-05-04-friendlier-sql %}#struct-dot-notation) +* [Simple `LIST` and `STRUCT` creation]({% post_url 2022-05-04-friendlier-sql %}#simple-list-and-struct-creation) + +## Join Types + +* [`ASOF` joins]({% link docs/archive/1.0/sql/query_syntax/from.md %}#as-of-joins) +* [`LATERAL` joins]({% link docs/archive/1.0/sql/query_syntax/from.md %}#lateral-joins) +* [`POSITIONAL` joins]({% link docs/archive/1.0/sql/query_syntax/from.md %}#positional-joins) + +## Trailing Commas + +DuckDB allows [trailing commas](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Trailing_commas), +both when listing entities (e.g., column and table names) and when constructing [`LIST` items]({% link docs/archive/1.0/sql/data_types/list.md %}#creating-lists). +For example, the following query works: + +```sql +SELECT + 42 AS x, + ['a', 'b', 'c',] AS y, + 'hello world' AS z, +; +``` + +## Related Blog Posts + +* [“Friendlier SQL with DuckDB”]({% post_url 2022-05-04-friendlier-sql %}) blog post +* [“Even Friendlier SQL with DuckDB”]({% post_url 2023-08-23-even-friendlier-sql %}) blog post +* [“SQL Gymnastics: Bending SQL into Flexible New Shapes”]({% post_url 2024-03-01-sql-gymnastics %}) blog post \ No newline at end of file diff --git a/docs/archive/1.0/sql/dialect/keywords_and_identifiers.md b/docs/archive/1.0/sql/dialect/keywords_and_identifiers.md new file mode 100644 index 00000000000..7dec8808027 --- /dev/null +++ b/docs/archive/1.0/sql/dialect/keywords_and_identifiers.md @@ -0,0 +1,119 @@ +--- +layout: docu +redirect_from: +- /docs/archive/1.0/sql/case_sensitivity +- /docs/archive/1.0/sql/case_sensitivity/ +- /docs/archive/1.0/sql/keywords-and-identifiers +- /docs/archive/1.0/sql/dialect/keywords-and-identifiers +title: Keywords and Identifiers +--- + +## Identifiers + +Similarly to other SQL dialects and programming languages, identifiers in DuckDB's SQL are subject to several rules. + +* Unquoted identifiers need to conform to a number of rules: + * They must not be a reserved keyword (see [`duckdb_keywords()`]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_keywords)), e.g., `SELECT 123 AS SELECT` will fail. + * They must not start with a number or special character, e.g., `SELECT 123 AS 1col` is invalid. + * They cannot contain whitespaces (including tabs and newline characters). +* Identifiers can be quoted using double-quote characters (`"`). Quoted identifiers can use any keyword, whitespace or special character, e.g., `"SELECT"` and `" § 🦆 ¶ "` are valid identifiers. +* Double quotes can be escaped by repeating the quote character, e.g., to create an identifier named `IDENTIFIER "X"`, use `"IDENTIFIER ""X"""`. + +### Deduplicating Identifiers + +In some cases, duplicate identifiers can occur, e.g., column names may conflict when unnesting a nested data structure. +In these cases, DuckDB automatically deduplicates column names by renaming them according to the following rules: + +* For a column named `⟨name⟩`, the first instance is not renamed. +* Subsequent instances are renamed to `⟨name⟩_⟨count⟩`, where `⟨count⟩` starts at 1. + +For example: + +```sql +SELECT * +FROM (SELECT UNNEST({'a': 42, 'b': {'a': 88, 'b': 99}}, recursive := true)); +``` + +| a | a_1 | b | +|---:|----:|---:| +| 42 | 88 | 99 | + +## Database Names + +Database names are subject to the rules for [identifiers](#identifiers). + +Additionally, it is best practice to avoid DuckDB's two internal [database schema names]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_databases), `system` and `temp`. +By default, persistent databases are named after their filename without the extension. +Therefore, the filenames `system.db` and `temp.db` (as well as `system.duckdb` and `temp.duckdb`) result in the database names `system` and `temp`, respectively. +If you need to attach to a database that has one of these names, use an alias, e.g.: + +```sql +ATTACH 'temp.db' AS temp2; +USE temp2; +``` + +## Rules for Case-Sensitivity + +### Keywords and Function Names + +SQL keywords and function names are case-insensitive in DuckDB. + +For example, the following two queries are equivalent: + +```matlab +select COS(Pi()) as CosineOfPi; +SELECT cos(pi()) AS CosineOfPi; +``` + +| CosineOfPi | +|-----------:| +| -1.0 | + +### Case-Sensitivity of Identifiers + +Identifiers in DuckDB are always case-insensitive, similarly to PostgreSQL. +However, unlike PostgreSQL (and some other major SQL implementations), DuckDB also treats quoted identifiers as case-insensitive. + +Despite treating identifiers in a case-insensitive manner, each character's case (uppercase/lowercase) is maintained as originally specified by the user even if a query uses different cases when referring to the identifier. +For example: + +```sql +CREATE TABLE tbl AS SELECT cos(pi()) AS CosineOfPi; +SELECT cosineofpi FROM tbl; +``` + +| CosineOfPi | +|-----------:| +| -1.0 | + +To change this behavior, set the `preserve_identifier_case` [configuration option]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference) to `false`. + +#### Handling Conflicts + +In case of a conflict, when the same identifier is spelt with different cases, one will be selected randomly. For example: + +```sql +CREATE TABLE t1 (idfield INTEGER, x INTEGER); +CREATE TABLE t2 (IdField INTEGER, y INTEGER); +INSERT INTO t1 VALUES (1, 123); +INSERT INTO t2 VALUES (1, 456); +SELECT * FROM t1 NATURAL JOIN t2; +``` + +| idfield | x | y | +|--------:|----:|----:| +| 1 | 123 | 456 | + +#### Disabling Preserving Cases + +With the `preserve_identifier_case` [configuration option]({% link docs/archive/1.0/configuration/overview.md %}#configuration-reference) set to `false`, all identifiers are turned into lowercase: + +```sql +SET preserve_identifier_case = false; +CREATE TABLE tbl AS SELECT cos(pi()) AS CosineOfPi; +SELECT CosineOfPi FROM tbl; +``` + +| cosineofpi | +|-----------:| +| -1.0 | \ No newline at end of file diff --git a/docs/archive/1.0/sql/dialect/order_preservation.md b/docs/archive/1.0/sql/dialect/order_preservation.md new file mode 100644 index 00000000000..ac394d5af82 --- /dev/null +++ b/docs/archive/1.0/sql/dialect/order_preservation.md @@ -0,0 +1,17 @@ +--- +layout: docu +title: Order Preservation +--- + +For many operations, DuckDB preserves the insertion order of rows, similarly to data frame libraries such as Pandas. +The following operations and components respect insertion order: + +* [The CSV reader]({% link docs/archive/1.0/data/csv/overview.md %}#order-preservation) + +Preservation of insertion order is controlled by the `preserve_insertion_order` [configuration option]({% link docs/archive/1.0/configuration/overview.md %}). +This setting is `true` by default, indicating that the order should be preserved. +To change this setting, use: + +```sql +SET preserve_insertion_order = false; +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/dialect/overview.md b/docs/archive/1.0/sql/dialect/overview.md new file mode 100644 index 00000000000..0eb57d4500f --- /dev/null +++ b/docs/archive/1.0/sql/dialect/overview.md @@ -0,0 +1,9 @@ +--- +layout: docu +title: Overview +--- + +DuckDB's SQL dialect is based on PostgreSQL. +DuckDB tries to closely match PostgreSQL's semantics, however, some use cases require slightly different behavior. +For example, interchangability with data frame libraries necessitates [order preservation of inserts]({% link docs/archive/1.0/sql/dialect/order_preservation.md %}) to be supported by default. +These differences are documented in the pages below. \ No newline at end of file diff --git a/docs/archive/1.0/sql/dialect/postgresql_compatibility.md b/docs/archive/1.0/sql/dialect/postgresql_compatibility.md new file mode 100644 index 00000000000..18251bfb83c --- /dev/null +++ b/docs/archive/1.0/sql/dialect/postgresql_compatibility.md @@ -0,0 +1,171 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/sql/postgresl_compatibility +title: PostgreSQL Compatibility +--- + +DuckDB's SQL dialect closely follows the conventions of the PostgreSQL dialect. +The few exceptions to this are listed on this page. + +## Floating-Point Arithmetic + +DuckDB and PostgreSQL handle floating-point arithmetic differently for division by zero. Neither system confirm the [IEEE Standard for Floating-Point Arithmetic (IEEE 754)](https://en.wikipedia.org/wiki/IEEE_754). +On operations involving infinity values, DuckDB and PostgreSQL align with each other and conform to IEEE 754. +To show the differences, run the following SQL queries: + +```sql +SELECT 1.0 / 0.0 AS x; +SELECT 0.0 / 0.0 AS x; +SELECT -1.0 / 0.0 AS x; +SELECT 'Infinity'::FLOAT / 'Infinity'::FLOAT AS x; +SELECT 1.0 / 'Infinity'::FLOAT AS x; +SELECT 'Infinity'::FLOAT - 'Infinity'::FLOAT AS x; +SELECT 'Infinity'::FLOAT - 1.0 AS x; +``` + +
+ +| Expression | DuckDB | PostgreSQL | IEEE 754 | +| :---------------------- | -------: | ---------: | --------: | +| 1.0 / 0.0 | NULL | error | Infinity | +| 0.0 / 0.0 | NULL | error | NaN | +| -1.0 / 0.0 | NULL | error | -Infinity | +| 'Infinity' / 'Infinity' | NaN | NaN | NaN | +| 1.0 / 'Infinity' | 0.0 | 0.0 | 0.0 | +| 'Infinity' - 'Infinity' | NaN | NaN | NaN | +| 'Infinity' - 1.0 | Infinity | Infinity | Infinity | + +## Division on Integers + +When computing division on integers, PostgreSQL performs integer division, while DuckDB performs float division: + +```sql +SELECT 1 / 2 AS x; +``` + +PostgreSQL returns: + +```text + x +--- + 0 +(1 row) +``` + +DuckDB returns: + +| x | +| ---: | +| 0.5 | + +To perform integer division in DuckDB, use the `//` operator: + +```sql +SELECT 1 // 2 AS x; +``` + +| x | +| ---: | +| 0 | + +## `UNION` of Boolean and Integer Values + +The following query fails in PostgreSQL but successfully completes in DuckDB: + +```sql +SELECT true AS x +UNION +SELECT 2; +``` + +PostgreSQL returns an error: + +```console +ERROR: UNION types boolean and integer cannot be matched +``` + +DuckDB performs an enforced cast, therefore, it completes the query and returns the following: + +| x | +| ---: | +| 1 | +| 2 | + +## Case Sensitivity for Quoted Identifiers + +PostgreSQL is case-insensitive. The way PostgreSQL achieves case insensitivity is by lowercasing unquoted identifiers within SQL, whereas quoting preserves case, e.g., the following command creates a table named `mytable` but tries to query for `MyTaBLe` because quotes preserve the case. + +```sql +CREATE TABLE MyTaBLe(x INT); +SELECT * FROM "MyTaBLe"; +``` + +```console +ERROR: relation "MyTaBLe" does not exist +``` + +PostgreSQL does not only treat quoted identifiers as case-sensitive, PostgreSQL treats all identifiers as case-sensitive, e.g., this also does not work: + +```sql +CREATE TABLE "PreservedCase"(x INT); +SELECT * FROM PreservedCase; +``` + +```console +ERROR: relation "preservedcase" does not exist +``` + +Therefore, case-insensitivity in PostgreSQL only works if you never use quoted identifiers with different cases. + +For DuckDB, this behavior was problematic when interfacing with other tools (e.g., Parquet, Pandas) that are case-sensitive by default - since all identifiers would be lowercased all the time. +Therefore, DuckDB achieves case insensitivity by making identifiers fully case insensitive throughout the system but [_preserving their case_]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#rules-for-case-sensitivity). + +In DuckDB, the scripts above complete successfully: + +```sql +CREATE TABLE MyTaBLe(x INT); +SELECT * FROM "MyTaBLe"; +CREATE TABLE "PreservedCase"(x INT); +SELECT * FROM PreservedCase; +SELECT table_name FROM duckdb_tables(); +``` + +
+ +| table_name | +| ------------- | +| MyTaBLe | +| PreservedCase | + +PostgreSQL's behavior of lowercasing identifiers is accessible using the [`preserve_identifier_case` option]({% link docs/archive/1.0/configuration/overview.md %}#local-configuration-options): + +```sql +SET preserve_identifier_case = false; +CREATE TABLE MyTaBLe(x INT); +SELECT table_name FROM duckdb_tables(); +``` + +
+ +| table_name | +| ---------- | +| mytable | + +However, the case insensitive matching in the system for identifiers cannot be turned off. + +## Scalar Subqueries + +Subqueries in DuckDB are not required to return a single row. Take the following query for example: + +```sql +SELECT (SELECT 1 UNION SELECT 2) AS b; +``` + +PostgreSQL returns an error: + +```console +ERROR: more than one row returned by a subquery used as an expression +``` + +DuckDB non-deterministically returns either `1` or `2`. \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/case.md b/docs/archive/1.0/sql/expressions/case.md new file mode 100644 index 00000000000..bdc4941fa7b --- /dev/null +++ b/docs/archive/1.0/sql/expressions/case.md @@ -0,0 +1,77 @@ +--- +layout: docu +railroad: expressions/case.js +title: CASE Statement +--- + +
+ +The `CASE` statement performs a switch based on a condition. The basic form is identical to the ternary condition used in many programming languages (`CASE WHEN cond THEN a ELSE b END` is equivalent to `cond ? a : b`). With a single condition this can be expressed with `IF(cond, a, b)`. + +```sql +CREATE OR REPLACE TABLE integers AS SELECT unnest([1, 2, 3]) AS i; +SELECT i, CASE WHEN i > 2 THEN 1 ELSE 0 END AS test +FROM integers; +``` + +| i | test | +|--:|-----:| +| 1 | 0 | +| 2 | 0 | +| 3 | 1 | + +This is equivalent to: + +```sql +SELECT i, IF(i > 2, 1, 0) AS test +FROM integers; +``` + +The `WHEN cond THEN expr` part of the `CASE` statement can be chained, whenever any of the conditions returns true for a single tuple, the corresponding expression is evaluated and returned. + +```sql +CREATE OR REPLACE TABLE integers AS SELECT unnest([1, 2, 3]) AS i; +SELECT i, CASE WHEN i = 1 THEN 10 WHEN i = 2 THEN 20 ELSE 0 END AS test +FROM integers; +``` + +| i | test | +|--:|-----:| +| 1 | 10 | +| 2 | 20 | +| 3 | 0 | + +The `ELSE` part of the `CASE` statement is optional. If no else statement is provided and none of the conditions match, the `CASE` statement will return `NULL`. + +```sql +CREATE OR REPLACE TABLE integers AS SELECT unnest([1, 2, 3]) AS i; +SELECT i, CASE WHEN i = 1 THEN 10 END AS test +FROM integers; +``` + +| i | test | +|--:|-----:| +| 1 | 10 | +| 2 | NULL | +| 3 | NULL | + +It is also possible to provide an individual expression after the `CASE` but before the `WHEN`. When this is done, the `CASE` statement is effectively transformed into a switch statement. + +```sql +CREATE OR REPLACE TABLE integers AS SELECT unnest([1, 2, 3]) AS i; +SELECT i, CASE i WHEN 1 THEN 10 WHEN 2 THEN 20 WHEN 3 THEN 30 END AS test +FROM integers; +``` + +| i | test | +|--:|-----:| +| 1 | 10 | +| 2 | 20 | +| 3 | 30 | + +This is equivalent to: + +```sql +SELECT i, CASE WHEN i = 1 THEN 10 WHEN i = 2 THEN 20 WHEN i = 3 THEN 30 END AS test +FROM integers; +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/cast.md b/docs/archive/1.0/sql/expressions/cast.md new file mode 100644 index 00000000000..f677014845a --- /dev/null +++ b/docs/archive/1.0/sql/expressions/cast.md @@ -0,0 +1,60 @@ +--- +layout: docu +railroad: expressions/cast.js +title: Casting +--- + +
+ +Casting refers to the operation of converting a value in a particular data type to the corresponding value in another data type. +Casting can occur either implicitly or explicitly. The syntax described here performs an explicit cast. More information on casting can be found on the [typecasting page]({% link docs/archive/1.0/sql/data_types/typecasting.md %}). + +## Explicit Casting + +The standard SQL syntax for explicit casting is `CAST(expr AS TYPENAME)`, where `TYPENAME` is a name (or alias) of one of [DuckDB's data types]({% link docs/archive/1.0/sql/data_types/overview.md %}). DuckDB also supports the shorthand `expr::TYPENAME`, which is also present in PostgreSQL. + +```sql +SELECT CAST(i AS VARCHAR) AS i FROM generate_series(1, 3) tbl(i); +``` + +| i | +|---| +| 1 | +| 2 | +| 3 | + +```sql +SELECT i::DOUBLE AS i FROM generate_series(1, 3) tbl(i); +``` + +| i | +|----:| +| 1.0 | +| 2.0 | +| 3.0 | + +### Casting Rules + +Not all casts are possible. For example, it is not possible to convert an `INTEGER` to a `DATE`. Casts may also throw errors when the cast could not be successfully performed. For example, trying to cast the string `'hello'` to an `INTEGER` will result in an error being thrown. + +```sql +SELECT CAST('hello' AS INTEGER); +``` + +```console +Conversion Error: Could not convert string 'hello' to INT32 +``` + +The exact behavior of the cast depends on the source and destination types. For example, when casting from `VARCHAR` to any other type, the string will be attempted to be converted. + +### `TRY_CAST` + +`TRY_CAST` can be used when the preferred behavior is not to throw an error, but instead to return a `NULL` value. `TRY_CAST` will never throw an error, and will instead return `NULL` if a cast is not possible. + +```sql +SELECT TRY_CAST('hello' AS INTEGER) AS i; +``` + +| i | +|------| +| NULL | \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/collations.md b/docs/archive/1.0/sql/expressions/collations.md new file mode 100644 index 00000000000..f7ec5fc6d18 --- /dev/null +++ b/docs/archive/1.0/sql/expressions/collations.md @@ -0,0 +1,178 @@ +--- +layout: docu +railroad: expressions/collate.js +title: Collations +--- + +
+ +Collations provide rules for how text should be sorted or compared in the execution engine. Collations are useful for localization, as the rules for how text should be ordered are different for different languages or for different countries. These orderings are often incompatible with one another. For example, in English the letter `y` comes between `x` and `z`. However, in Lithuanian the letter `y` comes between the `i` and `j`. For that reason, different collations are supported. The user must choose which collation they want to use when performing sorting and comparison operations. + +By default, the `BINARY` collation is used. That means that strings are ordered and compared based only on their binary contents. This makes sense for standard ASCII characters (i.e., the letters A-Z and numbers 0-9), but generally does not make much sense for special unicode characters. It is, however, by far the fastest method of performing ordering and comparisons. Hence it is recommended to stick with the `BINARY` collation unless required otherwise. + +> Warning Collation support in DuckDB is experimental with [several planned improvements](https://github.com/duckdb/duckdb/issues/604) and a [few known issues](https://github.com/duckdb/duckdb/issues?q=is%3Aissue+is%3Aopen+collation+). + +## Using Collations + +In the stand-alone installation of DuckDB three collations are included: `NOCASE`, `NOACCENT` and `NFC`. The `NOCASE` collation compares characters as equal regardless of their casing. The `NOACCENT` collation compares characters as equal regardless of their accents. The `NFC` collation performs NFC-normalized comparisons, see [Unicode normalization](https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization) for more information. + +```sql +SELECT 'hello' = 'hElLO'; +``` + +```text +false +``` + +```sql +SELECT 'hello' COLLATE NOCASE = 'hElLO'; +``` + +```text +true +``` + +```sql +SELECT 'hello' = 'hëllo'; +``` + +```text +false +``` + +```sql +SELECT 'hello' COLLATE NOACCENT = 'hëllo'; +``` + +```text +true +``` + +Collations can be combined by chaining them using the dot operator. Note, however, that not all collations can be combined together. In general, the `NOCASE` collation can be combined with any other collator, but most other collations cannot be combined. + +```sql +SELECT 'hello' COLLATE NOCASE = 'hElLÖ'; +``` + +```text +false +``` + +```sql +SELECT 'hello' COLLATE NOACCENT = 'hElLÖ'; +``` + +```text +false +``` + +```sql +SELECT 'hello' COLLATE NOCASE.NOACCENT = 'hElLÖ'; +``` + +```text +true +``` + +## Default Collations + +The collations we have seen so far have all been specified *per expression*. It is also possible to specify a default collator, either on the global database level or on a base table column. The `PRAGMA` `default_collation` can be used to specify the global default collator. This is the collator that will be used if no other one is specified. + +```sql +SET default_collation = NOCASE; +SELECT 'hello' = 'HeLlo'; +``` + +```text +true +``` + +Collations can also be specified per-column when creating a table. When that column is then used in a comparison, the per-column collation is used to perform that comparison. + +```sql +CREATE TABLE names (name VARCHAR COLLATE NOACCENT); +INSERT INTO names VALUES ('hännes'); +``` + +```sql +SELECT name +FROM names +WHERE name = 'hannes'; +``` + +```text +hännes +``` + +Be careful here, however, as different collations cannot be combined. This can be problematic when you want to compare columns that have a different collation specified. + +```sql +SELECT name +FROM names +WHERE name = 'hannes' COLLATE NOCASE; +``` + +```console +ERROR: Cannot combine types with different collation! +``` + +```sql +CREATE TABLE other_names (name VARCHAR COLLATE NOCASE); +INSERT INTO other_names VALUES ('HÄNNES'); +``` + +```sql +SELECT names.name AS name, other_names.name AS other_name +FROM names, other_names +WHERE names.name = other_names.name; +``` + +```console +ERROR: Cannot combine types with different collation! +``` + +We need to manually overwrite the collation: + +```sql +SELECT names.name AS name, other_names.name AS other_name +FROM names, other_names +WHERE names.name COLLATE NOACCENT.NOCASE = other_names.name COLLATE NOACCENT.NOCASE; +``` + +| name | other_name | +|--------|------------| +| hännes | HÄNNES | + +## ICU Collations + +The collations we have seen so far are not region-dependent, and do not follow any specific regional rules. If you wish to follow the rules of a specific region or language, you will need to use one of the ICU collations. For that, you need to [load the ICU extension]({% link docs/archive/1.0/extensions/icu.md %}#installing-and-loading). + +If you are using the C++ API, you may find the extension in the `extension/icu` folder of the DuckDB project. Using the C++ API, the extension can be loaded as follows: + +```cpp +DuckDB db; +db.LoadExtension(); +``` + +Loading this extension will add a number of language and region specific collations to your database. These can be queried using `PRAGMA collations` command, or by querying the `pragma_collations` function. + +```sql +PRAGMA collations; +SELECT * FROM pragma_collations(); +``` + +```text +[af, am, ar, as, az, be, bg, bn, bo, bs, bs, ca, ceb, chr, cs, cy, da, de, de_AT, dsb, dz, ee, el, en, en_US, en_US, eo, es, et, fa, fa_AF, fi, fil, fo, fr, fr_CA, ga, gl, gu, ha, haw, he, he_IL, hi, hr, hsb, hu, hy, id, id_ID, ig, is, it, ja, ka, kk, kl, km, kn, ko, kok, ku, ky, lb, lkt, ln, lo, lt, lv, mk, ml, mn, mr, ms, mt, my, nb, nb_NO, ne, nl, nn, om, or, pa, pa, pa_IN, pl, ps, pt, ro, ru, se, si, sk, sl, smn, sq, sr, sr, sr_BA, sr_ME, sr_RS, sr, sr_BA, sr_RS, sv, sw, ta, te, th, tk, to, tr, ug, uk, ur, uz, vi, wae, wo, xh, yi, yo, zh, zh, zh_CN, zh_SG, zh, zh_HK, zh_MO, zh_TW, zu] +``` + +These collations can then be used as the other collations would be used before. They can also be combined with the `NOCASE` collation. For example, to use the German collation rules you could use the following code snippet: + +```sql +CREATE TABLE strings (s VARCHAR COLLATE DE); +INSERT INTO strings VALUES ('Gabel'), ('Göbel'), ('Goethe'), ('Goldmann'), ('Göthe'), ('Götz'); +SELECT * FROM strings ORDER BY s; +``` + +```text +"Gabel", "Göbel", "Goethe", "Goldmann", "Göthe", "Götz" +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/comparison_operators.md b/docs/archive/1.0/sql/expressions/comparison_operators.md new file mode 100644 index 00000000000..593a8b1735d --- /dev/null +++ b/docs/archive/1.0/sql/expressions/comparison_operators.md @@ -0,0 +1,54 @@ +--- +layout: docu +railroad: expressions/comparison.js +title: Comparisons +--- + +## Comparison Operators + +
+ +The table below shows the standard comparison operators. +Whenever either of the input arguments is `NULL`, the output of the comparison is `NULL`. + +
+ +| Operator | Description | Example | Result | +|:---|:---|:---|:---| +| `<` | less than | `2 < 3` | `true` | +| `>` | greater than | `2 > 3` | `false` | +| `<=` | less than or equal to | `2 <= 3` | `true` | +| `>=` | greater than or equal to | `4 >= NULL` | `NULL` | +| `=` | equal | `NULL = NULL` | `NULL` | +| `<>` or `!=` | not equal | `2 <> 2` | `false` | + +The table below shows the standard distinction operators. +These operators treat `NULL` values as equal. + +
+ +| Operator | Description | Example | Result | +|:---|:---|:---|:-| +| `IS DISTINCT FROM` | not equal, including `NULL` | `2 IS DISTINCT FROM NULL` | `true` | +| `IS NOT DISTINCT FROM` | equal, including `NULL` | `NULL IS NOT DISTINCT FROM NULL` | `true` | + +## `BETWEEN` and `IS [NOT] NULL` + +
+ +Besides the standard comparison operators there are also the `BETWEEN` and `IS (NOT) NULL` operators. These behave much like operators, but have special syntax mandated by the SQL standard. They are shown in the table below. + +Note that `BETWEEN` and `NOT BETWEEN` are only equivalent to the examples below in the cases where both `a`, `x` and `y` are of the same type, as `BETWEEN` will cast all of its inputs to the same type. + +
+ +| Predicate | Description | +|:---|:---| +| `a BETWEEN x AND y` | equivalent to `x <= a AND a <= y` | +| `a NOT BETWEEN x AND y` | equivalent to `x > a OR a > y` | +| `expression IS NULL` | `true` if expression is `NULL`, `false` otherwise | +| `expression ISNULL` | alias for `IS NULL` (non-standard) | +| `expression IS NOT NULL` | `false` if expression is `NULL`, `true` otherwise | +| `expression NOTNULL` | alias for `IS NOT NULL` (non-standard) | + +> For the expression `BETWEEN x AND y`, `x` is used as the lower bound and `y` is used as the upper bound. Therefore, if `x > y`, the result will always be `false`. \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/in.md b/docs/archive/1.0/sql/expressions/in.md new file mode 100644 index 00000000000..5e81ddd2f1e --- /dev/null +++ b/docs/archive/1.0/sql/expressions/in.md @@ -0,0 +1,51 @@ +--- +layout: docu +railroad: expressions/in.js +title: IN Operator +--- + +
+ +## `IN` + +The `IN` operator checks containment of the left expression inside the set of expressions on the right hand side (RHS). The `IN` operator returns true if the expression is present in the RHS, false if the expression is not in the RHS and the RHS has no `NULL` values, or `NULL` if the expression is not in the RHS and the RHS has `NULL` values. + +```sql +SELECT 'Math' IN ('CS', 'Math'); +``` + +```text +true +``` + +```sql +SELECT 'English' IN ('CS', 'Math'); +``` + +```text +false +``` + +```sql +SELECT 'Math' IN ('CS', 'Math', NULL); +``` + +```text +true +``` + +```sql +SELECT 'English' IN ('CS', 'Math', NULL); +``` + +```text +NULL +``` + +## `NOT IN` + +`NOT IN` can be used to check if an element is not present in the set. `x NOT IN y` is equivalent to `NOT (x IN y)`. + +## Use with Subqueries + +The `IN` operator can also be used with a subquery that returns a single column. See the [subqueries page for more information]({% link docs/archive/1.0/sql/expressions/subqueries.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/logical_operators.md b/docs/archive/1.0/sql/expressions/logical_operators.md new file mode 100644 index 00000000000..57a6a4f1f88 --- /dev/null +++ b/docs/archive/1.0/sql/expressions/logical_operators.md @@ -0,0 +1,34 @@ +--- +layout: docu +railroad: expressions/logical.js +title: Logical Operators +--- + +
+ +The following logical operators are available: `AND`, `OR` and `NOT`. SQL uses a three-valuad logic system with `true`, `false` and `NULL`. Note that logical operators involving `NULL` do not always evaluate to `NULL`. For example, `NULL AND false` will evaluate to `false`, and `NULL OR true` will evaluate to `true`. Below are the complete truth tables. + +### Binary Operators: `AND` and `OR` + +
+ +| `a` | `b` | `a AND b` | `a OR b` | +|:---|:---|:---|:---| +| `true` | `true` | `true` | `true` | +| `true` | `false` | `false` | `true` | +| `true` | `NULL` | `NULL` | `true` | +| `false` | `false` | `false` | `false` | +| `false` | `NULL` | `false` | `NULL` | +| `NULL` | `NULL` | `NULL` | `NULL`| + +### Unary Operator: `NOT` + +
+ +| `a` | `NOT a` | +|:---|:---| +| `true` | `false` | +| `false` | `true` | +| `NULL` | `NULL` | + +The operators `AND` and `OR` are commutative, that is, you can switch the left and right operand without affecting the result. \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/overview.md b/docs/archive/1.0/sql/expressions/overview.md new file mode 100644 index 00000000000..9699636e7d4 --- /dev/null +++ b/docs/archive/1.0/sql/expressions/overview.md @@ -0,0 +1,6 @@ +--- +layout: docu +title: Expressions +--- + +An expression is a combination of values, operators and functions. Expressions are highly composable, and range from very simple to arbitrarily complex. They can be found in many different parts of SQL statements. In this section, we provide the different types of operators and functions that can be used within expressions. \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/star.md b/docs/archive/1.0/sql/expressions/star.md new file mode 100644 index 00000000000..e1efc30ecac --- /dev/null +++ b/docs/archive/1.0/sql/expressions/star.md @@ -0,0 +1,225 @@ +--- +layout: docu +railroad: expressions/star.js +title: Star Expression +--- + +## Examples + +Select all columns present in the `FROM` clause: + +```sql +SELECT * FROM table_name; +``` + +Count the number of rows in a table: + +```sql +SELECT count(*) FROM table_name; +``` + +DuckDB offers a shorthand for `count(*)` expressions where the `*` may be omitted: + +```sql +SELECT count() FROM table_name; +``` + +Select all columns from the table called `table_name`: + +```sql +SELECT table_name.* +FROM table_name +JOIN other_table_name USING (id); +``` + +Select all columns except the city column from the addresses table: + +```sql +SELECT * EXCLUDE (city) +FROM addresses; +``` + +Select all columns from the addresses table, but replace city with `lower(city)`: + +```sql +SELECT * REPLACE (lower(city) AS city) +FROM addresses; +``` + +Select all columns matching the given expression: + +```sql +SELECT COLUMNS(c -> c LIKE '%num%') +FROM addresses; +``` + +Select all columns matching the given regex from the table: + +```sql +SELECT COLUMNS('number\d+') +FROM addresses; +``` + +## Syntax + +
+ +## Star Expression + +The `*` expression can be used in a `SELECT` statement to select all columns that are projected in the `FROM` clause. + +```sql +SELECT * +FROM tbl; +``` + +The `*` expression can be modified using the `EXCLUDE` and `REPLACE`. + +### `EXCLUDE` Clause + +`EXCLUDE` allows us to exclude specific columns from the `*` expression. + +```sql +SELECT * EXCLUDE (col) +FROM tbl; +``` + +### `REPLACE` Clause + +`REPLACE` allows us to replace specific columns with different expressions. + +```sql +SELECT * REPLACE (col / 1000 AS col) +FROM tbl; +``` + +## `COLUMNS` Expression + +The `COLUMNS` expression can be used to execute the same expression on multiple columns. For example: + +```sql +CREATE TABLE numbers (id INTEGER, number INTEGER); +INSERT INTO numbers VALUES (1, 10), (2, 20), (3, NULL); +SELECT min(COLUMNS(*)), count(COLUMNS(*)) FROM numbers; +``` + +
+ +| id | number | id | number | +|---:|-------:|---:|-------:| +| 1 | 10 | 3 | 2 | + +The `*` expression in the `COLUMNS` statement can also contain `EXCLUDE` or `REPLACE`, similar to regular star expressions. + +```sql +SELECT + min(COLUMNS(* REPLACE (number + id AS number))), + count(COLUMNS(* EXCLUDE (number))) +FROM numbers; +``` + +
+ +| id | min(number := (number + id)) | id | +|---:|-----------------------------:|---:| +| 1 | 11 | 3 | + +`COLUMNS` expressions can also be combined, as long as the `COLUMNS` contains the same (star) expression: + +```sql +SELECT COLUMNS(*) + COLUMNS(*) FROM numbers; +``` + +
+ +| id | number | +|---:|-------:| +| 2 | 20 | +| 4 | 40 | +| 6 | NULL | + +`COLUMNS` expressions can also be used in `WHERE` clauses. The conditions are applied to all columns and are combined using the logical `AND` operator. + +```sql +SELECT * +FROM ( + SELECT 0 AS x, 1 AS y, 2 AS z + UNION ALL + SELECT 1 AS x, 2 AS y, 3 AS z + UNION ALL + SELECT 2 AS x, 3 AS y, 4 AS z +) +WHERE COLUMNS(*) > 1; -- equivalent to: x > 1 AND y > 1 AND z > 1 +``` + +
+ +| x | y | z | +|--:|--:|--:| +| 2 | 3 | 4 | + +## `COLUMNS` Regular Expression + +`COLUMNS` supports passing a regex in as a string constant: + +```sql +SELECT COLUMNS('(id|numbers?)') FROM numbers; +``` + +
+ +| id | number | +|---:|-------:| +| 1 | 10 | +| 2 | 20 | +| 3 | NULL | + +The matches of capture groups can be used to rename columns selected by a regular expression: + +```sql +SELECT COLUMNS('(\w{2}).*') AS '\1' FROM numbers; +``` + +
+ +| id | nu | +|---:|-----:| +| 1 | 10 | +| 2 | 20 | +| 3 | NULL | + +The capture groups are one-indexed; `\0` is the original column name. + +## `COLUMNS` Lambda Function + +`COLUMNS` also supports passing in a lambda function. The lambda function will be evaluated for all columns present in the `FROM` clause, and only columns that match the lambda function will be returned. This allows the execution of arbitrary expressions in order to select columns. + +```sql +SELECT COLUMNS(c -> c LIKE '%num%') FROM numbers; +``` + +
+ +| number | +|--------| +| 10 | +| 20 | +| NULL | + +## `STRUCT.*` + +The `*` expression can also be used to retrieve all keys from a struct as separate columns. +This is particularly useful when a prior operation creates a struct of unknown shape, or if a query must handle any potential struct keys. +See the [`STRUCT` data type]({% link docs/archive/1.0/sql/data_types/struct.md %}) and [nested functions]({% link docs/archive/1.0/sql/functions/nested.md %}) pages for more details on working with structs. + +For example: + +```sql +SELECT st.* FROM (SELECT {'x': 1, 'y': 2, 'z': 3} AS st); +``` + +
+ +| x | y | z | +|--:|--:|--:| +| 1 | 2 | 3 | \ No newline at end of file diff --git a/docs/archive/1.0/sql/expressions/subqueries.md b/docs/archive/1.0/sql/expressions/subqueries.md new file mode 100644 index 00000000000..0aec6267e17 --- /dev/null +++ b/docs/archive/1.0/sql/expressions/subqueries.md @@ -0,0 +1,219 @@ +--- +layout: docu +railroad: expressions/subqueries.js +title: Subqueries +--- + +Subqueries are parenthesized query expressions that appear as part of a larger, outer query. Subqueries are usually based on `SELECT ... FROM`, but in DuckDB other query constructs such as [`PIVOT`]({% link docs/archive/1.0/sql/statements/pivot.md %}) can also appear as a subquery. + +## Scalar Subquery + +
+ +Scalar subqueries are subqueries that return a single value. They can be used anywhere where an expression can be used. If a scalar subquery returns more than a single value, a row is selected randomly. This behavior is [different from PostgreSQL]({% link docs/archive/1.0/sql/dialect/postgresql_compatibility.md %}#scalar-subqueries). + +Consider the following table: + +### Grades + +
+ +| grade | course | +|---:|:---| +| 7 | Math | +| 9 | Math | +| 8 | CS | + +```sql +CREATE TABLE grades (grade INTEGER, course VARCHAR); +INSERT INTO grades VALUES (7, 'Math'), (9, 'Math'), (8, 'CS'); +``` + +We can run the following query to obtain the minimum grade: + +```sql +SELECT min(grade) FROM grades; +``` + +| min(grade) | +|-----------:| +| 7 | + +By using a scalar subquery in the `WHERE` clause, we can figure out for which course this grade was obtained: + +```sql +SELECT course FROM grades WHERE grade = (SELECT min(grade) FROM grades); +``` + +| course | +|--------| +| Math | + +## Subquery Comparisons: `ALL`, `ANY` and `SOME` + +In the section on [scalar subqueries](#scalar-subquery), a scalar expression was compared directly to a subquery using the equality [comparison operator]({% link docs/archive/1.0/sql/expressions/comparison_operators.md %}#comparison-operators) (`=`). +Such direct comparisons only make sense with scalar subqueries. + +Scalar expressions can still be compared to single-column subqueries returning multiple rows by specifying a quantifier. Available quantifiers are `ALL`, `ANY` and `SOME`. The quantifiers `ANY` and `SOME` are equivalent. + +### `ALL` + +The `ALL` quantifier specifies that the comparison as a whole evaluates to `true` when the individual comparison results of _the expression at the left hand side of the comparison operator_ with each of the values from _the subquery at the right hand side of the comparison operator_ **all** evaluate to `true`: + +```sql +SELECT 6 <= ALL (SELECT grade FROM grades) AS adequate; +``` + +returns: + +| adequate | +|----------| +| true | + +because 6 is less than or equal to each of the subquery results 7, 8 and 9. + +However, the following query + +```sql +SELECT 8 >= ALL (SELECT grade FROM grades) AS excellent; +``` + +returns + +| excellent | +|-----------| +| false | + +because 8 is not greater than or equal to the subquery result 7. And thus, because not all comparisons evaluate `true`, `>= ALL` as a whole evaluates to `false`. + +### `ANY` + +The `ANY` quantifier specifies that the comparison as a whole evaluates to `true` when at least one of the individual comparison results evaluates to `true`. +For example: + +```sql +SELECT 5 >= ANY (SELECT grade FROM grades) AS fail; +``` + +returns + +| fail | +|-------| +| false | + +because no result of the subquery is less than or equal to 5. + +The quantifier `SOME` maybe used instead of `ANY`: `ANY` and `SOME` are interchangeable. + +> In DuckDB, and contrary to most SQL implementations, a comparison of a scalar with a single-column subquery returning multiple values still executes without error. However, the result is unstable, as the final comparison result is based on comparing just one (non-deterministically selected) value returned by the subquery. + +## `EXISTS` + +
+ +The `EXISTS` operator tests for the existence of any row inside the subquery. It returns either true when the subquery returns one or more records, and false otherwise. The `EXISTS` operator is generally the most useful as a *correlated* subquery to express semijoin operations. However, it can be used as an uncorrelated subquery as well. + +For example, we can use it to figure out if there are any grades present for a given course: + +```sql +SELECT EXISTS (SELECT * FROM grades WHERE course = 'Math') AS math_grades_present; +``` + +| math_grades_present | +|--------------------:| +| true | + +```sql +SELECT EXISTS (SELECT * FROM grades WHERE course = 'History') AS history_grades_present; +``` + +| history_grades_present | +|-----------------------:| +| false | + +### `NOT EXISTS` + +The `NOT EXISTS` operator tests for the absence of any row inside the subquery. It returns either true when the subquery returns an empty result, and false otherwise. The `NOT EXISTS` operator is generally the most useful as a *correlated* subquery to express antijoin operations. For example, to find Person nodes without an interest: + +```sql +CREATE TABLE Person (id BIGINT, name VARCHAR); +CREATE TABLE interest (PersonId BIGINT, topic VARCHAR); + +INSERT INTO Person VALUES (1, 'Jane'), (2, 'Joe'); +INSERT INTO interest VALUES (2, 'Music'); + +SELECT * +FROM Person +WHERE NOT EXISTS (SELECT * FROM interest WHERE interest.PersonId = Person.id); +``` + +| id | name | +|---:|------| +| 1 | Jane | + +> DuckDB automatically detects when a `NOT EXISTS` query expresses an antijoin operation. There is no need to manually rewrite such queries to use `LEFT OUTER JOIN ... WHERE ... IS NULL`. + +## `IN` Operator + +
+ +The `IN` operator checks containment of the left expression inside the result defined by the subquery or the set of expressions on the right hand side (RHS). The `IN` operator returns true if the expression is present in the RHS, false if the expression is not in the RHS and the RHS has no `NULL` values, or `NULL` if the expression is not in the RHS and the RHS has `NULL` values. + +We can use the `IN` operator in a similar manner as we used the `EXISTS` operator: + +```sql +SELECT 'Math' IN (SELECT course FROM grades) AS math_grades_present; +``` + +| math_grades_present | +|--------------------:| +| true | + +## Correlated Subqueries + +All the subqueries presented here so far have been **uncorrelated** subqueries, where the subqueries themselves are entirely self-contained and can be run without the parent query. There exists a second type of subqueries called **correlated** subqueries. For correlated subqueries, the subquery uses values from the parent subquery. + +Conceptually, the subqueries are run once for every single row in the parent query. Perhaps a simple way of envisioning this is that the correlated subquery is a **function** that is applied to every row in the source data set. + +For example, suppose that we want to find the minimum grade for every course. We could do that as follows: + +```sql +SELECT * +FROM grades grades_parent +WHERE grade = + (SELECT min(grade) + FROM grades + WHERE grades.course = grades_parent.course); +``` + +| grade | course | +|------:|--------| +| 7 | Math | +| 8 | CS | + +The subquery uses a column from the parent query (`grades_parent.course`). Conceptually, we can see the subquery as a function where the correlated column is a parameter to that function: + +```sql +SELECT min(grade) +FROM grades +WHERE course = ?; +``` + +Now when we execute this function for each of the rows, we can see that for `Math` this will return `7`, and for `CS` it will return `8`. We then compare it against the grade for that actual row. As a result, the row `(Math, 9)` will be filtered out, as `9 <> 7`. + +## Returning Each Row of the Subquery as a Struct + +Using the name of a subquery in the `SELECT` clause (without referring to a specific column) turns each row of the subquery into a struct whose fields correspond to the columns of the subquery. For example: + +```sql +SELECT t +FROM (SELECT unnest(generate_series(41, 43)) AS x, 'hello' AS y) t; +``` + +
+ +| t | +|-----------------------| +| {'x': 41, 'y': hello} | +| {'x': 42, 'y': hello} | +| {'x': 43, 'y': hello} | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/aggregates.md b/docs/archive/1.0/sql/functions/aggregates.md new file mode 100644 index 00000000000..207444ab9ae --- /dev/null +++ b/docs/archive/1.0/sql/functions/aggregates.md @@ -0,0 +1,623 @@ +--- +layout: docu +railroad: expressions/aggregate.js +redirect_from: +- docs/archive/1.0/sql/aggregates +title: Aggregate Functions +--- + + + +## Examples + +Produce a single row containing the sum of the `amount` column: + +```sql +SELECT sum(amount) +FROM sales; +``` + +Produce one row per unique region, containing the sum of `amount` for each group: + +```sql +SELECT region, sum(amount) +FROM sales +GROUP BY region; +``` + +Return only the regions that have a sum of `amount` higher than 100: + +```sql +SELECT region +FROM sales +GROUP BY region +HAVING sum(amount) > 100; +``` + +Return the number of unique values in the `region` column: + +```sql +SELECT count(DISTINCT region) +FROM sales; +``` + +Return two values, the total sum of `amount` and the sum of `amount` minus columns where the region is `north` using the [`FILTER` clause]({% link docs/archive/1.0/sql/query_syntax/filter.md %}): + +```sql +SELECT sum(amount), sum(amount) FILTER (region != 'north') +FROM sales; +``` + +Returns a list of all regions in order of the `amount` column: + +```sql +SELECT list(region ORDER BY amount DESC) +FROM sales; +``` + +Returns the amount of the first sale using the `first()` aggregate function: + +```sql +SELECT first(amount ORDER BY date ASC) +FROM sales; +``` + +## Syntax + +
+ +Aggregates are functions that *combine* multiple rows into a single value. Aggregates are different from scalar functions and window functions because they change the cardinality of the result. As such, aggregates can only be used in the `SELECT` and `HAVING` clauses of a SQL query. + +### `DISTINCT` Clause in Aggregate Functions + +When the `DISTINCT` clause is provided, only distinct values are considered in the computation of the aggregate. This is typically used in combination with the `count` aggregate to get the number of distinct elements; but it can be used together with any aggregate function in the system. + +### `ORDER BY` Clause in Aggregate Functions + +An `ORDER BY` clause can be provided after the last argument of the function call. Note the lack of the comma separator before the clause. + +```sql +SELECT ⟨aggregate_function⟩(⟨arg⟩, ⟨sep⟩ ORDER BY ⟨ordering_criteria⟩); +``` + +This clause ensures that the values being aggregated are sorted before applying the function. +Most aggregate functions are order-insensitive, and for them this clause is parsed and discarded. +However, there are some order-sensitive aggregates that can have non-deterministic results without ordering, e.g., `first`, `last`, `list` and `string_agg` / `group_concat` / `listagg`. +These can be made deterministic by ordering the arguments. + +For example: + +```sql +CREATE TABLE tbl AS + SELECT s FROM range(1, 4) r(s); + +SELECT string_agg(s, ', ' ORDER BY s DESC) AS countdown +FROM tbl; +``` + +| countdown | +|-----------| +| 3, 2, 1 | + +## General Aggregate Functions + +The table below shows the available general aggregate functions. + +| Function | Description | +|:--|:--------| +| [`any_value(arg)`](#any_valuearg) | Returns the first non-null value from `arg`. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`arbitrary(arg)`](#arbitraryarg) | Returns the first value (null or non-null) from `arg`. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`arg_max(arg, val)`](#arg_maxarg-val) | Finds the row with the maximum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`arg_min(arg, val)`](#arg_minarg-val) | Finds the row with the minimum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`array_agg(arg)`](#array_aggarg) | Returns a `LIST` containing all the values of a column. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`avg(arg)`](#avgarg) | Calculates the average value for all tuples in `arg`. | +| [`bit_and(arg)`](#bit_andarg) | Returns the bitwise AND of all bits in a given expression. | +| [`bit_or(arg)`](#bit_orarg) | Returns the bitwise OR of all bits in a given expression. | +| [`bit_xor(arg)`](#bit_xorarg) | Returns the bitwise XOR of all bits in a given expression. | +| [`bitstring_agg(arg)`](#bitstring_aggarg) | Returns a bitstring with bits set for each distinct value. | +| [`bool_and(arg)`](#bool_andarg) | Returns `true` if every input value is `true`, otherwise `false`. | +| [`bool_or(arg)`](#bool_orarg) | Returns `true` if any input value is `true`, otherwise `false`. | +| [`count(arg)`](#countarg) | Calculates the number of tuples in `arg`. | +| [`favg(arg)`](#favgarg) | Calculates the average using a more accurate floating point summation (Kahan Sum). | +| [`first(arg)`](#firstarg) | Returns the first value (null or non-null) from `arg`. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`fsum(arg)`](#fsumarg) | Calculates the sum using a more accurate floating point summation (Kahan Sum). | +| [`geomean(arg)`](#geomeanarg) | Calculates the geometric mean for all tuples in `arg`. | +| [`histogram(arg)`](#histogramarg) | Returns a `MAP` of key-value pairs representing buckets and counts. | +| [`last(arg)`](#lastarg) | Returns the last value of a column. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`list(arg)`](#listarg) | Returns a `LIST` containing all the values of a column. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`max(arg)`](#maxarg) | Returns the maximum value present in `arg`. | +| [`max_by(arg, val)`](#max_byarg-val) | Finds the row with the maximum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`min(arg)`](#minarg) | Returns the minimum value present in `arg`. | +| [`min_by(arg, val)`](#min_byarg-val) | Finds the row with the minimum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`product(arg)`](#productarg) | Calculates the product of all tuples in `arg`. | +| [`string_agg(arg, sep)`](#string_aggarg-sep) | Concatenates the column string values with a separator. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| [`sum(arg)`](#sumarg) | Calculates the sum value for all tuples in `arg`. | +| [`sum_no_overflow(arg)`](#sum_no_overflowarg) | Calculates the sum value for all tuples in `arg` without [overflow](https://en.wikipedia.org/wiki/Integer_overflow) checks. Unlike `sum`, which works on floating-point values, `sum_no_overflow` only accepts `INTEGER` and `DECIMAL` values. | + +#### `any_value(arg)` + +
+ +| **Description** | Returns the first non-null value from `arg`. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `any_value(A)` | +| **Alias(es)** | - | + +#### `arbitrary(arg)` + +
+ +| **Description** | Returns the first value (null or non-null) from `arg`. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `arbitrary(A)` | +| **Alias(es)** | `first(A)` | + +#### `arg_max(arg, val)` + +
+ +| **Description** | Finds the row with the maximum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `arg_max(A, B)` | +| **Alias(es)** | `argMax(arg, val)`, `max_by(arg, val)` | + +#### `arg_min(arg, val)` + +
+ +| **Description** | Finds the row with the minimum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `arg_min(A, B)` | +| **Alias(es)** | `argMin(arg, val)`, `min_by(arg, val)` | + +#### `array_agg(arg)` + +
+ +| **Description** | Returns a `LIST` containing all the values of a column. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `array_agg(A)` | +| **Alias(es)** | `list` | + +#### `avg(arg)` + +
+ +| **Description** | Calculates the average value for all tuples in `arg`. | +| **Example** | `avg(A)` | +| **Alias(es)** | `mean` | + +#### `bit_and(arg)` + +
+ +| **Description** | Returns the bitwise `AND` of all bits in a given expression. | +| **Example** | `bit_and(A)` | +| **Alias(es)** | - | + +#### `bit_or(arg)` + +
+ +| **Description** | Returns the bitwise `OR` of all bits in a given expression. | +| **Example** | `bit_or(A)` | +| **Alias(es)** | - | + +#### `bit_xor(arg)` + +
+ +| **Description** | Returns the bitwise `XOR` of all bits in a given expression. | +| **Example** | `bit_xor(A)` | +| **Alias(es)** | - | + +#### `bitstring_agg(arg)` + +
+ +| **Description** | Returns a bitstring with bits set for each distinct value. | +| **Example** | `bitstring_agg(A)` | +| **Alias(es)** | - | + +#### `bool_and(arg)` + +
+ +| **Description** | Returns `true` if every input value is `true`, otherwise `false`. | +| **Example** | `bool_and(A)` | +| **Alias(es)** | - | + +#### `bool_or(arg)` + +
+ +| **Description** | Returns `true` if any input value is `true`, otherwise `false`. | +| **Example** | `bool_or(A)` | +| **Alias(es)** | - | + +#### `count(arg)` + +
+ +| **Description** | Calculates the number of tuples in `arg`. If no `arg` is provided, the expression is evaluated as `count(*)`. | +| **Example** | `count(A)` | +| **Alias(es)** | - | + +#### `favg(arg)` + +
+ +| **Description** | Calculates the average using a more accurate floating point summation (Kahan Sum). | +| **Example** | `favg(A)` | +| **Alias(es)** | - | + +#### `first(arg)` + +
+ +| **Description** | Returns the first value (null or non-null) from `arg`. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `first(A)` | +| **Alias(es)** | `arbitrary(A)` | + +#### `fsum(arg)` + +
+ +| **Description** | Calculates the sum using a more accurate floating point summation (Kahan Sum). | +| **Example** | `fsum(A)` | +| **Alias(es)** | `sumKahan`, `kahan_sum` | + +#### `geomean(arg)` + +
+ +| **Description** | Calculates the geometric mean for all tuples in `arg`. | +| **Example** | `geomean(A)` | +| **Alias(es)** | `geometric_mean(A)` | + +#### `histogram(arg)` + +
+ +| **Description** | Returns a `MAP` of key-value pairs representing buckets and counts. | +| **Example** | `histogram(A)` | +| **Alias(es)** | - | + +#### `last(arg)` + +
+ +| **Description** | Returns the last value of a column. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `last(A)` | +| **Alias(es)** | - | + +#### `list(arg)` + +
+ +| **Description** | Returns a `LIST` containing all the values of a column. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `list(A)` | +| **Alias(es)** | `array_agg` | + +#### `max(arg)` + +
+ +| **Description** | Returns the maximum value present in `arg`. | +| **Example** | `max(A)` | +| **Alias(es)** | - | + +#### `max_by(arg, val)` + +
+ +| **Description** | Finds the row with the maximum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `max_by(A, B)` | +| **Alias(es)** | `argMax(arg, val)`, `arg_max(arg, val)` | + +#### `min(arg)` + +
+ +| **Description** | Returns the minimum value present in `arg`. | +| **Example** | `min(A)` | +| **Alias(es)** | - | + +#### `min_by(arg, val)` + +
+ +| **Description** | Finds the row with the minimum `val`. Calculates the `arg` expression at that row. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `min_by(A, B)` | +| **Alias(es)** | `argMin(arg, val)`, `arg_min(arg, val)` | + +#### `product(arg)` + +
+ +| **Description** | Calculates the product of all tuples in `arg`. | +| **Example** | `product(A)` | +| **Alias(es)** | - | + +#### `string_agg(arg, sep)` + +
+ +| **Description** | Concatenates the column string values with a separator. This function is [affected by ordering](#order-by-clause-in-aggregate-functions). | +| **Example** | `string_agg(S, ',')` | +| **Alias(es)** | `group_concat(arg, sep)`, `listagg(arg, sep)` | + +#### `sum(arg)` + +
+ +| **Description** | Calculates the sum value for all tuples in `arg`. | +| **Example** | `sum(A)` | +| **Alias(es)** | - | + +#### `sum_no_overflow(arg)` + +
+ +| **Description** | Calculates the sum value for all tuples in `arg` without [overflow](https://en.wikipedia.org/wiki/Integer_overflow) checks. Unlike `sum`, which works on floating-point values, `sum_no_overflow` only accepts `INTEGER` and `DECIMAL` values. | +| **Example** | `sum_no_overflow(A)` | +| **Alias(es)** | - | + +## Approximate Aggregates + +The table below shows the available approximate aggregate functions. + +| Function | Description | Example | +|:---|:---|:---| +| `approx_count_distinct(x)` | Gives the approximate count of distinct elements using HyperLogLog. | `approx_count_distinct(A)` | +| `approx_quantile(x, pos)` | Gives the approximate quantile using T-Digest. | `approx_quantile(A, 0.5)` | +| `reservoir_quantile(x, quantile, sample_size = 8192)` | Gives the approximate quantile using reservoir sampling, the sample size is optional and uses 8192 as a default size. | `reservoir_quantile(A, 0.5, 1024)` | + +## Statistical Aggregates + +The table below shows the available statistical aggregate functions. +They all ignore `NULL` values (in the case of a single input column `x`), or pairs where either input is `NULL` (in the case of two input columns `y` and `x`). + +| Function | Description | +|:--|:--------| +| [`corr(y, x)`](#corry-x) | The correlation coefficient. | +| [`covar_pop(y, x)`](#covar_popy-x) | The population covariance, which does not include bias correction. | +| [`covar_samp(y, x)`](#covar_sampy-x) | The sample covariance, which includes Bessel's bias correction. | +| [`entropy(x)`](#entropyx) | The log-2 entropy. | +| [`kurtosis_pop(x)`](#kurtosis_popx) | The excess kurtosis (Fisher’s definition) without bias correction. | +| [`kurtosis(x)`](#kurtosisx) | The excess kurtosis (Fisher's definition) with bias correction according to the sample size. | +| [`mad(x)`](#madx) | The median absolute deviation. Temporal types return a positive `INTERVAL`. | +| [`median(x)`](#medianx) | The middle value of the set. For even value counts, quantitative values are averaged and ordinal values return the lower value. | +| [`mode(x)`](#modex)| The most frequent value. | +| [`quantile_cont(x, pos)`](#quantile_contx-pos) | The interpolated `pos`-quantile of `x` for `0 <= pos <= 1`, i.e., orders the values of `x` and returns the `pos * (n_nonnull_values - 1)`th (zero-indexed) element (or an interpolation between the adjacent elements if the index is not an integer). If `pos` is a `LIST` of `FLOAT`s, then the result is a `LIST` of the corresponding interpolated quantiles. | +| [`quantile_disc(x, pos)`](#quantile_discx-pos) | The discrete `pos`-quantile of `x` for `0 <= pos <= 1`, i.e., orders the values of `x` and returns the `floor(pos * (n_nonnull_values - 1))`th (zero-indexed) element. If `pos` is a `LIST` of `FLOAT`s, then the result is a `LIST` of the corresponding discrete quantiles. | +| [`regr_avgx(y, x)`](#regr_avgxy-x) | The average of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable. | +| [`regr_avgy(y, x)`](#regr_avgyy-x) | The average of the dependent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable. | +| [`regr_count(y, x)`](#regr_county-x) | The number of non-`NULL` pairs. | +| [`regr_intercept(y, x)`](#regr_intercepty-x) | The intercept of the univariate linear regression line, where x is the independent variable and y is the dependent variable. | +| [`regr_r2(y, x)`](#regr_r2y-x) | The squared Pearson correlation coefficient between y and x. Also: The coefficient of determination in a linear regression, where x is the independent variable and y is the dependent variable. | +| [`regr_slope(y, x)`](#regr_slopey-x) | The slope of the linear regression line, where x is the independent variable and y is the dependent variable. | +| [`regr_sxx(y, x)`](#regr_sxxy-x) | The sample variance, which includes Bessel's bias correction, of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable. | +| [`regr_sxy(y, x)`](#regr_sxyy-x) | The sample covariance, which includes Bessel's bias correction. | +| [`regr_syy(y, x)`](#regr_syyy-x) | The sample variance, which includes Bessel's bias correction, of the dependent variable for non-`NULL` pairs , where x is the independent variable and y is the dependent variable. | +| [`skewness(x)`](#skewnessx) | The skewness. | +| [`stddev_pop(x)`](#stddev_popx) | The population standard deviation. | +| [`stddev_samp(x)`](#stddev_sampx) | The sample standard deviation. | +| [`var_pop(x)`](#var_popx) | The population variance, which does not include bias correction. | +| [`var_samp(x)`](#var_sampx) | The sample variance, which includes Bessel's bias correction. | + +#### `corr(y, x)` + +
+ +| **Description** | The correlation coefficient. +| **Formula** | `covar_pop(y, x) / (stddev_pop(x) * stddev_pop(y))` | +| **Alias(es)** | - | + +#### `covar_pop(y, x)` + +
+ +| **Description** | The population covariance, which does not include bias correction. | +| **Formula** | `(sum(x*y) - sum(x) * sum(y) / regr_count(y, x)) / regr_count(y, x)`, `covar_samp(y, x) * (1 - 1 / regr_count(y, x))` | +| **Alias(es)** | - | + +#### `covar_samp(y, x)` + +
+ +| **Description** | The sample covariance, which includes Bessel's bias correction. | +| **Formula** | `(sum(x*y) - sum(x) * sum(y) / regr_count(y, x)) / (regr_count(y, x) - 1)`, `covar_pop(y, x) / (1 - 1 / regr_count(y, x))` | +| **Alias(es)** | `regr_sxy(y, x)` | + +#### `entropy(x)` + +
+ +| **Description** | The log-2 entropy. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `kurtosis_pop(x)` + +
+ +| **Description** | The excess kurtosis (Fisher’s definition) without bias correction. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `kurtosis(x)` + +
+ +| **Description** | The excess kurtosis (Fisher's definition) with bias correction according to the sample size. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `mad(x)` + +
+ +| **Description** | The median absolute deviation. Temporal types return a positive `INTERVAL`. | +| **Formula** | `median(abs(x - median(x)))` | +| **Alias(es)** | - | + +#### `median(x)` + +
+ +| **Description** | The middle value of the set. For even value counts, quantitative values are averaged and ordinal values return the lower value. | +| **Formula** | `quantile_cont(x, 0.5)` | +| **Alias(es)** | - | + +#### `mode(x)` + +
+ +| **Description** | The most frequent value. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `quantile_cont(x, pos)` + +
+ +| **Description** | The interpolated `pos`-quantile of `x` for `0 <= pos <= 1`, i.e., orders the values of `x` and returns the `pos * (n_nonnull_values - 1)`th (zero-indexed) element (or an interpolation between the adjacent elements if the index is not an integer). If `pos` is a `LIST` of `FLOAT`s, then the result is a `LIST` of the corresponding interpolated quantiles. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `quantile_disc(x, pos)` + +
+ +| **Description** | The discrete `pos`-quantile of `x` for `0 <= pos <= 1`, i.e., orders the values of `x` and returns the `floor(pos * (n_nonnull_values - 1))`th (zero-indexed) element. If `pos` is a `LIST` of `FLOAT`s, then the result is a `LIST` of the corresponding discrete quantiles. | +| **Formula** | - | +| **Alias(es)** | `quantile` | + +#### `regr_avgx(y, x)` + +
+ +| **Description** | The average of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `regr_avgy(y, x)` + +
+ +| **Description** | The average of the dependent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `regr_count(y, x)` + +
+ +| **Description** | The number of non-`NULL` pairs. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `regr_intercept(y, x)` + +
+ +| **Description** | The intercept of the univariate linear regression line, where x is the independent variable and y is the dependent variable. | +| **Formula** | `regr_avgy(y, x) - regr_slope(y, x) * regr_avgx(y, x)` | +| **Alias(es)** | - | + +#### `regr_r2(y, x)` + +
+ +| **Description** | The squared Pearson correlation coefficient between y and x. Also: The coefficient of determination in a linear regression, where x is the independent variable and y is the dependent variable. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `regr_slope(y, x)` + +
+ +| **Description** | Returns the slope of the linear regression line, where x is the independent variable and y is the dependent variable. | +| **Formula** | `regr_sxy(y, x) / regr_sxx(y, x)` | +| **Alias(es)** | - | + +#### `regr_sxx(y, x)` + +
+ +| **Description** | The sample variance, which includes Bessel's bias correction, of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `regr_sxy(y, x)` + +
+ +| **Description** | The sample covariance, which includes Bessel's bias correction. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `regr_syy(y, x)` + +
+ +| **Description** | The sample variance, which includes Bessel's bias correction, of the dependent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `skewness(x)` + +
+ +| **Description** | The skewness. | +| **Formula** | - | +| **Alias(es)** | - | + +#### `stddev_pop(x)` + +
+ +| **Description** | The population standard deviation. | +| **Formula** | `sqrt(var_pop(x))` | +| **Alias(es)** | - | + +#### `stddev_samp(x)` + +
+ +| **Description** | The sample standard deviation. | +| **Formula** | `sqrt(var_samp(x))`| +| **Alias(es)** | `stddev(x)`| + +#### `var_pop(x)` + +
+ +| **Description** | The population variance, which does not include bias correction. | +| **Formula** | `(sum(x^2) - sum(x)^2 / count(x)) / count(x)`, `var_samp(y, x) * (1 - 1 / count(x))` | +| **Alias(es)** | - | + +#### `var_samp(x)` + +
+ +| **Description** | The sample variance, which includes Bessel's bias correction. | +| **Formula** | `(sum(x^2) - sum(x)^2 / count(x)) / (count(x) - 1)`, `var_pop(y, x) / (1 - 1 / count(x))` | +| **Alias(es)** | `variance(arg, val)` | + +## Ordered Set Aggregate Functions + +The table below shows the available “ordered set” aggregate functions. +These functions are specified using the `WITHIN GROUP (ORDER BY sort_expression)` syntax, +and they are converted to an equivalent aggregate function that takes the ordering expression +as the first argument. + +| Function | Equivalent | +|:---|:---| +| mode() WITHIN GROUP (ORDER BY column [(ASC|DESC)]) | mode(column ORDER BY column [(ASC|DESC)]) | +| percentile_cont(fraction) WITHIN GROUP (ORDER BY column [(ASC|DESC)]) | quantile_cont(column, fraction ORDER BY column [(ASC|DESC)]) | +| percentile_cont(fractions) WITHIN GROUP (ORDER BY column [(ASC|DESC)]) | quantile_cont(column, fractions ORDER BY column [(ASC|DESC)]) | +| percentile_disc(fraction) WITHIN GROUP (ORDER BY column [(ASC|DESC)]) | quantile_disc(column, fraction ORDER BY column [(ASC|DESC)]) | +| percentile_disc(fractions) WITHIN GROUP (ORDER BY column [(ASC|DESC)]) | quantile_disc(column, fractions ORDER BY column [(ASC|DESC)]) | + +## Miscellaneous Aggregate Functions + +| Function | Description | Alias | +|:--|:---|:--| +| `grouping()` | For queries with `GROUP BY` and either [`ROLLUP` or `GROUPING SETS`]({% link docs/archive/1.0/sql/query_syntax/grouping_sets.md %}#identifying-grouping-sets-with-grouping_id): Returns an integer identifying which of the argument expressions where used to group on to create the current supper-aggregate row. | `grouping_id()` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/array.md b/docs/archive/1.0/sql/functions/array.md new file mode 100644 index 00000000000..780ba67ac61 --- /dev/null +++ b/docs/archive/1.0/sql/functions/array.md @@ -0,0 +1,67 @@ +--- +layout: docu +title: Array Functions +--- + + + +All [`LIST` functions]({% link docs/archive/1.0/sql/functions/nested.md %}#list-functions) work with the [`ARRAY` data type]({% link docs/archive/1.0/sql/data_types/array.md %}). Additionally, several `ARRAY`-native functions are also supported. + +## Array-Native Functions + +| Function | Description | +|----|-----|-------|---| +| [`array_value(index)`](#array_valueindex) | Create an `ARRAY` containing the argument values. | +| [`array_cross_product(array1, array2)`](#array_cross_productarray1-array2) | Compute the cross product of two arrays of size 3. The array elements can not be `NULL`. | +| [`array_cosine_similarity(array1, array2)`](#array_cosine_similarityarray1-array2) | Compute the cosine similarity between two arrays of the same size. The array elements can not be `NULL`. The arrays can have any size as long as the size is the same for both arguments. | +| [`array_distance(array1, array2)`](#array_distancearray1-array2) | Compute the distance between two arrays of the same size. The array elements can not be `NULL`. The arrays can have any size as long as the size is the same for both arguments. | +| [`array_inner_product(array1, array2)`](#array_inner_productarray1-array2) | Compute the inner product between two arrays of the same size. The array elements can not be `NULL`. The arrays can have any size as long as the size is the same for both arguments. | +| [`array_dot_product(array1, array2)`](#array_dot_productarray1-array2) | Alias for `array_inner_product(array1, array2)`. | + +#### `array_value(index)` + +
+ +| **Description** | Create an `ARRAY` containing the argument values. | +| **Example** | `array_value(1.0::FLOAT, 2.0::FLOAT, 3.0::FLOAT)` | +| **Result** | `[1.0, 2.0, 3.0]` | + +#### `array_cross_product(array1, array2)` + +
+ +| **Description** | Compute the cross product of two arrays of size 3. The array elements can not be `NULL`. | +| **Example** | `array_cross_product(array_value(1.0::FLOAT, 2.0::FLOAT, 3.0::FLOAT), array_value(2.0::FLOAT, 3.0::FLOAT, 4.0::FLOAT))` | +| **Result** | `[-1.0, 2.0, -1.0]` | + +#### `array_cosine_similarity(array1, array2)` + +
+ +| **Description** | Compute the cosine similarity between two arrays of the same size. The array elements can not be `NULL`. The arrays can have any size as long as the size is the same for both arguments. | +| **Example** | `array_cosine_similarity(array_value(1.0::FLOAT, 2.0::FLOAT, 3.0::FLOAT), array_value(2.0::FLOAT, 3.0::FLOAT, 4.0::FLOAT))` | +| **Result** | `0.9925833` | + +#### `array_distance(array1, array2)` + +
+ +| **Description** | Compute the distance between two arrays of the same size. The array elements can not be `NULL`. The arrays can have any size as long as the size is the same for both arguments. | +| **Example** | `array_distance(array_value(1.0::FLOAT, 2.0::FLOAT, 3.0::FLOAT), array_value(2.0::FLOAT, 3.0::FLOAT, 4.0::FLOAT))` | +| **Result** | `1.7320508` | + +#### `array_inner_product(array1, array2)` + +
+ +| **Description** | Compute the inner product between two arrays of the same size. The array elements can not be `NULL`. The arrays can have any size as long as the size is the same for both arguments. | +| **Example** | `array_inner_product(array_value(1.0::FLOAT, 2.0::FLOAT, 3.0::FLOAT), array_value(2.0::FLOAT, 3.0::FLOAT, 4.0::FLOAT))` | +| **Result** | `20.0` | + +#### `array_dot_product(array1, array2)` + +
+ +| **Description** | Alias for `array_inner_product(array1, array2)`. | +| **Example** | `array_dot_product(l1, l2)` | +| **Result** | `20.0` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/bitstring.md b/docs/archive/1.0/sql/functions/bitstring.md new file mode 100644 index 00000000000..9625766865a --- /dev/null +++ b/docs/archive/1.0/sql/functions/bitstring.md @@ -0,0 +1,158 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/bitstring +title: Bitstring Functions +--- + + + +This section describes functions and operators for examining and manipulating [`BITSTRING`]({% link docs/archive/1.0/sql/data_types/bitstring.md %}) values. +Bitstrings must be of equal length when performing the bitwise operands AND, OR and XOR. When bit shifting, the original length of the string is preserved. + +## Bitstring Operators + +The table below shows the available mathematical operators for `BIT` type. + +
+ + + +| Operator | Description | Example | Result | +|:---|:---|:---|:---| +| `&` | Bitwise AND | `'10101'::BITSTRING & '10001'::BITSTRING` | `10001` | +| `|` | Bitwise OR | `'1011'::BITSTRING | '0001'::BITSTRING` | `1011` | +| `xor` | Bitwise XOR | `xor('101'::BITSTRING, '001'::BITSTRING)` | `100` | +| `~` | Bitwise NOT | `~('101'::BITSTRING)` | `010` | +| `<<` | Bitwise shift left | `'1001011'::BITSTRING << 3` | `1011000` | +| `>>` | Bitwise shift right | `'1001011'::BITSTRING >> 3` | `0001001` | + + + +## Bitstring Functions + +The table below shows the available scalar functions for `BIT` type. + +| Name | Description | +|:--|:-------| +| [`bit_count(bitstring)`](#bit_countbitstring) | Returns the number of set bits in the bitstring. | +| [`bit_length(bitstring)`](#bit_lengthbitstring) | Returns the number of bits in the bitstring. | +| [`bit_position(substring, bitstring)`](#bit_positionsubstring-bitstring) | Returns first starting index of the specified substring within bits, or zero if it's not present. The first (leftmost) bit is indexed 1. | +| [`bitstring(bitstring, length)`](#bitstringbitstring-length) | Returns a bitstring of determined length. | +| [`get_bit(bitstring, index)`](#get_bitbitstring-index) | Extracts the nth bit from bitstring; the first (leftmost) bit is indexed 0. | +| [`length(bitstring)`](#lengthbitstring) | Alias for `bit_length`. | +| [`octet_length(bitstring)`](#octet_lengthbitstring) | Returns the number of bytes in the bitstring. | +| [`set_bit(bitstring, index, new_value)`](#set_bitbitstring-index-new_value) | Sets the nth bit in bitstring to newvalue; the first (leftmost) bit is indexed 0. Returns a new bitstring. | + +#### `bit_count(bitstring)` + +
+ +| **Description** | Returns the number of set bits in the bitstring. | +| **Example** | `bit_count('1101011'::BITSTRING)` | +| **Result** | `5` | + +#### `bit_length(bitstring)` + +
+ +| **Description** | Returns the number of bits in the bitstring. | +| **Example** | `bit_length('1101011'::BITSTRING)` | +| **Result** | `7` | + +#### `bit_position(substring, bitstring)` + +
+ +| **Description** | Returns first starting index of the specified substring within bits, or zero if it's not present. The first (leftmost) bit is indexed 1 | +| **Example** | `bit_position('010'::BITSTRING, '1110101'::BITSTRING)` | +| **Result** | `4` | + +#### `bitstring(bitstring, length)` + +
+ +| **Description** | Returns a bitstring of determined length. | +| **Example** | `bitstring('1010'::BITSTRING, 7)` | +| **Result** | `0001010` | + +#### `get_bit(bitstring, index)` + +
+ +| **Description** | Extracts the nth bit from bitstring; the first (leftmost) bit is indexed 0. | +| **Example** | `get_bit('0110010'::BITSTRING, 2)` | +| **Result** | `1` | + +#### `length(bitstring)` + +
+ +| **Description** | Alias for `bit_length`. | +| **Example** | `length('1101011'::BITSTRING)` | +| **Result** | `7` | + +#### `octet_length(bitstring)` + +
+ +| **Description** | Returns the number of bytes in the bitstring. | +| **Example** | `octet_length('1101011'::BITSTRING)` | +| **Result** | `1` | + +#### `set_bit(bitstring, index, new_value)` + +
+ +| **Description** | Sets the nth bit in bitstring to newvalue; the first (leftmost) bit is indexed 0. Returns a new bitstring. | +| **Example** | `set_bit('0110010'::BITSTRING, 2, 0)` | +| **Result** | `0100010` | + +## Bitstring Aggregate Functions + +These aggregate functions are available for `BIT` type. + +| Name | Description | +|:--|:-------| +| [`bit_and(arg)`](#bit_andarg) | Returns the bitwise AND operation performed on all bitstrings in a given expression. | +| [`bit_or(arg)`](#bit_orarg) | Returns the bitwise OR operation performed on all bitstrings in a given expression. | +| [`bit_xor(arg)`](#bit_xorarg) | Returns the bitwise XOR operation performed on all bitstrings in a given expression. | +| [`bitstring_agg(arg)`](#bitstring_aggarg) | Returns a bitstring with bits set for each distinct position defined in `arg`. | +| [`bitstring_agg(arg, min, max)`](#bitstring_aggarg-min-max) | Returns a bitstring with bits set for each distinct position defined in `arg`. All positions must be within the range [`min`, `max`] or an `Out of Range Error` will be thrown. | + +#### `bit_and(arg)` + +
+ +| **Description** | Returns the bitwise AND operation performed on all bitstrings in a given expression. | +| **Example** | `bit_and(A)` | + +#### `bit_or(arg)` + +
+ +| **Description** | Returns the bitwise OR operation performed on all bitstrings in a given expression. | +| **Example** | `bit_or(A)` | + +#### `bit_xor(arg)` + +
+ +| **Description** | Returns the bitwise XOR operation performed on all bitstrings in a given expression. | +| **Example** | `bit_xor(A)` | + +#### `bitstring_agg(arg)` + +
+ +| **Description** | The `bitstring_agg` function takes any integer type as input and returns a bitstring with bits set for each distinct value. The left-most bit represents the smallest value in the column and the right-most bit the maximum value. If possible, the min and max are retrieved from the column statistics. Otherwise, it is also possible to provide the min and max values. | +| **Example** | `bitstring_agg(A)` | + +> Tip The combination of `bit_count` and `bitstring_agg` can be used as an alternative to `count(DISTINCT ...)`, with possible performance improvements in cases of low cardinality and dense values. + +#### `bitstring_agg(arg, min, max)` + +
+ +| **Description** | Returns a bitstring with bits set for each distinct position defined in `arg`. All positions must be within the range [`min`, `max`] or an `Out of Range Error` will be thrown. | +| **Example** | `bitstring_agg(A, 1, 42)` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/blob.md b/docs/archive/1.0/sql/functions/blob.md new file mode 100644 index 00000000000..78ee5eddd2d --- /dev/null +++ b/docs/archive/1.0/sql/functions/blob.md @@ -0,0 +1,62 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/blob +title: Blob Functions +--- + + + +This section describes functions and operators for examining and manipulating [`BLOB` values]({% link docs/archive/1.0/sql/data_types/blob.md %}). + + + +| Name | Description | +|:--|:-------| +| [`blob || blob`](#blob--blob) | `BLOB` concatenation. | +| [`decode(blob)`](#decodeblob) | Converts `blob` to `VARCHAR`. Fails if `blob` is not valid UTF-8. | +| [`encode(string)`](#encodestring) | Converts the `string` to `BLOB`. Converts UTF-8 characters into literal encoding. | +| [`octet_length(blob)`](#octet_lengthblob) | Number of bytes in `blob`. | +| [`read_blob(source)`](#read_blobsource) | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `BLOB`. See the [`read_blob` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_blob) for more details. | + + + +#### `blob || blob` + +
+ +| **Description** | `BLOB` concatenation. | +| **Example** | `'\xAA'::BLOB || '\xBB'::BLOB` | +| **Result** | `\xAA\xBB` | + +#### `decode(blob)` + +
+ +| **Description** | Convert `blob` to `VARCHAR`. Fails if `blob` is not valid UTF-8. | +| **Example** | `decode('\xC3\xBC'::BLOB)` | +| **Result** | `ü` | + +#### `encode(string)` + +
+ +| **Description** | Converts the `string` to `BLOB`. Converts UTF-8 characters into literal encoding. | +| **Example** | `encode('my_string_with_ü')` | +| **Result** | `my_string_with_\xC3\xBC` | + +#### `octet_length(blob)` + +
+ +| **Description** | Number of bytes in `blob`. | +| **Example** | `octet_length('\xAA\xBB'::BLOB)` | +| **Result** | `2` | + +#### `read_blob(source)` + +
+ +| **Description** | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `BLOB`. See the [`read_blob` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_blob) for more details. | +| **Example** | `read_blob('hello.bin')` | +| **Result** | `hello\x0A` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/char.md b/docs/archive/1.0/sql/functions/char.md new file mode 100644 index 00000000000..8d0aac5bd1d --- /dev/null +++ b/docs/archive/1.0/sql/functions/char.md @@ -0,0 +1,1044 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/char +title: Text Functions +--- + + + +## Text Functions and Operators + +This section describes functions and operators for examining and manipulating [`STRING` values]({% link docs/archive/1.0/sql/data_types/text.md %}). + + + +| Name | Description | +|:--|:-------| +| [`string ^@ search_string`](#string--search_string) | Return true if `string` begins with `search_string`. | +| [`string || string`](#string--string) | String concatenation. | +| [`string[index]`](#stringindex) | Extract a single character using a (1-based) index. | +| [`string[begin:end]`](#stringbeginend) | Extract a string using slice conventions (like in Python). Missing `begin` or `end` arguments are interpreted as the beginning or end of the list respectively. Negative values are accepted. | +| [`string LIKE target`](#string-like-target) | Returns true if the `string` matches the like specifier (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})). | +| [`string SIMILAR TO regex`](#string-similar-to-regex) | Returns `true` if the `string` matches the `regex`; identical to `regexp_full_match` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})). | +| [`array_extract(list, index)`](#array_extractlist-index) | Extract a single character using a (1-based) index. | +| [`array_slice(list, begin, end)`](#array_slicelist-begin-end) | Extract a string using slice conventions. Negative values are accepted. | +| [`ascii(string)`](#asciistring) | Returns an integer that represents the Unicode code point of the first character of the `string`. | +| [`bar(x, min, max[, width])`](#barx-min-max-width) | Draw a band whose width is proportional to (`x - min`) and equal to `width` characters when `x` = `max`. `width` defaults to 80. | +| [`bit_length(string)`](#bit_lengthstring) | Number of bits in a string. | +| [`chr(x)`](#chrx) | Returns a character which is corresponding the ASCII code value or Unicode code point. | +| [`concat_ws(separator, string, ...)`](#concat_wsseparator-string-) | Concatenate strings together separated by the specified separator. | +| [`concat(string, ...)`](#concatstring-) | Concatenate many strings together. | +| [`contains(string, search_string)`](#containsstring-search_string) | Return true if `search_string` is found within `string`. | +| [`ends_with(string, search_string)`](#ends_withstring-search_string) | Return true if `string` ends with `search_string`. | +| [`format_bytes(bytes)`](#format_bytesbytes) | Converts bytes to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.). | +| [`format(format, parameters, ...)`](#formatformat-parameters-) | Formats a string using the [fmt syntax](#fmt-syntax). | +| [`from_base64(string)`](#from_base64string) | Convert a base64 encoded string to a character string. | +| [`greatest(x1, x2, ...)`](#greatestx1-x2-) | Selects the largest value using lexicographical ordering. Note that lowercase characters are considered “larger” than uppercase characters and [collations]({% link docs/archive/1.0/sql/expressions/collations.md %}) are not supported. | +| [`hash(value)`](#hashvalue) | Returns a `UBIGINT` with the hash of the `value`. | +| [`ilike_escape(string, like_specifier, escape_character)`](#ilike_escapestring-like_specifier-escape_character) | Returns true if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| [`instr(string, search_string)`](#instrstring-search_string) | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. | +| [`least(x1, x2, ...)`](#leastx1-x2-) | Selects the smallest value using lexicographical ordering. Note that uppercase characters are considered “smaller” than lowercase characters, and [collations]({% link docs/archive/1.0/sql/expressions/collations.md %}) are not supported. | +| [`left_grapheme(string, count)`](#left_graphemestring-count) | Extract the left-most grapheme clusters. | +| [`left(string, count)`](#leftstring-count) | Extract the left-most count characters. | +| [`length_grapheme(string)`](#length_graphemestring) | Number of grapheme clusters in `string`. | +| [`length(string)`](#lengthstring) | Number of characters in `string`. | +| [`like_escape(string, like_specifier, escape_character)`](#like_escapestring-like_specifier-escape_character) | Returns true if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| [`lower(string)`](#lowerstring) | Convert `string` to lower case. | +| [`lpad(string, count, character)`](#lpadstring-count-character) | Pads the `string` with the character from the left until it has count characters. | +| [`ltrim(string, characters)`](#ltrimstring-characters) | Removes any occurrences of any of the `characters` from the left side of the `string`. | +| [`ltrim(string)`](#ltrimstring) | Removes any spaces from the left side of the `string`. | +| [`md5(value)`](#md5value) | Returns the [MD5 hash](https://en.wikipedia.org/wiki/MD5) of the `value`. | +| [`nfc_normalize(string)`](#nfc_normalizestring) | Convert string to Unicode NFC normalized string. Useful for comparisons and ordering if text data is mixed between NFC normalized and not. | +| [`not_ilike_escape(string, like_specifier, escape_character)`](#not_ilike_escapestring-like_specifier-escape_character) | Returns false if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| [`not_like_escape(string, like_specifier, escape_character)`](#not_like_escapestring-like_specifier-escape_character) | Returns false if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| [`ord(string)`](#ordstring) | Return ASCII character code of the leftmost character in a string. | +| [`parse_dirname(path, separator)`](#parse_dirnamepath-separator) | Returns the top-level directory name from the given path. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| [`parse_dirpath(path, separator)`](#parse_dirpathpath-separator) | Returns the head of the path (the pathname until the last slash) similarly to Python's [`os.path.dirname`](https://docs.python.org/3.7/library/os.path.html#os.path.dirname) function. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| [`parse_filename(path, trim_extension, separator)`](#parse_filenamepath-trim_extension-separator) | Returns the last component of the path similarly to Python's [`os.path.basename`](https://docs.python.org/3.7/library/os.path.html#os.path.basename) function. If `trim_extension` is true, the file extension will be removed (defaults to `false`). `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| [`parse_path(path, separator)`](#parse_pathpath-separator) | Returns a list of the components (directories and filename) in the path similarly to Python's [`pathlib.parts`](https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.parts) function. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| [`position(search_string IN string)`](#positionsearch_string-in-string) | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. | +| [`printf(format, parameters...)`](#printfformat-parameters) | Formats a `string` using [printf syntax](#printf-syntax). | +| [`read_text(source)`](#read_textsource) | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `VARCHAR`. The file content is first validated to be valid UTF-8. If `read_text` attempts to read a file with invalid UTF-8 an error is thrown suggesting to use `read_blob` instead. See the [`read_text` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_text) for more details. | +| [`regexp_escape(string)`](#regexp_escapestring) | Escapes special patterns to turn `string` into a regular expression similarly to Python's [`re.escape` function](https://docs.python.org/3/library/re.html#re.escape). | +| [`regexp_extract(string, pattern[, group = 0])`](#regexp_extractstring-pattern-group--0) | If `string` contains the regexp `pattern`, returns the capturing group specified by optional parameter `group` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_extract)). | +| [`regexp_extract(string, pattern, name_list)`](#regexp_extractstring-pattern-name_list) | If `string` contains the regexp `pattern`, returns the capturing groups as a struct with corresponding names from `name_list` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_extract)). | +| [`regexp_extract_all(string, regex[, group = 0])`](#regexp_extract_allstring-regex-group--0) | Split the `string` along the `regex` and extract all occurrences of `group`. | +| [`regexp_full_match(string, regex)`](#regexp_full_matchstring-regex) | Returns `true` if the entire `string` matches the `regex` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})). | +| [`regexp_matches(string, pattern)`](#regexp_matchesstring-pattern) | Returns `true` if `string` contains the regexp `pattern`, `false` otherwise (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_matches)). | +| [`regexp_replace(string, pattern, replacement)`](#regexp_replacestring-pattern-replacement) | If `string` contains the regexp `pattern`, replaces the matching part with `replacement` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_replace)). | +| [`regexp_split_to_array(string, regex)`](#regexp_split_to_arraystring-regex) | Splits the `string` along the `regex`. | +| [`regexp_split_to_table(string, regex)`](#regexp_split_to_tablestring-regex) | Splits the `string` along the `regex` and returns a row for each part. | +| [`repeat(string, count)`](#repeatstring-count) | Repeats the `string` `count` number of times. | +| [`replace(string, source, target)`](#replacestring-source-target) | Replaces any occurrences of the `source` with `target` in `string`. | +| [`reverse(string)`](#reversestring) | Reverses the `string`. | +| [`right_grapheme(string, count)`](#right_graphemestring-count) | Extract the right-most `count` grapheme clusters. | +| [`right(string, count)`](#rightstring-count) | Extract the right-most `count` characters. | +| [`rpad(string, count, character)`](#rpadstring-count-character) | Pads the `string` with the character from the right until it has `count` characters. | +| [`rtrim(string, characters)`](#rtrimstring-characters) | Removes any occurrences of any of the `characters` from the right side of the `string`. | +| [`rtrim(string)`](#rtrimstring) | Removes any spaces from the right side of the `string`. | +| [`sha256(value)`](#sha256value) | Returns a `VARCHAR` with the SHA-256 hash of the `value`. | +| [`split_part(string, separator, index)`](#split_partstring-separator-index) | Split the `string` along the `separator` and return the data at the (1-based) `index` of the list. If the `index` is outside the bounds of the list, return an empty string (to match PostgreSQL's behavior). | +| [`starts_with(string, search_string)`](#starts_withstring-search_string) | Return true if `string` begins with `search_string`. | +| [`str_split_regex(string, regex)`](#str_split_regexstring-regex) | Splits the `string` along the `regex`. | +| [`string_split_regex(string, regex)`](#string_split_regexstring-regex) | Splits the `string` along the `regex`. | +| [`string_split(string, separator)`](#string_splitstring-separator) | Splits the `string` along the `separator`. | +| [`strip_accents(string)`](#strip_accentsstring) | Strips accents from `string`. | +| [`strlen(string)`](#strlenstring) | Number of bytes in `string`. | +| [`strpos(string, search_string)`](#strposstring-search_string) | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. | +| [`substring(string, start, length)`](#substringstring-start-length) | Extract substring of `length` characters starting from character `start`. Note that a `start` value of `1` refers to the `first` character of the string. | +| [`substring_grapheme(string, start, length)`](#substring_graphemestring-start-length) | Extract substring of `length` grapheme clusters starting from character `start`. Note that a `start` value of `1` refers to the `first` character of the string. | +| [`to_base64(blob)`](#to_base64blob) | Convert a blob to a base64 encoded string. | +| [`trim(string, characters)`](#trimstring-characters) | Removes any occurrences of any of the `characters` from either side of the `string`. | +| [`trim(string)`](#trimstring) | Removes any spaces from either side of the `string`. | +| [`unicode(string)`](#unicodestring) | Returns the Unicode code of the first character of the `string`. | +| [`upper(string)`](#upperstring) | Convert `string` to upper case. | + + + +#### `string ^@ search_string` + +
+ +| **Description** | Return true if `string` begins with `search_string`. | +| **Example** | `'abc' ^@ 'a'` | +| **Result** | `true` | +| **Alias** | `starts_with` | + +#### `string || string` + +
+ +| **Description** | String concatenation. | +| **Example** | `'Duck' || 'DB'` | +| **Result** | `DuckDB` | +| **Alias** | `concat` | + +#### `string[index]` + +
+ +| **Description** | Extract a single character using a (1-based) index. | +| **Example** | `'DuckDB'[4]` | +| **Result** | `k` | +| **Alias** | `array_extract` | + +#### `string[begin:end]` + +
+ +| **Description** | Extract a string using slice conventions similar to Python. Missing `begin` or `end` arguments are interpreted as the beginning or end of the list respectively. Negative values are accepted. | +| **Example** | `'DuckDB'[:4]` | +| **Result** | `Duck` | +| **Alias** | `array_slice` | + +More examples: + +```sql +SELECT 'abcdefghi' AS str +, str[3] -- get char at position 3, 'c' +, str[3:5] -- substring from position 3 up to and including position 5, 'cde' +, str[6:] -- substring from position 6 till the end, 'fghi' +, str[:3] -- substring from the start up to and including position 3, 'abc' +, str[3:-4] -- substring from positio 3 up to and including the 4th position from the end, 'cdef' +``` + +#### `string LIKE target` + +
+ +| **Description** | Returns true if the `string` matches the like specifier (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})). | +| **Example** | `'hello' LIKE '%lo'` | +| **Result** | `true` | + +#### `string SIMILAR TO regex` + +
+ +| **Description** | Returns `true` if the `string` matches the `regex`; identical to `regexp_full_match` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) | +| **Example** | `'hello' SIMILAR TO 'l+'` | +| **Result** | `false` | + +#### `array_extract(list, index)` + +
+ +| **Description** | Extract a single character using a (1-based) index. | +| **Example** | `array_extract('DuckDB', 2)` | +| **Result** | `u` | +| **Aliases** | `list_element`, `list_extract` | + +#### `array_slice(list, begin, end)` + +
+ +| **Description** | Extract a string using slice conventions (like in Python). Negative values are accepted. | +| **Example 1** | `array_slice('DuckDB', 3, 4)` | +| **Result** | `ck` | +| **Example 2** | `array_slice('DuckDB', 3, NULL)` | +| **Result** | `NULL` | +| **Example 3** | `array_slice('DuckDB', 0, -3)` | +| **Result** | `Duck` | + +#### `ascii(string)` + +
+ +| **Description** | Returns an integer that represents the Unicode code point of the first character of the `string`. | +| **Example** | `ascii('Ω')` | +| **Result** | `937` | + +#### `bar(x, min, max[, width])` + +
+ +| **Description** | Draw a band whose width is proportional to (`x - min`) and equal to `width` characters when `x` = `max`. `width` defaults to 80. | +| **Example** | `bar(5, 0, 20, 10)` | +| **Result** | `██▌` | + +#### `bit_length(string)` + +
+ +| **Description** | Number of bits in a string. | +| **Example** | `bit_length('abc')` | +| **Result** | `24` | + +#### `chr(x)` + +
+ +| **Description** | Returns a character which is corresponding the ASCII code value or Unicode code point. | +| **Example** | `chr(65)` | +| **Result** | A | + +#### `concat_ws(separator, string, ...)` + +
+ +| **Description** | Concatenate strings together separated by the specified separator. | +| **Example** | `concat_ws(', ', 'Banana', 'Apple', 'Melon')` | +| **Result** | `Banana, Apple, Melon` | + +#### `concat(string, ...)` + +
+ +| **Description** | Concatenate many strings together. | +| **Example** | `concat('Hello', ' ', 'World')` | +| **Result** | `Hello World` | + +#### `contains(string, search_string)` + +
+ +| **Description** | Return true if `search_string` is found within `string`. | +| **Example** | `contains('abc', 'a')` | +| **Result** | `true` | + +#### `ends_with(string, search_string)` + +
+ +| **Description** | Return true if `string` ends with `search_string`. | +| **Example** | `ends_with('abc', 'c')` | +| **Result** | `true` | +| **Alias** | `suffix` | + +#### `format_bytes(bytes)` + +
+ +| **Description** | Converts bytes to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.). | +| **Example** | `format_bytes(16384)` | +| **Result** | `16.0 KiB` | + +#### `format(format, parameters, ...)` + +
+ +| **Description** | Formats a string using the [fmt syntax](#fmt-syntax). | +| **Example** | `format('Benchmark "{}" took {} seconds', 'CSV', 42)` | +| **Result** | `Benchmark "CSV" took 42 seconds` | + +#### `from_base64(string)` + +
+ +| **Description** | Convert a base64 encoded string to a character string. | +| **Example** | `from_base64('QQ==')` | +| **Result** | `'A'` | + +#### `greatest(x1, x2, ...)` + +
+ +| **Description** | Selects the largest value using lexicographical ordering. Note that lowercase characters are considered “larger” than uppercase characters and [collations]({% link docs/archive/1.0/sql/expressions/collations.md %}) are not supported. | +| **Example** | `greatest('abc', 'bcd', 'cde', 'EFG')` | +| **Result** | `'cde'` | + +#### `hash(value)` + +
+ +| **Description** | Returns a `UBIGINT` with the hash of the `value`. | +| **Example** | `hash('🦆')` | +| **Result** | `2595805878642663834` | + +#### `ilike_escape(string, like_specifier, escape_character)` + +
+ +| **Description** | Returns true if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| **Example** | `ilike_escape('A%c', 'a$%C', '$')` | +| **Result** | `true` | + +#### `instr(string, search_string)` + +
+ +| **Description** | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. | +| **Example** | `instr('test test', 'es')` | +| **Result** | 2 | + +#### `least(x1, x2, ...)` + +
+ +| **Description** | Selects the smallest value using lexicographical ordering. Note that uppercase characters are considered “smaller” than lowercase characters, and [collations]({% link docs/archive/1.0/sql/expressions/collations.md %}) are not supported. | +| **Example** | `least('abc', 'BCD', 'cde', 'EFG')` | +| **Result** | `'BCD'` | + +#### `left_grapheme(string, count)` + +
+ +| **Description** | Extract the left-most grapheme clusters. | +| **Example** | `left_grapheme('🤦🏼‍♂️🤦🏽‍♀️', 1)` | +| **Result** | `🤦🏼‍♂️` | + +#### `left(string, count)` + +
+ +| **Description** | Extract the left-most count characters. | +| **Example** | `left('Hello🦆', 2)` | +| **Result** | `He` | + +#### `length_grapheme(string)` + +
+ +| **Description** | Number of grapheme clusters in `string`. | +| **Example** | `length_grapheme('🤦🏼‍♂️🤦🏽‍♀️')` | +| **Result** | `2` | + +#### `length(string)` + +
+ +| **Description** | Number of characters in `string`. | +| **Example** | `length('Hello🦆')` | +| **Result** | `6` | + +#### `like_escape(string, like_specifier, escape_character)` + +
+ +| **Description** | Returns true if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| **Example** | `like_escape('a%c', 'a$%c', '$')` | +| **Result** | `true` | + +#### `lower(string)` + +
+ +| **Description** | Convert `string` to lower case. | +| **Example** | `lower('Hello')` | +| **Result** | `hello` | +| **Alias** | `lcase` | + +#### `lpad(string, count, character)` + +
+ +| **Description** | Pads the `string` with the character from the left until it has count characters. | +| **Example** | `lpad('hello', 8, '>')` | +| **Result** | `>>>hello` | + +#### `ltrim(string, characters)` + +
+ +| **Description** | Removes any occurrences of any of the `characters` from the left side of the `string`. | +| **Example** | `ltrim('>>>>test<<', '><')` | +| **Result** | `test<<` | + +#### `ltrim(string)` + +
+ +| **Description** | Removes any spaces from the left side of the `string`. In the example, the `␣` symbol denotes a space character. | +| **Example** | `ltrim('␣␣␣␣test␣␣')` | +| **Result** | `test␣␣` | + +#### `md5(value)` + +
+ +| **Description** | Returns the [MD5 hash](https://en.wikipedia.org/wiki/MD5) of the `value`. | +| **Example** | `md5('123')` | +| **Result** | `202cb962ac59075b964b07152d234b70` | + +#### `nfc_normalize(string)` + +
+ +| **Description** | Convert string to Unicode NFC normalized string. Useful for comparisons and ordering if text data is mixed between NFC normalized and not. | +| **Example** | `nfc_normalize('ardèch')` | +| **Result** | `ardèch` | + +#### `not_ilike_escape(string, like_specifier, escape_character)` + +
+ +| **Description** | Returns false if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-sensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| **Example** | `not_ilike_escape('A%c', 'a$%C', '$')` | +| **Result** | `false` | + +#### `not_like_escape(string, like_specifier, escape_character)` + +
+ +| **Description** | Returns false if the `string` matches the `like_specifier` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})) using case-insensitive matching. `escape_character` is used to search for wildcard characters in the `string`. | +| **Example** | `not_like_escape('a%c', 'a$%c', '$')` | +| **Result** | `false` | + +#### `ord(string)` + +
+ +| **Description** | Return ASCII character code of the leftmost character in a string. | +| **Example** | `ord('ü')` | +| **Result** | `252` | + +#### `parse_dirname(path, separator)` + +
+ +| **Description** | Returns the top-level directory name from the given path. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| **Example** | `parse_dirname('path/to/file.csv', 'system')` | +| **Result** | `path` | + +#### `parse_dirpath(path, separator)` + +
+ +| **Description** | Returns the head of the path (the pathname until the last slash) similarly to Python's [`os.path.dirname`](https://docs.python.org/3.7/library/os.path.html#os.path.dirname) function. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| **Example** | `parse_dirpath('/path/to/file.csv', 'forward_slash')` | +| **Result** | `/path/to` | + +#### `parse_filename(path, trim_extension, separator)` + +
+ +| **Description** | Returns the last component of the path similarly to Python's [`os.path.basename`](https://docs.python.org/3.7/library/os.path.html#os.path.basename) function. If `trim_extension` is true, the file extension will be removed (defaults to `false`). `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| **Example** | `parse_filename('path/to/file.csv', true, 'system')` | +| **Result** | `file` | + +#### `parse_path(path, separator)` + +
+ +| **Description** | Returns a list of the components (directories and filename) in the path similarly to Python's [`pathlib.parts`](https://docs.python.org/3/library/pathlib.html#pathlib.PurePath.parts) function. `separator` options: `system`, `both_slash` (default), `forward_slash`, `backslash`. | +| **Example** | `parse_path('/path/to/file.csv', 'system')` | +| **Result** | `[/, path, to, file.csv]` | + +#### `position(search_string IN string)` + +
+ +| **Description** | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. | +| **Example** | `position('b' IN 'abc')` | +| **Result** | `2` | + +#### `printf(format, parameters...)` + +
+ +| **Description** | Formats a `string` using [printf syntax](#printf-syntax). | +| **Example** | `printf('Benchmark "%s" took %d seconds', 'CSV', 42)` | +| **Result** | `Benchmark "CSV" took 42 seconds` | + +#### `read_text(source)` + +
+ +| **Description** | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `VARCHAR`. The file content is first validated to be valid UTF-8. If `read_text` attempts to read a file with invalid UTF-8 an error is thrown suggesting to use `read_blob` instead. See the [`read_text` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_text) for more details. | +| **Example** | `read_text('hello.txt')` | +| **Result** | `hello\n` | + +#### `regexp_escape(string)` + +
+ +| **Description** | Escapes special patterns to turn `string` into a regular expression similarly to Python's [`re.escape` function](https://docs.python.org/3/library/re.html#re.escape). | +| **Example** | `regexp_escape('http://d.org')` | +| **Result** | `http\:\/\/d\.org` | + +#### `regexp_extract(string, pattern[, group = 0])` + +
+ +| **Description** | If `string` contains the regexp `pattern`, returns the capturing group specified by optional parameter `group` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_extract)). | +| **Example** | `regexp_extract('hello_world', '([a-z ]+)_?', 1)` | +| **Result** | `hello` | + +#### `regexp_extract(string, pattern, name_list)` + +
+ +| **Description** | If `string` contains the regexp `pattern`, returns the capturing groups as a struct with corresponding names from `name_list` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_extract)). | +| **Example** | `regexp_extract('2023-04-15', '(\d+)-(\d+)-(\d+)', ['y', 'm', 'd'])` | +| **Result** | `{'y':'2023', 'm':'04', 'd':'15'}` | + +#### `regexp_extract_all(string, regex[, group = 0])` + +
+ +| **Description** | Split the `string` along the `regex` and extract all occurrences of `group`. | +| **Example** | `regexp_extract_all('hello_world', '([a-z ]+)_?', 1)` | +| **Result** | `[hello, world]` | + +#### `regexp_full_match(string, regex)` + +
+ +| **Description** | Returns `true` if the entire `string` matches the `regex` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %})). | +| **Example** | `regexp_full_match('anabanana', '(an)')` | +| **Result** | `false` | + +#### `regexp_matches(string, pattern)` + +
+ +| **Description** | Returns `true` if `string` contains the regexp `pattern`, `false` otherwise (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_matches)). | +| **Example** | `regexp_matches('anabanana', '(an)')` | +| **Result** | `true` | + +#### `regexp_replace(string, pattern, replacement)` + +
+ +| **Description** | If `string` contains the regexp `pattern`, replaces the matching part with `replacement` (see [Pattern Matching]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#using-regexp_replace)). | +| **Example** | `regexp_replace('hello', '[lo]', '-')` | +| **Result** | `he-lo` | + +#### `regexp_split_to_array(string, regex)` + +
+ +| **Description** | Splits the `string` along the `regex`. | +| **Example** | `regexp_split_to_array('hello world; 42', ';? ')` | +| **Result** | `['hello', 'world', '42']` | +| **Aliases** | `string_split_regex`, `str_split_regex` | + +#### `regexp_split_to_table(string, regex)` + +
+ +| **Description** | Splits the `string` along the `regex` and returns a row for each part. | +| **Example** | `regexp_split_to_table('hello world; 42', ';? ')` | +| **Result** | Two rows: `'hello'`, `'world'` | + +#### `repeat(string, count)` + +
+ +| **Description** | Repeats the `string` `count` number of times. | +| **Example** | `repeat('A', 5)` | +| **Result** | `AAAAA` | + +#### `replace(string, source, target)` + +
+ +| **Description** | Replaces any occurrences of the `source` with `target` in `string`. | +| **Example** | `replace('hello', 'l', '-')` | +| **Result** | `he--o` | + +#### `reverse(string)` + +
+ +| **Description** | Reverses the `string`. | +| **Example** | `reverse('hello')` | +| **Result** | `olleh` | + +#### `right_grapheme(string, count)` + +
+ +| **Description** | Extract the right-most `count` grapheme clusters. | +| **Example** | `right_grapheme('🤦🏼‍♂️🤦🏽‍♀️', 1)` | +| **Result** | `🤦🏽‍♀️` | + +#### `right(string, count)` + +
+ +| **Description** | Extract the right-most `count` characters. | +| **Example** | `right('Hello🦆', 3)` | +| **Result** | `lo🦆` | + +#### `rpad(string, count, character)` + +
+ +| **Description** | Pads the `string` with the character from the right until it has `count` characters. | +| **Example** | `rpad('hello', 10, '<')` | +| **Result** | `hello<<<<<` | + +#### `rtrim(string, characters)` + +
+ +| **Description** | Removes any occurrences of any of the `characters` from the right side of the `string`. | +| **Example** | `rtrim('>>>>test<<', '><')` | +| **Result** | `>>>>test` | + +#### `rtrim(string)` + +
+ +| **Description** | Removes any spaces from the right side of the `string`. In the example, the `␣` symbol denotes a space character. | +| **Example** | `rtrim('␣␣␣␣test␣␣')` | +| **Result** | `␣␣␣␣test` | + +#### `sha256(value)` + +
+ +| **Description** | Returns a `VARCHAR` with the SHA-256 hash of the `value`. | +| **Example** | `sha256('🦆')` | +| **Result** | `d7a5c5e0d1d94c32218539e7e47d4ba9c3c7b77d61332fb60d633dde89e473fb` | + +#### `split_part(string, separator, index)` + +
+ +| **Description** | Split the `string` along the `separator` and return the data at the (1-based) `index` of the list. If the `index` is outside the bounds of the list, return an empty string (to match PostgreSQL's behavior). | +| **Example** | `split_part('a;b;c', ';', 2)` | +| **Result** | `b` | + +#### `starts_with(string, search_string)` + +
+ +| **Description** | Return true if `string` begins with `search_string`. | +| **Example** | `starts_with('abc', 'a')` | +| **Result** | `true` | + +#### `str_split_regex(string, regex)` + +
+ +| **Description** | Splits the `string` along the `regex`. | +| **Example** | `str_split_regex('hello world; 42', ';? ')` | +| **Result** | `['hello', 'world', '42']` | +| **Aliases** | `string_split_regex`, `regexp_split_to_array` | + +#### `string_split_regex(string, regex)` + +
+ +| **Description** | Splits the `string` along the `regex`. | +| **Example** | `string_split_regex('hello world; 42', ';? ')` | +| **Result** | `['hello', 'world', '42']` | +| **Aliases** | `str_split_regex`, `regexp_split_to_array` | + +#### `string_split(string, separator)` + +
+ +| **Description** | Splits the `string` along the `separator`. | +| **Example** | `string_split('hello world', ' ')` | +| **Result** | `['hello', 'world']` | +| **Aliases** | `str_split`, `string_to_array` | + +#### `strip_accents(string)` + +
+ +| **Description** | Strips accents from `string`. | +| **Example** | `strip_accents('mühleisen')` | +| **Result** | `muhleisen` | + +#### `strlen(string)` + +
+ +| **Description** | Number of bytes in `string`. | +| **Example** | `strlen('🦆')` | +| **Result** | `4` | + +#### `strpos(string, search_string)` + +
+ +| **Description** | Return location of first occurrence of `search_string` in `string`, counting from 1. Returns 0 if no match found. | +| **Example** | `strpos('test test', 'es')` | +| **Result** | 2 | +| **Alias** | `instr` | + +#### `substring(string, start, length)` + +
+ +| **Description** | Extract substring of `length` characters starting from character `start`. Note that a `start` value of `1` refers to the `first` character of the string. | +| **Example** | `substring('Hello', 2, 2)` | +| **Result** | `el` | +| **Alias** | `substr` | + +#### `substring_grapheme(string, start, length)` + +
+ +| **Description** | Extract substring of `length` grapheme clusters starting from character `start`. Note that a `start` value of `1` refers to the `first` character of the string. | +| **Example** | `substring_grapheme('🦆🤦🏼‍♂️🤦🏽‍♀️🦆', 3, 2)` | +| **Result** | `🤦🏽‍♀️🦆` | + +#### `to_base64(blob)` + +
+ +| **Description** | Convert a blob to a base64 encoded string. | +| **Example** | `to_base64('A'::blob)` | +| **Result** | `QQ==` | +| **Alias** | `base64` | + +#### `trim(string, characters)` + +
+ +| **Description** | Removes any occurrences of any of the `characters` from either side of the `string`. | +| **Example** | `trim('>>>>test<<', '><')` | +| **Result** | `test` | + +#### `trim(string)` + +
+ +| **Description** | Removes any spaces from either side of the `string`. | +| **Example** | `trim(' test ')` | +| **Result** | `test` | + +#### `unicode(string)` + +
+ +| **Description** | Returns the Unicode code of the first character of the `string`. Returns `-1` when `string` is empty, and `NULL` when `string` is `NULL`. | +| **Example** | `[unicode('âbcd'), unicode('â'), unicode(''), unicode(NULL)]` | +| **Result** | `[226, 226, -1, NULL]` | + +#### `upper(string)` + +
+ +| **Description** | Convert `string` to upper case. | +| **Example** | `upper('Hello')` | +| **Result** | `HELLO` | +| **Alias** | `ucase` | + +## Text Similarity Functions + +These functions are used to measure the similarity of two strings using various [similarity measures](https://en.wikipedia.org/wiki/Similarity_measure). + +| Name | Description | +|:--|:-------| +| [`damerau_levenshtein(s1, s2)`](#damerau_levenshteins1-s2) | Extension of Levenshtein distance to also include transposition of adjacent characters as an allowed edit operation. In other words, the minimum number of edit operations (insertions, deletions, substitutions or transpositions) required to change one string to another. Characters of different cases (e.g., `a` and `A`) are considered different. | +| [`editdist3(s1, s2)`](#editdist3s1-s2) | Alias of `levenshtein` for SQLite compatibility. The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. | +| [`hamming(s1, s2)`](#hammings1-s2) | The Hamming distance between to strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. | +| [`jaccard(s1, s2)`](#jaccards1-s2) | The Jaccard similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. | +| [`jaro_similarity(s1, s2)`](#jaro_similaritys1-s2) | The Jaro similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. | +| [`jaro_winkler_similarity(s1, s2)`](#jaro_winkler_similaritys1-s2) | The Jaro-Winkler similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. | +| [`levenshtein(s1, s2)`](#levenshteins1-s2) | The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. | +| [`mismatches(s1, s2)`](#mismatchess1-s2) | Alias for `hamming(s1, s2)`. The number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. | + +#### `damerau_levenshtein(s1, s2)` + +
+ +| **Description** | Extension of Levenshtein distance to also include transposition of adjacent characters as an allowed edit operation. In other words, the minimum number of edit operations (insertions, deletions, substitutions or transpositions) required to change one string to another. Characters of different cases (e.g., `a` and `A`) are considered different. | +| **Example** | `damerau_levenshtein('duckdb', 'udckbd')` | +| **Result** | `2` | + +#### `editdist3(s1, s2)` + +
+ +| **Description** | Alias of `levenshtein` for SQLite compatibility. The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. | +| **Example** | `editdist3('duck', 'db')` | +| **Result** | `3` | + +#### `hamming(s1, s2)` + +
+ +| **Description** | The Hamming distance between to strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. | +| **Example** | `hamming('duck', 'luck')` | +| **Result** | `1` | + +#### `jaccard(s1, s2)` + +
+ +| **Description** | The Jaccard similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. | +| **Example** | `jaccard('duck', 'luck')` | +| **Result** | `0.6` | + +#### `jaro_similarity(s1, s2)` + +
+ +| **Description** | The Jaro similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. | +| **Example** | `jaro_similarity('duck', 'duckdb')` | +| **Result** | `0.88` | + +#### `jaro_winkler_similarity(s1, s2)` + +
+ +| **Description** | The Jaro-Winkler similarity between two strings. Characters of different cases (e.g., `a` and `A`) are considered different. Returns a number between 0 and 1. | +| **Example** | `jaro_winkler_similarity('duck', 'duckdb')` | +| **Result** | `0.93` | + +#### `levenshtein(s1, s2)` + +
+ +| **Description** | The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., `a` and `A`) are considered different. | +| **Example** | `levenshtein('duck', 'db')` | +| **Result** | `3` | + +#### `mismatches(s1, s2)` + +
+ +| **Description** | Alias for `hamming(s1, s2)`. The number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., `a` and `A`) are considered different. | +| **Example** | `mismatches('duck', 'luck')` | +| **Result** | `1` | + +## Formatters + +### `fmt` Syntax + +The `format(format, parameters...)` function formats strings, loosely following the syntax of the [{fmt} open-source formatting library](https://fmt.dev/latest/syntax/). + +Format without additional parameters: + +```sql +SELECT format('Hello world'); -- Hello world +``` + +Format a string using {}: + +```sql +SELECT format('The answer is {}', 42); -- The answer is 42 +``` + +Format a string using positional arguments: + +```sql +SELECT format('I''d rather be {1} than {0}.', 'right', 'happy'); -- I'd rather be happy than right. +``` + +#### Format Specifiers + +
+ +| Specifier | Description | Example | +|:-|:------|:---| +| `{:d}` | integer | `123456` | +| `{:E}` | scientific notation | `3.141593E+00` | +| `{:f}` | float | `4.560000` | +| `{:o}` | octal | `361100` | +| `{:s}` | string | `asd` | +| `{:x}` | hexadecimal | `1e240` | +| `{:tX}` | integer, `X` is the thousand separator | `123 456` | + +#### Formatting Types + +Integers: + +```sql +SELECT format('{} + {} = {}', 3, 5, 3 + 5); -- 3 + 5 = 8 +``` + +Booleans: + +```sql +SELECT format('{} != {}', true, false); -- true != false +``` + +Format datetime values: + +```sql +SELECT format('{}', DATE '1992-01-01'); -- 1992-01-01 +SELECT format('{}', TIME '12:01:00'); -- 12:01:00 +SELECT format('{}', TIMESTAMP '1992-01-01 12:01:00'); -- 1992-01-01 12:01:00 +``` + +Format BLOB: + +```sql +SELECT format('{}', BLOB '\x00hello'); -- \x00hello +``` + +Pad integers with 0s: + +```sql +SELECT format('{:04d}', 33); -- 0033 +``` + +Create timestamps from integers: + +```sql +SELECT format('{:02d}:{:02d}:{:02d} {}', 12, 3, 16, 'AM'); -- 12:03:16 AM +``` + +Convert to hexadecimal: + +```sql +SELECT format('{:x}', 123_456_789); -- 75bcd15 +``` + +Convert to binary: + +```sql +SELECT format('{:b}', 123_456_789); -- 111010110111100110100010101 +``` + +#### Print Numbers with Thousand Separators + +```sql +SELECT format('{:,}', 123_456_789); -- 123,456,789 +SELECT format('{:t.}', 123_456_789); -- 123.456.789 +SELECT format('{:''}', 123_456_789); -- 123'456'789 +SELECT format('{:_}', 123_456_789); -- 123_456_789 +SELECT format('{:t }', 123_456_789); -- 123 456 789 +SELECT format('{:tX}', 123_456_789); -- 123X456X789 +``` + +### `printf` Syntax + +The `printf(format, parameters...)` function formats strings using the [`printf` syntax](https://cplusplus.com/reference/cstdio/printf/). + +Format without additional parameters: + +```sql +SELECT printf('Hello world'); +``` + +```text +Hello world +``` + +Format a string using arguments in a given order: + +```sql +SELECT printf('The answer to %s is %d', 'life', 42); +``` + +```text +The answer to life is 42 +``` + +Format a string using positional arguments `%position$formatter`, e.g., the second parameter as a string is encoded as `%2$s`: + +```sql +SELECT printf('I''d rather be %2$s than %1$s.', 'right', 'happy'); +``` + +```text +I'd rather be happy than right. +``` + +#### Format Specifiers + +
+ +| Specifier | Description | Example | +|:-|:------|:---| +| `%c` | character code to character | `a` | +| `%d` | integer | `123456` | +| `%Xd` | integer with thousand seperarator `X` from `,`, `.`, `''`, `_` | `123_456` | +| `%E` | scientific notation | `3.141593E+00` | +| `%f` | float | `4.560000` | +| `%hd` | integer | `123456` | +| `%hhd` | integer | `123456` | +| `%lld` | integer | `123456` | +| `%o` | octal | `361100` | +| `%s` | string | `asd` | +| `%x` | hexadecimal | `1e240` | + +#### Formatting Types + +Integers: + +```sql +SELECT printf('%d + %d = %d', 3, 5, 3 + 5); -- 3 + 5 = 8 +``` + +Booleans: + +```sql +SELECT printf('%s != %s', true, false); -- true != false +``` + +Format datetime values: + +```sql +SELECT printf('%s', DATE '1992-01-01'); -- 1992-01-01 +SELECT printf('%s', TIME '12:01:00'); -- 12:01:00 +SELECT printf('%s', TIMESTAMP '1992-01-01 12:01:00'); -- 1992-01-01 12:01:00 +``` + +Format BLOB: + +```sql +SELECT printf('%s', BLOB '\x00hello'); -- \x00hello +``` + +Pad integers with 0s: + +```sql +SELECT printf('%04d', 33); -- 0033 +``` + +Create timestamps from integers: + +```sql +SELECT printf('%02d:%02d:%02d %s', 12, 3, 16, 'AM'); -- 12:03:16 AM +``` + +Convert to hexadecimal: + +```sql +SELECT printf('%x', 123_456_789); -- 75bcd15 +``` + +Convert to binary: + +```sql +SELECT printf('%b', 123_456_789); -- 111010110111100110100010101 +``` + +#### Thousand Separators + +```sql +SELECT printf('%,d', 123_456_789); -- 123,456,789 +SELECT printf('%.d', 123_456_789); -- 123.456.789 +SELECT printf('%''d', 123_456_789); -- 123'456'789 +SELECT printf('%_d', 123_456_789); -- 123_456_789 +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/date.md b/docs/archive/1.0/sql/functions/date.md new file mode 100644 index 00000000000..1a60d9aab16 --- /dev/null +++ b/docs/archive/1.0/sql/functions/date.md @@ -0,0 +1,253 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/date +title: Date Functions +--- + + + +This section describes functions and operators for examining and manipulating [`DATE`]({% link docs/archive/1.0/sql/data_types/date.md %}) values. + +## Date Operators + +The table below shows the available mathematical operators for `DATE` types. + +| Operator | Description | Example | Result | +|:-|:--|:---|:--| +| `+` | Addition of days (integers) | `DATE '1992-03-22' + 5` | `1992-03-27` | +| `+` | Addition of an `INTERVAL` | `DATE '1992-03-22' + INTERVAL 5 DAY` | `1992-03-27` | +| `+` | Addition of a variable `INTERVAL` | `SELECT DATE '1992-03-22' + INTERVAL (d.days) DAY FROM (VALUES (5), (11)) AS d(days)` | `1992-03-27` and `1992-04-02` | +| `-` | Subtraction of `DATE`s | `DATE '1992-03-27' - DATE '1992-03-22'` | `5` | +| `-` | Subtraction of an `INTERVAL` | `DATE '1992-03-27' - INTERVAL 5 DAY` | `1992-03-22` | +| `-` | Subtraction of a variable `INTERVAL` | `SELECT DATE '1992-03-27' - INTERVAL (d.days) DAY FROM (VALUES (5), (11)) AS d(days)` | `1992-03-22` and `1992-03-16` | + +Adding to or subtracting from [infinite values]({% link docs/archive/1.0/sql/data_types/date.md %}#special-values) produces the same infinite value. + +## Date Functions + +The table below shows the available functions for `DATE` types. +Dates can also be manipulated with the [timestamp functions]({% link docs/archive/1.0/sql/functions/timestamp.md %}) through type promotion. + +| Name | Description | +|:--|:-------| +| [`current_date`](#current_date) | Current date (at start of current transaction). | +| [`date_add(date, interval)`](#date_adddate-interval) | Add the interval to the date. | +| [`date_diff(part, startdate, enddate)`](#date_diffpart-startdate-enddate) | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the dates. | +| [`date_part(part, date)`](#date_partpart-date) | Get the [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| [`date_sub(part, startdate, enddate)`](#date_subpart-startdate-enddate) | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the dates. | +| [`date_trunc(part, date)`](#date_truncpart-date) | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| [`datediff(part, startdate, enddate)`](#datediffpart-startdate-enddate) | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the dates. Alias of `date_diff`. | +| [`datepart(part, date)`](#datepartpart-date) | Get the [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). Alias of `date_part`. | +| [`datesub(part, startdate, enddate)`](#datesubpart-startdate-enddate) | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the dates. Alias of `date_sub`. | +| [`datetrunc(part, date)`](#datetruncpart-date) | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). Alias of `date_trunc`. | +| [`dayname(date)`](#daynamedate) | The (English) name of the weekday. | +| [`extract(part from date)`](#extractpart-from-date) | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) from a date. | +| [`greatest(date, date)`](#greatestdate-date) | The later of two dates. | +| [`isfinite(date)`](#isfinitedate) | Returns true if the date is finite, false otherwise. | +| [`isinf(date)`](#isinfdate) | Returns true if the date is infinite, false otherwise. | +| [`last_day(date)`](#last_daydate) | The last day of the corresponding month in the date. | +| [`least(date, date)`](#leastdate-date) | The earlier of two dates. | +| [`make_date(year, month, day)`](#make_dateyear-month-day) | The date for the given parts. | +| [`monthname(date)`](#monthnamedate) | The (English) name of the month. | +| [`strftime(date, format)`](#strftimedate-format) | Converts a date to a string according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | +| [`time_bucket(bucket_width, date[, offset])`](#time_bucketbucket_width-date-offset) | Truncate `date` by the specified interval `bucket_width`. Buckets are offset by `offset` interval. | +| [`time_bucket(bucket_width, date[, origin])`](#time_bucketbucket_width-date-origin) | Truncate `date` by the specified interval `bucket_width`. Buckets are aligned relative to `origin` date. `origin` defaults to 2000-01-03 for buckets that don't include a month or year interval, and to 2000-01-01 for month and year buckets. | +| [`today()`](#today) | Current date (start of current transaction). | + +#### `current_date` + +
+ +| **Description** | Current date (at start of current transaction). | +| **Example** | `current_date` | +| **Result** | `2022-10-08` | + +#### `date_add(date, interval)` + +
+ +| **Description** | Add the interval to the date. | +| **Example** | `date_add(DATE '1992-09-15', INTERVAL 2 MONTH)` | +| **Result** | `1992-11-15` | + +#### `date_diff(part, startdate, enddate)` + +
+ +| **Description** | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the dates. | +| **Example** | `date_diff('month', DATE '1992-09-15', DATE '1992-11-14')` | +| **Result** | `2` | + +#### `date_part(part, date)` + +
+ +| **Description** | Get the [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| **Example** | `date_part('year', DATE '1992-09-20')` | +| **Result** | `1992` | + +#### `date_sub(part, startdate, enddate)` + +
+ +| **Description** | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the dates. | +| **Example** | `date_sub('month', DATE '1992-09-15', DATE '1992-11-14')` | +| **Result** | `1` | + +#### `date_trunc(part, date)` + +
+ +| **Description** | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| **Example** | `date_trunc('month', DATE '1992-03-07')` | +| **Result** | `1992-03-01` | + +#### `datediff(part, startdate, enddate)` + +
+ +| **Description** | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the dates. | +| **Example** | `datediff('month', DATE '1992-09-15', DATE '1992-11-14')` | +| **Result** | `2` | +| **Alias** | `date_diff`. | + +#### `datepart(part, date)` + +
+ +| **Description** | Get the [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| **Example** | `datepart('year', DATE '1992-09-20')` | +| **Result** | `1992` | +| **Alias** | `date_part`. | + +#### `datesub(part, startdate, enddate)` + +
+ +| **Description** | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the dates. | +| **Example** | `datesub('month', DATE '1992-09-15', DATE '1992-11-14')` | +| **Result** | `1` | +| **Alias** | `date_sub`. | + +#### `datetrunc(part, date)` + +
+ +| **Description** | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| **Example** | `datetrunc('month', DATE '1992-03-07')` | +| **Result** | `1992-03-01` | +| **Alias** | `date_trunc`. | + +#### `dayname(date)` + +
+ +| **Description** | The (English) name of the weekday. | +| **Example** | `dayname(DATE '1992-09-20')` | +| **Result** | `Sunday` | + +#### `extract(part from date)` + +
+ +| **Description** | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) from a date. | +| **Example** | `extract('year' FROM DATE '1992-09-20')` | +| **Result** | `1992` | + +#### `greatest(date, date)` + +
+ +| **Description** | The later of two dates. | +| **Example** | `greatest(DATE '1992-09-20', DATE '1992-03-07')` | +| **Result** | `1992-09-20` | + +#### `isfinite(date)` + +
+ +| **Description** | Returns `true` if the date is finite, false otherwise. | +| **Example** | `isfinite(DATE '1992-03-07')` | +| **Result** | `true` | + +#### `isinf(date)` + +
+ +| **Description** | Returns `true` if the date is infinite, false otherwise. | +| **Example** | `isinf(DATE '-infinity')` | +| **Result** | `true` | + +#### `last_day(date)` + +
+ +| **Description** | The last day of the corresponding month in the date. | +| **Example** | `last_day(DATE '1992-09-20')` | +| **Result** | `1992-09-30` | + +#### `least(date, date)` + +
+ +| **Description** | The earlier of two dates. | +| **Example** | `least(DATE '1992-09-20', DATE '1992-03-07')` | +| **Result** | `1992-03-07` | + +#### `make_date(year, month, day)` + +
+ +| **Description** | The date for the given parts. | +| **Example** | `make_date(1992, 9, 20)` | +| **Result** | `1992-09-20` | + +#### `monthname(date)` + +
+ +| **Description** | The (English) name of the month. | +| **Example** | `monthname(DATE '1992-09-20')` | +| **Result** | `September` | + +#### `strftime(date, format)` + +
+ +| **Description** | Converts a date to a string according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | +| **Example** | `strftime(date '1992-01-01', '%a, %-d %B %Y')` | +| **Result** | `Wed, 1 January 1992` | + +#### `time_bucket(bucket_width, date[, offset])` + +
+ +| **Description** | Truncate `date` by the specified interval `bucket_width`. Buckets are offset by `offset` interval. | +| **Example** | `time_bucket(INTERVAL '2 months', DATE '1992-04-20', INTERVAL '1 month')` | +| **Result** | `1992-04-01` | + +#### `time_bucket(bucket_width, date[, origin])` + +
+ +| **Description** | Truncate `date` by the specified interval `bucket_width`. Buckets are aligned relative to `origin` date. `origin` defaults to `2000-01-03` for buckets that don't include a month or year interval, and to `2000-01-01` for month and year buckets. | +| **Example** | `time_bucket(INTERVAL '2 weeks', DATE '1992-04-20', DATE '1992-04-01')` | +| **Result** | `1992-04-15` | + +#### `today()` + +
+ +| **Description** | Current date (start of current transaction). | +| **Example** | `today()` | +| **Result** | `2022-10-08` | + +## Date Part Extraction Functions + +There are also dedicated extraction functions to get the [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}#part-functions). +A few examples include extracting the day from a date, or the day of the week from a date. + +Functions applied to infinite dates will either return the same infinite dates +(e.g, `greatest`) or `NULL` (e.g., `date_part`) depending on what “makes sense”. +In general, if the function needs to examine the parts of the infinite date, the result will be `NULL`. \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/dateformat.md b/docs/archive/1.0/sql/functions/dateformat.md new file mode 100644 index 00000000000..d54987e5cbc --- /dev/null +++ b/docs/archive/1.0/sql/functions/dateformat.md @@ -0,0 +1,128 @@ +--- +layout: docu +title: Date Format Functions +--- + +The `strftime` and `strptime` functions can be used to convert between [`DATE`]({% link docs/archive/1.0/sql/data_types/date.md %}) / [`TIMESTAMP`]({% link docs/archive/1.0/sql/data_types/timestamp.md %}) values and strings. This is often required when parsing CSV files, displaying output to the user or transferring information between programs. Because there are many possible date representations, these functions accept a format string that describes how the date or timestamp should be structured. + +## `strftime` Examples + +The [`strftime(timestamp, format)`]({% link docs/archive/1.0/sql/functions/timestamp.md %}#strftimetimestamp-format) converts timestamps or dates to strings according to the specified pattern. + +```sql +SELECT strftime(DATE '1992-03-02', '%d/%m/%Y'); +``` + +```text +02/03/1992 +``` + +```sql +SELECT strftime(TIMESTAMP '1992-03-02 20:32:45', '%A, %-d %B %Y - %I:%M:%S %p'); +``` + +```text +Monday, 2 March 1992 - 08:32:45 PM +``` + +## `strptime` Examples + +The [`strptime(text, format)` function]({% link docs/archive/1.0/sql/functions/timestamp.md %}#strptimetext-format) converts strings to timestamps according to the specified pattern. + +```sql +SELECT strptime('02/03/1992', '%d/%m/%Y'); +``` + +```text +1992-03-02 00:00:00 +``` + +```sql +SELECT strptime('Monday, 2 March 1992 - 08:32:45 PM', '%A, %-d %B %Y - %I:%M:%S %p'); +``` + +```text +1992-03-02 20:32:45 +``` + +The `strptime` function throws an error on failure: + +```sql +SELECT strptime('02/50/1992', '%d/%m/%Y') AS x; +``` + +```console +Invalid Input Error: Could not parse string "02/50/1992" according to format specifier "%d/%m/%Y" +02/50/1992 + ^ +Error: Month out of range, expected a value between 1 and 12 +``` + +To return `NULL` on failure, use the [`try_strptime` function]({% link docs/archive/1.0/sql/functions/timestamp.md %}#try_strptimetext-format): + +```text +NULL +``` + +## CSV Parsing + +The date formats can also be specified during CSV parsing, either in the [`COPY` statement]({% link docs/archive/1.0/sql/statements/copy.md %}) or in the `read_csv` function. This can be done by either specifying a `DATEFORMAT` or a `TIMESTAMPFORMAT` (or both). `DATEFORMAT` will be used for converting dates, and `TIMESTAMPFORMAT` will be used for converting timestamps. Below are some examples for how to use this. + +In a `COPY` statement: + +```sql +COPY dates FROM 'test.csv' (DATEFORMAT '%d/%m/%Y', TIMESTAMPFORMAT '%A, %-d %B %Y - %I:%M:%S %p'); +``` + +In a `read_csv` function: + +```sql +SELECT * +FROM read_csv('test.csv', dateformat = '%m/%d/%Y'); +``` + +## Format Specifiers + +Below is a full list of all available format specifiers. + +
+ +| Specifier | Description | Example | +|:-|:------|:---| +| `%a` | Abbreviated weekday name. | Sun, Mon, ... | +| `%A` | Full weekday name. | Sunday, Monday, ... | +| `%b` | Abbreviated month name. | Jan, Feb, ..., Dec | +| `%B` | Full month name. | January, February, ... | +| `%c` | ISO date and time representation | 1992-03-02 10:30:20 | +| `%d` | Day of the month as a zero-padded decimal. | 01, 02, ..., 31 | +| `%-d` | Day of the month as a decimal number. | 1, 2, ..., 30 | +| `%f` | Microsecond as a decimal number, zero-padded on the left. | 000000 - 999999 | +| `%g` | Millisecond as a decimal number, zero-padded on the left. | 000 - 999 | +| `%G` | ISO 8601 year with century representing the year that contains the greater part of the ISO week (see `%V`). | 0001, 0002, ..., 2013, 2014, ..., 9998, 9999 | +| `%H` | Hour (24-hour clock) as a zero-padded decimal number. | 00, 01, ..., 23 | +| `%-H` | Hour (24-hour clock) as a decimal number. | 0, 1, ..., 23 | +| `%I` | Hour (12-hour clock) as a zero-padded decimal number. | 01, 02, ..., 12 | +| `%-I` | Hour (12-hour clock) as a decimal number. | 1, 2, ... 12 | +| `%j` | Day of the year as a zero-padded decimal number. | 001, 002, ..., 366 | +| `%-j` | Day of the year as a decimal number. | 1, 2, ..., 366 | +| `%m` | Month as a zero-padded decimal number. | 01, 02, ..., 12 | +| `%-m` | Month as a decimal number. | 1, 2, ..., 12 | +| `%M` | Minute as a zero-padded decimal number. | 00, 01, ..., 59 | +| `%-M` | Minute as a decimal number. | 0, 1, ..., 59 | +| `%n` | Nanosecond as a decimal number, zero-padded on the left. | 000000000 - 999999999 | +| `%p` | Locale's AM or PM. | AM, PM | +| `%S` | Second as a zero-padded decimal number. | 00, 01, ..., 59 | +| `%-S` | Second as a decimal number. | 0, 1, ..., 59 | +| `%u` | ISO 8601 weekday as a decimal number where 1 is Monday. | 1, 2, ..., 7 | +| `%U` | Week number of the year. Week 01 starts on the first Sunday of the year, so there can be week 00. Note that this is not compliant with the week date standard in ISO-8601. | 00, 01, ..., 53 | +| `%V` | ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4. | 01, ..., 53 | +| `%w` | Weekday as a decimal number. | 0, 1, ..., 6 | +| `%W` | Week number of the year. Week 01 starts on the first Monday of the year, so there can be week 00. Note that this is not compliant with the week date standard in ISO-8601. | 00, 01, ..., 53 | +| `%x` | ISO date representation | 1992-03-02 | +| `%X` | ISO time representation | 10:30:20 | +| `%y` | Year without century as a zero-padded decimal number. | 00, 01, ..., 99 | +| `%-y` | Year without century as a decimal number. | 0, 1, ..., 99 | +| `%Y` | Year with century as a decimal number. | 2013, 2019 etc. | +| `%z` | [Time offset from UTC](https://en.wikipedia.org/wiki/ISO_8601#Time_offsets_from_UTC) in the form ±HH:MM, ±HHMM, or ±HH. | -0700 | +| `%Z` | Time zone name. | Europe/Amsterdam | +| `%%` | A literal `%` character. | % | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/datepart.md b/docs/archive/1.0/sql/functions/datepart.md new file mode 100644 index 00000000000..5680c59d1da --- /dev/null +++ b/docs/archive/1.0/sql/functions/datepart.md @@ -0,0 +1,289 @@ +--- +layout: docu +title: Date Part Functions +--- + + + +The `date_part` and `date_diff` and `date_trunc` functions can be used to manipulate the fields of temporal types such as [`DATE`]({% link docs/archive/1.0/sql/data_types/date.md %}) and [`TIMESTAMP`]({% link docs/archive/1.0/sql/data_types/timestamp.md %}). +The fields are specified as strings that contain the part name of the field. + +Below is a full list of all available date part specifiers. +The examples are the corresponding parts of the timestamp `2021-08-03 11:59:44.123456`. + +## Part Specifiers Usable as Date Part Specifiers and in Intervals + +| Specifier | Description | Synonyms | Example | +|:--|:--|:---|:-| +| `'century'` | Gregorian century | `'cent'`, `'centuries'`, `'c'` | `21` | +| `'day'` | Gregorian day | `'days'`, `'d'`, `'dayofmonth'` | `3` | +| `'decade'` | Gregorian decade | `'dec'`, `'decades'`, `'decs'` | `202` | +| `'hour'` | Hours | `'hr'`, `'hours'`, `'hrs'`, `'h'` | `11` | +| `'microseconds'` | Sub-minute microseconds | `'microsecond'`, `'us'`, `'usec'`, `'usecs'`, `'usecond'`, `'useconds'` | `44123456` | +| `'millennium'` | Gregorian millennium | `'mil'`, `'millenniums'`, `'millenia'`, `'mils'`, `'millenium'` | `3` | +| `'milliseconds'` | Sub-minute milliseconds | `'millisecond'`, `'ms'`, `'msec'`, `'msecs'`, `'msecond'`, `'mseconds'` | `44123` | +| `'minute'` | Minutes | `'min'`, `'minutes'`, `'mins'`, `'m'` | `59` | +| `'month'` | Gregorian month | `'mon'`, `'months'`, `'mons'` | `8` | +| `'quarter'` | Quarter of the year (1-4) | `'quarters'` | `3` | +| `'second'` | Seconds | `'sec'`, `'seconds'`, `'secs'`, `'s'` | `44` | +| `'year'` | Gregorian year | `'yr'`, `'y'`, `'years'`, `'yrs'` | `2021` | + +## Part Specifiers Only Usable as Date Part Specifiers + +| Specifier | Description | Synonyms | Example | +|:--|:--|:---|:-| +| `'dayofweek'` | Day of the week (Sunday = 0, Saturday = 6) | `'weekday'`, `'dow'` | `2` | +| `'dayofyear'` | Day of the year (1-365/366) | `'doy'` | `215` | +| `'epoch'` | Seconds since 1970-01-01 | | `1627991984` | +| `'era'` | Gregorian era (CE/AD, BCE/BC) | | `1` | +| `'isodow'` | ISO day of the week (Monday = 1, Sunday = 7) | | `2` | +| `'isoyear'` | ISO Year number (Starts on Monday of week containing Jan 4th) | | `2021` | +| `'timezone_hour'` | Time zone offset hour portion | | `0` | +| `'timezone_minute'` | Time zone offset minute portion | | `0` | +| `'timezone'` | Time zone offset in seconds | | `0` | +| `'week'` | Week number | `'weeks'`, `'w'` | `31` | +| `'yearweek'` | ISO year and week number in `YYYYWW` format | | `202131` | + +Note that the time zone parts are all zero unless a time zone plugin such as ICU +has been installed to support `TIMESTAMP WITH TIME ZONE`. + +## Part Functions + +There are dedicated extraction functions to get certain subfields: + +| Name | Description | +|:--|:-------| +| [`century(date)`](#centurydate) | Century. | +| [`day(date)`](#daydate) | Day. | +| [`dayofmonth(date)`](#dayofmonthdate) | Day (synonym). | +| [`dayofweek(date)`](#dayofweekdate) | Numeric weekday (Sunday = 0, Saturday = 6). | +| [`dayofyear(date)`](#dayofyeardate) | Day of the year (starts from 1, i.e., January 1 = 1). | +| [`decade(date)`](#decadedate) | Decade (year / 10). | +| [`epoch(date)`](#epochdate) | Seconds since 1970-01-01. | +| [`era(date)`](#eradate) | Calendar era. | +| [`hour(date)`](#hourdate) | Hours. | +| [`isodow(date)`](#isodowdate) | Numeric ISO weekday (Monday = 1, Sunday = 7). | +| [`isoyear(date)`](#isoyeardate) | ISO Year number (Starts on Monday of week containing Jan 4th). | +| [`microsecond(date)`](#microseconddate) | Sub-minute microseconds. | +| [`millennium(date)`](#millenniumdate) | Millennium. | +| [`millisecond(date)`](#milliseconddate) | Sub-minute milliseconds. | +| [`minute(date)`](#minutedate) | Minutes. | +| [`month(date)`](#monthdate) | Month. | +| [`quarter(date)`](#quarterdate) | Quarter. | +| [`second(date)`](#seconddate) | Seconds. | +| [`timezone_hour(date)`](#timezone_hourdate) | Time zone offset hour portion. | +| [`timezone_minute(date)`](#timezone_minutedate) | Time zone offset minutes portion. | +| [`timezone(date)`](#timezonedate) | Time Zone offset in minutes. | +| [`week(date)`](#weekdate) | ISO Week. | +| [`weekday(date)`](#weekdaydate) | Numeric weekday synonym (Sunday = 0, Saturday = 6). | +| [`weekofyear(date)`](#weekofyeardate) | ISO Week (synonym). | +| [`year(date)`](#yeardate) | Year. | +| [`yearweek(date)`](#yearweekdate) | `BIGINT` of combined ISO Year number and 2-digit version of ISO Week number. | + +#### `century(date)` + +
+ +| **Description** | Century. | +| **Example** | `century(date '1992-02-15')` | +| **Result** | `20` | + +#### `day(date)` + +
+ +| **Description** | Day. | +| **Example** | `day(date '1992-02-15')` | +| **Result** | `15` | + +#### `dayofmonth(date)` + +
+ +| **Description** | Day (synonym). | +| **Example** | `dayofmonth(date '1992-02-15')` | +| **Result** | `15` | + +#### `dayofweek(date)` + +
+ +| **Description** | Numeric weekday (Sunday = 0, Saturday = 6). | +| **Example** | `dayofweek(date '1992-02-15')` | +| **Result** | `6` | + +#### `dayofyear(date)` + +
+ +| **Description** | Day of the year (starts from 1, i.e., January 1 = 1). | +| **Example** | `dayofyear(date '1992-02-15')` | +| **Result** | `46` | + +#### `decade(date)` + +
+ +| **Description** | Decade (year / 10). | +| **Example** | `decade(date '1992-02-15')` | +| **Result** | `199` | + +#### `epoch(date)` + +
+ +| **Description** | Seconds since 1970-01-01. | +| **Example** | `epoch(date '1992-02-15')` | +| **Result** | `698112000` | + +#### `era(date)` + +
+ +| **Description** | Calendar era. | +| **Example** | `era(date '0044-03-15 (BC)')` | +| **Result** | `0` | + +#### `hour(date)` + +
+ +| **Description** | Hours. | +| **Example** | `hour(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `11` | + +#### `isodow(date)` + +
+ +| **Description** | Numeric ISO weekday (Monday = 1, Sunday = 7). | +| **Example** | `isodow(date '1992-02-15')` | +| **Result** | `6` | + +#### `isoyear(date)` + +
+ +| **Description** | ISO Year number (Starts on Monday of week containing Jan 4th). | +| **Example** | `isoyear(date '2022-01-01')` | +| **Result** | `2021` | + +#### `microsecond(date)` + +
+ +| **Description** | Sub-minute microseconds. | +| **Example** | `microsecond(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `44123456` | + +#### `millennium(date)` + +
+ +| **Description** | Millennium. | +| **Example** | `millennium(date '1992-02-15')` | +| **Result** | `2` | + +#### `millisecond(date)` + +
+ +| **Description** | Sub-minute milliseconds. | +| **Example** | `millisecond(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `44123` | + +#### `minute(date)` + +
+ +| **Description** | Minutes. | +| **Example** | `minute(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `59` | + +#### `month(date)` + +
+ +| **Description** | Month. | +| **Example** | `month(date '1992-02-15')` | +| **Result** | `2` | + +#### `quarter(date)` + +
+ +| **Description** | Quarter. | +| **Example** | `quarter(date '1992-02-15')` | +| **Result** | `1` | + +#### `second(date)` + +
+ +| **Description** | Seconds. | +| **Example** | `second(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `44` | + +#### `timezone_hour(date)` + +
+ +| **Description** | Time zone offset hour portion. | +| **Example** | `timezone_hour(date '1992-02-15')` | +| **Result** | `0` | + +#### `timezone_minute(date)` + +
+ +| **Description** | Time zone offset minutes portion. | +| **Example** | `timezone_minute(date '1992-02-15')` | +| **Result** | `0` | + +#### `timezone(date)` + +
+ +| **Description** | Time Zone offset in minutes. | +| **Example** | `timezone(date '1992-02-15')` | +| **Result** | `0` | + +#### `week(date)` + +
+ +| **Description** | ISO Week. | +| **Example** | `week(date '1992-02-15')` | +| **Result** | `7` | + +#### `weekday(date)` + +
+ +| **Description** | Numeric weekday synonym (Sunday = 0, Saturday = 6). | +| **Example** | `weekday(date '1992-02-15')` | +| **Result** | `6` | + +#### `weekofyear(date)` + +
+ +| **Description** | ISO Week (synonym). | +| **Example** | `weekofyear(date '1992-02-15')` | +| **Result** | `7` | + +#### `year(date)` + +
+ +| **Description** | Year. | +| **Example** | `year(date '1992-02-15')` | +| **Result** | `1992` | + +#### `yearweek(date)` + +
+ +| **Description** | `BIGINT` of combined ISO Year number and 2-digit version of ISO Week number. | +| **Example** | `yearweek(date '1992-02-15')` | +| **Result** | `199207` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/enum.md b/docs/archive/1.0/sql/functions/enum.md new file mode 100644 index 00000000000..a178a743bf4 --- /dev/null +++ b/docs/archive/1.0/sql/functions/enum.md @@ -0,0 +1,66 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/enum +title: Enum Functions +--- + + + +This section describes functions and operators for examining and manipulating [`ENUM` values]({% link docs/archive/1.0/sql/data_types/enum.md %}). +The examples assume an enum type created as: + +```sql +CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy', 'anxious'); +``` + +These functions can take `NULL` or a specific value of the type as argument(s). +With the exception of `enum_range_boundary`, the result depends only on the type of the argument and not on its value. + +| Name | Description | +|:--|:-------| +| [`enum_code(enum_value)`](#enum_codeenum_value) | Returns the numeric value backing the given enum value. | +| [`enum_first(enum)`](#enum_firstenum) | Returns the first value of the input enum type. | +| [`enum_last(enum)`](#enum_lastenum) | Returns the last value of the input enum type. | +| [`enum_range(enum)`](#enum_rangeenum) | Returns all values of the input enum type as an array. | +| [`enum_range_boundary(enum, enum)`](#enum_range_boundaryenum-enum) | Returns the range between the two given enum values as an array. | + +#### `enum_code(enum_value)` + +
+ +| **Description** | Returns the numeric value backing the given enum value. | +| **Example** | `enum_code('happy'::mood)` | +| **Result** | `2` | + +#### `enum_first(enum)` + +
+ +| **Description** | Returns the first value of the input enum type. | +| **Example** | `enum_first(NULL::mood)` | +| **Result** | `sad` | + +#### `enum_last(enum)` + +
+ +| **Description** | Returns the last value of the input enum type. | +| **Example** | `enum_last(NULL::mood)` | +| **Result** | `anxious` | + +#### `enum_range(enum)` + +
+ +| **Description** | Returns all values of the input enum type as an array. | +| **Example** | `enum_range(NULL::mood)` | +| **Result** | `[sad, ok, happy, anxious]` | + +#### `enum_range_boundary(enum, enum)` + +
+ +| **Description** | Returns the range between the two given enum values as an array. The values must be of the same enum type. When the first parameter is `NULL`, the result starts with the first value of the enum type. When the second parameter is `NULL`, the result ends with the last value of the enum type. | +| **Example** | `enum_range_boundary(NULL, 'happy'::mood)` | +| **Result** | `[sad, ok, happy]` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/interval.md b/docs/archive/1.0/sql/functions/interval.md new file mode 100644 index 00000000000..74002ddfa80 --- /dev/null +++ b/docs/archive/1.0/sql/functions/interval.md @@ -0,0 +1,176 @@ +--- +layout: docu +title: Interval Functions +--- + + + +This section describes functions and operators for examining and manipulating [`INTERVAL`]({% link docs/archive/1.0/sql/data_types/interval.md %}) values. + +## Interval Operators + +The table below shows the available mathematical operators for `INTERVAL` types. + +| Operator | Description | Example | Result | +|:-|:--|:----|:--| +| `+` | addition of an `INTERVAL` | `INTERVAL 1 HOUR + INTERVAL 5 HOUR` | `INTERVAL 6 HOUR` | +| `+` | addition to a `DATE` | `DATE '1992-03-22' + INTERVAL 5 DAY` | `1992-03-27` | +| `+` | addition to a `TIMESTAMP` | `TIMESTAMP '1992-03-22 01:02:03' + INTERVAL 5 DAY` | `1992-03-27 01:02:03` | +| `+` | addition to a `TIME` | `TIME '01:02:03' + INTERVAL 5 HOUR` | `06:02:03` | +| `-` | subtraction of an `INTERVAL` | `INTERVAL 5 HOUR - INTERVAL 1 HOUR` | `INTERVAL 4 HOUR` | +| `-` | subtraction from a `DATE` | `DATE '1992-03-27' - INTERVAL 5 DAY` | `1992-03-22` | +| `-` | subtraction from a `TIMESTAMP` | `TIMESTAMP '1992-03-27 01:02:03' - INTERVAL 5 DAY` | `1992-03-22 01:02:03` | +| `-` | subtraction from a `TIME` | `TIME '06:02:03' - INTERVAL 5 HOUR` | `01:02:03` | + +## Interval Functions + +The table below shows the available scalar functions for `INTERVAL` types. + +| Name | Description | +|:--|:-------| +| [`date_part(part, interval)`](#date_partpart-interval) | Extract [datepart component]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). See [`INTERVAL`]({% link docs/archive/1.0/sql/data_types/interval.md %}) for the sometimes surprising rules governing this extraction. | +| [`datepart(part, interval)`](#datepartpart-interval) | Alias of `date_part`. | +| [`extract(part FROM interval)`](#extractpart-from-interval) | Alias of `date_part`. | +| [`epoch(interval)`](#epochinterval) | Get total number of seconds, as double precision floating point number, in interval. | +| [`to_centuries(integer)`](#to_centuriesinteger) | Construct a century interval. | +| [`to_days(integer)`](#to_daysinteger) | Construct a day interval. | +| [`to_decades(integer)`](#to_decadesinteger) | Construct a decade interval. | +| [`to_hours(integer)`](#to_hoursinteger) | Construct a hour interval. | +| [`to_microseconds(integer)`](#to_microsecondsinteger) | Construct a microsecond interval. | +| [`to_millennia(integer)`](#to_millenniainteger) | Construct a millennium interval. | +| [`to_milliseconds(integer)`](#to_millisecondsinteger) | Construct a millisecond interval. | +| [`to_minutes(integer)`](#to_minutesinteger) | Construct a minute interval. | +| [`to_months(integer)`](#to_monthsinteger) | Construct a month interval. | +| [`to_seconds(integer)`](#to_secondsinteger) | Construct a second interval. | +| [`to_weeks(integer)`](#to_weeksinteger) | Construct a week interval. | +| [`to_years(integer)`](#to_yearsinteger) | Construct a year interval. | + +> Only the documented [date part components]({% link docs/archive/1.0/sql/functions/datepart.md %}) are defined for intervals. + +#### `date_part(part, interval)` + +
+ +| **Description** | Extract [datepart component]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). See [`INTERVAL`]({% link docs/archive/1.0/sql/data_types/interval.md %}) for the sometimes surprising rules governing this extraction. | +| **Example** | `date_part('year', INTERVAL '14 months')` | +| **Result** | `1` | + +#### `datepart(part, interval)` + +
+ +| **Description** | Alias of `date_part`. | +| **Example** | `datepart('year', INTERVAL '14 months')` | +| **Result** | `1` | + +#### `extract(part FROM interval)` + +
+ +| **Description** | Alias of `date_part`. | +| **Example** | `extract('month' FROM INTERVAL '14 months')` | +| **Result** | 2 | + +#### `epoch(interval)` + +
+ +| **Description** | Get total number of seconds, as double precision floating point number, in interval. | +| **Example** | `epoch(INTERVAL 5 HOUR)` | +| **Result** | `18000.0` | + +#### `to_centuries(integer)` + +
+ +| **Description** | Construct a century interval. | +| **Example** | `to_centuries(5)` | +| **Result** | `INTERVAL 500 YEAR` | + +#### `to_days(integer)` + +
+ +| **Description** | Construct a day interval. | +| **Example** | `to_days(5)` | +| **Result** | `INTERVAL 5 DAY` | + +#### `to_decades(integer)` + +
+ +| **Description** | Construct a decade interval. | +| **Example** | `to_decades(5)` | +| **Result** | `INTERVAL 50 YEAR` | + +#### `to_hours(integer)` + +
+ +| **Description** | Construct a hour interval. | +| **Example** | `to_hours(5)` | +| **Result** | `INTERVAL 5 HOUR` | + +#### `to_microseconds(integer)` + +
+ +| **Description** | Construct a microsecond interval. | +| **Example** | `to_microseconds(5)` | +| **Result** | `INTERVAL 5 MICROSECOND` | + +#### `to_millennia(integer)` + +
+ +| **Description** | Construct a millennium interval. | +| **Example** | `to_millennia(5)` | +| **Result** | `INTERVAL 5000 YEAR` | + +#### `to_milliseconds(integer)` + +
+ +| **Description** | Construct a millisecond interval. | +| **Example** | `to_milliseconds(5)` | +| **Result** | `INTERVAL 5 MILLISECOND` | + +#### `to_minutes(integer)` + +
+ +| **Description** | Construct a minute interval. | +| **Example** | `to_minutes(5)` | +| **Result** | `INTERVAL 5 MINUTE` | + +#### `to_months(integer)` + +
+ +| **Description** | Construct a month interval. | +| **Example** | `to_months(5)` | +| **Result** | `INTERVAL 5 MONTH` | + +#### `to_seconds(integer)` + +
+ +| **Description** | Construct a second interval. | +| **Example** | `to_seconds(5)` | +| **Result** | `INTERVAL 5 SECOND` | + +#### `to_weeks(integer)` + +
+ +| **Description** | Construct a week interval. | +| **Example** | `to_weeks(5)` | +| **Result** | `INTERVAL 35 DAY` | + +#### `to_years(integer)` + +
+ +| **Description** | Construct a year interval. | +| **Example** | `to_years(5)` | +| **Result** | `INTERVAL 5 YEAR` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/lambda.md b/docs/archive/1.0/sql/functions/lambda.md new file mode 100644 index 00000000000..c6d49287b42 --- /dev/null +++ b/docs/archive/1.0/sql/functions/lambda.md @@ -0,0 +1,269 @@ +--- +layout: docu +title: Lambda Functions +--- + +Lambda functions enable the use of more complex and flexible expressions in queries. +DuckDB supports several scalar functions that operate on [`LIST`s]({% link docs/archive/1.0/sql/data_types/list.md %}) and +accept lambda functions as parameters +in the form `(parameter1, parameter2, ...) -> expression`. +If the lambda function has only one parameter, then the parentheses can be omitted. +The parameters can have any names. +For example, the following are all valid lambda functions: + +* `param -> param > 1` +* `s -> contains(concat(s, 'DB'), 'duck')` +* `(x, y) -> x + y` + +## Scalar Functions That Accept Lambda Functions + +| Name | Description | +|:--|:-------| +| [`list_transform(list, lambda)`](#list_transformlist-lambda) | Returns a list that is the result of applying the lambda function to each element of the input list. | +| [`list_filter(list, lambda)`](#list_filterlist-lambda) | Constructs a list from those elements of the input list for which the lambda function returns `true`. | +| [`list_reduce(list, lambda)`](#list_reducelist-lambda) | Reduces all elements of the input list into a single value by executing the lambda function on a running result and the next list element. The list must have at least one element – the use of an initial accumulator value is currently not supported. | + +### `list_transform(list, lambda)` + +
+ +| **Description** | Returns a list that is the result of applying the lambda function to each element of the input list. For more information, see [Transform](#transform). | +| **Example** | `list_transform([4, 5, 6], x -> x + 1)` | +| **Result** | `[5, 6, 7]` | +| **Aliases** | `array_transform`, `apply`, `list_apply`, `array_apply` | + +### `list_filter(list, lambda)` + +
+ +| **Description** | Constructs a list from those elements of the input list for which the lambda function returns `true`. For more information, see [Filter](#filter). | +| **Example** | `list_filter([4, 5, 6], x -> x > 4)` | +| **Result** | `[5, 6]` | +| **Aliases** | `array_filter`, `filter` | + +### `list_reduce(list, lambda)` + +
+ +| **Description** | Reduces all elements of the input list into a single value by executing the lambda function on a running result and the next list element. The list must have at least one element – the use of an initial accumulator value is currently not supported. For more information, see [Reduce](#reduce). | +| **Example** | `list_reduce([4, 5, 6], (x, y) -> x + y)` | +| **Result** | `15` | +| **Aliases** | `array_reduce`, `reduce` | + +## Nesting + +All scalar functions can be arbitrarily nested. + +_Nested lambda functions to get all squares of even list elements:_ + +```sql +SELECT list_transform( + list_filter([0, 1, 2, 3, 4, 5], x -> x % 2 = 0), + y -> y * y + ); +``` + +```text +[0, 4, 16] +``` + +_Nested lambda function to add each element of the first list to the sum of the second list:_ + +```sql +SELECT list_transform( + [1, 2, 3], + x -> list_reduce([4, 5, 6], (a, b) -> a + b) + x + ); +``` + +```text +[16, 17, 18] +``` + +## Scoping + +Lambda functions confirm to scoping rules in the following order: + +* inner lambda parameters +* outer lambda parameters +* column names +* macro parameters + +```sql +CREATE TABLE tbl (x INTEGER); +INSERT INTO tbl VALUES (10); +SELECT apply([1, 2], x -> apply([4], x -> x + tbl.x)[1] + x) FROM tbl; +``` + +```text +[15, 16] +``` + +## Indexes as Parameters + +All lambda functions accept an optional extra parameter that represents the index of the current element. +This is always the last parameter of the lambda function, and is 1-based (i.e., the first element has index 1). + +_Get all elements that are larger than their index:_ + +```sql +SELECT list_filter([1, 3, 1, 5], (x, i) -> x > i); +``` + +```text +[3, 5] +``` + +## Transform + +**Signature:** `list_transform(list, lambda)` + +**Description:** `list_transform` returns a list that is the result of applying the lambda function to each element of the input list. + +**Aliases:** + +* `array_transform` +* `apply` +* `list_apply` +* `array_apply` + +**Number of parameters excluding indexes:** 1 + +**Return type:** Defined by the return type of the lambda function + +### Examples + +Incrementing each list element by one: + +```sql +SELECT list_transform([1, 2, NULL, 3], x -> x + 1); +``` + +```text +[2, 3, NULL, 4] +``` + +Transforming strings: + +```sql +SELECT list_transform(['Duck', 'Goose', 'Sparrow'], s -> concat(s, 'DB')); +``` + +```text +[DuckDB, GooseDB, SparrowDB] +``` + +Combining lambda functions with other functions: + +```sql +SELECT list_transform([5, NULL, 6], x -> coalesce(x, 0) + 1); +``` + +```text +[6, 1, 7] +``` + +## Filter + +**Signature:** `list_filter(list, lambda)` + +**Description:** +Constructs a list from those elements of the input list for which the lambda function returns `true`. +DuckDB must be able to cast the lambda function's return type to `BOOL`. + +**Aliases:** + +* `array_filter` +* `filter` + +**Number of parameters excluding indexes:** 1 + +**Return type:** The same type as the input list + +### Examples + +Filter out negative values: + +```sql +SELECT list_filter([5, -6, NULL, 7], x -> x > 0); +``` + +```text +[5, 7] +``` + +Divisible by 2 and 5: + +```sql +SELECT list_filter(list_filter([2, 4, 3, 1, 20, 10, 3, 30], x -> x % 2 == 0), y -> y % 5 == 0); +``` + +```text +[20, 10, 30] +``` + +In combination with `range(...)` to construct lists: + +```sql +SELECT list_filter([1, 2, 3, 4], x -> x > #1) FROM range(4); +``` + +```text +[1, 2, 3, 4] +[2, 3, 4] +[3, 4] +[4] +[] +``` + +## Reduce + +**Signature:** `list_reduce(list, lambda)` + +**Description:** +The scalar function returns a single value +that is the result of applying the lambda function to each element of the input list. +Starting with the first element +and then repeatedly applying the lambda function to the result of the previous application and the next element of the list. +The list must have at least one element. + +**Aliases:** + +* `array_reduce` +* `reduce` + +**Number of parameters excluding indexes:** 2 + +**Return type:** The type of the input list's elements + +### Examples + +Sum of all list elements: + +```sql +SELECT list_reduce([1, 2, 3, 4], (x, y) -> x + y); +``` + +```text +10 +``` + +Only add up list elements if they are greater than 2: + +```sql +SELECT list_reduce(list_filter([1, 2, 3, 4], x -> x > 2), (x, y) -> x + y); +``` + +```text +7 +``` + +Concat all list elements: + +```sql +SELECT list_reduce(['DuckDB', 'is', 'awesome'], (x, y) -> concat(x, ' ', y)); +``` + +```text +DuckDB is awesome +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/list.md b/docs/archive/1.0/sql/functions/list.md new file mode 100644 index 00000000000..a7521696df9 --- /dev/null +++ b/docs/archive/1.0/sql/functions/list.md @@ -0,0 +1,887 @@ +--- +layout: docu +title: List Functions +--- + + + +| Name | Description | +|:--|:-------| +| [`list[index]`](#listindex) | Bracket notation serves as an alias for `list_extract`. | +| [`list[begin:end]`](#listbeginend) | Bracket notation with colon is an alias for `list_slice`. | +| [`list[begin:end:step]`](#listbeginendstep) | `list_slice` in bracket notation with an added `step` feature. | +| [`array_pop_back(list)`](#array_pop_backlist) | Returns the list without the last element. | +| [`array_pop_front(list)`](#array_pop_frontlist) | Returns the list without the first element. | +| [`flatten(list_of_lists)`](#flattenlist_of_lists) | Concatenate a list of lists into a single list. This only flattens one level of the list (see [examples](#flattening)). | +| [`len(list)`](#lenlist) | Return the length of the list. | +| [`list_aggregate(list, name)`](#list_aggregatelist-name) | Executes the aggregate function `name` on the elements of `list`. See the [List Aggregates]({% link docs/archive/1.0/sql/functions/list.md %}#list-aggregates) section for more details. | +| [`list_any_value(list)`](#list_any_valuelist) | Returns the first non-null value in the list. | +| [`list_append(list, element)`](#list_appendlist-element) | Appends `element` to `list`. | +| [`list_concat(list1, list2)`](#list_concatlist1-list2) | Concatenates two lists. | +| [`list_contains(list, element)`](#list_containslist-element) | Returns true if the list contains the element. | +| [`list_cosine_similarity(list1, list2)`](#list_cosine_similaritylist1-list2) | Compute the cosine similarity between two lists. | +| [`list_distance(list1, list2)`](#list_distancelist1-list2) | Calculates the Euclidean distance between two points with coordinates given in two inputs lists of equal length. | +| [`list_distinct(list)`](#list_distinctlist) | Removes all duplicates and `NULL` values from a list. Does not preserve the original order. | +| [`list_dot_product(list1, list2)`](#list_dot_productlist1-list2) | Computes the dot product of two same-sized lists of numbers. | +| [`list_extract(list, index)`](#list_extractlist-index) | Extract the `index`th (1-based) value from the list. | +| [`list_filter(list, lambda)`](#list_filterlist-lambda) | Constructs a list from those elements of the input list for which the lambda function returns true. See the [Lambda Functions]({% link docs/archive/1.0/sql/functions/lambda.md %}#filter) page for more details. | +| [`list_grade_up(list)`](#list_grade_uplist) | Works like sort, but the results are the indexes that correspond to the position in the original `list` instead of the actual values. | +| [`list_has_all(list, sub-list)`](#list_has_alllist-sub-list) | Returns true if all elements of sub-list exist in list. | +| [`list_has_any(list1, list2)`](#list_has_anylist1-list2) | Returns true if any elements exist is both lists. | +| [`list_intersect(list1, list2)`](#list_intersectlist1-list2) | Returns a list of all the elements that exist in both `l1` and `l2`, without duplicates. | +| [`list_position(list, element)`](#list_positionlist-element) | Returns the index of the element if the list contains the element. | +| [`list_prepend(element, list)`](#list_prependelement-list) | Prepends `element` to `list`. | +| [`list_reduce(list, lambda)`](#list_reducelist-lambda) | Returns a single value that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/archive/1.0/sql/functions/lambda.md %}#reduce) page for more details. | +| [`list_resize(list, size[, value])`](#list_resizelist-size-value) | Resizes the list to contain `size` elements. Initializes new elements with `value` or `NULL` if `value` is not set. | +| [`list_reverse_sort(list)`](#list_reverse_sortlist) | Sorts the elements of the list in reverse order. See the [Sorting Lists]({% link docs/archive/1.0/sql/functions/list.md %}#sorting-lists) section for more details about the `NULL` sorting order. | +| [`list_reverse(list)`](#list_reverselist) | Reverses the list. | +| [`list_select(value_list, index_list)`](#list_selectvalue_list-index_list) | Returns a list based on the elements selected by the `index_list`. | +| [`list_slice(list, begin, end, step)`](#list_slicelist-begin-end-step) | `list_slice` with added `step` feature. | +| [`list_slice(list, begin, end)`](#list_slicelist-begin-end) | Extract a sublist using slice conventions. Negative values are accepted. See [slicing]({% link docs/archive/1.0/sql/functions/list.md %}#slicing). | +| [`list_sort(list)`](#list_sortlist) | Sorts the elements of the list. See the [Sorting Lists]({% link docs/archive/1.0/sql/functions/list.md %}#sorting-lists) section for more details about the sorting order and the `NULL` sorting order. | +| [`list_transform(list, lambda)`](#list_transformlist-lambda) | Returns a list that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/archive/1.0/sql/functions/lambda.md %}#transform) page for more details. | +| [`list_unique(list)`](#list_uniquelist) | Counts the unique elements of a list. | +| [`list_value(any, ...)`](#list_valueany-) | Create a `LIST` containing the argument values. | +| [`list_where(value_list, mask_list)`](#list_wherevalue_list-mask_list) | Returns a list with the `BOOLEAN`s in `mask_list` applied as a mask to the `value_list`. | +| [`list_zip(list_1, list_2, ...[, truncate])`](#list_ziplist1-list2-) | Zips _k_ `LIST`s to a new `LIST` whose length will be that of the longest list. Its elements are structs of _k_ elements from each list `list_1`, ..., `list_k`, missing elements are replaced with `NULL`. If `truncate` is set, all lists are truncated to the smallest list length. | +| [`unnest(list)`](#unnestlist) | Unnests a list by one level. Note that this is a special function that alters the cardinality of the result. See the [`unnest` page]({% link docs/archive/1.0/sql/query_syntax/unnest.md %}) for more details. | + +#### `list[index]` + +
+ +| **Description** | Bracket notation serves as an alias for `list_extract`. | +| **Example** | `[4, 5, 6][3]` | +| **Result** | `6` | +| **Alias** | `list_extract` | + +#### `list[begin:end]` + +
+ +| **Description** | Bracket notation with colon is an alias for `list_slice`. | +| **Example** | `[4, 5, 6][2:3]` | +| **Result** | `[5, 6]` | +| **Alias** | `list_slice` | + +#### `list[begin:end:step]` + +
+ +| **Description** | `list_slice` in bracket notation with an added `step` feature. | +| **Example** | `[4, 5, 6][:-:2]` | +| **Result** | `[4, 6]` | +| **Alias** | `list_slice` | + +#### `array_pop_back(list)` + +
+ +| **Description** | Returns the list without the last element. | +| **Example** | `array_pop_back([4, 5, 6])` | +| **Result** | `[4, 5]` | + +#### `array_pop_front(list)` + +
+ +| **Description** | Returns the list without the first element. | +| **Example** | `array_pop_front([4, 5, 6])` | +| **Result** | `[5, 6]` | + +#### `flatten(list_of_lists)` + +
+ +| **Description** | Concatenate a list of lists into a single list. This only flattens one level of the list (see [examples](#flattening)). | +| **Example** | `flatten([[1, 2], [3, 4]])` | +| **Result** | `[1, 2, 3, 4]` | + +#### `len(list)` + +
+ +| **Description** | Return the length of the list. | +| **Example** | `len([1, 2, 3])` | +| **Result** | `3` | +| **Alias** | `array_length` | + +#### `list_aggregate(list, name)` + +
+ +| **Description** | Executes the aggregate function `name` on the elements of `list`. See the [List Aggregates]({% link docs/archive/1.0/sql/functions/list.md %}#list-aggregates) section for more details. | +| **Example** | `list_aggregate([1, 2, NULL], 'min')` | +| **Result** | `1` | +| **Aliases** | `list_aggr`, `aggregate`, `array_aggregate`, `array_aggr` | + +#### `list_any_value(list)` + +
+ +| **Description** | Returns the first non-null value in the list. | +| **Example** | `list_any_value([NULL, -3])` | +| **Result** | `-3` | + +#### `list_append(list, element)` + +
+ +| **Description** | Appends `element` to `list`. | +| **Example** | `list_append([2, 3], 4)` | +| **Result** | `[2, 3, 4]` | +| **Aliases** | `array_append`, `array_push_back` | + +#### `list_concat(list1, list2)` + +
+ +| **Description** | Concatenates two lists. | +| **Example** | `list_concat([2, 3], [4, 5, 6])` | +| **Result** | `[2, 3, 4, 5, 6]` | +| **Aliases** | `list_cat`, `array_concat`, `array_cat` | + +#### `list_contains(list, element)` + +
+ +| **Description** | Returns true if the list contains the element. | +| **Example** | `list_contains([1, 2, NULL], 1)` | +| **Result** | `true` | +| **Aliases** | `list_has`, `array_contains`, `array_has` | + +#### `list_cosine_similarity(list1, list2)` + +
+ +| **Description** | Compute the cosine similarity between two lists. | +| **Example** | `list_cosine_similarity([1, 2, 3], [1, 2, 5])` | +| **Result** | `0.9759000729485332` | + +#### `list_distance(list1, list2)` + +
+ +| **Description** | Calculates the Euclidean distance between two points with coordinates given in two inputs lists of equal length. | +| **Example** | `list_distance([1, 2, 3], [1, 2, 5])` | +| **Result** | `2.0` | + +#### `list_distinct(list)` + +
+ +| **Description** | Removes all duplicates and `NULL` values from a list. Does not preserve the original order. | +| **Example** | `list_distinct([1, 1, NULL, -3, 1, 5])` | +| **Result** | `[1, 5, -3]` | +| **Alias** | `array_distinct` | + +#### `list_dot_product(list1, list2)` + +
+ +| **Description** | Computes the dot product of two same-sized lists of numbers. | +| **Example** | `list_dot_product([1, 2, 3], [1, 2, 5])` | +| **Result** | `20.0` | +| **Alias** | `list_inner_product` | + +#### `list_extract(list, index)` + +
+ +| **Description** | Extract the `index`th (1-based) value from the list. | +| **Example** | `list_extract([4, 5, 6], 3)` | +| **Result** | `6` | +| **Aliases** | `list_element`, `array_extract` | + +#### `list_filter(list, lambda)` + +
+ +| **Description** | Constructs a list from those elements of the input list for which the lambda function returns true. See the [Lambda Functions]({% link docs/archive/1.0/sql/functions/lambda.md %}#filter) page for more details. | +| **Example** | `list_filter([4, 5, 6], x -> x > 4)` | +| **Result** | `[5, 6]` | +| **Aliases** | `array_filter`, `filter` | + +#### `list_grade_up(list)` + +
+ +| **Description** | Works like sort, but the results are the indexes that correspond to the position in the original `list` instead of the actual values. | +| **Example** | `list_grade_up([30, 10, 40, 20])` | +| **Result** | `[2, 4, 1, 3]` | +| **Alias** | `array_grade_up` | + +#### `list_has_all(list, sub-list)` + +
+ +| **Description** | Returns true if all elements of sub-list exist in list. | +| **Example** | `list_has_all([4, 5, 6], [4, 6])` | +| **Result** | `true` | +| **Alias** | `array_has_all` | + +#### `list_has_any(list1, list2)` + +
+ +| **Description** | Returns true if any elements exist is both lists. | +| **Example** | `list_has_any([1, 2, 3], [2, 3, 4])` | +| **Result** | `true` | +| **Alias** | `array_has_any` | + +#### `list_intersect(list1, list2)` + +
+ +| **Description** | Returns a list of all the elements that exist in both `l1` and `l2`, without duplicates. | +| **Example** | `list_intersect([1, 2, 3], [2, 3, 4])` | +| **Result** | `[2, 3]` | +| **Alias** | `array_intersect` | + +#### `list_position(list, element)` + +
+ +| **Description** | Returns the index of the element if the list contains the element. | +| **Example** | `list_position([1, 2, NULL], 2)` | +| **Result** | `2` | +| **Aliases** | `list_indexof`, `array_position`, `array_indexof` | + +#### `list_prepend(element, list)` + +
+ +| **Description** | Prepends `element` to `list`. | +| **Example** | `list_prepend(3, [4, 5, 6])` | +| **Result** | `[3, 4, 5, 6]` | +| **Aliases** | `array_prepend`, `array_push_front` | + +#### `list_reduce(list, lambda)` + +
+ +| **Description** | Returns a single value that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/archive/1.0/sql/functions/lambda.md %}#reduce) page for more details. | +| **Example** | `list_reduce([4, 5, 6], (x, y) -> x + y)` | +| **Result** | `15` | +| **Aliases** | `array_reduce`, `reduce` | + +#### `list_resize(list, size[, value])` + +
+ +| **Description** | Resizes the list to contain `size` elements. Initializes new elements with `value` or `NULL` if `value` is not set. | +| **Example** | `list_resize([1, 2, 3], 5, 0)` | +| **Result** | `[1, 2, 3, 0, 0]` | +| **Alias** | `array_resize` | + +#### `list_reverse_sort(list)` + +
+ +| **Description** | Sorts the elements of the list in reverse order. See the [Sorting Lists]({% link docs/archive/1.0/sql/functions/list.md %}#sorting-lists) section for more details about the `NULL` sorting order. | +| **Example** | `list_reverse_sort([3, 6, 1, 2])` | +| **Result** | `[6, 3, 2, 1]` | +| **Alias** | `array_reverse_sort` | + +#### `list_reverse(list)` + +
+ +| **Description** | Reverses the list. | +| **Example** | `list_reverse([3, 6, 1, 2])` | +| **Result** | `[2, 1, 6, 3]` | +| **Alias** | `array_reverse` | + +#### `list_select(value_list, index_list)` + +
+ +| **Description** | Returns a list based on the elements selected by the `index_list`. | +| **Example** | `list_select([10, 20, 30, 40], [1, 4])` | +| **Result** | `[10, 40]` | +| **Alias** | `array_select` | + +#### `list_slice(list, begin, end, step)` + +
+ +| **Description** | `list_slice` with added `step` feature. | +| **Example** | `list_slice([4, 5, 6], 1, 3, 2)` | +| **Result** | `[4, 6]` | +| **Alias** | `array_slice` | + +#### `list_slice(list, begin, end)` + +
+ +| **Description** | Extract a sublist using slice conventions. Negative values are accepted. See [slicing]({% link docs/archive/1.0/sql/functions/list.md %}#slicing). | +| **Example** | `list_slice([4, 5, 6], 2, 3)` | +| **Result** | `[5, 6]` | +| **Alias** | `array_slice` | + +#### `list_sort(list)` + +
+ +| **Description** | Sorts the elements of the list. See the [Sorting Lists]({% link docs/archive/1.0/sql/functions/list.md %}#sorting-lists) section for more details about the sorting order and the `NULL` sorting order. | +| **Example** | `list_sort([3, 6, 1, 2])` | +| **Result** | `[1, 2, 3, 6]` | +| **Alias** | `array_sort` | + +#### `list_transform(list, lambda)` + +
+ +| **Description** | Returns a list that is the result of applying the lambda function to each element of the input list. See the [Lambda Functions]({% link docs/archive/1.0/sql/functions/lambda.md %}#transform) page for more details. | +| **Example** | `list_transform([4, 5, 6], x -> x + 1)` | +| **Result** | `[5, 6, 7]` | +| **Aliases** | `array_transform`, `apply`, `list_apply`, `array_apply` | + +#### `list_unique(list)` + +
+ +| **Description** | Counts the unique elements of a list. | +| **Example** | `list_unique([1, 1, NULL, -3, 1, 5])` | +| **Result** | `3` | +| **Alias** | `array_unique` | + +#### `list_value(any, ...)` + +
+ +| **Description** | Create a `LIST` containing the argument values. | +| **Example** | `list_value(4, 5, 6)` | +| **Result** | `[4, 5, 6]` | +| **Alias** | `list_pack` | + +#### `list_where(value_list, mask_list)` + +
+ +| **Description** | Returns a list with the `BOOLEAN`s in `mask_list` applied as a mask to the `value_list`. | +| **Example** | `list_where([10, 20, 30, 40], [true, false, false, true])` | +| **Result** | `[10, 40]` | +| **Alias** | `array_where` | + +#### `list_zip(list1, list2, ...)` + +
+ +| **Description** | Zips _k_ `LIST`s to a new `LIST` whose length will be that of the longest list. Its elements are structs of _k_ elements from each list `list_1`, ..., `list_k`, missing elements are replaced with `NULL`. If `truncate` is set, all lists are truncated to the smallest list length. | +| **Example** | `list_zip([1, 2], [3, 4], [5, 6])` | +| **Result** | `[(1, 3, 5), (2, 4, 6)]` | +| **Alias** | `array_zip` | + +#### `unnest(list)` + +
+ +| **Description** | Unnests a list by one level. Note that this is a special function that alters the cardinality of the result. See the [`unnest` page]({% link docs/archive/1.0/sql/query_syntax/unnest.md %}) for more details. | +| **Example** | `unnest([1, 2, 3])` | +| **Result** | `1`, `2`, `3` | + +## List Operators + +The following operators are supported for lists: + + + +| Operator | Description | Example | Result | +|-|--|---|-| +| `&&` | Alias for [`list_has_any`](#list_has_anylist1-list2). | `[1, 2, 3, 4, 5] && [2, 5, 5, 6]` | `true` | +| `@>` | Alias for [`list_has_all`](#list_has_alllist-sub-list), where the list on the **right** of the operator is the sublist. | `[1, 2, 3, 4] @> [3, 4, 3]` | `true` | +| `<@` | Alias for [`list_has_all`](#list_has_alllist-sub-list), where the list on the **left** of the operator is the sublist. | `[1, 4] <@ [1, 2, 3, 4]` | `true` | +| `||` | Alias for [`list_concat`](#list_concatlist1-list2). | `[1, 2, 3] || [4, 5, 6]` | `[1, 2, 3, 4, 5, 6]` | +| `<=>` | Alias for [`list_cosine_similarity`](#list_cosine_similaritylist1-list2). | `[1, 2, 3] <=> [1, 2, 5]` | `0.9759000729485332` | +| `<->` | Alias for [`list_distance`](#list_distancelist1-list2). | `[1, 2, 3] <-> [1, 2, 5]` | `2.0` | + + + +## List Comprehension + +Python-style list comprehension can be used to compute expressions over elements in a list. For example: + +```sql +SELECT [lower(x) FOR x IN strings] +FROM (VALUES (['Hello', '', 'World'])) t(strings); +``` + +```text +[hello, , world] +``` + +```sql +SELECT [upper(x) FOR x IN strings IF len(x) > 0] +FROM (VALUES (['Hello', '', 'World'])) t(strings); +``` + +```text +[HELLO, WORLD] +``` + +## Range Functions + +DuckDB offers two range functions, [`range(start, stop, step)`](#range) and [`generate_series(start, stop, step)`](#generate_series), and their variants with default arguments for `stop` and `step`. The two functions' behavior is different regarding their `stop` argument. This is documented below. + +### `range` + +The `range` function creates a list of values in the range between `start` and `stop`. +The `start` parameter is inclusive, while the `stop` parameter is exclusive. +The default value of `start` is 0 and the default value of `step` is 1. + +Based on the number of arguments, the following variants of `range` exist. + +#### `range(stop)` + +```sql +SELECT range(5); +``` + +```text +[0, 1, 2, 3, 4] +``` + +#### `range(start, stop)` + +```sql +SELECT range(2, 5); +``` + +```text +[2, 3, 4] +``` + +#### `range(start, stop, step)` + +```sql +SELECT range(2, 5, 3); +``` + +```text +[2] +``` + +### `generate_series` + +The `generate_series` function creates a list of values in the range between `start` and `stop`. +Both the `start` and the `stop` parameters are inclusive. +The default value of `start` is 0 and the default value of `step` is 1. +Based on the number of arguments, the following variants of `generate_series` exist. + +#### `generate_series(stop)` + +```sql +SELECT generate_series(5); +``` + +```text +[0, 1, 2, 3, 4, 5] +``` + +#### `generate_series(start, stop)` + +```sql +SELECT generate_series(2, 5); +``` + +```text +[2, 3, 4, 5] +``` + +#### `generate_series(start, stop, step)` + +```sql +SELECT generate_series(2, 5, 3); +``` + +```text +[2, 5] +``` + +#### `generate_subscripts(arr, dim)` + +The `generate_subscripts(arr, dim)` function generates indexes along the `dim`th dimension of array `arr`. + +```sql +SELECT generate_subscripts([4, 5, 6], 1) AS i; +``` + +| i | +|--:| +| 1 | +| 2 | +| 3 | + +### Date Ranges + +Date ranges are also supported for `TIMESTAMP` and `TIMESTAMP WITH TIME ZONE` values. +Note that for these types, the `stop` and `step` arguments have to be specified explicitly (a default value is not provided). + +#### `range` for Date Ranges + +```sql +SELECT * +FROM range(DATE '1992-01-01', DATE '1992-03-01', INTERVAL '1' MONTH); +``` + +| range | +|---------------------| +| 1992-01-01 00:00:00 | +| 1992-02-01 00:00:00 | + +#### `generate_series` for Date Ranges + +```sql +SELECT * +FROM generate_series(DATE '1992-01-01', DATE '1992-03-01', INTERVAL '1' MONTH); +``` + +| generate_series | +|---------------------| +| 1992-01-01 00:00:00 | +| 1992-02-01 00:00:00 | +| 1992-03-01 00:00:00 | + +## Slicing + +The function [`list_slice`](#list_slicelist-begin-end) can be used to extract a sublist from a list. The following variants exist: + +* `list_slice(list, begin, end)` +* `list_slice(list, begin, end, step)` +* `array_slice(list, begin, end)` +* `array_slice(list, begin, end, step)` +* `list[begin:end]` +* `list[begin:end:step]` + +The arguments are as follows: + +* `list` + * Is the list to be sliced +* `begin` + * Is the index of the first element to be included in the slice + * When `begin < 0` the index is counted from the end of the list + * When `begin < 0` and `-begin > length`, `begin` is clamped to the beginning of the list + * When `begin > length`, the result is an empty list + * **Bracket Notation:** When `begin` is omitted, it defaults to the beginning of the list +* `end` + * Is the index of the last element to be included in the slice + * When `end < 0` the index is counted from the end of the list + * When `end > length`, end is clamped to `length` + * When `end < begin`, the result is an empty list + * **Bracket Notation:** When `end` is omitted, it defaults to the end of the list. When `end` is omitted and a `step` is provided, `end` must be replaced with a `-` +* `step` *(optional)* + * Is the step size between elements in the slice + * When `step < 0` the slice is reversed, and `begin` and `end` are swapped + * Must be non-zero + +Examples: + +```sql +SELECT list_slice([1, 2, 3, 4, 5], 2, 4); +``` + +```text +[2, 3, 4] +``` + +```sql +SELECT ([1, 2, 3, 4, 5])[2:4:2]; +``` + +```text +[2, 4] +``` + +```sql +SELECT([1, 2, 3, 4, 5])[4:2:-2]; +``` + +```text +[4, 2] +``` + +```sql +SELECT ([1, 2, 3, 4, 5])[:]; +``` + +```text +[1, 2, 3, 4, 5] +``` + +```sql +SELECT ([1, 2, 3, 4, 5])[:-:2]; +``` + +```text +[1, 3, 5] +``` + +```sql +SELECT ([1, 2, 3, 4, 5])[:-:-2]; +``` + +```text +[5, 3, 1] +``` + +## List Aggregates + +The function [`list_aggregate`](#list_aggregatelist-name) allows the execution of arbitrary existing aggregate functions on the elements of a list. Its first argument is the list (column), its second argument is the aggregate function name, e.g., `min`, `histogram` or `sum`. + +`list_aggregate` accepts additional arguments after the aggregate function name. These extra arguments are passed directly to the aggregate function, which serves as the second argument of `list_aggregate`. + +```sql +SELECT list_aggregate([1, 2, -4, NULL], 'min'); +``` + +```text +-4 +``` + +```sql +SELECT list_aggregate([2, 4, 8, 42], 'sum'); +``` + +```text +56 +``` + +```sql +SELECT list_aggregate([[1, 2], [NULL], [2, 10, 3]], 'last'); +``` + +```text +[2, 10, 3] +``` + +```sql +SELECT list_aggregate([2, 4, 8, 42], 'string_agg', '|'); +``` + +```text +2|4|8|42 +``` + +### `list_*` Rewrite Functions + +The following is a list of existing rewrites. Rewrites simplify the use of the list aggregate function by only taking the list (column) as their argument. `list_avg`, `list_var_samp`, `list_var_pop`, `list_stddev_pop`, `list_stddev_samp`, `list_sem`, `list_approx_count_distinct`, `list_bit_xor`, `list_bit_or`, `list_bit_and`, `list_bool_and`, `list_bool_or`, `list_count`, `list_entropy`, `list_last`, `list_first`, `list_kurtosis`, `list_kurtosis_pop`, `list_min`, `list_max`, `list_product`, `list_skewness`, `list_sum`, `list_string_agg`, `list_mode`, `list_median`, `list_mad` and `list_histogram`. + +```sql +SELECT list_min([1, 2, -4, NULL]); +``` + +```text +-4 +``` + +```sql +SELECT list_sum([2, 4, 8, 42]); +``` + +```text +56 +``` + +```sql +SELECT list_last([[1, 2], [NULL], [2, 10, 3]]); +``` + +```text +[2, 10, 3] +``` + +#### `array_to_string` + +Concatenates list/array elements using an optional delimiter. + +```sql +SELECT array_to_string([1, 2, 3], '-') AS str; +``` + +```text +1-2-3 +``` + +This is equivalent to the following SQL: + +```sql +SELECT list_aggr([1, 2, 3], 'string_agg', '-') AS str; +``` + +```text +1-2-3 +``` + +## Sorting Lists + +The function `list_sort` sorts the elements of a list either in ascending or descending order. +In addition, it allows to provide whether `NULL` values should be moved to the beginning or to the end of the list. +It has the same sorting behavior as DuckDB's `ORDER BY` clause. +Therefore, (nested) values compare the same in `list_sort` as in `ORDER BY`. + +By default, if no modifiers are provided, DuckDB sorts `ASC NULLS FIRST`. +I.e., the values are sorted in ascending order and `NULL` values are placed first. +This is identical to the default sort order of SQLite. +The default sort order can be changed using [`PRAGMA` statements.](../query_syntax/orderby). + +`list_sort` leaves it open to the user whether they want to use the default sort order or a custom order. +`list_sort` takes up to two additional optional parameters. +The second parameter provides the sort order and can be either `ASC` or `DESC`. +The third parameter provides the `NULL` order and can be either `NULLS FIRST` or `NULLS LAST`. + +This query uses the default sort order and the default `NULL` order. + +```sql +SELECT list_sort([1, 3, NULL, 5, NULL, -5]); +``` + +```sql +[NULL, NULL, -5, 1, 3, 5] +``` + +This query provides the sort order. +The `NULL` order uses the configurable default value. + +```sql +SELECT list_sort([1, 3, NULL, 2], 'ASC'); +``` + +```sql +[NULL, 1, 2, 3] +``` + +This query provides both the sort order and the `NULL` order. +```sql +SELECT list_sort([1, 3, NULL, 2], 'DESC', 'NULLS FIRST'); +``` + +```sql +[NULL, 3, 2, 1] +``` + +`list_reverse_sort` has an optional second parameter providing the `NULL` sort order. +It can be either `NULLS FIRST` or `NULLS LAST`. + +This query uses the default `NULL` sort order. + +```sql +SELECT list_sort([1, 3, NULL, 5, NULL, -5]); +``` + +```sql +[NULL, NULL, -5, 1, 3, 5] +``` + +This query provides the `NULL` sort order. + +```sql +SELECT list_reverse_sort([1, 3, NULL, 2], 'NULLS LAST'); +``` + +```sql +[3, 2, 1, NULL] +``` + +## Flattening + +The flatten function is a scalar function that converts a list of lists into a single list by concatenating each sub-list together. +Note that this only flattens one level at a time, not all levels of sub-lists. + +Convert a list of lists into a single list: + +```sql +SELECT + flatten([ + [1, 2], + [3, 4] + ]); +``` + +```text +[1, 2, 3, 4] +``` + +If the list has multiple levels of lists, only the first level of sub-lists is concatenated into a single list: + +```sql +SELECT + flatten([ + [ + [1, 2], + [3, 4], + ], + [ + [5, 6], + [7, 8], + ] + ]); +``` + +```text +[[1, 2], [3, 4], [5, 6], [7, 8]] +``` + +In general, the input to the flatten function should be a list of lists (not a single level list). +However, the behavior of the flatten function has specific behavior when handling empty lists and `NULL` values. + +If the input list is empty, return an empty list: + +```sql +SELECT flatten([]); +``` + +```text +[] +``` + +If the entire input to flatten is `NULL`, return `NULL`: + +```sql +SELECT flatten(NULL); +``` + +```text +NULL +``` + +If a list whose only entry is `NULL` is flattened, return an empty list: + +```sql +SELECT flatten([NULL]); +``` + +```text +[] +``` + +If the sub-list in a list of lists only contains `NULL`, do not modify the sub-list: + +```sql +-- (Note the extra set of parentheses vs. the prior example) +SELECT flatten([[NULL]]); +``` + +```text +[NULL] +``` + +Even if the only contents of each sub-list is `NULL`, still concatenate them together. Note that no de-duplication occurs when flattening. See `list_distinct` function for de-duplication: + +```sql +SELECT flatten([[NULL],[NULL]]); +``` + +```text +[NULL, NULL] +``` + +## Lambda Functions + +DuckDB supports lambda functions in the form `(parameter1, parameter2, ...) -> expression`. +For details, see the [lambda functions page]({% link docs/archive/1.0/sql/functions/lambda.md %}). + +## Related Functions + +There are also [aggregate functions]({% link docs/archive/1.0/sql/functions/aggregates.md %}) `list` and `histogram` that produces lists and lists of structs. +The [`unnest`]({% link docs/archive/1.0/sql/query_syntax/unnest.md %}) function is used to unnest a list by one level. \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/map.md b/docs/archive/1.0/sql/functions/map.md new file mode 100644 index 00000000000..4107c5dab02 --- /dev/null +++ b/docs/archive/1.0/sql/functions/map.md @@ -0,0 +1,90 @@ +--- +layout: docu +title: Map Functions +--- + + + +| Name | Description | +|:--|:-------| +| [`cardinality(map)`](#cardinalitymap) | Return the size of the map (or the number of entries in the map). | +| [`element_at(map, key)`](#element_atmap-key) | Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map's keys else an error is returned. | +| [`map_entries(map)`](#map_entriesmap) | Return a list of struct(k, v) for each key-value pair in the map. | +| [`map_extract(map, key)`](#map_extractmap-key) | Alias of `element_at`. Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map's keys else an error is returned. | +| [`map_from_entries(STRUCT(k, v)[])`](#map_from_entriesstructk-v) | Returns a map created from the entries of the array. | +| [`map_keys(map)`](#map_keysmap) | Return a list of all keys in the map. | +| [`map_values(map)`](#map_valuesmap) | Return a list of all values in the map. | +| [`map()`](#map) | Returns an empty map. | +| [`map[entry]`](#mapentry) | Alias for `element_at`. | + +#### `cardinality(map)` + +
+ +| **Description** | Return the size of the map (or the number of entries in the map). | +| **Example** | `cardinality(map([4, 2], ['a', 'b']))` | +| **Result** | `2` | + +#### `element_at(map, key)` + +
+ +| **Description** | Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map's keys else an error is returned. | +| **Example** | `element_at(map([100, 5], [42, 43]), 100)` | +| **Result** | `[42]` | + +#### `map_entries(map)` + +
+ +| **Description** | Return a list of struct(k, v) for each key-value pair in the map. | +| **Example** | `map_entries(map([100, 5], [42, 43]))` | +| **Result** | `[{'key': 100, 'value': 42}, {'key': 5, 'value': 43}]` | + +#### `map_extract(map, key)` + +
+ +| **Description** | Alias of `element_at`. Return a list containing the value for a given key or an empty list if the key is not contained in the map. The type of the key provided in the second parameter must match the type of the map's keys else an error is returned. | +| **Example** | `map_extract(map([100, 5], [42, 43]), 100)` | +| **Result** | `[42]` | + +#### `map_from_entries(STRUCT(k, v)[])` + +
+ +| **Description** | Returns a map created from the entries of the array. | +| **Example** | `map_from_entries([{k: 5, v: 'val1'}, {k: 3, v: 'val2'}])` | +| **Result** | `{5=val1, 3=val2}` | + +#### `map_keys(map)` + +
+ +| **Description** | Return a list of all keys in the map. | +| **Example** | `map_keys(map([100, 5], [42,43]))` | +| **Result** | `[100, 5]` | + +#### `map_values(map)` + +
+ +| **Description** | Return a list of all values in the map. | +| **Example** | `map_values(map([100, 5], [42, 43]))` | +| **Result** | `[42, 43]` | + +#### `map()` + +
+ +| **Description** | Returns an empty map. | +| **Example** | `map()` | +| **Result** | `{}` | + +#### `map[entry]` + +
+ +| **Description** | Alias for `element_at`. | +| **Example** | `map([100, 5], ['a', 'b'])[100]` | +| **Result** | `[a]` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/nested.md b/docs/archive/1.0/sql/functions/nested.md new file mode 100644 index 00000000000..1a94d2b9f63 --- /dev/null +++ b/docs/archive/1.0/sql/functions/nested.md @@ -0,0 +1,16 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/nested +title: Nested Functions +--- + +There are five [nested data types]({% link docs/archive/1.0/sql/data_types/overview.md %}#nested--composite-types): + +| Name | Type page | Functions page | +|--|---|---| +| `ARRAY` | [`ARRAY` type]({% link docs/archive/1.0/sql/data_types/array.md %}) | [`ARRAY` functions]({% link docs/archive/1.0/sql/functions/array.md %}) | +| `LIST` | [`LIST` type]({% link docs/archive/1.0/sql/data_types/list.md %}) | [`LIST` functions]({% link docs/archive/1.0/sql/functions/list.md %}) | +| `MAP` | [`MAP` type]({% link docs/archive/1.0/sql/data_types/map.md %}) | [`MAP` functions]({% link docs/archive/1.0/sql/functions/map.md %}) | +| `STRUCT` | [`STRUCT` type]({% link docs/archive/1.0/sql/data_types/struct.md %}) | [`STRUCT` functions]({% link docs/archive/1.0/sql/functions/struct.md %}) | +| `UNION` | [`UNION` type]({% link docs/archive/1.0/sql/data_types/union.md %}) | [`UNION` functions]({% link docs/archive/1.0/sql/functions/union.md %}) | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/numeric.md b/docs/archive/1.0/sql/functions/numeric.md new file mode 100644 index 00000000000..10fabd5e1e3 --- /dev/null +++ b/docs/archive/1.0/sql/functions/numeric.md @@ -0,0 +1,540 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/math +title: Numeric Functions +--- + + + +## Numeric Operators + +The table below shows the available mathematical operators for [numeric types]({% link docs/archive/1.0/sql/data_types/numeric.md %}). + +
+ + + +| Operator | Description | Example | Result | +|-|-----|--|-| +| `+` | addition | `2 + 3` | `5` | +| `-` | subtraction | `2 - 3` | `-1` | +| `*` | multiplication | `2 * 3` | `6` | +| `/` | float division | `5 / 2` | `2.5` | +| `//` | division | `5 // 2` | `2` | +| `%` | modulo (remainder) | `5 % 4` | `1` | +| `**` | exponent | `3 ** 4` | `81` | +| `^` | exponent (alias for `**`) | `3 ^ 4` | `81` | +| `&` | bitwise AND | `91 & 15` | `11` | +| `|` | bitwise OR | `32 | 3` | `35` | +| `<<` | bitwise shift left | `1 << 4` | `16` | +| `>>` | bitwise shift right | `8 >> 2` | `2` | +| `~` | bitwise negation | `~15` | `-16` | +| `!` | factorial of `x` | `4!` | `24` | + + + +### Division and Modulo Operators + +There are two division operators: `/` and `//`. +They are equivalent when at least one of the operands is a `FLOAT` or a `DOUBLE`. +When both operands are integers, `/` performs floating points division (`5 / 2 = 2.5`) while `//` performs integer division (`5 // 2 = 2`). + +### Supported Types + +The modulo, bitwise, and negation and factorial operators work only on integral data types, +whereas the others are available for all numeric data types. + +## Numeric Functions + +The table below shows the available mathematical functions. + +| Name | Description | +|:--|:-------| +| [`@(x)`](#x) | Absolute value. Parentheses are optional if `x` is a column name. | +| [`abs(x)`](#absx) | Absolute value. | +| [`acos(x)`](#acosx) | Computes the arccosine of `x`. | +| [`add(x, y)`](#addx-y) | Alias for `x + y`. | +| [`asin(x)`](#asinx) | Computes the arcsine of `x`. | +| [`atan(x)`](#atanx) | Computes the arctangent of `x`. | +| [`atan2(y, x)`](#atan2y-x) | Computes the arctangent `(y, x)`. | +| [`bit_count(x)`](#bit_countx) | Returns the number of bits that are set. | +| [`cbrt(x)`](#cbrtx) | Returns the cube root of the number. | +| [`ceil(x)`](#ceilx) | Rounds the number up. | +| [`ceiling(x)`](#ceilingx) | Rounds the number up. Alias of `ceil`. | +| [`cos(x)`](#cosx) | Computes the cosine of `x`. | +| [`cot(x)`](#cotx) | Computes the cotangent of `x`. | +| [`degrees(x)`](#degreesx) | Converts radians to degrees. | +| [`divide(x, y)`](#dividex-y) | Alias for `x // y`. | +| [`even(x)`](#evenx) | Round to next even number by rounding away from zero. | +| [`exp(x)`](#expx) | Computes `e ** x`. | +| [`factorial(x)`](#factorialx) | See `!` operator. Computes the product of the current integer and all integers below it. | +| [`fdiv(x, y)`](#fdivx-y) | Performs integer division (`x // y`) but returns a `DOUBLE` value. | +| [`floor(x)`](#floorx) | Rounds the number down. | +| [`fmod(x, y)`](#fmodx-y) | Calculates the modulo value. Always returns a `DOUBLE` value. | +| [`gamma(x)`](#gammax) | Interpolation of the factorial of `x - 1`. Fractional inputs are allowed. | +| [`gcd(x, y)`](#gcdx-y) | Computes the greatest common divisor of `x` and `y`. | +| [`greatest_common_divisor(x, y)`](#greatest_common_divisorx-y) | Computes the greatest common divisor of `x` and `y`. | +| [`greatest(x1, x2, ...)`](#greatestx1-x2-) | Selects the largest value. | +| [`isfinite(x)`](#isfinitex) | Returns true if the floating point value is finite, false otherwise. | +| [`isinf(x)`](#isinfx) | Returns true if the floating point value is infinite, false otherwise. | +| [`isnan(x)`](#isnanx) | Returns true if the floating point value is not a number, false otherwise. | +| [`lcm(x, y)`](#lcmx-y) | Computes the least common multiple of `x` and `y`. | +| [`least_common_multiple(x, y)`](#least_common_multiplex-y) | Computes the least common multiple of `x` and `y`. | +| [`least(x1, x2, ...)`](#leastx1-x2-) | Selects the smallest value. | +| [`lgamma(x)`](#lgammax) | Computes the log of the `gamma` function. | +| [`ln(x)`](#lnx) | Computes the natural logarithm of `x`. | +| [`log(x)`](#logx) | Computes the base-10 logarithm of `x`. | +| [`log10(x)`](#log10x) | Alias of `log`. Computes the base-10 logarithm of `x`. | +| [`log2(x)`](#log2x) | Computes the base-2 log of `x`. | +| [`multiply(x, y)`](#multiplyx-y) | Alias for `x * y`. | +| [`nextafter(x, y)`](#nextafterx-y) | Return the next floating point value after `x` in the direction of `y`. | +| [`pi()`](#pi) | Returns the value of pi. | +| [`pow(x, y)`](#powx-y) | Computes `x` to the power of `y`. | +| [`power(x, y)`](#powerx-y) | Alias of `pow`. computes `x` to the power of `y`. | +| [`radians(x)`](#radiansx) | Converts degrees to radians. | +| [`random()`](#random) | Returns a random number `x` in the range `0.0 <= x < 1.0`. | +| [`round_even(v NUMERIC, s INTEGER)`](#round_evenv-numeric-s-integer) | Alias of `roundbankers(v, s)`. Round to `s` decimal places using the [_rounding half to even_ rule](https://en.wikipedia.org/wiki/Rounding#Rounding_half_to_even). Values `s < 0` are allowed. | +| [`round(v NUMERIC, s INTEGER)`](#roundv-numeric-s-integer) | Round to `s` decimal places. Values `s < 0` are allowed. | +| [`setseed(x)`](#setseedx) | Sets the seed to be used for the random function. | +| [`sign(x)`](#signx) | Returns the sign of `x` as -1, 0 or 1. | +| [`signbit(x)`](#signbitx) | Returns whether the signbit is set or not. | +| [`sin(x)`](#sinx) | Computes the sin of `x`. | +| [`sqrt(x)`](#sqrtx) | Returns the square root of the number. | +| [`subtract(x, y)`](#subtractx-y) | Alias for `x - y`. | +| [`tan(x)`](#tanx) | Computes the tangent of `x`. | +| [`trunc(x)`](#truncx) | Truncates the number. | +| [`xor(x, y)`](#xorx-y) | Bitwise XOR. | + +#### `@(x)` + +
+ +| **Description** | Absolute value. Parentheses are optional if `x` is a column name. | +| **Example** | `@(-17.4)` | +| **Result** | `17.4` | +| **Alias** | `abs` | + +#### `abs(x)` + +
+ +| **Description** | Absolute value. | +| **Example** | `abs(-17.4)` | +| **Result** | `17.4` | +| **Alias** | `@` | + +#### `acos(x)` + +
+ +| **Description** | Computes the arccosine of `x`. | +| **Example** | `acos(0.5)` | +| **Result** | `1.0471975511965976` | + +#### `add(x, y)` + +
+ +| **Description** | Alias for `x + y`. | +| **Example** | `add(2, 3)` | +| **Result** | `5` | + +#### `asin(x)` + +
+ +| **Description** | Computes the arcsine of `x`. | +| **Example** | `asin(0.5)` | +| **Result** | `0.5235987755982989` | + +#### `atan(x)` + +
+ +| **Description** | Computes the arctangent of `x`. | +| **Example** | `atan(0.5)` | +| **Result** | `0.4636476090008061` | + +#### `atan2(y, x)` + +
+ +| **Description** | Computes the arctangent (y, x). | +| **Example** | `atan2(0.5, 0.5)` | +| **Result** | `0.7853981633974483` | + +#### `bit_count(x)` + +
+ +| **Description** | Returns the number of bits that are set. | +| **Example** | `bit_count(31)` | +| **Result** | `5` | + +#### `cbrt(x)` + +
+ +| **Description** | Returns the cube root of the number. | +| **Example** | `cbrt(8)` | +| **Result** | `2` | + +#### `ceil(x)` + +
+ +| **Description** | Rounds the number up. | +| **Example** | `ceil(17.4)` | +| **Result** | `18` | + +#### `ceiling(x)` + +
+ +| **Description** | Rounds the number up. Alias of `ceil`. | +| **Example** | `ceiling(17.4)` | +| **Result** | `18` | + +#### `cos(x)` + +
+ +| **Description** | Computes the cosine of `x`. | +| **Example** | `cos(90)` | +| **Result** | `-0.4480736161291701` | + +#### `cot(x)` + +
+ +| **Description** | Computes the cotangent of `x`. | +| **Example** | `cot(0.5)` | +| **Result** | `1.830487721712452` | + +#### `degrees(x)` + +
+ +| **Description** | Converts radians to degrees. | +| **Example** | `degrees(pi())` | +| **Result** | `180` | + +#### `divide(x, y)` + +
+ +| **Description** | Alias for `x // y`. | +| **Example** | `divide(5, 2)` | +| **Result** | `2` | + +#### `even(x)` + +
+ +| **Description** | Round to next even number by rounding away from zero. | +| **Example** | `even(2.9)` | +| **Result** | `4` | + +#### `exp(x)` + +
+ +| **Description** | Computes `e ** x`. | +| **Example** | `exp(0.693)` | +| **Result** | `2` | + +#### `factorial(x)` + +
+ +| **Description** | See `!` operator. Computes the product of the current integer and all integers below it. | +| **Example** | `factorial(4)` | +| **Result** | `24` | + +#### `fdiv(x, y)` + +
+ +| **Description** | Performs integer division (`x // y`) but returns a `DOUBLE` value. | +| **Example** | `fdiv(5, 2)` | +| **Result** | `2.0` | + +#### `floor(x)` + +
+ +| **Description** | Rounds the number down. | +| **Example** | `floor(17.4)` | +| **Result** | `17` | + +#### `fmod(x, y)` + +
+ +| **Description** | Calculates the modulo value. Always returns a `DOUBLE` value. | +| **Example** | `fmod(5, 2)` | +| **Result** | `1.0` | + +#### `gamma(x)` + +
+ +| **Description** | Interpolation of the factorial of `x - 1`. Fractional inputs are allowed. | +| **Example** | `gamma(5.5)` | +| **Result** | `52.34277778455352` | + +#### `gcd(x, y)` + +
+ +| **Description** | Computes the greatest common divisor of `x` and `y`. | +| **Example** | `gcd(42, 57)` | +| **Result** | `3` | + +#### `greatest_common_divisor(x, y)` + +
+ +| **Description** | Computes the greatest common divisor of `x` and `y`. | +| **Example** | `greatest_common_divisor(42, 57)` | +| **Result** | `3` | + +#### `greatest(x1, x2, ...)` + +
+ +| **Description** | Selects the largest value. | +| **Example** | `greatest(3, 2, 4, 4)` | +| **Result** | `4` | + +#### `isfinite(x)` + +
+ +| **Description** | Returns true if the floating point value is finite, false otherwise. | +| **Example** | `isfinite(5.5)` | +| **Result** | `true` | + +#### `isinf(x)` + +
+ +| **Description** | Returns true if the floating point value is infinite, false otherwise. | +| **Example** | `isinf('Infinity'::float)` | +| **Result** | `true` | + +#### `isnan(x)` + +
+ +| **Description** | Returns true if the floating point value is not a number, false otherwise. | +| **Example** | `isnan('NaN'::float)` | +| **Result** | `true` | + +#### `lcm(x, y)` + +
+ +| **Description** | Computes the least common multiple of `x` and `y`. | +| **Example** | `lcm(42, 57)` | +| **Result** | `798` | + +#### `least_common_multiple(x, y)` + +
+ +| **Description** | Computes the least common multiple of `x` and `y`. | +| **Example** | `least_common_multiple(42, 57)` | +| **Result** | `798` | + +#### `least(x1, x2, ...)` + +
+ +| **Description** | Selects the smallest value. | +| **Example** | `least(3, 2, 4, 4)` | +| **Result** | `2` | + +#### `lgamma(x)` + +
+ +| **Description** | Computes the log of the `gamma` function. | +| **Example** | `lgamma(2)` | +| **Result** | `0` | + +#### `ln(x)` + +
+ +| **Description** | Computes the natural logarithm of `x`. | +| **Example** | `ln(2)` | +| **Result** | `0.693` | + +#### `log(x)` + +
+ +| **Description** | Computes the base-10 log of `x`. | +| **Example** | `log(100)` | +| **Result** | `2` | + +#### `log10(x)` + +
+ +| **Description** | Alias of `log`. Computes the base-10 log of `x`. | +| **Example** | `log10(1000)` | +| **Result** | `3` | + +#### `log2(x)` + +
+ +| **Description** | Computes the base-2 log of `x`. | +| **Example** | `log2(8)` | +| **Result** | `3` | + +#### `multiply(x, y)` + +
+ +| **Description** | Alias for `x * y`. | +| **Example** | `multiply(2, 3)` | +| **Result** | `6` | + +#### `nextafter(x, y)` + +
+ +| **Description** | Return the next floating point value after `x` in the direction of `y`. | +| **Example** | `nextafter(1::float, 2::float)` | +| **Result** | `1.0000001` | + +#### `pi()` + +
+ +| **Description** | Returns the value of pi. | +| **Example** | `pi()` | +| **Result** | `3.141592653589793` | + +#### `pow(x, y)` + +
+ +| **Description** | Computes `x` to the power of `y`. | +| **Example** | `pow(2, 3)` | +| **Result** | `8` | + +#### `power(x, y)` + +
+ +| **Description** | Alias of `pow`. computes `x` to the power of `y`. | +| **Example** | `power(2, 3)` | +| **Result** | `8` | + +#### `radians(x)` + +
+ +| **Description** | Converts degrees to radians. | +| **Example** | `radians(90)` | +| **Result** | `1.5707963267948966` | + +#### `random()` + +
+ +| **Description** | Returns a random number `x` in the range `0.0 <= x < 1.0`. | +| **Example** | `random()` | +| **Result** | various | + +#### `round_even(v NUMERIC, s INTEGER)` + +
+ +| **Description** | Alias of `roundbankers(v, s)`. Round to `s` decimal places using the [_rounding half to even_ rule](https://en.wikipedia.org/wiki/Rounding#Rounding_half_to_even). Values `s < 0` are allowed. | +| **Example** | `round_even(24.5, 0)` | +| **Result** | `24.0` | + +#### `round(v NUMERIC, s INTEGER)` + +
+ +| **Description** | Round to `s` decimal places. Values `s < 0` are allowed. | +| **Example** | `round(42.4332, 2)` | +| **Result** | `42.43` | + +#### `setseed(x)` + +
+ +| **Description** | Sets the seed to be used for the random function. | +| **Example** | `setseed(0.42)` | + +#### `sign(x)` + +
+ +| **Description** | Returns the sign of `x` as -1, 0 or 1. | +| **Example** | `sign(-349)` | +| **Result** | `-1` | + +#### `signbit(x)` + +
+ +| **Description** | Returns whether the signbit is set or not. | +| **Example** | `signbit(-1.0)` | +| **Result** | `true` | + +#### `sin(x)` + +
+ +| **Description** | Computes the sin of `x`. | +| **Example** | `sin(90)` | +| **Result** | `0.8939966636005579` | + +#### `sqrt(x)` + +
+ +| **Description** | Returns the square root of the number. | +| **Example** | `sqrt(9)` | +| **Result** | `3` | + +#### `subtract(x, y)` + +
+ +| **Description** | Alias for `x - y`. | +| **Example** | `subtract(2, 3)` | +| **Result** | `-1` | + +#### `tan(x)` + +
+ +| **Description** | Computes the tangent of `x`. | +| **Example** | `tan(90)` | +| **Result** | `-1.995200412208242` | + +#### `trunc(x)` + +
+ +| **Description** | Truncates the number. | +| **Example** | `trunc(17.4)` | +| **Result** | `17` | + +#### `xor(x, y)` + +
+ +| **Description** | Bitwise XOR. | +| **Example** | `xor(17, 5)` | +| **Result** | `20` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/overview.md b/docs/archive/1.0/sql/functions/overview.md new file mode 100644 index 00000000000..47a5709d6a7 --- /dev/null +++ b/docs/archive/1.0/sql/functions/overview.md @@ -0,0 +1,57 @@ +--- +layout: docu +railroad: expressions/function.js +redirect_from: +- docs/archive/1.0/test/functions/overview +title: Functions +--- + +## Function Syntax + +
+ +## Function Chaining via the Dot Operator + +DuckDB supports the dot syntax for function chaining. This allows the function call `fn(arg1, arg2, arg3, ...)` to be rewritten as `arg1.fn(arg2, arg3, ...)`. For example, take the following use of the [`replace` function]({% link docs/archive/1.0/sql/functions/char.md %}#replacestring-source-target): + +```sql +SELECT replace(goose_name, 'goose', 'duck') AS duck_name +FROM unnest(['African goose', 'Faroese goose', 'Hungarian goose', 'Pomeranian goose']) breed(goose_name); +``` + +This can be rewritten as follows: + +```sql +SELECT goose_name.replace('goose', 'duck') AS duck_name +FROM unnest(['African goose', 'Faroese goose', 'Hungarian goose', 'Pomeranian goose']) breed(goose_name); +``` + +## Query Functions + +The `duckdb_functions()` table function shows the list of functions currently built into the system. + +```sql +SELECT DISTINCT ON(function_name) + function_name, + function_type, + return_type, + parameters, + parameter_types, + description +FROM duckdb_functions() +WHERE function_type = 'scalar' + AND function_name LIKE 'b%' +ORDER BY function_name; +``` + +| function_name | function_type | return_type | parameters | parameter_types | description | +|---------------|---------------|-------------|------------------------|----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------| +| bar | scalar | VARCHAR | [x, min, max, width] | [DOUBLE, DOUBLE, DOUBLE, DOUBLE] | Draws a band whose width is proportional to (x - min) and equal to width characters when x = max. width defaults to 80 | +| base64 | scalar | VARCHAR | [blob] | [BLOB] | Convert a blob to a base64 encoded string | +| bin | scalar | VARCHAR | [value] | [VARCHAR] | Converts the value to binary representation | +| bit_count | scalar | TINYINT | [x] | [TINYINT] | Returns the number of bits that are set | +| bit_length | scalar | BIGINT | [col0] | [VARCHAR] | NULL | +| bit_position | scalar | INTEGER | [substring, bitstring] | [BIT, BIT] | Returns first starting index of the specified substring within bits, or zero if it is not present. The first (leftmost) bit is indexed 1 | +| bitstring | scalar | BIT | [bitstring, length] | [VARCHAR, INTEGER] | Pads the bitstring until the specified length | + +> Currently, the description and parameter names of functions are not available in the `duckdb_functions()` function. \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/pattern_matching.md b/docs/archive/1.0/sql/functions/pattern_matching.md new file mode 100644 index 00000000000..5fe4d6c6183 --- /dev/null +++ b/docs/archive/1.0/sql/functions/pattern_matching.md @@ -0,0 +1,210 @@ +--- +layout: docu +railroad: expressions/like.js +redirect_from: +- docs/archive/1.0/sql/functions/patternmatching +- docs/archive/1.0/sql/functions/patternmatching.html +title: Pattern Matching +--- + +There are four separate approaches to pattern matching provided by DuckDB: +the traditional SQL [`LIKE` operator](#like), +the more recent [`SIMILAR TO` operator](#similar-to) (added in SQL:1999), +a [`GLOB` operator](#glob), +and POSIX-style [regular expressions](#regular-expressions). + +## `LIKE` + +
+ +The `LIKE` expression returns `true` if the string matches the supplied pattern. (As expected, the `NOT LIKE` expression returns `false` if `LIKE` returns `true`, and vice versa. An equivalent expression is `NOT (string LIKE pattern)`.) + +If pattern does not contain percent signs or underscores, then the pattern only represents the string itself; in that case `LIKE` acts like the equals operator. An underscore (`_`) in pattern stands for (matches) any single character; a percent sign (`%`) matches any sequence of zero or more characters. + +`LIKE` pattern matching always covers the entire string. Therefore, if it's desired to match a sequence anywhere within a string, the pattern must start and end with a percent sign. + +Some examples: + +```sql +SELECT 'abc' LIKE 'abc'; -- true +SELECT 'abc' LIKE 'a%' ; -- true +SELECT 'abc' LIKE '_b_'; -- true +SELECT 'abc' LIKE 'c'; -- false +SELECT 'abc' LIKE 'c%' ; -- false +SELECT 'abc' LIKE '%c'; -- true +SELECT 'abc' NOT LIKE '%c'; -- false +``` + +The keyword `ILIKE` can be used instead of `LIKE` to make the match case-insensitive according to the active locale: + +```sql +SELECT 'abc' ILIKE '%C'; -- true +``` + +```sql +SELECT 'abc' NOT ILIKE '%C'; -- false +``` + +To search within a string for a character that is a wildcard (`%` or `_`), the pattern must use an `ESCAPE` clause and an escape character to indicate the wildcard should be treated as a literal character instead of a wildcard. See an example below. + +Additionally, the function `like_escape` has the same functionality as a `LIKE` expression with an `ESCAPE` clause, but using function syntax. See the [Text Functions Docs]({% link docs/archive/1.0/sql/functions/char.md %}) for details. + +Search for strings with 'a' then a literal percent sign then 'c': + +```sql +SELECT 'a%c' LIKE 'a$%c' ESCAPE '$'; -- true +SELECT 'azc' LIKE 'a$%c' ESCAPE '$'; -- false +``` + +Case-insensitive ILIKE with ESCAPE: + +```sql +SELECT 'A%c' ILIKE 'a$%c' ESCAPE '$'; -- true +``` + +There are also alternative characters that can be used as keywords in place of `LIKE` expressions. These enhance PostgreSQL compatibility. + +
+ +| LIKE-style | PostgreSQL-style | +|:---|:---| +| LIKE | ~~ | +| NOT LIKE | !~~ | +| ILIKE | ~~* | +| NOT ILIKE | !~~* | + +## `SIMILAR TO` + +
+ +The `SIMILAR TO` operator returns true or false depending on whether its pattern matches the given string. It is similar to `LIKE`, except that it interprets the pattern using a [regular expression]({% link docs/archive/1.0/sql/functions/regular_expressions.md %}). Like `LIKE`, the `SIMILAR TO` operator succeeds only if its pattern matches the entire string; this is unlike common regular expression behavior where the pattern can match any part of the string. + +A regular expression is a character sequence that is an abbreviated definition of a set of strings (a regular set). A string is said to match a regular expression if it is a member of the regular set described by the regular expression. As with `LIKE`, pattern characters match string characters exactly unless they are special characters in the regular expression language — but regular expressions use different special characters than `LIKE` does. + +Some examples: + +```sql +SELECT 'abc' SIMILAR TO 'abc'; -- true +SELECT 'abc' SIMILAR TO 'a'; -- false +SELECT 'abc' SIMILAR TO '.*(b|d).*'; -- true +SELECT 'abc' SIMILAR TO '(b|c).*'; -- false +SELECT 'abc' NOT SIMILAR TO 'abc'; -- false +``` + +There are also alternative characters that can be used as keywords in place of `SIMILAR TO` expressions. These follow POSIX syntax. + +
+ +| `SIMILAR TO`-style | POSIX-style | +|:---|:---| +| SIMILAR TO | ~ | +| NOT SIMILAR TO | !~ | + +## Globbing + +DuckDB supports file name expansion, also known as globbing, for discovering files. +DuckDB's glob syntax uses the question mark (`?`) wildcard to match any single character and the asterisk (`*`) to match zero or more characters. +In addition, you can use the bracket syntax (`[...]`) to match any single character contained within the brackets, or within the character range specified by the brackets. An exclamation mark (`!`) may be used inside the first bracket to search for a character that is not contained within the brackets. +To learn more, visit the [“glob (programming)” Wikipedia page](https://en.wikipedia.org/wiki/Glob_(programming)). + +### `GLOB` + +
+ +The `GLOB` operator returns `true` or `false` if the string matches the `GLOB` pattern. The `GLOB` operator is most commonly used when searching for filenames that follow a specific pattern (for example a specific file extension). + +Some examples: + +```sql +SELECT 'best.txt' GLOB '*.txt'; -- true +SELECT 'best.txt' GLOB '????.txt'; -- true +SELECT 'best.txt' GLOB '?.txt'; -- false +SELECT 'best.txt' GLOB '[abc]est.txt'; -- true +SELECT 'best.txt' GLOB '[a-z]est.txt'; -- true +``` + +The bracket syntax is case-sensitive: + +```sql +SELECT 'Best.txt' GLOB '[a-z]est.txt'; -- false +SELECT 'Best.txt' GLOB '[a-zA-Z]est.txt'; -- true +``` + +The `!` applies to all characters within the brackets: + +```sql +SELECT 'Best.txt' GLOB '[!a-zA-Z]est.txt'; -- false +``` + +To negate a GLOB operator, negate the entire expression: + +```sql +SELECT NOT 'best.txt' GLOB '*.txt'; -- false +``` + +Three tildes (`~~~`) may also be used in place of the `GLOB` keyword. + +
+ +| GLOB-style | Symbolic-style | +|:---|:---| +| `GLOB` | `~~~` | + +### Glob Function to Find Filenames + +The glob pattern matching syntax can also be used to search for filenames using the `glob` table function. +It accepts one parameter: the path to search (which may include glob patterns). + +Search the current directory for all files: + +```sql +SELECT * FROM glob('*'); +``` + +
+ +| file | +|---------------| +| duckdb.exe | +| test.csv | +| test.json | +| test.parquet | +| test2.csv | +| test2.parquet | +| todos.json | + +### Globbing Semantics + +DuckDB's globbing implementation follows the semantics of [Python's `glob`](https://docs.python.org/3/library/glob.html) and not the `glob` used in the shell. +A notable difference is the behavior of the `**/` construct: `**/⟨filename⟩` will not return a file with `⟨filename⟩` in top-level directory. +For example, with a `README.md` file present in the directory, the following query finds it: + +```sql +SELECT * FROM glob('README.md'); +``` + +
+ +| file | +|-----------| +| README.md | + +However, the following query returns an empty result: + +```sql +SELECT * FROM glob('**/README.md'); +``` + +Meanwhile, the globbing of Bash, Zsh, etc. finds the file using the same syntax: + +```bash +ls **/README.md +``` + +```text +README.md +``` + +## Regular Expressions + +DuckDB's regex support is documented on the [Regular Expressions page]({% link docs/archive/1.0/sql/functions/regular_expressions.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/regular_expressions.md b/docs/archive/1.0/sql/functions/regular_expressions.md new file mode 100644 index 00000000000..16949f97812 --- /dev/null +++ b/docs/archive/1.0/sql/functions/regular_expressions.md @@ -0,0 +1,204 @@ +--- +layout: docu +railroad: expressions/like.js +title: Regular Expressions +--- + + + +DuckDB offers [pattern matching operators]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}) +([`LIKE`]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#like), +[`SIMILAR TO`]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#similar-to), +[`GLOB`]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}#glob)), +as well as support for regular expressions via functions. + +## Regular Expression Syntax + +DuckDB uses the [RE2 library](https://github.com/google/re2) as its regular expression engine. For the regular expression syntax, see the [RE2 docs](https://github.com/google/re2/wiki/Syntax). + +## Functions + +All functions accept an optional set of [options](#options-for-regular-expression-functions). + +| Name | Description | +|:--|:-------| +| [`regexp_extract(string, pattern[, group = 0][, options])`](#regexp_extractstring-pattern-group--0-options) | If `string` contains the regexp `pattern`, returns the capturing group specified by optional parameter `group`. The `group` must be a constant value. If no `group` is given, it defaults to 0. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| [`regexp_extract(string, pattern, name_list[, options])`](#regexp_extractstring-pattern-name_list-options) | If `string` contains the regexp `pattern`, returns the capturing groups as a struct with corresponding names from `name_list`. | +| [`regexp_extract_all(string, regex[, group = 0][, options])`](#regexp_extract_allstring-regex-group--0-options) | Split the `string` along the `regex` and extract all occurrences of `group`. | +| [`regexp_full_match(string, regex[, options])`](#regexp_full_matchstring-regex-options) | Returns `true` if the entire `string` matches the `regex`. | +| [`regexp_matches(string, pattern[, options])`](#regexp_matchesstring-pattern-options) | Returns `true` if `string` contains the regexp `pattern`, `false` otherwise. | +| [`regexp_replace(string, pattern, replacement[, options])`](#regexp_replacestring-pattern-replacement-options) | If `string` contains the regexp `pattern`, replaces the matching part with `replacement`. | +| [`regexp_split_to_array(string, regex[, options])`](#regexp_split_to_arraystring-regex-options) | Alias of `string_split_regex`. Splits the `string` along the `regex`. | +| [`regexp_split_to_table(string, regex[, options])`](#regexp_split_to_tablestring-regex-options) | Splits the `string` along the `regex` and returns a row for each part. | + +#### `regexp_extract(string, pattern[, group = 0][, options])` + +
+ +| **Description** | If `string` contains the regexp `pattern`, returns the capturing group specified by optional parameter `group`. The `group` must be a constant value. If no `group` is given, it defaults to 0. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_extract('abc', '([a-z])(b)', 1)` | +| **Result** | `a` | + +#### `regexp_extract(string, pattern, name_list[, options])` + +
+ +| **Description** | If `string` contains the regexp `pattern`, returns the capturing groups as a struct with corresponding names from `name_list`. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_extract('2023-04-15', '(\d+)-(\d+)-(\d+)', ['y', 'm', 'd'])` | +| **Result** | `{'y':'2023', 'm':'04', 'd':'15'}` | + +#### `regexp_extract_all(string, regex[, group = 0][, options])` + +
+ +| **Description** | Split the `string` along the `regex` and extract all occurrences of `group`. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_extract_all('hello_world', '([a-z ]+)_?', 1)` | +| **Result** | `[hello, world]` | + +#### `regexp_full_match(string, regex[, options])` + +
+ +| **Description** | Returns `true` if the entire `string` matches the `regex`. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_full_match('anabanana', '(an)*')` | +| **Result** | `false` | + +#### `regexp_matches(string, pattern[, options])` + +
+ +| **Description** | Returns `true` if `string` contains the regexp `pattern`, `false` otherwise. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_matches('anabanana', '(an)*')` | +| **Result** | `true` | + +#### `regexp_replace(string, pattern, replacement[, options])` + +
+ +| **Description** | If `string` contains the regexp `pattern`, replaces the matching part with `replacement`. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_replace('hello', '[lo]', '-')` | +| **Result** | `he-lo` | + +#### `regexp_split_to_array(string, regex[, options])` + +
+ +| **Description** | Alias of `string_split_regex`. Splits the `string` along the `regex`. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_split_to_array('hello world; 42', ';? ')` | +| **Result** | `['hello', 'world', '42']` | + +#### `regexp_split_to_table(string, regex[, options])` + +
+ +| **Description** | Splits the `string` along the `regex` and returns a row for each part. A set of optional [`options`](#options-for-regular-expression-functions) can be set. | +| **Example** | `regexp_split_to_table('hello world; 42', ';? ')` | +| **Result** | Three rows: `'hello'`, `'world', '42'` | + +The `regexp_matches` function is similar to the `SIMILAR TO` operator, however, it does not require the entire string to match. Instead, `regexp_matches` returns `true` if the string merely contains the pattern (unless the special tokens `^` and `$` are used to anchor the regular expression to the start and end of the string). Below are some examples: + +```sql +SELECT regexp_matches('abc', 'abc'); -- true +SELECT regexp_matches('abc', '^abc$'); -- true +SELECT regexp_matches('abc', 'a'); -- true +SELECT regexp_matches('abc', '^a$'); -- false +SELECT regexp_matches('abc', '.*(b|d).*'); -- true +SELECT regexp_matches('abc', '(b|c).*'); -- true +SELECT regexp_matches('abc', '^(b|c).*'); -- false +SELECT regexp_matches('abc', '(?i)A'); -- true +SELECT regexp_matches('abc', 'A', 'i'); -- true +``` + +## Options for Regular Expression Functions + +The regex functions support the following `options`. + +
+ +| Option | Description | +|:---|:---| +| `'c'` | case-sensitive matching | +| `'i'` | case-insensitive matching | +| `'l'` | match literals instead of regular expression tokens | +| `'m'`, `'n'`, `'p'` | newline sensitive matching | +| `'g'` | global replace, only available for `regexp_replace` | +| `'s'` | non-newline sensitive matching | + +For example: + +```sql +SELECT regexp_matches('abcd', 'ABC', 'c'); -- false +SELECT regexp_matches('abcd', 'ABC', 'i'); -- true +SELECT regexp_matches('ab^/$cd', '^/$', 'l'); -- true +SELECT regexp_matches(E'hello\nworld', 'hello.world', 'p'); -- false +SELECT regexp_matches(E'hello\nworld', 'hello.world', 's'); -- true +``` + +### Using `regexp_matches` + +The `regexp_matches` operator will be optimized to the `LIKE` operator when possible. To achieve best performance, the `'c'` option (case-sensitive matching) should be passed if applicable. Note that by default the [`RE2` library](#regular-expression-syntax) doesn't match the `.` character to newline. + +
+ +| Original | Optimized equivalent | +|:---|:---| +| `regexp_matches('hello world', '^hello', 'c')` | `prefix('hello world', 'hello')` | +| `regexp_matches('hello world', 'world$', 'c')` | `suffix('hello world', 'world')` | +| `regexp_matches('hello world', 'hello.world', 'c')` | `LIKE 'hello_world'` | +| `regexp_matches('hello world', 'he.*rld', 'c')` | `LIKE '%he%rld'` | + +### Using `regexp_replace` + +The `regexp_replace` function can be used to replace the part of a string that matches the regexp pattern with a replacement string. The notation `\d` (where `d` is a number indicating the group) can be used to refer to groups captured in the regular expression in the replacement string. Note that by default, `regexp_replace` only replaces the first occurrence of the regular expression. To replace all occurrences, use the global replace (`g`) flag. + +Some examples for using `regexp_replace`: + +```sql +SELECT regexp_replace('abc', '(b|c)', 'X'); -- aXc +SELECT regexp_replace('abc', '(b|c)', 'X', 'g'); -- aXX +SELECT regexp_replace('abc', '(b|c)', '\1\1\1\1'); -- abbbbc +SELECT regexp_replace('abc', '(.*)c', '\1e'); -- abe +SELECT regexp_replace('abc', '(a)(b)', '\2\1'); -- bac +``` + +### Using `regexp_extract` + +The `regexp_extract` function is used to extract a part of a string that matches the regexp pattern. +A specific capturing group within the pattern can be extracted using the `group` parameter. If `group` is not specified, it defaults to 0, extracting the first match with the whole pattern. + +```sql +SELECT regexp_extract('abc', '.b.'); -- abc +SELECT regexp_extract('abc', '.b.', 0); -- abc +SELECT regexp_extract('abc', '.b.', 1); -- (empty) +SELECT regexp_extract('abc', '([a-z])(b)', 1); -- a +SELECT regexp_extract('abc', '([a-z])(b)', 2); -- b +``` + +The `regexp_extract` function also supports a `name_list` argument, which is a `LIST` of strings. Using `name_list`, the `regexp_extract` will return the corresponding capture groups as fields of a `STRUCT`: + +```sql +SELECT regexp_extract('2023-04-15', '(\d+)-(\d+)-(\d+)', ['y', 'm', 'd']); +``` + +```text +{'y': 2023, 'm': 04, 'd': 15} +``` + +```sql +SELECT regexp_extract('2023-04-15 07:59:56', '^(\d+)-(\d+)-(\d+) (\d+):(\d+):(\d+)', ['y', 'm', 'd']); +``` + +```text +{'y': 2023, 'm': 04, 'd': 15} +``` + +```sql +SELECT regexp_extract('duckdb_0_7_1', '^(\w+)_(\d+)_(\d+)', ['tool', 'major', 'minor', 'fix']); +``` + +```console +Binder Error: Not enough group names in regexp_extract +``` + +If the number of column names is less than the number of capture groups, then only the first groups are returned. +If the number of column names is greater, then an error is generated. \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/struct.md b/docs/archive/1.0/sql/functions/struct.md new file mode 100644 index 00000000000..1dbe442f037 --- /dev/null +++ b/docs/archive/1.0/sql/functions/struct.md @@ -0,0 +1,81 @@ +--- +layout: docu +title: Struct Functions +--- + + + +| Name | Description | +|:--|:-------| +| [`struct.entry`](#structentry) | Dot notation that serves as an alias for `struct_extract` from named `STRUCT`s. | +| [`struct[entry]`](#structentry) | Bracket notation that serves as an alias for `struct_extract` from named `STRUCT`s. | +| [`struct[idx]`](#structidx) | Bracket notation that serves as an alias for `struct_extract` from unnamed `STRUCT`s (tuples), using an index (1-based). | +| [`row(any, ...)`](#rowany-) | Create an unnamed `STRUCT` (tuple) containing the argument values. | +| [`struct_extract(struct, 'entry')`](#struct_extractstruct-entry) | Extract the named entry from the `STRUCT`. | +| [`struct_extract(struct, idx)`](#struct_extractstruct-idx) | Extract the entry from an unnamed `STRUCT` (tuple) using an index (1-based). | +| [`struct_insert(struct, name := any, ...)`](#struct_insertstruct-name--any-) | Add field(s)/value(s) to an existing `STRUCT` with the argument values. The entry name(s) will be the bound variable name(s). | +| [`struct_pack(name := any, ...)`](#struct_packname--any-) | Create a `STRUCT` containing the argument values. The entry name will be the bound variable name. | + +#### `struct.entry` + +
+ +| **Description** | Dot notation that serves as an alias for `struct_extract` from named `STRUCT`s. | +| **Example** | `({'i': 3, 's': 'string'}).i` | +| **Result** | `3` | + +#### `struct[entry]` + +
+ +| **Description** | Bracket notation that serves as an alias for `struct_extract` from named `STRUCT`s. | +| **Example** | `({'i': 3, 's': 'string'})['i']` | +| **Result** | `3` | + +#### `struct[idx]` + +
+ +| **Description** | Bracket notation that serves as an alias for `struct_extract` from unnamed `STRUCT`s (tuples), using an index (1-based). | +| **Example** | `(row(42, 84))[1]` | +| **Result** | `42` | + +#### `row(any, ...)` + +
+ +| **Description** | Create an unnamed `STRUCT` (tuple) containing the argument values. | +| **Example** | `row(i, i % 4, i / 4)` | +| **Result** | `(10, 2, 2.5)` | + +#### `struct_extract(struct, 'entry')` + +
+ +| **Description** | Extract the named entry from the `STRUCT`. | +| **Example** | `struct_extract({'i': 3, 'v2': 3, 'v3': 0}, 'i')` | +| **Result** | `3` | + +#### `struct_extract(struct, idx)` + +
+ +| **Description** | Extract the entry from an unnamed `STRUCT` (tuple) using an index (1-based). | +| **Example** | `struct_extract(row(42, 84), 1)` | +| **Result** | `42` | + +#### `struct_insert(struct, name := any, ...)` + +
+ +| **Description** | Add field(s)/value(s) to an existing `STRUCT` with the argument values. The entry name(s) will be the bound variable name(s). | +| **Example** | `struct_insert({'a': 1}, b := 2)` | +| **Result** | `{'a': 1, 'b': 2}` | + +#### `struct_pack(name := any, ...)` + +
+ +| **Description** | Create a `STRUCT` containing the argument values. The entry name will be the bound variable name. | +| **Example** | `struct_pack(i := 4, s := 'string')` | +| **Result** | `{'i': 4, 's': string}` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/time.md b/docs/archive/1.0/sql/functions/time.md new file mode 100644 index 00000000000..c71d671ae12 --- /dev/null +++ b/docs/archive/1.0/sql/functions/time.md @@ -0,0 +1,120 @@ +--- +layout: docu +title: Time Functions +--- + + + +This section describes functions and operators for examining and manipulating [`TIME` values]({% link docs/archive/1.0/sql/data_types/time.md %}). + +## Time Operators + +The table below shows the available mathematical operators for `TIME` types. + +
+ +| Operator | Description | Example | Result | +|:-|:---|:----|:--| +| `+` | addition of an `INTERVAL` | `TIME '01:02:03' + INTERVAL 5 HOUR` | `06:02:03` | +| `-` | subtraction of an `INTERVAL` | `TIME '06:02:03' - INTERVAL 5 HOUR'` | `01:02:03` | + +## Time Functions + +The table below shows the available scalar functions for `TIME` types. + +| Name | Description | +|:--|:-------| +| [`current_time`](#current_time) | Current time (start of current transaction). | +| [`date_diff(part, starttime, endtime)`](#date_diffpart-starttime-endtime) | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the times. | +| [`date_part(part, time)`](#date_partpart-time) | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| [`date_sub(part, starttime, endtime)`](#date_subpart-starttime-endtime) | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the times. | +| [`datediff(part, starttime, endtime)`](#datediffpart-starttime-endtime) | Alias of `date_diff`. The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the times. | +| [`datepart(part, time)`](#datepartpart-time) | Alias of date_part. Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| [`datesub(part, starttime, endtime)`](#datesubpart-starttime-endtime) | Alias of date_sub. The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the times. | +| [`extract(part FROM time)`](#extractpart-from-time) | Get subfield from a time. | +| [`get_current_time()`](#get_current_time) | Current time (start of current transaction). | +| [`make_time(bigint, bigint, double)`](#make_timebigint-bigint-double) | The time for the given parts. | + +The only [date parts]({% link docs/archive/1.0/sql/functions/datepart.md %}) that are defined for times are `epoch`, `hours`, `minutes`, `seconds`, `milliseconds` and `microseconds`. + +#### `current_time` + +
+ +| **Description** | Current time (start of current transaction). Note that parentheses should be omitted. | +| **Example** | `current_time` | +| **Result** | `10:31:58.578` | +| **Alias** | `get_current_time()` | + +#### `date_diff(part, starttime, endtime)` + +
+ +| **Description** | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the times. | +| **Example** | `date_diff('hour', TIME '01:02:03', TIME '06:01:03')` | +| **Result** | `5` | + +#### `date_part(part, time)` + +
+ +| **Description** | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| **Example** | `date_part('minute', TIME '14:21:13')` | +| **Result** | `21` | + +#### `date_sub(part, starttime, endtime)` + +
+ +| **Description** | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the times. | +| **Example** | `date_sub('hour', TIME '01:02:03', TIME '06:01:03')` | +| **Result** | `4` | + +#### `datediff(part, starttime, endtime)` + +
+ +| **Description** | Alias of `date_diff`. The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the times. | +| **Example** | `datediff('hour', TIME '01:02:03', TIME '06:01:03')` | +| **Result** | `5` | + +#### `datepart(part, time)` + +
+ +| **Description** | Alias of date_part. Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| **Example** | `datepart('minute', TIME '14:21:13')` | +| **Result** | `21` | + +#### `datesub(part, starttime, endtime)` + +
+ +| **Description** | Alias of date_sub. The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the times. | +| **Example** | `datesub('hour', TIME '01:02:03', TIME '06:01:03')` | +| **Result** | `4` | + +#### `extract(part FROM time)` + +
+ +| **Description** | Get subfield from a time. | +| **Example** | `extract('hour' FROM TIME '14:21:13')` | +| **Result** | `14` | + +#### `get_current_time()` + +
+ +| **Description** | Current time (start of current transaction). | +| **Example** | `get_current_time()` | +| **Result** | `10:31:58.578` | +| **Alias** | `current_time` | + +#### `make_time(bigint, bigint, double)` + +
+ +| **Description** | The time for the given parts. | +| **Example** | `make_time(13, 34, 27.123456)` | +| **Result** | `13:34:27.123456` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/timestamp.md b/docs/archive/1.0/sql/functions/timestamp.md new file mode 100644 index 00000000000..0f44127f0f4 --- /dev/null +++ b/docs/archive/1.0/sql/functions/timestamp.md @@ -0,0 +1,400 @@ +--- +layout: docu +title: Timestamp Functions +--- + + + +This section describes functions and operators for examining and manipulating [`TIMESTAMP` values]({% link docs/archive/1.0/sql/data_types/timestamp.md %}). + +## Timestamp Operators + +The table below shows the available mathematical operators for `TIMESTAMP` types. + +| Operator | Description | Example | Result | +|:-|:--|:----|:--| +| `+` | addition of an `INTERVAL` | `TIMESTAMP '1992-03-22 01:02:03' + INTERVAL 5 DAY` | `1992-03-27 01:02:03` | +| `-` | subtraction of `TIMESTAMP`s | `TIMESTAMP '1992-03-27' - TIMESTAMP '1992-03-22'` | `5 days` | +| `-` | subtraction of an `INTERVAL` | `TIMESTAMP '1992-03-27 01:02:03' - INTERVAL 5 DAY` | `1992-03-22 01:02:03` | + +Adding to or subtracting from [infinite values]({% link docs/archive/1.0/sql/data_types/timestamp.md %}#special-values) produces the same infinite value. + +## Scalar Timestamp Functions + +The table below shows the available scalar functions for `TIMESTAMP` values. + +| Name | Description | +|:--|:-------| +| [`age(timestamp, timestamp)`](#agetimestamp-timestamp) | Subtract arguments, resulting in the time difference between the two timestamps. | +| [`age(timestamp)`](#agetimestamp) | Subtract from current_date. | +| [`century(timestamp)`](#centurytimestamp) | Extracts the century of a timestamp. | +| [`current_timestamp`](#current_timestamp) | Returns the current timestamp (at the start of the transaction). | +| [`date_diff(part, startdate, enddate)`](#date_diffpart-startdate-enddate) | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| [`date_part([part, ...], timestamp)`](#date_partpart--timestamp) | Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| [`date_part(part, timestamp)`](#date_partpart-timestamp) | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| [`date_sub(part, startdate, enddate)`](#date_subpart-startdate-enddate) | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| [`date_trunc(part, timestamp)`](#date_truncpart-timestamp) | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| [`datediff(part, startdate, enddate)`](#datediffpart-startdate-enddate) | Alias of `date_diff`. The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| [`datepart([part, ...], timestamp)`](#datepartpart--timestamp) | Alias of `date_part`. Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| [`datepart(part, timestamp)`](#datepartpart-timestamp) | Alias of `date_part`. Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| [`datesub(part, startdate, enddate)`](#datesubpart-startdate-enddate) | Alias of `date_sub`. The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| [`datetrunc(part, timestamp)`](#datetruncpart-timestamp) | Alias of `date_trunc`. Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| [`dayname(timestamp)`](#daynametimestamp) | The (English) name of the weekday. | +| [`epoch_ms(ms)`](#epoch_msms) | Converts ms since epoch to a timestamp. | +| [`epoch_ms(timestamp)`](#epoch_mstimestamp) | Converts a timestamp to milliseconds since the epoch. | +| [`epoch_ms(timestamp)`](#epoch_mstimestamp) | Return the total number of milliseconds since the epoch. | +| [`epoch_ns(timestamp)`](#epoch_nstimestamp) | Return the total number of nanoseconds since the epoch. | +| [`epoch_us(timestamp)`](#epoch_ustimestamp) | Return the total number of microseconds since the epoch. | +| [`epoch(timestamp)`](#epochtimestamp) | Converts a timestamp to seconds since the epoch. | +| [`extract(field FROM timestamp)`](#extractfield-from-timestamp) | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) from a timestamp. | +| [`greatest(timestamp, timestamp)`](#greatesttimestamp-timestamp) | The later of two timestamps. | +| [`isfinite(timestamp)`](#isfinitetimestamp) | Returns true if the timestamp is finite, false otherwise. | +| [`isinf(timestamp)`](#isinftimestamp) | Returns true if the timestamp is infinite, false otherwise. | +| [`last_day(timestamp)`](#last_daytimestamp) | The last day of the month. | +| [`least(timestamp, timestamp)`](#leasttimestamp-timestamp) | The earlier of two timestamps. | +| [`make_timestamp(bigint, bigint, bigint, bigint, bigint, double)`](#make_timestampbigint-bigint-bigint-bigint-bigint-double) | The timestamp for the given parts. | +| [`make_timestamp(microseconds)`](#make_timestampmicroseconds) | The timestamp for the given number of µs since the epoch. | +| [`monthname(timestamp)`](#monthnametimestamp) | The (English) name of the month. | +| [`strftime(timestamp, format)`](#strftimetimestamp-format) | Converts timestamp to string according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | +| [`strptime(text, format-list)`](#strptimetext-format-list) | Converts the string `text` to timestamp applying the [format strings]({% link docs/archive/1.0/sql/functions/dateformat.md %}) in the list until one succeeds. Throws an error on failure. To return `NULL` on failure, use [`try_strptime`](#try_strptimetext-format-list). | +| [`strptime(text, format)`](#strptimetext-format) | Converts the string `text` to timestamp according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). Throws an error on failure. To return `NULL` on failure, use [`try_strptime`](#try_strptimetext-format). | +| [`time_bucket(bucket_width, timestamp[, offset])`](#time_bucketbucket_width-timestamp-offset) | Truncate `timestamp` by the specified interval `bucket_width`. Buckets are offset by `offset` interval. | +| [`time_bucket(bucket_width, timestamp[, origin])`](#time_bucketbucket_width-timestamp-origin) | Truncate `timestamp` by the specified interval `bucket_width`. Buckets are aligned relative to `origin` timestamp. `origin` defaults to 2000-01-03 00:00:00 for buckets that don't include a month or year interval, and to 2000-01-01 00:00:00 for month and year buckets. | +| [`to_timestamp(double)`](#to_timestampdouble) | Converts seconds since the epoch to a timestamp with time zone. | +| [`try_strptime(text, format-list)`](#try_strptimetext-format-list) | Converts the string `text` to timestamp applying the [format strings]({% link docs/archive/1.0/sql/functions/dateformat.md %}) in the list until one succeeds. Returns `NULL` on failure. | +| [`try_strptime(text, format)`](#try_strptimetext-format) | Converts the string `text` to timestamp according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). Returns `NULL` on failure. | + +There are also dedicated extraction functions to get the [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}). + +Functions applied to infinite dates will either return the same infinite dates +(e.g, `greatest`) or `NULL` (e.g., `date_part`) depending on what “makes sense”. +In general, if the function needs to examine the parts of the infinite date, the result will be `NULL`. + +#### `age(timestamp, timestamp)` + +
+ +| **Description** | Subtract arguments, resulting in the time difference between the two timestamps. | +| **Example** | `age(TIMESTAMP '2001-04-10', TIMESTAMP '1992-09-20')` | +| **Result** | `8 years 6 months 20 days` | + +#### `age(timestamp)` + +
+ +| **Description** | Subtract from current_date. | +| **Example** | `age(TIMESTAMP '1992-09-20')` | +| **Result** | `29 years 1 month 27 days 12:39:00.844` | + +#### `century(timestamp)` + +
+ +| **Description** | Extracts the century of a timestamp. | +| **Example** | `century(TIMESTAMP '1992-03-22')` | +| **Result** | `20` | + +#### `current_timestamp` + +
+ +| **Description** | Returns the current timestamp with time zone (at the start of the transaction). | +| **Example** | `current_timestamp` | +| **Result** | `2024-04-16T09:14:36.098Z` | + +#### `date_diff(part, startdate, enddate)` + +
+ +| **Description** | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| **Example** | `date_diff('hour', TIMESTAMP '1992-09-30 23:59:59', TIMESTAMP '1992-10-01 01:58:00')` | +| **Result** | `2` | + +#### `date_part([part, ...], timestamp)` + +
+ +| **Description** | Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| **Example** | `date_part(['year', 'month', 'day'], TIMESTAMP '1992-09-20 20:38:40')` | +| **Result** | `{year: 1992, month: 9, day: 20}` | + +#### `date_part(part, timestamp)` + +
+ +| **Description** | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| **Example** | `date_part('minute', TIMESTAMP '1992-09-20 20:38:40')` | +| **Result** | `38` | + +#### `date_sub(part, startdate, enddate)` + +
+ +| **Description** | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| **Example** | `date_sub('hour', TIMESTAMP '1992-09-30 23:59:59', TIMESTAMP '1992-10-01 01:58:00')` | +| **Result** | `1` | + +#### `date_trunc(part, timestamp)` + +
+ +| **Description** | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| **Example** | `date_trunc('hour', TIMESTAMP '1992-09-20 20:38:40')` | +| **Result** | `1992-09-20 20:00:00` | + +#### `datediff(part, startdate, enddate)` + +
+ +| **Description** | Alias of `date_diff`. The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| **Example** | `datediff('hour', TIMESTAMP '1992-09-30 23:59:59', TIMESTAMP '1992-10-01 01:58:00')` | +| **Result** | `2` | + +#### `datepart([part, ...], timestamp)` + +
+ +| **Description** | Alias of `date_part`. Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| **Example** | `datepart(['year', 'month', 'day'], TIMESTAMP '1992-09-20 20:38:40')` | +| **Result** | `{year: 1992, month: 9, day: 20}` | + +#### `datepart(part, timestamp)` + +
+ +| **Description** | Alias of `date_part`. Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to `extract`). | +| **Example** | `datepart('minute', TIMESTAMP '1992-09-20 20:38:40')` | +| **Result** | `38` | + +#### `datesub(part, startdate, enddate)` + +
+ +| **Description** | Alias of `date_sub`. The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| **Example** | `datesub('hour', TIMESTAMP '1992-09-30 23:59:59', TIMESTAMP '1992-10-01 01:58:00')` | +| **Result** | `1` | + +#### `datetrunc(part, timestamp)` + +
+ +| **Description** | Alias of `date_trunc`. Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| **Example** | `datetrunc('hour', TIMESTAMP '1992-09-20 20:38:40')` | +| **Result** | `1992-09-20 20:00:00` | + +#### `dayname(timestamp)` + +
+ +| **Description** | The (English) name of the weekday. | +| **Example** | `dayname(TIMESTAMP '1992-03-22')` | +| **Result** | `Sunday` | + +#### `epoch_ms(ms)` + +
+ +| **Description** | Converts ms since epoch to a timestamp. | +| **Example** | `epoch_ms(701222400000)` | +| **Result** | `1992-03-22 00:00:00` | + +#### `epoch_ms(timestamp)` + +
+ +| **Description** | Converts a timestamp to milliseconds since the epoch. | +| **Example** | `epoch_ms('2022-11-07 08:43:04.123456'::TIMESTAMP);` | +| **Result** | `1667810584123` | + +#### `epoch_ms(timestamp)` + +
+ +| **Description** | Return the total number of milliseconds since the epoch. | +| **Example** | `epoch_ms(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `1627991984123` | + +#### `epoch_ns(timestamp)` + +
+ +| **Description** | Return the total number of nanoseconds since the epoch. | +| **Example** | `epoch_ns(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `1627991984123456000` | + +#### `epoch_us(timestamp)` + +
+ +| **Description** | Return the total number of microseconds since the epoch. | +| **Example** | `epoch_us(timestamp '2021-08-03 11:59:44.123456')` | +| **Result** | `1627991984123456` | + +#### `epoch(timestamp)` + +
+ +| **Description** | Converts a timestamp to seconds since the epoch. | +| **Example** | `epoch('2022-11-07 08:43:04'::TIMESTAMP);` | +| **Result** | `1667810584` | + +#### `extract(field FROM timestamp)` + +
+ +| **Description** | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) from a timestamp. | +| **Example** | `extract('hour' FROM TIMESTAMP '1992-09-20 20:38:48')` | +| **Result** | `20` | + +#### `greatest(timestamp, timestamp)` + +
+ +| **Description** | The later of two timestamps. | +| **Example** | `greatest(TIMESTAMP '1992-09-20 20:38:48', TIMESTAMP '1992-03-22 01:02:03.1234')` | +| **Result** | `1992-09-20 20:38:48` | + +#### `isfinite(timestamp)` + +
+ +| **Description** | Returns true if the timestamp is finite, false otherwise. | +| **Example** | `isfinite(TIMESTAMP '1992-03-07')` | +| **Result** | `true` | + +#### `isinf(timestamp)` + +
+ +| **Description** | Returns true if the timestamp is infinite, false otherwise. | +| **Example** | `isinf(TIMESTAMP '-infinity')` | +| **Result** | `true` | + +#### `last_day(timestamp)` + +
+ +| **Description** | The last day of the month. | +| **Example** | `last_day(TIMESTAMP '1992-03-22 01:02:03.1234')` | +| **Result** | `1992-03-31` | + +#### `least(timestamp, timestamp)` + +
+ +| **Description** | The earlier of two timestamps. | +| **Example** | `least(TIMESTAMP '1992-09-20 20:38:48', TIMESTAMP '1992-03-22 01:02:03.1234')` | +| **Result** | `1992-03-22 01:02:03.1234` | + +#### `make_timestamp(bigint, bigint, bigint, bigint, bigint, double)` + +
+ +| **Description** | The timestamp for the given parts. | +| **Example** | `make_timestamp(1992, 9, 20, 13, 34, 27.123456)` | +| **Result** | `1992-09-20 13:34:27.123456` | + +#### `make_timestamp(microseconds)` + +
+ +| **Description** | The timestamp for the given number of µs since the epoch. | +| **Example** | `make_timestamp(1667810584123456)` | +| **Result** | `2022-11-07 08:43:04.123456` | + +#### `monthname(timestamp)` + +
+ +| **Description** | The (English) name of the month. | +| **Example** | `monthname(TIMESTAMP '1992-09-20')` | +| **Result** | `September` | + +#### `strftime(timestamp, format)` + +
+ +| **Description** | Converts timestamp to string according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | +| **Example** | `strftime(timestamp '1992-01-01 20:38:40', '%a, %-d %B %Y - %I:%M:%S %p')` | +| **Result** | `Wed, 1 January 1992 - 08:38:40 PM` | + +#### `strptime(text, format-list)` + +
+ +| **Description** | Converts the string `text` to timestamp applying the [format strings]({% link docs/archive/1.0/sql/functions/dateformat.md %}) in the list until one succeeds. Throws an error on failure. To return `NULL` on failure, use [`try_strptime`](#try_strptimetext-format-list). | +| **Example** | `strptime('4/15/2023 10:56:00', ['%d/%m/%Y %H:%M:%S', '%m/%d/%Y %H:%M:%S'])` | +| **Result** | `2023-04-15 10:56:00` | + +#### `strptime(text, format)` + +
+ +| **Description** | Converts the string `text` to timestamp according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). Throws an error on failure. To return `NULL` on failure, use [`try_strptime`](#try_strptimetext-format). | +| **Example** | `strptime('Wed, 1 January 1992 - 08:38:40 PM', '%a, %-d %B %Y - %I:%M:%S %p')` | +| **Result** | `1992-01-01 20:38:40` | + +#### `time_bucket(bucket_width, timestamp[, offset])` + +
+ +| **Description** | Truncate `timestamp` by the specified interval `bucket_width`. Buckets are offset by `offset` interval. | +| **Example** | `time_bucket(INTERVAL '10 minutes', TIMESTAMP '1992-04-20 15:26:00-07', INTERVAL '5 minutes')` | +| **Result** | `1992-04-20 15:25:00` | + +#### `time_bucket(bucket_width, timestamp[, origin])` + +
+ +| **Description** | Truncate `timestamp` by the specified interval `bucket_width`. Buckets are aligned relative to `origin` timestamp. `origin` defaults to 2000-01-03 00:00:00 for buckets that don't include a month or year interval, and to 2000-01-01 00:00:00 for month and year buckets. | +| **Example** | `time_bucket(INTERVAL '2 weeks', TIMESTAMP '1992-04-20 15:26:00', TIMESTAMP '1992-04-01 00:00:00')` | +| **Result** | `1992-04-15 00:00:00` | + +#### `to_timestamp(double)` + +
+ +| **Description** | Converts seconds since the epoch to a timestamp with time zone. | +| **Example** | `to_timestamp(1284352323.5)` | +| **Result** | `2010-09-13 04:32:03.5+00` | + +#### `try_strptime(text, format-list)` + +
+ +| **Description** | Converts the string `text` to timestamp applying the [format strings]({% link docs/archive/1.0/sql/functions/dateformat.md %}) in the list until one succeeds. Returns `NULL` on failure. | +| **Example** | `try_strptime('4/15/2023 10:56:00', ['%d/%m/%Y %H:%M:%S', '%m/%d/%Y %H:%M:%S'])` | +| **Result** | `2023-04-15 10:56:00` | + +#### `try_strptime(text, format)` + +
+ +| **Description** | Converts the string `text` to timestamp according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). Returns `NULL` on failure. | +| **Example** | `try_strptime('Wed, 1 January 1992 - 08:38:40 PM', '%a, %-d %B %Y - %I:%M:%S %p')` | +| **Result** | `1992-01-01 20:38:40` | + +## Timestamp Table Functions + +The table below shows the available table functions for `TIMESTAMP` types. + +| Name | Description | +|:--|:-------| +| [`generate_series(timestamp, timestamp, interval)`](#generate_seriestimestamp-timestamp-interval) | Generate a table of timestamps in the closed range, stepping by the interval. | +| [`range(timestamp, timestamp, interval)`](#rangetimestamp-timestamp-interval) | Generate a table of timestamps in the half open range, stepping by the interval. | + +> Infinite values are not allowed as table function bounds. + +#### `generate_series(timestamp, timestamp, interval)` + +
+ +| **Description** | Generate a table of timestamps in the closed range, stepping by the interval. | +| **Example** | `generate_series(TIMESTAMP '2001-04-10', TIMESTAMP '2001-04-11', INTERVAL 30 MINUTE)` | + +#### `range(timestamp, timestamp, interval)` + +
+ +| **Description** | Generate a table of timestamps in the half open range, stepping by the interval. | +| **Example** | `range(TIMESTAMP '2001-04-10', TIMESTAMP '2001-04-11', INTERVAL 30 MINUTE)` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/timestamptz.md b/docs/archive/1.0/sql/functions/timestamptz.md new file mode 100644 index 00000000000..1f5b9d00460 --- /dev/null +++ b/docs/archive/1.0/sql/functions/timestamptz.md @@ -0,0 +1,495 @@ +--- +layout: docu +title: Timestamp with Time Zone Functions +--- + + + +This section describes functions and operators for examining and manipulating [`TIMESTAMP WITH TIME ZONE` +(or `TIMESTAMPTZ`) values]({% link docs/archive/1.0/sql/data_types/timestamp.md %}). + +Despite the name, these values do not store a time zone – just an instant like `TIMESTAMP`. +Instead, they request that the instant be binned and formatted using the current time zone. + +Time zone support is not built in but can be provided by an extension, +such as the [ICU extension]({% link docs/archive/1.0/extensions/icu.md %}) that ships with DuckDB. + +In the examples below, the current time zone is presumed to be `America/Los_Angeles` +using the Gregorian calendar. + +## Built-In Timestamp with Time Zone Functions + +The table below shows the available scalar functions for `TIMESTAMPTZ` values. +Since these functions do not involve binning or display, +they are always available. + +| Name | Description | +|:--|:-------| +| [`current_timestamp`](#current_timestamp) | Current date and time (start of current transaction). | +| [`get_current_timestamp()`](#get_current_timestamp) | Current date and time (start of current transaction). | +| [`greatest(timestamptz, timestamptz)`](#greatesttimestamptz-timestamptz) | The later of two timestamps. | +| [`isfinite(timestamptz)`](#isfinitetimestamptz) | Returns true if the timestamp with time zone is finite, false otherwise. | +| [`isinf(timestamptz)`](#isinftimestamptz) | Returns true if the timestamp with time zone is infinite, false otherwise. | +| [`least(timestamptz, timestamptz)`](#leasttimestamptz-timestamptz) | The earlier of two timestamps. | +| [`now()`](#now) | Current date and time (start of current transaction). | +| [`transaction_timestamp()`](#transaction_timestamp) | Current date and time (start of current transaction). | + +#### `current_timestamp` + +
+ +| **Description** | Current date and time (start of current transaction). | +| **Example** | `current_timestamp` | +| **Result** | `2022-10-08 12:44:46.122-07` | + +#### `get_current_timestamp()` + +
+ +| **Description** | Current date and time (start of current transaction). | +| **Example** | `get_current_timestamp()` | +| **Result** | `2022-10-08 12:44:46.122-07` | + +#### `greatest(timestamptz, timestamptz)` + +
+ +| **Description** | The later of two timestamps. | +| **Example** | `greatest(TIMESTAMPTZ '1992-09-20 20:38:48', TIMESTAMPTZ '1992-03-22 01:02:03.1234')` | +| **Result** | `1992-09-20 20:38:48-07` | + +#### `isfinite(timestamptz)` + +
+ +| **Description** | Returns true if the timestamp with time zone is finite, false otherwise. | +| **Example** | `isfinite(TIMESTAMPTZ '1992-03-07')` | +| **Result** | `true` | + +#### `isinf(timestamptz)` + +
+ +| **Description** | Returns true if the timestamp with time zone is infinite, false otherwise. | +| **Example** | `isinf(TIMESTAMPTZ '-infinity')` | +| **Result** | `true` | + +#### `least(timestamptz, timestamptz)` + +
+ +| **Description** | The earlier of two timestamps. | +| **Example** | `least(TIMESTAMPTZ '1992-09-20 20:38:48', TIMESTAMPTZ '1992-03-22 01:02:03.1234')` | +| **Result** | `1992-03-22 01:02:03.1234-08` | + +#### `now()` + +
+ +| **Description** | Current date and time (start of current transaction). | +| **Example** | `now()` | +| **Result** | `2022-10-08 12:44:46.122-07` | + +#### `transaction_timestamp()` + +
+ +| **Description** | Current date and time (start of current transaction). | +| **Example** | `transaction_timestamp()` | +| **Result** | `2022-10-08 12:44:46.122-07` | + +## Timestamp with Time Zone Strings + +With no time zone extension loaded, `TIMESTAMPTZ` values will be cast to and from strings +using offset notation. +This will let you specify an instant correctly without access to time zone information. +For portability, `TIMESTAMPTZ` values will always be displayed using GMT offsets: + +```sql +SELECT '2022-10-08 13:13:34-07'::TIMESTAMPTZ; +``` + +```text +2022-10-08 20:13:34+00 +``` + +If a time zone extension such as ICU is loaded, then a time zone can be parsed from a string +and cast to a representation in the local time zone: + +```sql +SELECT '2022-10-08 13:13:34 Europe/Amsterdam'::TIMESTAMPTZ::VARCHAR; +``` + +```text +2022-10-08 04:13:34-07 -- the offset will differ based on your local time zone +``` + +## ICU Timestamp with Time Zone Operators + +The table below shows the available mathematical operators for `TIMESTAMP WITH TIME ZONE` values +provided by the ICU extension. + +| Operator | Description | Example | Result | +|:-|:--|:----|:--| +| `+` | addition of an `INTERVAL` | `TIMESTAMPTZ '1992-03-22 01:02:03' + INTERVAL 5 DAY` | `1992-03-27 01:02:03` | +| `-` | subtraction of `TIMESTAMPTZ`s | `TIMESTAMPTZ '1992-03-27' - TIMESTAMPTZ '1992-03-22'` | `5 days` | +| `-` | subtraction of an `INTERVAL` | `TIMESTAMPTZ '1992-03-27 01:02:03' - INTERVAL 5 DAY` | `1992-03-22 01:02:03` | + +Adding to or subtracting from [infinite values]({% link docs/archive/1.0/sql/data_types/timestamp.md %}#special-values) produces the same infinite value. + +## ICU Timestamp with Time Zone Functions + +The table below shows the ICU provided scalar functions for `TIMESTAMP WITH TIME ZONE` values. + +| Name | Description | +|:--|:-------| +| [`age(timestamptz, timestamptz)`](#agetimestamptz-timestamptz) | Subtract arguments, resulting in the time difference between the two timestamps. | +| [`age(timestamptz)`](#agetimestamptz) | Subtract from current_date. | +| [`date_diff(part, startdate, enddate)`](#date_diffpart-startdate-enddate) | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| [`date_part([part, ...], timestamptz)`](#date_partpart--timestamptz) | Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| [`date_part(part, timestamptz)`](#date_partpart-timestamptz) | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to *extract*). | +| [`date_sub(part, startdate, enddate)`](#date_subpart-startdate-enddate) | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| [`date_trunc(part, timestamptz)`](#date_truncpart-timestamptz) | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| [`datediff(part, startdate, enddate)`](#datediffpart-startdate-enddate) | Alias of date_diff. The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| [`datepart([part, ...], timestamptz)`](#datepartpart--timestamptz) | Alias of date_part. Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| [`datepart(part, timestamptz)`](#datepartpart-timestamptz) | Alias of date_part. Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to *extract*). | +| [`datesub(part, startdate, enddate)`](#datesubpart-startdate-enddate) | Alias of date_sub. The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| [`datetrunc(part, timestamptz)`](#datetruncpart-timestamptz) | Alias of date_trunc. Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| [`epoch_ms(timestamptz)`](#epoch_mstimestamptz) | Converts a timestamptz to milliseconds since the epoch. | +| [`epoch_ns(timestamptz)`](#epoch_nstimestamptz) | Converts a timestamptz to nanoseconds since the epoch. | +| [`epoch_us(timestamptz)`](#epoch_ustimestamptz) | Converts a timestamptz to microseconds since the epoch. | +| [`extract(field FROM timestamptz)`](#extractfield-from-timestamptz) | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) from a `TIMESTAMP WITH TIME ZONE`. | +| [`last_day(timestamptz)`](#last_daytimestamptz) | The last day of the month. | +| [`make_timestamptz(bigint, bigint, bigint, bigint, bigint, double, string)`](#make_timestamptzbigint-bigint-bigint-bigint-bigint-double-string) | The `TIMESTAMP WITH TIME ZONE` for the given parts and time zone. | +| [`make_timestamptz(bigint, bigint, bigint, bigint, bigint, double)`](#make_timestamptzbigint-bigint-bigint-bigint-bigint-double) | The `TIMESTAMP WITH TIME ZONE` for the given parts in the current time zone. | +| [`make_timestamptz(microseconds)`](#make_timestamptzmicroseconds) | The `TIMESTAMP WITH TIME ZONE` for the given µs since the epoch. | +| [`strftime(timestamptz, format)`](#strftimetimestamptz-format) | Converts a `TIMESTAMP WITH TIME ZONE` value to string according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | +| [`strptime(text, format)`](#strptimetext-format) | Converts string to `TIMESTAMP WITH TIME ZONE` according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}) if `%Z` is specified. | +| [`time_bucket(bucket_width, timestamptz[, offset])`](#time_bucketbucket_width-timestamptz-offset) | Truncate `timestamptz` by the specified interval `bucket_width`. Buckets are offset by `offset` interval. | +| [`time_bucket(bucket_width, timestamptz[, origin])`](#time_bucketbucket_width-timestamptz-origin) | Truncate `timestamptz` by the specified interval `bucket_width`. Buckets are aligned relative to `origin` timestamptz. `origin` defaults to 2000-01-03 00:00:00+00 for buckets that don't include a month or year interval, and to 2000-01-01 00:00:00+00 for month and year buckets. | +| [`time_bucket(bucket_width, timestamptz[, timezone])`](#time_bucketbucket_width-timestamptz-timezone) | Truncate `timestamptz` by the specified interval `bucket_width`. Bucket starts and ends are calculated using `timezone`. `timezone` is a varchar and defaults to UTC. | + +#### `age(timestamptz, timestamptz)` + +
+ +| **Description** | Subtract arguments, resulting in the time difference between the two timestamps. | +| **Example** | `age(TIMESTAMPTZ '2001-04-10', TIMESTAMPTZ '1992-09-20')` | +| **Result** | `8 years 6 months 20 days` | + +#### `age(timestamptz)` + +
+ +| **Description** | Subtract from current_date. | +| **Example** | `age(TIMESTAMP '1992-09-20')` | +| **Result** | `29 years 1 month 27 days 12:39:00.844` | + +#### `date_diff(part, startdate, enddate)` + +
+ +| **Description** | The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| **Example** | `date_diff('hour', TIMESTAMPTZ '1992-09-30 23:59:59', TIMESTAMPTZ '1992-10-01 01:58:00')` | +| **Result** | `2` | + +#### `date_part([part, ...], timestamptz)` + +
+ +| **Description** | Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| **Example** | `date_part(['year', 'month', 'day'], TIMESTAMPTZ '1992-09-20 20:38:40-07')` | +| **Result** | `{year: 1992, month: 9, day: 20}` | + +#### `date_part(part, timestamptz)` + +
+ +| **Description** | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to *extract*). | +| **Example** | `date_part('minute', TIMESTAMPTZ '1992-09-20 20:38:40')` | +| **Result** | `38` | + +#### `date_sub(part, startdate, enddate)` + +
+ +| **Description** | The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| **Example** | `date_sub('hour', TIMESTAMPTZ '1992-09-30 23:59:59', TIMESTAMPTZ '1992-10-01 01:58:00')` | +| **Result** | `1` | + +#### `date_trunc(part, timestamptz)` + +
+ +| **Description** | Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| **Example** | `date_trunc('hour', TIMESTAMPTZ '1992-09-20 20:38:40')` | +| **Result** | `1992-09-20 20:00:00` | + +#### `datediff(part, startdate, enddate)` + +
+ +| **Description** | Alias of date_diff. The number of [partition]({% link docs/archive/1.0/sql/functions/datepart.md %}) boundaries between the timestamps. | +| **Example** | `datediff('hour', TIMESTAMPTZ '1992-09-30 23:59:59', TIMESTAMPTZ '1992-10-01 01:58:00')` | +| **Result** | `2` | + +#### `datepart([part, ...], timestamptz)` + +
+ +| **Description** | Alias of date_part. Get the listed [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}) as a `struct`. The list must be constant. | +| **Example** | `datepart(['year', 'month', 'day'], TIMESTAMPTZ '1992-09-20 20:38:40-07')` | +| **Result** | `{year: 1992, month: 9, day: 20}` | + +#### `datepart(part, timestamptz)` + +
+ +| **Description** | Alias of date_part. Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) (equivalent to *extract*). | +| **Example** | `datepart('minute', TIMESTAMPTZ '1992-09-20 20:38:40')` | +| **Result** | `38` | + +#### `datesub(part, startdate, enddate)` + +
+ +| **Description** | Alias of date_sub. The number of complete [partitions]({% link docs/archive/1.0/sql/functions/datepart.md %}) between the timestamps. | +| **Example** | `datesub('hour', TIMESTAMPTZ '1992-09-30 23:59:59', TIMESTAMPTZ '1992-10-01 01:58:00')` | +| **Result** | `1` | + +#### `datetrunc(part, timestamptz)` + +
+ +| **Description** | Alias of date_trunc. Truncate to specified [precision]({% link docs/archive/1.0/sql/functions/datepart.md %}). | +| **Example** | `datetrunc('hour', TIMESTAMPTZ '1992-09-20 20:38:40')` | +| **Result** | `1992-09-20 20:00:00` | + +#### `epoch_ms(timestamptz)` + +
+ +| **Description** | Converts a timestamptz to milliseconds since the epoch. | +| **Example** | `epoch_ms('2022-11-07 08:43:04.123456+00'::TIMESTAMPTZ);` | +| **Result** | `1667810584123` | + +#### `epoch_ns(timestamptz)` + +
+ +| **Description** | Converts a timestamptz to nanoseconds since the epoch. | +| **Example** | `epoch_ns('2022-11-07 08:43:04.123456+00'::TIMESTAMPTZ);` | +| **Result** | `1667810584123456000` | + +#### `epoch_us(timestamptz)` + +
+ +| **Description** | Converts a timestamptz to microseconds since the epoch. | +| **Example** | `epoch_us('2022-11-07 08:43:04.123456+00'::TIMESTAMPTZ);` | +| **Result** | `1667810584123456` | + +#### `extract(field FROM timestamptz)` + +
+ +| **Description** | Get [subfield]({% link docs/archive/1.0/sql/functions/datepart.md %}) from a `TIMESTAMP WITH TIME ZONE`. | +| **Example** | `extract('hour' FROM TIMESTAMPTZ '1992-09-20 20:38:48')` | +| **Result** | `20` | + +#### `last_day(timestamptz)` + +
+ +| **Description** | The last day of the month. | +| **Example** | `last_day(TIMESTAMPTZ '1992-03-22 01:02:03.1234')` | +| **Result** | `1992-03-31` | + +#### `make_timestamptz(bigint, bigint, bigint, bigint, bigint, double, string)` + +
+ +| **Description** | The `TIMESTAMP WITH TIME ZONE` for the given parts and time zone. | +| **Example** | `make_timestamptz(1992, 9, 20, 15, 34, 27.123456, 'CET')` | +| **Result** | `1992-09-20 06:34:27.123456-07` | + +#### `make_timestamptz(bigint, bigint, bigint, bigint, bigint, double)` + +
+ +| **Description** | The `TIMESTAMP WITH TIME ZONE` for the given parts in the current time zone. | +| **Example** | `make_timestamptz(1992, 9, 20, 13, 34, 27.123456)` | +| **Result** | `1992-09-20 13:34:27.123456-07` | + +#### `make_timestamptz(microseconds)` + +
+ +| **Description** | The `TIMESTAMP WITH TIME ZONE` for the given µs since the epoch. | +| **Example** | `make_timestamptz(1667810584123456)` | +| **Result** | `2022-11-07 16:43:04.123456-08` | + +#### `strftime(timestamptz, format)` + +
+ +| **Description** | Converts a `TIMESTAMP WITH TIME ZONE` value to string according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}). | +| **Example** | `strftime(timestamptz '1992-01-01 20:38:40', '%a, %-d %B %Y - %I:%M:%S %p')` | +| **Result** | `Wed, 1 January 1992 - 08:38:40 PM` | + +#### `strptime(text, format)` + +
+ +| **Description** | Converts string to `TIMESTAMP WITH TIME ZONE` according to the [format string]({% link docs/archive/1.0/sql/functions/dateformat.md %}) if `%Z` is specified. | +| **Example** | `strptime('Wed, 1 January 1992 - 08:38:40 PST', '%a, %-d %B %Y - %H:%M:%S %Z')` | +| **Result** | `1992-01-01 08:38:40-08` | + +#### `time_bucket(bucket_width, timestamptz[, offset])` + +
+ +| **Description** | Truncate `timestamptz` by the specified interval `bucket_width`. Buckets are offset by `offset` interval. | +| **Example** | `time_bucket(INTERVAL '10 minutes', TIMESTAMPTZ '1992-04-20 15:26:00-07', INTERVAL '5 minutes')` | +| **Result** | `1992-04-20 15:25:00-07` | + +#### `time_bucket(bucket_width, timestamptz[, origin])` + +
+ +| **Description** | Truncate `timestamptz` by the specified interval `bucket_width`. Buckets are aligned relative to `origin` timestamptz. `origin` defaults to 2000-01-03 00:00:00+00 for buckets that don't include a month or year interval, and to 2000-01-01 00:00:00+00 for month and year buckets. | +| **Example** | `time_bucket(INTERVAL '2 weeks', TIMESTAMPTZ '1992-04-20 15:26:00-07', TIMESTAMPTZ '1992-04-01 00:00:00-07')` | +| **Result** | `1992-04-15 00:00:00-07` | + +#### `time_bucket(bucket_width, timestamptz[, timezone])` + +
+ +| **Description** | Truncate `timestamptz` by the specified interval `bucket_width`. Bucket starts and ends are calculated using `timezone`. `timezone` is a varchar and defaults to UTC. | +| **Example** | `time_bucket(INTERVAL '2 days', TIMESTAMPTZ '1992-04-20 15:26:00-07', 'Europe/Berlin')` | +| **Result** | `1992-04-19 15:00:00-07` | + +There are also dedicated extraction functions to get the [subfields]({% link docs/archive/1.0/sql/functions/datepart.md %}). + +## ICU Timestamp Table Functions + +The table below shows the available table functions for `TIMESTAMP WITH TIME ZONE` types. + +| Name | Description | +|:--|:-------| +| [`generate_series(timestamptz, timestamptz, interval)`](#generate_seriestimestamptz-timestamptz-interval) | Generate a table of timestamps in the closed range (including both the starting timestamp and the ending timestamp), stepping by the interval. | +| [`range(timestamptz, timestamptz, interval)`](#rangetimestamptz-timestamptz-interval) | Generate a table of timestamps in the half open range (including the starting timestamp, but stopping before the ending timestamp), stepping by the interval. | + +> Infinite values are not allowed as table function bounds. + +#### `generate_series(timestamptz, timestamptz, interval)` + +
+ +| **Description** | Generate a table of timestamps in the closed range (including both the starting timestamp and the ending timestamp), stepping by the interval. | +| **Example** | `generate_series(TIMESTAMPTZ '2001-04-10', TIMESTAMPTZ '2001-04-11', INTERVAL 30 MINUTE)` | + +#### `range(timestamptz, timestamptz, interval)` + +
+ +| **Description** | Generate a table of timestamps in the half open range (including the starting timestamp, but stopping before the ending timestamp), stepping by the interval. | +| **Example** | `range(TIMESTAMPTZ '2001-04-10', TIMESTAMPTZ '2001-04-11', INTERVAL 30 MINUTE)` | + +## ICU Timestamp Without Time Zone Functions + +The table below shows the ICU provided scalar functions that operate on plain `TIMESTAMP` values. +These functions assume that the `TIMESTAMP` is a “local timestamp”. + +A local timestamp is effectively a way of encoding the part values from a time zone into a single value. +They should be used with caution because the produced values can contain gaps and ambiguities thanks to daylight savings time. +Often the same functionality can be implemented more reliably using the `struct` variant of the `date_part` function. + +| Name | Description | +|:--|:-------| +| [`current_localtime()`](#current_localtime) | Returns a `TIME` whose GMT bin values correspond to local time in the current time zone. | +| [`current_localtimestamp()`](#current_localtimestamp) | Returns a `TIMESTAMP` whose GMT bin values correspond to local date and time in the current time zone. | +| [`localtime`](#localtime) | Synonym for the `current_localtime()` function call. | +| [`localtimestamp`](#localtimestamp) | Synonym for the `current_localtimestamp()` function call. | +| [`timezone(text, timestamp)`](#timezonetext-timestamp) | Use the [date parts]({% link docs/archive/1.0/sql/functions/datepart.md %}) of the timestamp in GMT to construct a timestamp in the given time zone. Effectively, the argument is a “local” time. | +| [`timezone(text, timestamptz)`](#timezonetext-timestamptz) | Use the [date parts]({% link docs/archive/1.0/sql/functions/datepart.md %}) of the timestamp in the given time zone to construct a timestamp. Effectively, the result is a “local” time. | + +#### `current_localtime()` + +
+ +| **Description** | Returns a `TIME` whose GMT bin values correspond to local time in the current time zone. | +| **Example** | `current_localtime()` | +| **Result** | `08:47:56.497` | + +#### `current_localtimestamp()` + +
+ +| **Description** | Returns a `TIMESTAMP` whose GMT bin values correspond to local date and time in the current time zone. | +| **Example** | `current_localtimestamp()` | +| **Result** | `2022-12-17 08:47:56.497` | + +#### `localtime` + +
+ +| **Description** | Synonym for the `current_localtime()` function call. | +| **Example** | `localtime` | +| **Result** | `08:47:56.497` | + +#### `localtimestamp` + +
+ +| **Description** | Synonym for the `current_localtimestamp()` function call. | +| **Example** | `localtimestamp` | +| **Result** | `2022-12-17 08:47:56.497` | + +#### `timezone(text, timestamp)` + +
+ +| **Description** | Use the [date parts]({% link docs/archive/1.0/sql/functions/datepart.md %}) of the timestamp in GMT to construct a timestamp in the given time zone. Effectively, the argument is a “local” time. | +| **Example** | `timezone('America/Denver', TIMESTAMP '2001-02-16 20:38:40')` | +| **Result** | `2001-02-16 19:38:40-08` | + +#### `timezone(text, timestamptz)` + +
+ +| **Description** | Use the [date parts]({% link docs/archive/1.0/sql/functions/datepart.md %}) of the timestamp in the given time zone to construct a timestamp. Effectively, the result is a “local” time. | +| **Example** | `timezone('America/Denver', TIMESTAMPTZ '2001-02-16 20:38:40-05')` | +| **Result** | `2001-02-16 18:38:40` | + +## At Time Zone + +The `AT TIME ZONE` syntax is syntactic sugar for the (two argument) `timezone` function listed above: + +```sql +TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'America/Denver'; +``` + +```text +2001-02-16 19:38:40-08 +``` + +```sql +TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT TIME ZONE 'America/Denver'; +``` + +```text +2001-02-16 18:38:40 +``` + +## Infinities + +Functions applied to infinite dates will either return the same infinite dates +(e.g, `greatest`) or `NULL` (e.g., `date_part`) depending on what “makes sense”. +In general, if the function needs to examine the parts of the infinite temporal value, +the result will be `NULL`. + +## Calendars + +The ICU extension also supports [non-Gregorian calendars]({% link docs/archive/1.0/sql/data_types/timestamp.md %}#calendars). +If such a calendar is current, then the display and binning operations will use that calendar. \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/union.md b/docs/archive/1.0/sql/functions/union.md new file mode 100644 index 00000000000..bd430c4edc2 --- /dev/null +++ b/docs/archive/1.0/sql/functions/union.md @@ -0,0 +1,45 @@ +--- +layout: docu +title: Union Functions +--- + + + +| Name | Description | +|:--|:-------| +| [`union.tag`](#uniontag) | Dot notation serves as an alias for `union_extract`. | +| [`union_extract(union, 'tag')`](#union_extractunion-tag) | Extract the value with the named tags from the union. `NULL` if the tag is not currently selected. | +| [`union_value(tag := any)`](#union_valuetag--any) | Create a single member `UNION` containing the argument value. The tag of the value will be the bound variable name. | +| [`union_tag(union)`](#union_tagunion) | Retrieve the currently selected tag of the union as an [Enum]({% link docs/archive/1.0/sql/data_types/enum.md %}). | + +#### `union.tag` + +
+ +| **Description** | Dot notation serves as an alias for `union_extract`. | +| **Example** | `(union_value(k := 'hello')).k` | +| **Result** | `string` | + +#### `union_extract(union, 'tag')` + +
+ +| **Description** | Extract the value with the named tags from the union. `NULL` if the tag is not currently selected. | +| **Example** | `union_extract(s, 'k')` | +| **Result** | `hello` | + +#### `union_value(tag := any)` + +
+ +| **Description** | Create a single member `UNION` containing the argument value. The tag of the value will be the bound variable name. | +| **Example** | `union_value(k := 'hello')` | +| **Result** | `'hello'::UNION(k VARCHAR)` | + +#### `union_tag(union)` + +
+ +| **Description** | Retrieve the currently selected tag of the union as an [Enum]({% link docs/archive/1.0/sql/data_types/enum.md %}). | +| **Example** | `union_tag(union_value(k := 'foo'))` | +| **Result** | `'k'` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/utility.md b/docs/archive/1.0/sql/functions/utility.md new file mode 100644 index 00000000000..73324abd108 --- /dev/null +++ b/docs/archive/1.0/sql/functions/utility.md @@ -0,0 +1,309 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/test/functions/utility +title: Utility Functions +--- + + + +## Scalar Utility Functions + +The functions below are difficult to categorize into specific function types and are broadly useful. + +| Name | Description | +|:--|:-------| +| [`alias(column)`](#aliascolumn) | Return the name of the column. | +| [`checkpoint(database)`](#checkpointdatabase) | Synchronize WAL with file for (optional) database without interrupting transactions. | +| [`coalesce(expr, ...)`](#coalesceexpr-) | Return the first expression that evaluates to a non-`NULL` value. Accepts 1 or more parameters. Each expression can be a column, literal value, function result, or many others. | +| [`constant_or_null(arg1, arg2)`](#constant_or_nullarg1-arg2) | If `arg2` is `NULL`, return `NULL`. Otherwise, return `arg1`. | +| [`count_if(x)`](#count_ifx) | Returns 1 if `x` is `true` or a non-zero number. | +| [`current_catalog()`](#current_catalog) | Return the name of the currently active catalog. Default is memory. | +| [`current_schema()`](#current_schema) | Return the name of the currently active schema. Default is main. | +| [`current_schemas(boolean)`](#current_schemasboolean) | Return list of schemas. Pass a parameter of `true` to include implicit schemas. | +| [`current_setting('setting_name')`](#current_settingsetting_name) | Return the current value of the configuration setting. | +| [`currval('sequence_name')`](#currvalsequence_name) | Return the current value of the sequence. Note that `nextval` must be called at least once prior to calling `currval`. | +| [`error(message)`](#errormessage) | Throws the given error `message`. | +| [`force_checkpoint(database)`](#force_checkpointdatabase) | Synchronize WAL with file for (optional) database interrupting transactions. | +| [`gen_random_uuid()`](#gen_random_uuid) | Alias of `uuid`. Return a random UUID similar to this: `eeccb8c5-9943-b2bb-bb5e-222f4e14b687`. | +| [`getenv(var)`](#getenvvar) | Returns the value of the environment variable `var`. Only available in the [command line client]({% link docs/archive/1.0/api/cli/overview.md %}). | +| [`hash(value)`](#hashvalue) | Returns a `UBIGINT` with the hash of the `value`. | +| [`icu_sort_key(string, collator)`](#icu_sort_keystring-collator) | Surrogate key used to sort special characters according to the specific locale. Collator parameter is optional. Valid only when ICU extension is installed. | +| [`if(a, b, c)`](#ifa-b-c) | Ternary operator. | +| [`ifnull(expr, other)`](#ifnullexpr-other) | A two-argument version of coalesce. | +| [`md5(string)`](#md5string) | Return an MD5 hash of the `string`. | +| [`nextval('sequence_name')`](#nextvalsequence_name) | Return the following value of the sequence. | +| [`nullif(a, b)`](#nullifa-b) | Return `NULL` if `a = b`, else return `a`. Equivalent to `CASE WHEN a = b THEN NULL ELSE a END`. | +| [`pg_typeof(expression)`](#pg_typeofexpression) | Returns the lower case name of the data type of the result of the expression. For PostgreSQL compatibility. | +| [`read_blob(source)`](#read_blobsource) | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `BLOB`. See the [`read_blob` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_blob) for more details. | +| [`read_text(source)`](#read_textsource) | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `VARCHAR`. The file content is first validated to be valid UTF-8. If `read_text` attempts to read a file with invalid UTF-8 an error is thrown suggesting to use `read_blob` instead. See the [`read_text` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_text) for more details. | +| [`sha256(value)`](#sha256value) | Returns a `VARCHAR` with the SHA-256 hash of the `value`. | +| [`stats(expression)`](#statsexpression) | Returns a string with statistics about the expression. Expression can be a column, constant, or SQL expression. | +| [`txid_current()`](#txid_current) | Returns the current transaction's identifier, a `BIGINT` value. It will assign a new one if the current transaction does not have one already. | +| [`typeof(expression)`](#typeofexpression) | Returns the name of the data type of the result of the expression. | +| [`uuid()`](#uuid) | Return a random UUID similar to this: `eeccb8c5-9943-b2bb-bb5e-222f4e14b687`. | +| [`version()`](#version) | Return the currently active version of DuckDB in this format. | + +#### `alias(column)` + +
+ +| **Description** | Return the name of the column. | +| **Example** | `alias(column1)` | +| **Result** | `column1` | + +#### `checkpoint(database)` + +
+ +| **Description** | Synchronize WAL with file for (optional) database without interrupting transactions. | +| **Example** | `checkpoint(my_db)` | +| **Result** | success boolean | + +#### `coalesce(expr, ...)` + +
+ +| **Description** | Return the first expression that evaluates to a non-`NULL` value. Accepts 1 or more parameters. Each expression can be a column, literal value, function result, or many others. | +| **Example** | `coalesce(NULL, NULL, 'default_string')` | +| **Result** | `default_string` | + +#### `constant_or_null(arg1, arg2)` + +
+ +| **Description** | If `arg2` is `NULL`, return `NULL`. Otherwise, return `arg1`. | +| **Example** | `constant_or_null(42, NULL)` | +| **Result** | `NULL` | + +#### `count_if(x)` + +
+ +| **Description** | Returns 1 if `x` is `true` or a non-zero number. | +| **Example** | `count_if(42)` | +| **Result** | 1 | + +#### `current_catalog()` + +
+ +| **Description** | Return the name of the currently active catalog. Default is memory. | +| **Example** | `current_catalog()` | +| **Result** | `memory` | + +#### `current_schema()` + +
+ +| **Description** | Return the name of the currently active schema. Default is main. | +| **Example** | `current_schema()` | +| **Result** | `main` | + +#### `current_schemas(boolean)` + +
+ +| **Description** | Return list of schemas. Pass a parameter of `true` to include implicit schemas. | +| **Example** | `current_schemas(true)` | +| **Result** | `['temp', 'main', 'pg_catalog']` | + +#### `current_setting('setting_name')` + +
+ +| **Description** | Return the current value of the configuration setting. | +| **Example** | `current_setting('access_mode')` | +| **Result** | `automatic` | + +#### `currval('sequence_name')` + +
+ +| **Description** | Return the current value of the sequence. Note that `nextval` must be called at least once prior to calling `currval`. | +| **Example** | `currval('my_sequence_name')` | +| **Result** | `1` | + +#### `error(message)` + +
+ +| **Description** | Throws the given error `message`. | +| **Example** | `error('access_mode')` | + +#### `force_checkpoint(database)` + +
+ +| **Description** | Synchronize WAL with file for (optional) database interrupting transactions. | +| **Example** | `force_checkpoint(my_db)` | +| **Result** | success boolean | + +#### `gen_random_uuid()` + +
+ +| **Description** | Alias of `uuid`. Return a random UUID similar to this: `eeccb8c5-9943-b2bb-bb5e-222f4e14b687`. | +| **Example** | `gen_random_uuid()` | +| **Result** | various | + +#### `getenv(var)` + +| **Description** | Returns the value of the environment variable `var`. Only available in the [command line client]({% link docs/archive/1.0/api/cli/overview.md %}). | +| **Example** | `getenv('HOME')` | +| **Result** | `/path/to/user/home` | + +#### `hash(value)` + +
+ +| **Description** | Returns a `UBIGINT` with the hash of the `value`. | +| **Example** | `hash('🦆')` | +| **Result** | `2595805878642663834` | + +#### `icu_sort_key(string, collator)` + +
+ +| **Description** | Surrogate key used to sort special characters according to the specific locale. Collator parameter is optional. Valid only when ICU extension is installed. | +| **Example** | `icu_sort_key('ö', 'DE')` | +| **Result** | `460145960106` | + +#### `if(a, b, c)` + +
+ +| **Description** | Ternary operator; returns b if a, else returns c. Equivalent to `CASE WHEN a THEN b ELSE c END`. | +| **Example** | `if(2 > 1, 3, 4)` | +| **Result** | `3` | + +#### `ifnull(expr, other)` + +
+ +| **Description** | A two-argument version of coalesce. | +| **Example** | `ifnull(NULL, 'default_string')` | +| **Result** | `default_string` | + +#### `md5(string)` + +
+ +| **Description** | Return an MD5 hash of the `string`. | +| **Example** | `md5('123')` | +| **Result** | `202cb962ac59075b964b07152d234b70` | + +#### `nextval('sequence_name')` + +
+ +| **Description** | Return the following value of the sequence. | +| **Example** | `nextval('my_sequence_name')` | +| **Result** | `2` | + +#### `nullif(a, b)` + +
+ +| **Description** | Return null if a = b, else return a. Equivalent to `CASE WHEN a = b THEN NULL ELSE a END`. | +| **Example** | `nullif(1+1, 2)` | +| **Result** | `NULL` | + +#### `pg_typeof(expression)` + +
+ +| **Description** | Returns the lower case name of the data type of the result of the expression. For PostgreSQL compatibility. | +| **Example** | `pg_typeof('abc')` | +| **Result** | `varchar` | + +#### `read_blob(source)` + +
+ +| **Description** | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `BLOB`. See the [`read_blob` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_blob) for more details. | +| **Example** | `read_blob('hello.bin')` | +| **Result** | `hello\x0A` | + +#### `read_text(source)` + +
+ +| **Description** | Returns the content from `source` (a filename, a list of filenames, or a glob pattern) as a `VARCHAR`. The file content is first validated to be valid UTF-8. If `read_text` attempts to read a file with invalid UTF-8 an error is thrown suggesting to use `read_blob` instead. See the [`read_text` guide]({% link docs/archive/1.0/guides/file_formats/read_file.md %}#read_text) for more details. | +| **Example** | `read_text('hello.txt')` | +| **Result** | `hello\n` | + +#### `sha256(value)` + +
+ +| **Description** | Returns a `VARCHAR` with the SHA-256 hash of the `value`. | +| **Example** | `sha256('🦆')` | +| **Result** | `d7a5c5e0d1d94c32218539e7e47d4ba9c3c7b77d61332fb60d633dde89e473fb` | + +#### `stats(expression)` + +
+ +| **Description** | Returns a string with statistics about the expression. Expression can be a column, constant, or SQL expression. | +| **Example** | `stats(5)` | +| **Result** | `'[Min: 5, Max: 5][Has Null: false]'` | + +#### `txid_current()` + +
+ +| **Description** | Returns the current transaction's identifier, a `BIGINT` value. It will assign a new one if the current transaction does not have one already. | +| **Example** | `txid_current()` | +| **Result** | various | + +#### `typeof(expression)` + +
+ +| **Description** | Returns the name of the data type of the result of the expression. | +| **Example** | `typeof('abc')` | +| **Result** | `VARCHAR` | + +#### `uuid()` + +
+ +| **Description** | Return a random UUID similar to this: `eeccb8c5-9943-b2bb-bb5e-222f4e14b687`. | +| **Example** | `uuid()` | +| **Result** | various | + +#### `version()` + +
+ +| **Description** | Return the currently active version of DuckDB in this format. | +| **Example** | `version()` | +| **Result** | various | + +## Utility Table Functions + +A table function is used in place of a table in a `FROM` clause. + +
+ +| Name | Description | +|:--|:-------| +| [`glob(search_path)`](#globsearch_path) | Return filenames found at the location indicated by the *search_path* in a single column named `file`. The *search_path* may contain [glob pattern matching syntax]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}). | +| [`repeat_row(varargs, num_rows)`](#repeat_rowvarargs-num_rows) | Returns a table with `num_rows` rows, each containing the fields defined in `varargs`. | + +#### `glob(search_path)` + +
+ +| **Description** | Return filenames found at the location indicated by the *search_path* in a single column named `file`. The *search_path* may contain [glob pattern matching syntax]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}). | +| **Example** | `glob('*')` | +| **Result** | (table of filenames) | + +#### `repeat_row(varargs, num_rows)` + +
+ +| **Description** | Returns a table with `num_rows` rows, each containing the fields defined in `varargs`. | +| **Example** | `repeat_row(1, 2, 'foo', num_rows = 3)` | +| **Result** | 3 rows of `1, 2, 'foo'` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/functions/window_functions.md b/docs/archive/1.0/sql/functions/window_functions.md new file mode 100644 index 00000000000..9a2669f047c --- /dev/null +++ b/docs/archive/1.0/sql/functions/window_functions.md @@ -0,0 +1,437 @@ +--- +layout: docu +railroad: expressions/window.js +redirect_from: +- docs/archive/1.0/sql/window_functions +title: Window Functions +--- + + + +DuckDB supports [window functions](https://en.wikipedia.org/wiki/Window_function_(SQL)), which can use multiple rows to calculate a value for each row. +Window functions are [blocking operators]({% link docs/archive/1.0/guides/performance/how_to_tune_workloads.md %}#blocking-operators), i.e., they require their entire input to be buffered, making them one of the most memory-intensive operators in SQL. + +Window function are available in SQL since [SQL:2003](https://en.wikipedia.org/wiki/SQL:2003) and are supported by major SQL database systems. + +## Examples + +Generate a `row_number` column with containing incremental identifiers for each row: + +```sql +SELECT row_number() OVER () +FROM sales; +``` + +Generate a `row_number` column, by order of time: + +```sql +SELECT row_number() OVER (ORDER BY time) +FROM sales; +``` + +Generate a `row_number` column, by order of time partitioned by region: + +```sql +SELECT row_number() OVER (PARTITION BY region ORDER BY time) +FROM sales; +``` + +Compute the difference between the current amount, and the previous amount, by order of time: + +```sql +SELECT amount - lag(amount) OVER (ORDER BY time) +FROM sales; +``` + +Compute the percentage of the total amount of sales per region for each row: + +```sql +SELECT amount / sum(amount) OVER (PARTITION BY region) +FROM sales; +``` + +## Syntax + +
+ +Window functions can only be used in the `SELECT` clause. To share `OVER` specifications between functions, use the statement's `WINDOW` clause and use the `OVER ⟨window-name⟩` syntax. + +## General-Purpose Window Functions + +The table below shows the available general window functions. + +| Name | Description | +|:--|:-------| +| [`cume_dist()`](#cume_dist) | The cumulative distribution: (number of partition rows preceding or peer with current row) / total partition rows. | +| [`dense_rank()`](#dense_rank) | The rank of the current row *without gaps;* this function counts peer groups. | +| [`first_value(expr[ IGNORE NULLS])`](#first_valueexpr-ignore-nulls) | Returns `expr` evaluated at the row that is the first row (with a non-null value of `expr` if `IGNORE NULLS` is set) of the window frame. | +| [`lag(expr[, offset[, default]][ IGNORE NULLS])`](#lagexpr-offset-default-ignore-nulls) | Returns `expr` evaluated at the row that is `offset` rows (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) before the current row within the window frame; if there is no such row, instead return `default` (which must be of the Same type as `expr`). Both `offset` and `default` are evaluated with respect to the current row. If omitted, `offset` defaults to `1` and default to `NULL`. | +| [`last_value(expr[ IGNORE NULLS])`](#last_valueexpr-ignore-nulls) | Returns `expr` evaluated at the row that is the last row (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) of the window frame. | +| [`lead(expr[, offset[, default]][ IGNORE NULLS])`](#leadexpr-offset-default-ignore-nulls) | Returns `expr` evaluated at the row that is `offset` rows after the current row (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) within the window frame; if there is no such row, instead return `default` (which must be of the Same type as `expr`). Both `offset` and `default` are evaluated with respect to the current row. If omitted, `offset` defaults to `1` and default to `NULL`. | +| [`nth_value(expr, nth[ IGNORE NULLS])`](#nth_valueexpr-nth-ignore-nulls) | Returns `expr` evaluated at the nth row (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) of the window frame (counting from 1); `NULL` if no such row. | +| [`ntile(num_buckets)`](#ntilenum_buckets) | An integer ranging from 1 to `num_buckets`, dividing the partition as equally as possible. | +| [`percent_rank()`](#percent_rank) | The relative rank of the current row: `(rank() - 1) / (total partition rows - 1)`. | +| [`rank_dense()`](#rank_dense) | The rank of the current row *with gaps;* same as `row_number` of its first peer. | +| [`rank()`](#rank) | The rank of the current row *with gaps;* same as `row_number` of its first peer. | +| [`row_number()`](#row_number) | The number of the current row within the partition, counting from 1. | + +#### `cume_dist()` + +
+ +| **Description** | The cumulative distribution: (number of partition rows preceding or peer with current row) / total partition rows. | +| **Return Type** | `DOUBLE` | +| **Example** | `cume_dist()` | + +#### `dense_rank()` + +
+ +| **Description** | The rank of the current row *without gaps;* this function counts peer groups. | +| **Return Type** | `BIGINT` | +| **Example** | `dense_rank()` | + +#### `first_value(expr[ IGNORE NULLS])` + +
+ +| **Description** | Returns `expr` evaluated at the row that is the first row (with a non-null value of `expr` if `IGNORE NULLS` is set) of the window frame. | +| **Return Type** | Same type as `expr` | +| **Example** | `first_value(column)` | + +#### `lag(expr[, offset[, default]][ IGNORE NULLS])` + +
+ +| **Description** | Returns `expr` evaluated at the row that is `offset` rows (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) before the current row within the window frame; if there is no such row, instead return `default` (which must be of the Same type as `expr`). Both `offset` and `default` are evaluated with respect to the current row. If omitted, `offset` defaults to `1` and default to `NULL`. | +| **Return Type** | Same type as `expr` | +| **Aliases** | `lag(column, 3, 0)` | + +#### `last_value(expr[ IGNORE NULLS])` + +
+ +| **Description** | Returns `expr` evaluated at the row that is the last row (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) of the window frame. | +| **Return Type** | Same type as `expr` | +| **Example** | `last_value(column)` | + +#### `lead(expr[, offset[, default]][ IGNORE NULLS])` + +
+ +| **Description** | Returns `expr` evaluated at the row that is `offset` rows after the current row (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) within the window frame; if there is no such row, instead return `default` (which must be of the Same type as `expr`). Both `offset` and `default` are evaluated with respect to the current row. If omitted, `offset` defaults to `1` and default to `NULL`. | +| **Return Type** | Same type as `expr` | +| **Aliases** | `lead(column, 3, 0)` | + +#### `nth_value(expr, nth[ IGNORE NULLS])` + +
+ +| **Description** | Returns `expr` evaluated at the nth row (among rows with a non-null value of `expr` if `IGNORE NULLS` is set) of the window frame (counting from 1); `NULL` if no such row. | +| **Return Type** | Same type as `expr` | +| **Aliases** | `nth_value(column, 2)` | + +#### `ntile(num_buckets)` + +
+ +| **Description** | An integer ranging from 1 to `num_buckets`, dividing the partition as equally as possible. | +| **Return Type** | `BIGINT` | +| **Example** | `ntile(4)` | + +#### `percent_rank()` + +
+ +| **Description** | The relative rank of the current row: `(rank() - 1) / (total partition rows - 1)`. | +| **Return Type** | `DOUBLE` | +| **Example** | `percent_rank()` | + +#### `rank_dense()` + +
+ +| **Description** | The rank of the current row *with gaps;* same as `row_number` of its first peer. | +| **Return Type** | `BIGINT` | +| **Example** | `rank_dense()` | +| **Alias** | `rank()` | + +#### `rank()` + +
+ +| **Description** | The rank of the current row *with gaps;* same as `row_number` of its first peer. | +| **Return Type** | `BIGINT` | +| **Example** | `rank()` | +| **Alias** | `rank_dense()` | + +#### `row_number()` + +
+ +| **Description** | The number of the current row within the partition, counting from 1. | +| **Return Type** | `BIGINT` | +| **Example** | `row_number()` | + +## Aggregate Window Functions + +All [aggregate functions]({% link docs/archive/1.0/sql/functions/aggregates.md %}) can be used in a windowing context, including the optional [`FILTER` clause]({% link docs/archive/1.0/sql/query_syntax/filter.md %}). +The `first` and `last` aggregate functions are shadowed by the respective general-purpose window functions, with the minor consequence that the `FILTER` clause is not available for these but `IGNORE NULLS` is. + +## Nulls + +All [general-purpose window functions](#general-purpose-window-functions) that accept `IGNORE NULLS` respect nulls by default. This default behavior can optionally be made explicit via `RESPECT NULLS`. + +In contrast, all [aggregate window functions](#aggregate-window-functions) (except for `list` and its aliases, which can be made to ignore nulls via a `FILTER`) ignore nulls and do not accept `RESPECT NULLS`. For example, `sum(column) OVER (ORDER BY time) AS cumulativeColumn` computes a cumulative sum where rows with a `NULL` value of `column` have the same value of `cumulativeColumn` as the row that precedes them. + +## Evaluation + +Windowing works by breaking a relation up into independent *partitions*, +*ordering* those partitions, +and then computing a new column for each row as a function of the nearby values. +Some window functions depend only on the partition boundary and the ordering, +but a few (including all the aggregates) also use a *frame*. +Frames are specified as a number of rows on either side (*preceding* or *following*) of the *current row*. +The distance can either be specified as a number of *rows* or a *range* of values +using the partition's ordering value and a distance. + +The full syntax is shown in the diagram at the top of the page, +and this diagram visually illustrates computation environment: + +The Window Computation Environment + +### Partition and Ordering + +Partitioning breaks the relation up into independent, unrelated pieces. +Partitioning is optional, and if none is specified then the entire relation is treated as a single partition. +Window functions cannot access values outside of the partition containing the row they are being evaluated at. + +Ordering is also optional, but without it the results are not well-defined. +Each partition is ordered using the same ordering clause. + +Here is a table of power generation data, available as a CSV file ([`power-plant-generation-history.csv`](/data/power-plant-generation-history.csv)). To load the data, run: + +```sql +CREATE TABLE "Generation History" AS + FROM 'power-plant-generation-history.csv'; +``` + +After partitioning by plant and ordering by date, it will have this layout: + +
+ +| Plant | Date | MWh | +|:---|:---|---:| +| Boston | 2019-01-02 | 564337 | +| Boston | 2019-01-03 | 507405 | +| Boston | 2019-01-04 | 528523 | +| Boston | 2019-01-05 | 469538 | +| Boston | 2019-01-06 | 474163 | +| Boston | 2019-01-07 | 507213 | +| Boston | 2019-01-08 | 613040 | +| Boston | 2019-01-09 | 582588 | +| Boston | 2019-01-10 | 499506 | +| Boston | 2019-01-11 | 482014 | +| Boston | 2019-01-12 | 486134 | +| Boston | 2019-01-13 | 531518 | +| Worcester | 2019-01-02 | 118860 | +| Worcester | 2019-01-03 | 101977 | +| Worcester | 2019-01-04 | 106054 | +| Worcester | 2019-01-05 | 92182 | +| Worcester | 2019-01-06 | 94492 | +| Worcester | 2019-01-07 | 99932 | +| Worcester | 2019-01-08 | 118854 | +| Worcester | 2019-01-09 | 113506 | +| Worcester | 2019-01-10 | 96644 | +| Worcester | 2019-01-11 | 93806 | +| Worcester | 2019-01-12 | 98963 | +| Worcester | 2019-01-13 | 107170 | + +In what follows, +we shall use this table (or small sections of it) to illustrate various pieces of window function evaluation. + +The simplest window function is `row_number()`. +This function just computes the 1-based row number within the partition using the query: + +```sql +SELECT + "Plant", + "Date", + row_number() OVER (PARTITION BY "Plant" ORDER BY "Date") AS "Row" +FROM "Generation History" +ORDER BY 1, 2; +``` + +The result will be the following: + +
+ +| Plant | Date | Row | +|:---|:---|---:| +| Boston | 2019-01-02 | 1 | +| Boston | 2019-01-03 | 2 | +| Boston | 2019-01-04 | 3 | +| ... | ... | ... | +| Worcester | 2019-01-02 | 1 | +| Worcester | 2019-01-03 | 2 | +| Worcester | 2019-01-04 | 3 | +| ... | ... | ... | + +Note that even though the function is computed with an `ORDER BY` clause, +the result does not have to be sorted, +so the `SELECT` also needs to be explicitly sorted if that is desired. + +### Framing + +Framing specifies a set of rows relative to each row where the function is evaluated. +The distance from the current row is given as an expression either `PRECEDING` or `FOLLOWING` the current row. +This distance can either be specified as an integral number of `ROWS` +or as a `RANGE` delta expression from the value of the ordering expression. +For a `RANGE` specification, there must be only one ordering expression, +and it has to support addition and subtraction (i.e., numbers or `INTERVAL`s). +The default values for frames are from `UNBOUNDED PRECEDING` to `CURRENT ROW`. +It is invalid for a frame to start after it ends. +Using the [`EXCLUDE` clause](#exclude-clause), rows around the current row can be excluded from the frame. + +#### `ROW` Framing + +Here is a simple `ROW` frame query, using an aggregate function: + +```sql +SELECT points, + sum(points) OVER ( + ROWS BETWEEN 1 PRECEDING + AND 1 FOLLOWING) we +FROM results; +``` + +This query computes the `sum` of each point and the points on either side of it: + +Moving SUM of three values + +Notice that at the edge of the partition, there are only two values added together. +This is because frames are cropped to the edge of the partition. + +#### `RANGE` Framing + +Returning to the power data, suppose the data is noisy. +We might want to compute a 7 day moving average for each plant to smooth out the noise. +To do this, we can use this window query: + +```sql +SELECT "Plant", "Date", + avg("MWh") OVER ( + PARTITION BY "Plant" + ORDER BY "Date" ASC + RANGE BETWEEN INTERVAL 3 DAYS PRECEDING + AND INTERVAL 3 DAYS FOLLOWING) + AS "MWh 7-day Moving Average" +FROM "Generation History" +ORDER BY 1, 2; +``` + +This query partitions the data by `Plant` (to keep the different power plants' data separate), +orders each plant's partition by `Date` (to put the energy measurements next to each other), +and uses a `RANGE` frame of three days on either side of each day for the `avg` +(to handle any missing days). +This is the result: + +
+ +| Plant | Date | MWh 7-day Moving Average | +|:---|:---|---:| +| Boston | 2019-01-02 | 517450.75 | +| Boston | 2019-01-03 | 508793.20 | +| Boston | 2019-01-04 | 508529.83 | +| ... | ... | ... | +| Boston | 2019-01-13 | 499793.00 | +| Worcester | 2019-01-02 | 104768.25 | +| Worcester | 2019-01-03 | 102713.00 | +| Worcester | 2019-01-04 | 102249.50 | +| ... | ... | ... | + +#### `EXCLUDE` Clause + +The `EXCLUDE` clause allows rows around the current row to be excluded from the frame. It has the following options: + +* `EXCLUDE NO OTHERS`: exclude nothing (default) +* `EXCLUDE CURRENT ROW`: exclude the current row from the window frame +* `EXCLUDE GROUP`: exclude the current row and all its peers (according to the columns specified by `ORDER BY`) from the window frame +* `EXCLUDE TIES`: exclude only the current row's peers from the window frame + +### `WINDOW` Clauses + +Multiple different `OVER` clauses can be specified in the same `SELECT`, and each will be computed separately. +Often, however, we want to use the same layout for multiple window functions. +The `WINDOW` clause can be used to define a *named* window that can be shared between multiple window functions: + +```sql +SELECT "Plant", "Date", + min("MWh") OVER seven AS "MWh 7-day Moving Minimum", + avg("MWh") OVER seven AS "MWh 7-day Moving Average", + max("MWh") OVER seven AS "MWh 7-day Moving Maximum" +FROM "Generation History" +WINDOW seven AS ( + PARTITION BY "Plant" + ORDER BY "Date" ASC + RANGE BETWEEN INTERVAL 3 DAYS PRECEDING + AND INTERVAL 3 DAYS FOLLOWING) +ORDER BY 1, 2; +``` + +The three window functions will also share the data layout, which will improve performance. + +Multiple windows can be defined in the same `WINDOW` clause by comma-separating them: + +```sql +SELECT "Plant", "Date", + min("MWh") OVER seven AS "MWh 7-day Moving Minimum", + avg("MWh") OVER seven AS "MWh 7-day Moving Average", + max("MWh") OVER seven AS "MWh 7-day Moving Maximum", + min("MWh") OVER three AS "MWh 3-day Moving Minimum", + avg("MWh") OVER three AS "MWh 3-day Moving Average", + max("MWh") OVER three AS "MWh 3-day Moving Maximum" +FROM "Generation History" +WINDOW + seven AS ( + PARTITION BY "Plant" + ORDER BY "Date" ASC + RANGE BETWEEN INTERVAL 3 DAYS PRECEDING + AND INTERVAL 3 DAYS FOLLOWING), + three AS ( + PARTITION BY "Plant" + ORDER BY "Date" ASC + RANGE BETWEEN INTERVAL 1 DAYS PRECEDING + AND INTERVAL 1 DAYS FOLLOWING) +ORDER BY 1, 2; +``` + +The queries above do not use a number of clauses commonly found in select statements, like +`WHERE`, `GROUP BY`, etc. For more complex queries you can find where `WINDOW` clauses fall in +the canonical order of the [`SELECT statement`]({% link docs/archive/1.0/sql/statements/select.md %}). + +### Filtering the Results of Window Functions Using `QUALIFY` + +Window functions are executed after the [`WHERE`]({% link docs/archive/1.0/sql/query_syntax/where.md %}) and [`HAVING`]({% link docs/archive/1.0/sql/query_syntax/having.md %}) clauses have been already evaluated, so it's not possible to use these clauses to filter the results of window functions +The [`QUALIFY` clause]({% link docs/archive/1.0/sql/query_syntax/qualify.md %}) avoids the need for a subquery or [`WITH` clause]({% link docs/archive/1.0/sql/query_syntax/with.md %}) to perform this filtering. + +### Box and Whisker Queries + +All aggregates can be used as windowing functions, including the complex statistical functions. +These function implementations have been optimised for windowing, +and we can use the window syntax to write queries that generate the data for moving box-and-whisker plots: + +```sql +SELECT "Plant", "Date", + min("MWh") OVER seven AS "MWh 7-day Moving Minimum", + quantile_cont("MWh", [0.25, 0.5, 0.75]) OVER seven + AS "MWh 7-day Moving IQR", + max("MWh") OVER seven AS "MWh 7-day Moving Maximum", +FROM "Generation History" +WINDOW seven AS ( + PARTITION BY "Plant" + ORDER BY "Date" ASC + RANGE BETWEEN INTERVAL 3 DAYS PRECEDING + AND INTERVAL 3 DAYS FOLLOWING) +ORDER BY 1, 2; +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/indexes.md b/docs/archive/1.0/sql/indexes.md new file mode 100644 index 00000000000..063285d9c71 --- /dev/null +++ b/docs/archive/1.0/sql/indexes.md @@ -0,0 +1,70 @@ +--- +layout: docu +railroad: statements/indexes.js +title: Indexes +--- + +## Index Types + +DuckDB currently uses two index types: + +* A [min-max index](https://en.wikipedia.org/wiki/Block_Range_Index) (also known as zonemap and block range index) is automatically created for columns of all [general-purpose data types]({% link docs/archive/1.0/sql/data_types/overview.md %}). +* An [Adaptive Radix Tree (ART)](https://db.in.tum.de/~leis/papers/ART.pdf) is mainly used to ensure primary key constraints and to speed up point and very highly selective (i.e., < 0.1%) queries. Such an index is automatically created for columns with a `UNIQUE` or `PRIMARY KEY` constraint and can be defined using `CREATE INDEX`. + +> Warning ART indexes must currently be able to fit in-memory. Avoid creating ART indexes if the index does not fit in memory. + +## Persistence + +Both min-max indexes and ART indexes are persisted on disk. + +## `CREATE INDEX` and `DROP INDEX` + +To create an index, use the [`CREATE INDEX` statement]({% link docs/archive/1.0/sql/statements/create_index.md %}#create-index). +To drop an index, use the [`DROP INDEX` statement]({% link docs/archive/1.0/sql/statements/create_index.md %}#drop-index). + +## Limitations of ART Indexes + +ART indexes create a secondary copy of the data in a second location – this complicates processing, particularly when combined with transactions. Certain limitations apply when it comes to modifying data that is also stored in secondary indexes. + +> As expected, indexes have a strong effect on performance, slowing down loading and updates, but speeding up certain queries. Please consult the [Performance Guide]({% link docs/archive/1.0/guides/performance/indexing.md %}) for details. + +### Updates Become Deletes and Inserts + +When an update statement is executed on a column that is present in an index, the statement is transformed into a *delete* of the original row followed by an *insert*. +This has certain performance implications, particularly for wide tables, as entire rows are rewritten instead of only the affected columns. + +### Over-Eager Unique Constraint Checking + +Due to the presence of transactions, data can only be removed from the index after (1) the transaction that performed the delete is committed, and (2) no further transactions exist that refer to the old entry still present in the index. As a result of this – transactions that perform *deletions followed by insertions* may trigger unexpected unique constraint violations, as the deleted tuple has not actually been removed from the index yet. For example: + +```sql +CREATE TABLE students (id INTEGER, name VARCHAR); +INSERT INTO students VALUES (1, 'John Doe'); +CREATE UNIQUE INDEX students_id ON students (id); + +BEGIN; -- start transaction +DELETE FROM students WHERE id = 1; +INSERT INTO students VALUES (1, 'Jane Doe'); +``` + +The last statement fails with the following error: + +```console +Constraint Error: Duplicate key "id: 1" violates unique constraint. If this is an unexpected constraint violation please double check with the known index limitations section in our documentation (https://duckdb.org/docs/sql/indexes). +``` + +This, combined with the fact that updates are turned into deletions and insertions within the same transaction, means that updating rows in the presence of unique or primary key constraints can often lead to unexpected unique constraint violations. For example, in the following query, `SET id = 1` causes a `Constraint Error` to occur. + +```sql +CREATE TABLE students (id INTEGER PRIMARY KEY, name VARCHAR); +INSERT INTO students VALUES (1, 'John Doe'); + +UPDATE students SET id = 1 WHERE id = 1; +``` + +```console +Constraint Error: Duplicate key "id: 1" violates primary key constraint. +If this is an unexpected constraint violation please double check with the known index limitations section in our documentation (https://duckdb.org/docs/sql/indexes). +``` + +Currently, this is an expected limitation of DuckDB – although we aim to resolve this in the future. \ No newline at end of file diff --git a/docs/archive/1.0/sql/introduction.md b/docs/archive/1.0/sql/introduction.md new file mode 100644 index 00000000000..e5ea14ace7e --- /dev/null +++ b/docs/archive/1.0/sql/introduction.md @@ -0,0 +1,403 @@ +--- +layout: docu +title: SQL Introduction +--- + +Here we provide an overview of how to perform simple operations in SQL. +This tutorial is only intended to give you an introduction and is in no way a complete tutorial on SQL. +This tutorial is adapted from the [PostgreSQL tutorial](https://www.postgresql.org/docs/11/tutorial-sql-intro.html). + +> DuckDB's SQL dialect closely follows the conventions of the PostgreSQL dialect. +> The few exceptions to this are listed on the [PostgreSQL compatibility page]({% link docs/archive/1.0/sql/dialect/postgresql_compatibility.md %}). + +In the examples that follow, we assume that you have installed the DuckDB Command Line Interface (CLI) shell. See the [installation page]({% link docs/archive/1.0/installation/index.html %}?environment=cli) for information on how to install the CLI. + +## Concepts + +DuckDB is a relational database management system (RDBMS). That means it is a system for managing data stored in relations. A relation is essentially a mathematical term for a table. + +Each table is a named collection of rows. Each row of a given table has the same set of named columns, and each column is of a specific data type. Tables themselves are stored inside schemas, and a collection of schemas constitutes the entire database that you can access. + +## Creating a New Table + +You can create a new table by specifying the table name, along with all column names and their types: + +```sql +CREATE TABLE weather ( + city VARCHAR, + temp_lo INTEGER, -- minimum temperature on a day + temp_hi INTEGER, -- maximum temperature on a day + prcp FLOAT, + date DATE +); +``` + +You can enter this into the shell with the line breaks. The command is not terminated until the semicolon. + +White space (i.e., spaces, tabs, and newlines) can be used freely in SQL commands. That means you can type the command aligned differently than above, or even all on one line. Two dash characters (`--`) introduce comments. Whatever follows them is ignored up to the end of the line. SQL is case-insensitive about keywords and identifiers. When returning identifiers, [their original cases are preserved]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#rules-for-case-sensitivity). + +In the SQL command, we first specify the type of command that we want to perform: `CREATE TABLE`. After that follows the parameters for the command. First, the table name, `weather`, is given. Then the column names and column types follow. + +`city VARCHAR` specifies that the table has a column called `city` that is of type `VARCHAR`. `VARCHAR` specifies a data type that can store text of arbitrary length. The temperature fields are stored in an `INTEGER` type, a type that stores integer numbers (i.e., whole numbers without a decimal point). `FLOAT` columns store single precision floating-point numbers (i.e., numbers with a decimal point). `DATE` stores a date (i.e., year, month, day combination). `DATE` only stores the specific day, not a time associated with that day. + +DuckDB supports the standard SQL types `INTEGER`, `SMALLINT`, `FLOAT`, `DOUBLE`, `DECIMAL`, `CHAR(n)`, `VARCHAR(n)`, `DATE`, `TIME` and `TIMESTAMP`. + +The second example will store cities and their associated geographical location: + +```sql +CREATE TABLE cities ( + name VARCHAR, + lat DECIMAL, + lon DECIMAL +); +``` + +Finally, it should be mentioned that if you don't need a table any longer or want to recreate it differently you can remove it using the following command: + +```sql +DROP TABLE ⟨tablename⟩; +``` + +## Populating a Table with Rows + +The insert statement is used to populate a table with rows: + +```sql +INSERT INTO weather +VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27'); +``` + +Constants that are not numeric values (e.g., text and dates) must be surrounded by single quotes (`''`), as in the example. Input dates for the date type must be formatted as `'YYYY-MM-DD'`. + +We can insert into the `cities` table in the same manner. + +```sql +INSERT INTO cities +VALUES ('San Francisco', -194.0, 53.0); +``` + +The syntax used so far requires you to remember the order of the columns. An alternative syntax allows you to list the columns explicitly: + +```sql +INSERT INTO weather (city, temp_lo, temp_hi, prcp, date) +VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29'); +``` + +You can list the columns in a different order if you wish or even omit some columns, e.g., if the `prcp` is unknown: + +```sql +INSERT INTO weather (date, city, temp_hi, temp_lo) +VALUES ('1994-11-29', 'Hayward', 54, 37); +``` + +> Tip Many developers consider explicitly listing the columns better style than relying on the order implicitly. + +Please enter all the commands shown above so you have some data to work with in the following sections. + +Alternatively, you can use the `COPY` statement. This is faster for large amounts of data because the `COPY` command is optimized for bulk loading while allowing less flexibility than `INSERT`. An example with [`weather.csv`](/data/weather.csv) would be: + +```sql +COPY weather +FROM 'weather.csv'; +``` + +Where the file name for the source file must be available on the machine running the process. There are many other ways of loading data into DuckDB, see the [corresponding documentation section]({% link docs/archive/1.0/data/overview.md %}) for more information. + +## Querying a Table + +To retrieve data from a table, the table is queried. A SQL `SELECT` statement is used to do this. The statement is divided into a select list (the part that lists the columns to be returned), a table list (the part that lists the tables from which to retrieve the data), and an optional qualification (the part that specifies any restrictions). For example, to retrieve all the rows of table weather, type: + +```sql +SELECT * +FROM weather; +``` + +Here `*` is a shorthand for “all columns”. So the same result would be had with: + +```sql +SELECT city, temp_lo, temp_hi, prcp, date +FROM weather; +``` + +The output should be: + +| city | temp_lo | temp_hi | prcp | date | +|---------------|---------|---------|------|------------| +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | +| San Francisco | 43 | 57 | 0.0 | 1994-11-29 | +| Hayward | 37 | 54 | NULL | 1994-11-29 | + +You can write expressions, not just simple column references, in the select list. For example, you can do: + +```sql +SELECT city, (temp_hi + temp_lo) / 2 AS temp_avg, date +FROM weather; +``` + +This should give: + +| city | temp_avg | date | +|---------------|----------|------------| +| San Francisco | 48.0 | 1994-11-27 | +| San Francisco | 50.0 | 1994-11-29 | +| Hayward | 45.5 | 1994-11-29 | + +Notice how the `AS` clause is used to relabel the output column. (The `AS` clause is optional.) + +A query can be “qualified” by adding a `WHERE` clause that specifies which rows are wanted. The `WHERE` clause contains a Boolean (truth value) expression, and only rows for which the Boolean expression is true are returned. The usual Boolean operators (`AND`, `OR`, and `NOT`) are allowed in the qualification. For example, the following retrieves the weather of San Francisco on rainy days: + +```sql +SELECT * +FROM weather +WHERE city = 'San Francisco' + AND prcp > 0.0; +``` + +Result: + +| city | temp_lo | temp_hi | prcp | date | +|---------------|---------|---------|------|------------| +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | + +You can request that the results of a query be returned in sorted order: + +```sql +SELECT * +FROM weather +ORDER BY city; +``` + +| city | temp_lo | temp_hi | prcp | date | +|---------------|---------|---------|------|------------| +| Hayward | 37 | 54 | NULL | 1994-11-29 | +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | +| San Francisco | 43 | 57 | 0.0 | 1994-11-29 | + +In this example, the sort order isn't fully specified, and so you might get the San Francisco rows in either order. But you'd always get the results shown above if you do: + +```sql +SELECT * +FROM weather +ORDER BY city, temp_lo; +``` + +You can request that duplicate rows be removed from the result of a query: + +```sql +SELECT DISTINCT city +FROM weather; +``` + +| city | +|---------------| +| San Francisco | +| Hayward | + +Here again, the result row ordering might vary. You can ensure consistent results by using `DISTINCT` and `ORDER BY` together: + +```sql +SELECT DISTINCT city +FROM weather +ORDER BY city; +``` + +## Joins between Tables + +Thus far, our queries have only accessed one table at a time. Queries can access multiple tables at once, or access the same table in such a way that multiple rows of the table are being processed at the same time. A query that accesses multiple rows of the same or different tables at one time is called a join query. As an example, say you wish to list all the weather records together with the location of the associated city. To do that, we need to compare the city column of each row of the `weather` table with the name column of all rows in the `cities` table, and select the pairs of rows where these values match. + +This would be accomplished by the following query: + +```sql +SELECT * +FROM weather, cities +WHERE city = name; +``` + +| city | temp_lo | temp_hi | prcp | date | name | lat | lon | +|---------------|---------|---------|------|------------|---------------|----------|--------| +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | -194.000 | 53.000 | +| San Francisco | 43 | 57 | 0.0 | 1994-11-29 | San Francisco | -194.000 | 53.000 | + +Observe two things about the result set: + +* There is no result row for the city of Hayward. This is because there is no matching entry in the `cities` table for Hayward, so the join ignores the unmatched rows in the `weather` table. We will see shortly how this can be fixed. +* There are two columns containing the city name. This is correct because the lists of columns from the `weather` and `cities` tables are concatenated. In practice this is undesirable, though, so you will probably want to list the output columns explicitly rather than using `*`: + +```sql +SELECT city, temp_lo, temp_hi, prcp, date, lon, lat +FROM weather, cities +WHERE city = name; +``` + +| city | temp_lo | temp_hi | prcp | date | lon | lat | +|---------------|---------|---------|------|------------|--------|----------| +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | 53.000 | -194.000 | +| San Francisco | 43 | 57 | 0.0 | 1994-11-29 | 53.000 | -194.000 | + +Since the columns all had different names, the parser automatically found which table they belong to. If there were duplicate column names in the two tables you'd need to qualify the column names to show which one you meant, as in: + +```sql +SELECT weather.city, weather.temp_lo, weather.temp_hi, + weather.prcp, weather.date, cities.lon, cities.lat +FROM weather, cities +WHERE cities.name = weather.city; +``` + +It is widely considered good style to qualify all column names in a join query, so that the query won't fail if a duplicate column name is later added to one of the tables. + +Join queries of the kind seen thus far can also be written in this alternative form: + +```sql +SELECT * +FROM weather +INNER JOIN cities ON weather.city = cities.name; +``` + +This syntax is not as commonly used as the one above, but we show it here to help you understand the following topics. + +Now we will figure out how we can get the Hayward records back in. What we want the query to do is to scan the `weather` table and for each row to find the matching cities row(s). If no matching row is found we want some “empty values” to be substituted for the `cities` table's columns. This kind of query is called an outer join. (The joins we have seen so far are inner joins.) The command looks like this: + +```sql +SELECT * +FROM weather +LEFT OUTER JOIN cities ON weather.city = cities.name; +``` + +| city | temp_lo | temp_hi | prcp | date | name | lat | lon | +|---------------|---------|---------|------|------------|---------------|----------|--------| +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | -194.000 | 53.000 | +| San Francisco | 43 | 57 | 0.0 | 1994-11-29 | San Francisco | -194.000 | 53.000 | +| Hayward | 37 | 54 | NULL | 1994-11-29 | NULL | NULL | NULL | + +This query is called a left outer join because the table mentioned on the left of the join operator will have each of its rows in the output at least once, whereas the table on the right will only have those rows output that match some row of the left table. When outputting a left-table row for which there is no right-table match, empty (null) values are substituted for the right-table columns. + +## Aggregate Functions + +Like most other relational database products, DuckDB supports aggregate functions. An aggregate function computes a single result from multiple input rows. For example, there are aggregates to compute the `count`, `sum`, `avg` (average), `max` (maximum) and `min` (minimum) over a set of rows. + +As an example, we can find the highest low-temperature reading anywhere with: + +```sql +SELECT max(temp_lo) +FROM weather; +``` + +| max(temp_lo) | +|--------------| +| 46 | + +If we wanted to know what city (or cities) that reading occurred in, we might try: + +```sql +SELECT city +FROM weather +WHERE temp_lo = max(temp_lo); -- WRONG +``` + +but this will not work since the aggregate max cannot be used in the `WHERE` clause. (This restriction exists because the `WHERE` clause determines which rows will be included in the aggregate calculation; so obviously it has to be evaluated before aggregate functions are computed.) However, as is often the case the query can be restated to accomplish the desired result, here by using a subquery: + +```sql +SELECT city +FROM weather +WHERE temp_lo = (SELECT max(temp_lo) FROM weather); +``` + +| city | +|---------------| +| San Francisco | + +This is OK because the subquery is an independent computation that computes its own aggregate separately from what is happening in the outer query. + +Aggregates are also very useful in combination with `GROUP BY` clauses. For example, we can get the maximum low temperature observed in each city with: + +```sql +SELECT city, max(temp_lo) +FROM weather +GROUP BY city; +``` + +| city | max(temp_lo) | +|---------------|--------------| +| San Francisco | 46 | +| Hayward | 37 | + +Which gives us one output row per city. Each aggregate result is computed over the table rows matching that city. We can filter these grouped rows using `HAVING`: + +```sql +SELECT city, max(temp_lo) +FROM weather +GROUP BY city +HAVING max(temp_lo) < 40; +``` + +| city | max(temp_lo) | +|---------|--------------| +| Hayward | 37 | + +which gives us the same results for only the cities that have all `temp_lo` values below 40. Finally, if we only care about cities whose names begin with `S`, we can use the `LIKE` operator: + +```sql +SELECT city, max(temp_lo) +FROM weather +WHERE city LIKE 'S%' -- (1) +GROUP BY city +HAVING max(temp_lo) < 40; +``` + +More information about the `LIKE` operator can be found in the [pattern matching page]({% link docs/archive/1.0/sql/functions/pattern_matching.md %}). + +It is important to understand the interaction between aggregates and SQL's `WHERE` and `HAVING` clauses. The fundamental difference between `WHERE` and `HAVING` is this: `WHERE` selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas `HAVING` selects group rows after groups and aggregates are computed. Thus, the `WHERE` clause must not contain aggregate functions; it makes no sense to try to use an aggregate to determine which rows will be inputs to the aggregates. On the other hand, the `HAVING` clause always contains aggregate functions. + +In the previous example, we can apply the city name restriction in `WHERE`, since it needs no aggregate. This is more efficient than adding the restriction to `HAVING`, because we avoid doing the grouping and aggregate calculations for all rows that fail the `WHERE` check. + +## Updates + +You can update existing rows using the `UPDATE` command. Suppose you discover the temperature readings are all off by 2 degrees after November 28. You can correct the data as follows: + +```sql +UPDATE weather +SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2 +WHERE date > '1994-11-28'; +``` + +Look at the new state of the data: + +```sql +SELECT * +FROM weather; +``` + +| city | temp_lo | temp_hi | prcp | date | +|---------------|---------|---------|------|------------| +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | +| San Francisco | 43 | 57 | 0.0 | 1994-11-29 | +| Hayward | 37 | 54 | NULL | 1994-11-29 | + +## Deletions + +Rows can be removed from a table using the `DELETE` command. Suppose you are no longer interested in the weather of Hayward. Then you can do the following to delete those rows from the table: + +```sql +DELETE FROM weather +WHERE city = 'Hayward'; +``` + +All weather records belonging to Hayward are removed. + +```sql +SELECT * +FROM weather; +``` + +| city | temp_lo | temp_hi | prcp | date | +|---------------|---------|---------|------|------------| +| San Francisco | 46 | 50 | 0.25 | 1994-11-27 | +| San Francisco | 43 | 57 | 0.0 | 1994-11-29 | + +One should be cautious when issuing statements of the following form: + +```sql +DELETE FROM ⟨table_name⟩; +``` + +> Warning Without a qualification, `DELETE` will remove all rows from the given table, leaving it empty. The system will not request confirmation before doing this. \ No newline at end of file diff --git a/docs/archive/1.0/sql/meta/duckdb_table_functions.md b/docs/archive/1.0/sql/meta/duckdb_table_functions.md new file mode 100644 index 00000000000..2711db23337 --- /dev/null +++ b/docs/archive/1.0/sql/meta/duckdb_table_functions.md @@ -0,0 +1,340 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/sql/duckdb_table_functions +title: DuckDB_% Metadata Functions +--- + +DuckDB offers a collection of table functions that provide metadata about the current database. These functions reside in the `main` schema and their names are prefixed with `duckdb_`. + +The resultset returned by a `duckdb_` table function may be used just like an ordinary table or view. For example, you can use a `duckdb_` function call in the `FROM` clause of a `SELECT` statement, and you may refer to the columns of its returned resultset elsewhere in the statement, for example in the `WHERE` clause. + +Table functions are still functions, and you should write parenthesis after the function name to call it to obtain its returned resultset: + +```sql +SELECT * FROM duckdb_settings(); +``` + +Alternatively, you may execute table functions also using the `CALL`-syntax: + +```sql +CALL duckdb_settings(); +``` + +In this case too, the parentheses are mandatory. + +> For some of the `duckdb_%` functions, there is also an identically named view available, which also resides in the `main` schema. Typically, these views do a `SELECT` on the `duckdb_` table function with the same name, while filtering out those objects that are marked as internal. We mention it here, because if you accidentally omit the parentheses in your `duckdb_` table function call, you might still get a result, but from the identically named view. + +Example: + +The `duckdb_views()` _table function_ returns all views, including those marked internal: + +```sql +SELECT * FROM duckdb_views(); +``` + +The `duckdb_views` _view_ returns views that are not marked as internal: + +```sql +SELECT * FROM duckdb_views; +``` + +## `duckdb_columns` + +The `duckdb_columns()` function provides metadata about the columns available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains the column object. | `VARCHAR` | +| `database_oid` | Internal identifier of the database that contains the column object. | `BIGINT` | +| `schema_name` | The SQL name of the schema that contains the table object that defines this column. | `VARCHAR` | +| `schema_oid` | Internal identifier of the schema object that contains the table of the column. | `BIGINT` | +| `table_name` | The SQL name of the table that defines the column. | `VARCHAR` | +| `table_oid` | Internal identifier (name) of the table object that defines the column. | `BIGINT` | +| `column_name` | The SQL name of the column. | `VARCHAR` | +| `column_index` | The unique position of the column within its table. | `INTEGER` | +| `internal` | `true` if this column built-in, `false` if it is user-defined. | `BOOLEAN` | +| `column_default` | The default value of the column (expressed in SQL)| `VARCHAR` | +| `is_nullable` | `true` if the column can hold `NULL` values; `false` if the column cannot hold `NULL`-values. | `BOOLEAN` | +| `data_type` | The name of the column datatype. | `VARCHAR` | +| `data_type_id` | The internal identifier of the column data type. | `BIGINT` | +| `character_maximum_length` | Always `NULL`. DuckDB [text types]({% link docs/archive/1.0/sql/data_types/text.md %}) do not enforce a value length restriction based on a length type parameter. | `INTEGER` | +| `numeric_precision` | The number of units (in the base indicated by `numeric_precision_radix`) used for storing column values. For integral and approximate numeric types, this is the number of bits. For decimal types, this is the number of digits positions. | `INTEGER` | +| `numeric_precision_radix` | The number-base of the units in the `numeric_precision` column. For integral and approximate numeric types, this is `2`, indicating the precision is expressed as a number of bits. For the `decimal` type this is `10`, indicating the precision is expressed as a number of decimal positions. | `INTEGER` | +| `numeric_scale` | Applicable to `decimal` type. Indicates the maximum number of fractional digits (i.e., the number of digits that may appear after the decimal separator). | `INTEGER` | + +The [`information_schema.columns`]({% link docs/archive/1.0/sql/meta/information_schema.md %}#columns-columns) system view provides a more standardized way to obtain metadata about database columns, but the `duckdb_columns` function also returns metadata about DuckDB internal objects. (In fact, `information_schema.columns` is implemented as a query on top of `duckdb_columns()`) + +## `duckdb_constraints` + +The `duckdb_constraints()` function provides metadata about the constraints available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains the constraint. | `VARCHAR` | +| `database_oid` | Internal identifier of the database that contains the constraint. | `BIGINT` | +| `schema_name` | The SQL name of the schema that contains the table on which the constraint is defined. | `VARCHAR` | +| `schema_oid` | Internal identifier of the schema object that contains the table on which the constraint is defined. | `BIGINT` | +| `table_name` | The SQL name of the table on which the constraint is defined. | `VARCHAR` | +| `table_oid` | Internal identifier (name) of the table object on which the constraint is defined. | `BIGINT` | +| `constraint_index` | Indicates the position of the constraint as it appears in its table definition. | `BIGINT` | +| `constraint_type` | Indicates the type of constraint. Applicable values are `CHECK`, `FOREIGN KEY`, `PRIMARY KEY`, `NOT NULL`, `UNIQUE`. | `VARCHAR` | +| `constraint_text` | The definition of the constraint expressed as a SQL-phrase. (Not necessarily a complete or syntactically valid DDL-statement.)| `VARCHAR` | +| `expression` | If constraint is a check constraint, the definition of the condition being checked, otherwise `NULL`. | `VARCHAR` | +| `constraint_column_indexes` | An array of table column indexes referring to the columns that appear in the constraint definition. | `BIGINT[]` | +| `constraint_column_names` | An array of table column names appearing in the constraint definition. | `VARCHAR[]` | + +The [`information_schema.referential_constraints`]({% link docs/archive/1.0/sql/meta/information_schema.md %}#referential_constraints-referential-constraints) and [`information_schema.table_constraints`]({% link docs/archive/1.0/sql/meta/information_schema.md %}#table_constraints-table-constraints) system views provide a more standardized way to obtain metadata about constraints, but the `duckdb_constraints` function also returns metadata about DuckDB internal objects. (In fact, `information_schema.referential_constraints` and `information_schema.table_constraints` are implemented as a query on top of `duckdb_constraints()`) + +## `duckdb_databases` + +The `duckdb_databases()` function lists the databases that are accessible from within the current DuckDB process. +Apart from the database associated at startup, the list also includes databases that were [attached]({% link docs/archive/1.0/sql/statements/attach.md %}) later on to the DuckDB process + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database, or the alias if the database was attached using an ALIAS-clause. | `VARCHAR` | +| `database_oid` | The internal identifier of the database. | `VARCHAR` | +| `path` | The file path associated with the database. | `VARCHAR` | +| `internal` | `true` indicates a system or built-in database. False indicates a user-defined database. | `BOOLEAN` | +| `type` | The type indicates the type of RDBMS implemented by the attached database. For DuckDB databases, that value is `duckdb`. | `VARCHAR` | + +## `duckdb_dependencies` + +The `duckdb_dependencies()` function provides metadata about the dependencies available in the DuckDB instance. + +| Column | Description | Type | +|:--|:------|:-| +| `classid` | Always 0| `BIGINT` | +| `objid` | The internal id of the object. | `BIGINT` | +| `objsubid` | Always 0| `INTEGER` | +| `refclassid` | Always 0| `BIGINT` | +| `refobjid` | The internal id of the dependent object. | `BIGINT` | +| `refobjsubid` | Always 0| `INTEGER` | +| `deptype` | The type of dependency. Either regular (n) or automatic (a). | `VARCHAR` | + +## `duckdb_extensions` + +The `duckdb_extensions()` function provides metadata about the extensions available in the DuckDB instance. + +| Column | Description | Type | +|:--|:------|:-| +| `extension_name` | The name of the extension. | `VARCHAR` | +| `loaded` | `true` if the extension is loaded, `false` if it's not loaded. | `BOOLEAN` | +| `installed` | `true` if the extension is installed, `false` if it's not installed. | `BOOLEAN` | +| `install_path` | `(BUILT-IN)` if the extension is built-in, otherwise, the filesystem path where binary that implements the extension resides. | `VARCHAR` | +| `description` | Human readable text that describes the extension's functionality. | `VARCHAR` | +| `aliases` | List of alternative names for this extension. | `VARCHAR[]` | + +## `duckdb_functions` + +The `duckdb_functions()` function provides metadata about the functions (including macros) available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains this function. | `VARCHAR` | +| `schema_name` | The SQL name of the schema where the function resides. | `VARCHAR` | +| `function_name` | The SQL name of the function. | `VARCHAR` | +| `function_type` | The function kind. Value is one of: `table`,`scalar`,`aggregate`,`pragma`,`macro`| `VARCHAR` | +| `description` | Description of this function (always `NULL`)| `VARCHAR` | +| `return_type` | The logical data type name of the returned value. Applicable for scalar and aggregate functions. | `VARCHAR` | +| `parameters` | If the function has parameters, the list of parameter names. | `VARCHAR[]` | +| `parameter_types` | If the function has parameters, a list of logical data type names corresponding to the parameter list. | `VARCHAR[]` | +| `varargs` | The name of the data type in case the function has a variable number of arguments, or `NULL` if the function does not have a variable number of arguments. | `VARCHAR` | +| `macro_definition` | If this is a [macro]({% link docs/archive/1.0/sql/statements/create_macro.md %}), the SQL expression that defines it. | `VARCHAR` | +| `has_side_effects` | `false` if this is a pure function. `true` if this function changes the database state (like sequence functions `nextval()` and `curval()`). | `BOOLEAN` | +| `function_oid` | The internal identifier for this function | `BIGINT` | + +## `duckdb_indexes` + +The `duckdb_indexes()` function provides metadata about secondary indexes available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains this index. | `VARCHAR` | +| `database_oid` | Internal identifier of the database containing the index. | `BIGINT` | +| `schema_name` | The SQL name of the schema that contains the table with the secondary index. | `VARCHAR` | +| `schema_oid` | Internal identifier of the schema object. | `BIGINT` | +| `index_name` | The SQL name of this secondary index. | `VARCHAR` | +| `index_oid` | The object identifier of this index. | `BIGINT` | +| `table_name` | The name of the table with the index. | `VARCHAR` | +| `table_oid` | Internal identifier (name) of the table object. | `BIGINT` | +| `is_unique` | `true` if the index was created with the `UNIQUE` modifier, `false` if it was not. | `BOOLEAN` | +| `is_primary` | Always `false`| `BOOLEAN` | +| `expressions` | Always `NULL`| `VARCHAR` | +| `sql` | The definition of the index, expressed as a `CREATE INDEX` SQL statement. | `VARCHAR` | + +Note that `duckdb_indexes` only provides metadata about secondary indexes – i.e., those indexes created by explicit [`CREATE INDEX`]({% link docs/archive/1.0/sql/indexes.md %}#create-index) statements. Primary keys, foreign keys, and `UNIQUE` constraints are maintained using indexes, but their details are included in the `duckdb_constraints()` function. + +## `duckdb_keywords` + +The `duckdb_keywords()` function provides metadata about DuckDB's keywords and reserved words. + +
+ +| Column | Description | Type | +|:-|:---|:-| +| `keyword_name` | The keyword. | `VARCHAR` | +| `keyword_category` | Indicates the category of the keyword. Values are `column_name`, `reserved`, `type_function` and `unreserved`. | `VARCHAR` | + +## `duckdb_memory` + +The `duckdb_memory()` function provides metadata about DuckDB's buffer manager. + +| Column | Description | Type | +|:-|:---|:-| +| `tag` | The memory tag. It has one of the following values: `BASE_TABLE`, `HASH_TABLE`, `PARQUET_READER`, `CSV_READER`, `ORDER_BY`, `ART_INDEX`, `COLUMN_DATA`, `METADATA`, `OVERFLOW_STRINGS`, `IN_MEMORY_TABLE`, `ALLOCATOR`, `EXTENSION`. | `VARCHAR` | +| `memory_usage_bytes` | The memory used (in bytes). | `BIGINT` | +| `temporary_storage_bytes` | The disk storage used (in bytes). | `BIGINT` | + +## `duckdb_optimizers` + +The `duckdb_optimizers()` function provides metadata about the optimization rules (e.g., `expression_rewriter`, `filter_pushdown`) available in the DuckDB instance. +These can be selectively turned off using [`PRAGMA disabled_optimizers`]({% link docs/archive/1.0/configuration/pragmas.md %}#selectively-disabling-optimizers). + +
+ +| Column | Description | Type | +|:-|:---|:-| +| `name` | The name of the optimization rule. | `VARCHAR` | + +## `duckdb_schemas` + +The `duckdb_schemas()` function provides metadata about the schemas available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `oid` | Internal identifier of the schema object. | `BIGINT` | +| `database_name` | The name of the database that contains this schema. | `VARCHAR` | +| `database_oid` | Internal identifier of the database containing the schema. | `BIGINT` | +| `schema_name` | The SQL name of the schema. | `VARCHAR` | +| `internal` | `true` if this is an internal (built-in) schema, `false` if this is a user-defined schema. | `BOOLEAN` | +| `sql` | Always `NULL`| `VARCHAR` | + +The [`information_schema.schemata`]({% link docs/archive/1.0/sql/meta/information_schema.md %}#schemata-database-catalog-and-schema) system view provides a more standardized way to obtain metadata about database schemas. + +## `duckdb_secrets` + +The `duckdb_secrets()` function provides metadata about the secrets available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `name` | The name of the secret. | `VARCHAR` | +| `type` | The type of the secret, e.g., `S3`, `GCS`, `R2`, `AZURE`. | `VARCHAR` | +| `provider` | The provider of the secret. | `VARCHAR` | +| `persistent` | Denotes whether the secret is persisent. | `BOOLEAN` | +| `storage` | The backend for storing the secret. | `VARCHAR` | +| `scope` | The scope of the secret. | `VARCHAR[]` | +| `secret_string` | Returns the content of the secret as a string. Sensitive pieces of information, e.g., they access key, are redacted. | `VARCHAR` | + +## `duckdb_sequences` + +The `duckdb_sequences()` function provides metadata about the sequences available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains this sequence | `VARCHAR` | +| `database_oid` | Internal identifier of the database containing the sequence. | `BIGINT` | +| `schema_name` | The SQL name of the schema that contains the sequence object. | `VARCHAR` | +| `schema_oid` | Internal identifier of the schema object that contains the sequence object. | `BIGINT` | +| `sequence_name` | The SQL name that identifies the sequence within the schema. | `VARCHAR` | +| `sequence_oid` | The internal identifier of this sequence object. | `BIGINT` | +| `temporary` | Whether this sequence is temporary. Temporary sequences are transient and only visible within the current connection. | `BOOLEAN` | +| `start_value` | The initial value of the sequence. This value will be returned when `nextval()` is called for the very first time on this sequence. | `BIGINT` | +| `min_value` | The minimum value of the sequence. | `BIGINT` | +| `max_value` | The maximum value of the sequence. | `BIGINT` | +| `increment_by` | The value that is added to the current value of the sequence to draw the next value from the sequence. | `BIGINT` | +| `cycle` | Whether the sequence should start over when drawing the next value would result in a value outside the range. | `BOOLEAN` | +| `last_value` | `null` if no value was ever drawn from the sequence using `nextval(...)`. `1` if a value was drawn. | `BIGINT` | +| `sql` | The definition of this object, expressed as SQL DDL-statement. | `VARCHAR` | + +Attributes like `temporary`, `start_value` etc. correspond to the various options available in the [`CREATE SEQUENCE`]({% link docs/archive/1.0/sql/statements/create_sequence.md %}) statement and are documented there in full. Note that the attributes will always be filled out in the `duckdb_sequences` resultset, even if they were not explicitly specified in the `CREATE SEQUENCE` statement. + +> 1. The column name `last_value` suggests that it contains the last value that was drawn from the sequence, but that is not the case. It's either `null` if a value was never drawn from the sequence, or `1` (when there was a value drawn, ever, from the sequence). +> +> 2. If the sequence cycles, then the sequence will start over from the boundary of its range, not necessarily from the value specified as start value. + +## `duckdb_settings` + +The `duckdb_settings()` function provides metadata about the settings available in the DuckDB instance. + +
+ +| Column | Description | Type | +|:-|:---|:-| +| `name` | Name of the setting. | `VARCHAR` | +| `value` | Current value of the setting. | `VARCHAR` | +| `description` | A description of the setting. | `VARCHAR` | +| `input_type` | The logical datatype of the setting's value. | `VARCHAR` | + +The various settings are described in the [configuration page]({% link docs/archive/1.0/configuration/overview.md %}). + +## `duckdb_tables` + +The `duckdb_tables()` function provides metadata about the base tables available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains this table | `VARCHAR` | +| `database_oid` | Internal identifier of the database containing the table. | `BIGINT` | +| `schema_name` | The SQL name of the schema that contains the base table. | `VARCHAR` | +| `schema_oid` | Internal identifier of the schema object that contains the base table. | `BIGINT` | +| `table_name` | The SQL name of the base table. | `VARCHAR` | +| `table_oid` | Internal identifier of the base table object. | `BIGINT` | +| `internal` | `false` if this is a user-defined table. | `BOOLEAN` | +| `temporary` | Whether this is a temporary table. Temporary tables are not persisted and only visible within the current connection. | `BOOLEAN` | +| `has_primary_key` | `true` if this table object defines a `PRIMARY KEY`. | `BOOLEAN` | +| `estimated_size` | The estimated number of rows in the table. | `BIGINT` | +| `column_count` | The number of columns defined by this object. | `BIGINT` | +| `index_count` | The number of indexes associated with this table. This number includes all secondary indexes, as well as internal indexes generated to maintain `PRIMARY KEY` and/or `UNIQUE` constraints. | `BIGINT` | +| `check_constraint_count` | The number of check constraints active on columns within the table. | `BIGINT` | +| `sql` | The definition of this object, expressed as SQL [`CREATE TABLE`-statement]({% link docs/archive/1.0/sql/statements/create_table.md %}). | `VARCHAR` | + +The [`information_schema.tables`]({% link docs/archive/1.0/sql/meta/information_schema.md %}#tables-tables-and-views) system view provides a more standardized way to obtain metadata about database tables that also includes views. But the resultset returned by `duckdb_tables` contains a few columns that are not included in `information_schema.tables`. + +## `duckdb_temporary_files` + +The `duckdb_temporary_files()` function provides metadata about the temporary files DuckDB has written to disk, to offload data from memory. This function mostly exists for debugging and testing purposes. + +
+ +| Column | Description | Type | +|:-|:---|:-| +| `path` | The name of the temporary file. | `VARCHAR` | +| `size` | The size in bytes of the temporary file. | `BIGINT` | + +## `duckdb_types` + +The `duckdb_types()` function provides metadata about the data types available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains this schema. | `VARCHAR` | +| `database_oid` | Internal identifier of the database that contains the data type. | `BIGINT` | +| `schema_name` | The SQL name of the schema containing the type definition. Always `main`. | `VARCHAR` | +| `schema_oid` | Internal identifier of the schema object. | `BIGINT` | +| `type_name` | The name or alias of this data type. | `VARCHAR` | +| `type_oid` | The internal identifier of the data type object. If `NULL`, then this is an alias of the type (as identified by the value in the `logical_type` column). | `BIGINT` | +| `type_size` | The number of bytes required to represent a value of this type in memory. | `BIGINT` | +| `logical_type` | The 'canonical' name of this data type. The same `logical_type` may be referenced by several types having different `type_name`s. | `VARCHAR` | +| `type_category` | The category to which this type belongs. Data types within the same category generally expose similar behavior when values of this type are used in expression. For example, the `NUMERIC` type_category includes integers, decimals, and floating point numbers. | `VARCHAR` | +| `internal` | Whether this is an internal (built-in) or a user object. | `BOOLEAN` | + +## `duckdb_views` + +The `duckdb_views()` function provides metadata about the views available in the DuckDB instance. + +| Column | Description | Type | +|:-|:---|:-| +| `database_name` | The name of the database that contains this view. | `VARCHAR` | +| `database_oid` | Internal identifier of the database that contains this view. | `BIGINT` | +| `schema_name` | The SQL name of the schema where the view resides. | `VARCHAR` | +| `schema_oid` | Internal identifier of the schema object that contains the view. | `BIGINT` | +| `view_name` | The SQL name of the view object. | `VARCHAR` | +| `view_oid` | The internal identifier of this view object. | `BIGINT` | +| `internal` | `true` if this is an internal (built-in) view, `false` if this is a user-defined view. | `BOOLEAN` | +| `temporary` | `true` if this is a temporary view. Temporary views are not persistent and are only visible within the current connection. | `BOOLEAN` | +| `column_count` | The number of columns defined by this view object. | `BIGINT` | +| `sql` | The definition of this object, expressed as SQL DDL-statement. | `VARCHAR` | + +The [`information_schema.tables`]({% link docs/archive/1.0/sql/meta/information_schema.md %}#tables-tables-and-views) system view provides a more standardized way to obtain metadata about database views that also includes base tables. But the resultset returned by `duckdb_views` contains also definitions of internal view objects as well as a few columns that are not included in `information_schema.tables`. \ No newline at end of file diff --git a/docs/archive/1.0/sql/meta/information_schema.md b/docs/archive/1.0/sql/meta/information_schema.md new file mode 100644 index 00000000000..026b1c813a3 --- /dev/null +++ b/docs/archive/1.0/sql/meta/information_schema.md @@ -0,0 +1,131 @@ +--- +layout: docu +redirect_from: +- docs/archive/1.0/sql/information_schema +title: Information Schema +--- + +The views in the `information_schema` are SQL-standard views that describe the catalog entries of the database. These views can be filtered to obtain information about a specific column or table. +DuckDB's implementation is based on [PostgreSQL's information schema](https://www.postgresql.org/docs/16/infoschema-columns.html). + +## Tables + +### `character_sets`: Character Sets + +| Column | Description | Type | Example | +|--------|-------------|------|---------| +| `character_set_catalog` | Currently not implemented – always `NULL`. | `VARCHAR` | `NULL` | +| `character_set_schema` | Currently not implemented – always `NULL`. | `VARCHAR` | `NULL` | +| `character_set_name` | Name of the character set, currently implemented as showing the name of the database encoding. | `VARCHAR` | `'UTF8'` | +| `character_repertoire` | Character repertoire, showing `UCS` if the encoding is `UTF8`, else just the encoding name. | `VARCHAR` | `'UCS'` | +| `form_of_use` | Character encoding form, same as the database encoding. | `VARCHAR` | `'UTF8'` | +| `default_collate_catalog`| Name of the database containing the default collation (always the current database). | `VARCHAR` | `'my_db'` | +| `default_collate_schema` | Name of the schema containing the default collation. | `VARCHAR` | `'pg_catalog'` | +| `default_collate_name` | Name of the default collation. | `VARCHAR` | `'ucs_basic'` | + +### `columns`: Columns + +The view that describes the catalog information for columns is `information_schema.columns`. It lists the column present in the database and has the following layout: + +| Column | Description | Type | Example | +|:--|:---|:-|:-| +| `table_catalog` | Name of the database containing the table (always the current database). | `VARCHAR` | `'my_db'` | +| `table_schema` | Name of the schema containing the table. | `VARCHAR` | `'main'` | +| `table_name` | Name of the table. | `VARCHAR` | `'widgets'` | +| `column_name` | Name of the column. | `VARCHAR` | `'price'` | +| `ordinal_position` | Ordinal position of the column within the table (count starts at 1). | `INTEGER` | `5` | +| `column_default` | Default expression of the column. |`VARCHAR`| `1.99` | +| `is_nullable` | `YES` if the column is possibly nullable, `NO` if it is known not nullable. |`VARCHAR`| `'YES'` | +| `data_type` | Data type of the column. |`VARCHAR`| `'DECIMAL(18, 2)'` | +| `character_maximum_length` | If `data_type` identifies a character or bit string type, the declared maximum length; null for all other data types or if no maximum length was declared. |`INTEGER`| `255` | +| `character_octet_length` | If `data_type` identifies a character type, the maximum possible length in octets (bytes) of a datum; null for all other data types. The maximum octet length depends on the declared character maximum length (see above) and the character encoding. |`INTEGER`| `1073741824` | +| `numeric_precision` | If `data_type` identifies a numeric type, this column contains the (declared or implicit) precision of the type for this column. The precision indicates the number of significant digits. For all other data types, this column is null. |`INTEGER`| `18` | +| `numeric_scale` | If `data_type` identifies a numeric type, this column contains the (declared or implicit) scale of the type for this column. The precision indicates the number of significant digits. For all other data types, this column is null. |`INTEGER`| `2` | +| `datetime_precision` | If `data_type` identifies a date, time, timestamp, or interval type, this column contains the (declared or implicit) fractional seconds precision of the type for this column, that is, the number of decimal digits maintained following the decimal point in the seconds value. No fractional seconds are currently supported in DuckDB. For all other data types, this column is null. |`INTEGER`| `0` | + +### `key_column_usage`: Key Column Usage + +| Column | Description | Type | Example | +|--------|-------------|------|---------| +| `constraint_catalog` | Name of the database that contains the constraint (always the current database). | `VARCHAR` | `'my_db'` | +| `constraint_schema` | Name of the schema that contains the constraint. | `VARCHAR` | `'main'` | +| `constraint_name` | Name of the constraint. | `VARCHAR` | `'exams_exam_id_fkey'` | +| `table_catalog` | Name of the database that contains the table that contains the column that is restricted by this constraint (always the current database). | `VARCHAR` | `'my_db'` | +| `table_schema` | Name of the schema that contains the table that contains the column that is restricted by this constraint. | `VARCHAR` | `'main'` | +| `table_name` | Name of the table that contains the column that is restricted by this constraint. | `VARCHAR` | `'exams'` | +| `column_name` | Name of the column that is restricted by this constraint. | `VARCHAR` | `'exam_id'` | +| `ordinal_position` | Ordinal position of the column within the constraint key (count starts at 1). | `INTEGER` | `1` | +| `position_in_unique_constraint` | For a foreign-key constraint, ordinal position of the referenced column within its unique constraint (count starts at `1`); otherwise `NULL`. | `INTEGER` | `1` | + +### `referential_constraints`: Referential Constraints + +| Column | Description | Type | Example | +|--------|-------------|------|---------| +| `constraint_catalog` | Name of the database containing the constraint (always the current database). | `VARCHAR` | `'my_db'` | +| `constraint_schema` | Name of the schema containing the constraint. | `VARCHAR` | `main` | +| `constraint_name` | Name of the constraint. | `VARCHAR` | `exam_id_students_id_fkey` | +| `unique_constraint_catalog` | Name of the database that contains the unique or primary key constraint that the foreign key constraint references. | `VARCHAR` | `'my_db'` | +| `unique_constraint_schema` | Name of the schema that contains the unique or primary key constraint that the foreign key constraint references. | `VARCHAR` | `'main'` | +| `unique_constraint_name` | Name of the unique or primary key constraint that the foreign key constraint references. | `VARCHAR` | `'students_id_pkey'` | +| `match_option` | Match option of the foreign key constraint. Always `NONE`. | `VARCHAR` | `NONE` | +| `update_rule` | Update rule of the foreign key constraint. Always `NO ACTION`. | `VARCHAR` | `NO ACTION` | +| `delete_rule` | Delete rule of the foreign key constraint. Always `NO ACTION`. | `VARCHAR` | `NO ACTION` | + +### `schemata`: Database, Catalog and Schema + +The top level catalog view is `information_schema.schemata`. It lists the catalogs and the schemas present in the database and has the following layout: + +| Column | Description | Type | Example | +|:--|:---|:-|:-| +| `catalog_name` | Name of the database that the schema is contained in. | `VARCHAR` | `'my_db'` | +| `schema_name` | Name of the schema. | `VARCHAR` | `'main'` | +| `schema_owner` | Name of the owner of the schema. Not yet implemented. | `VARCHAR` | `'duckdb'` | +| `default_character_set_catalog` | Applies to a feature not available in DuckDB. | `VARCHAR` | `NULL` | +| `default_character_set_schema` | Applies to a feature not available in DuckDB. | `VARCHAR` | `NULL` | +| `default_character_set_name` | Applies to a feature not available in DuckDB. | `VARCHAR` | `NULL` | +| `sql_path` | The file system location of the database. Currently unimplemented. | `VARCHAR` | `NULL` | + +### `tables`: Tables and Views + +The view that describes the catalog information for tables and views is `information_schema.tables`. It lists the tables present in the database and has the following layout: + +| Column | Description | Type | Example | +|:--|:---|:-|:-| +| `table_catalog` | The catalog the table or view belongs to. | `VARCHAR` | `'my_db'` | +| `table_schema` | The schema the table or view belongs to. | `VARCHAR` | `'main'` | +| `table_name` | The name of the table or view. | `VARCHAR` | `'widgets'` | +| `table_type` | The type of table. One of: `BASE TABLE`, `LOCAL TEMPORARY`, `VIEW`. | `VARCHAR` | `'BASE TABLE'` | +| `self_referencing_column_name` | Applies to a feature not available in DuckDB. | `VARCHAR` | `NULL` | +| `reference_generation` | Applies to a feature not available in DuckDB. | `VARCHAR` | `NULL` | +| `user_defined_type_catalog` | If the table is a typed table, the name of the database that contains the underlying data type (always the current database), else null. Currently unimplemented. | `VARCHAR` | `NULL` | +| `user_defined_type_schema` | If the table is a typed table, the name of the schema that contains the underlying data type, else null. Currently unimplemented. | `VARCHAR` | `NULL` | +| `user_defined_type_name` | If the table is a typed table, the name of the underlying data type, else null. Currently unimplemented. | `VARCHAR` | `NULL` | +| `is_insertable_into` | `YES` if the table is insertable into, `NO` if not (Base tables are always insertable into, views not necessarily.)| `VARCHAR` | `'YES'` | +| `is_typed` | `YES` if the table is a typed table, `NO` if not. | `VARCHAR` | `'NO'` | +| `commit_action` | Not yet implemented. | `VARCHAR` | `'NO'` | + +### `table_constraints`: Table Constraints + +| Column | Description | Type | Example | +|--------|-------------|------|---------| +| `constraint_catalog` | Name of the database that contains the constraint (always the current database). | `VARCHAR` | `'my_db'` | +| `constraint_schema` | Name of the schema that contains the constraint. | `VARCHAR` | `'main'` | +| `constraint_name` | Name of the constraint. | `VARCHAR` | `'exams_exam_id_fkey'` | +| `table_catalog` | Name of the database that contains the table (always the current database). | `VARCHAR` | `'my_db'` | +| `table_schema` | Name of the schema that contains the table. | `VARCHAR` | `'main'` | +| `table_name` | Name of the table. | `VARCHAR` | `'exams'` | +| `constraint_type` | Type of the constraint: `CHECK`, `FOREIGN KEY`, `PRIMARY KEY`, or `UNIQUE`. | `VARCHAR` | `'FOREIGN KEY'` | +| `is_deferrable` | `YES` if the constraint is deferrable, `NO` if not. | `VARCHAR` | `'NO'` | +| `initially_deferred` | `YES` if the constraint is deferrable and initially deferred, `NO` if not. | `VARCHAR` | `'NO'` | +| `enforced` | Always `YES`. | `VARCHAR` | `'YES'` | +| `nulls_distinct` | If the constraint is a unique constraint, then `YES` if the constraint treats nulls as distinct or `NO` if it treats nulls as not distinct, otherwise `NULL` for other types of constraints. | `VARCHAR` | `'YES'` | + +## Catalog Functions + +Several functions are also provided to see details about the catalogs and schemas that are configured in the database. + +| Function | Description | Example | Result | +|:--|:---|:--|:--| +| `current_catalog()` | Return the name of the currently active catalog. Default is memory. | `current_catalog()` | `'memory'` | +| `current_schema()` | Return the name of the currently active schema. Default is main. | `current_schema()` | `'main'` | +| `current_schemas(boolean)` | Return list of schemas. Pass a parameter of `true` to include implicit schemas. | `current_schemas(true)` | `['temp', 'main', 'pg_catalog']` | \ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/filter.md b/docs/archive/1.0/sql/query_syntax/filter.md new file mode 100644 index 00000000000..e8640436704 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/filter.md @@ -0,0 +1,143 @@ +--- +layout: docu +railroad: query_syntax/filter.js +title: FILTER Clause +--- + +The `FILTER` clause may optionally follow an aggregate function in a `SELECT` statement. This will filter the rows of data that are fed into the aggregate function in the same way that a `WHERE` clause filters rows, but localized to the specific aggregate function. `FILTER`s are not currently able to be used when the aggregate function is in a windowing context. + +There are multiple types of situations where this is useful, including when evaluating multiple aggregates with different filters, and when creating a pivoted view of a dataset. `FILTER` provides a cleaner syntax for pivoting data when compared with the more traditional `CASE WHEN` approach discussed below. + +Some aggregate functions also do not filter out null values, so using a `FILTER` clause will return valid results when at times the `CASE WHEN` approach will not. This occurs with the functions `first` and `last`, which are desirable in a non-aggregating pivot operation where the goal is to simply re-orient the data into columns rather than re-aggregate it. `FILTER` also improves null handling when using the `list` and `array_agg` functions, as the `CASE WHEN` approach will include null values in the list result, while the `FILTER` clause will remove them. + +## Examples + +Return the following: + +* The total number of rows. +* The number of rows where `i <= 5` +* The number of rows where `i` is odd + +```sql +SELECT + count(*) AS total_rows, + count(*) FILTER (i <= 5) AS lte_five, + count(*) FILTER (i % 2 = 1) AS odds +FROM generate_series(1, 10) tbl(i); +``` + +
+ +| total_rows | lte_five | odds | +|:---|:---|:---| +| 10 | 5 | 5 | + +Different aggregate functions may be used, and multiple `WHERE` expressions are also permitted: + +```sql +SELECT + sum(i) FILTER (i <= 5) AS lte_five_sum, + median(i) FILTER (i % 2 = 1) AS odds_median, + median(i) FILTER (i % 2 = 1 AND i <= 5) AS odds_lte_five_median +FROM generate_series(1, 10) tbl(i); +``` + +
+ +| lte_five_sum | odds_median | odds_lte_five_median | +|:---|:---|:---| +| 15 | 5.0 | 3.0 | + +The `FILTER` clause can also be used to pivot data from rows into columns. This is a static pivot, as columns must be defined prior to runtime in SQL. However, this kind of statement can be dynamically generated in a host programming language to leverage DuckDB's SQL engine for rapid, larger than memory pivoting. + +First generate an example dataset: + +```sql +CREATE TEMP TABLE stacked_data AS + SELECT + i, + CASE WHEN i <= rows * 0.25 THEN 2022 + WHEN i <= rows * 0.5 THEN 2023 + WHEN i <= rows * 0.75 THEN 2024 + WHEN i <= rows * 0.875 THEN 2025 + ELSE NULL + END AS year + FROM ( + SELECT + i, + count(*) OVER () AS rows + FROM generate_series(1, 100_000_000) tbl(i) + ) tbl; +``` + +“Pivot” the data out by year (move each year out to a separate column): + +```sql +SELECT + count(i) FILTER (year = 2022) AS "2022", + count(i) FILTER (year = 2023) AS "2023", + count(i) FILTER (year = 2024) AS "2024", + count(i) FILTER (year = 2025) AS "2025", + count(i) FILTER (year IS NULL) AS "NULLs" +FROM stacked_data; +``` + +This syntax produces the same results as the `FILTER` clauses above: + +```sql +SELECT + count(CASE WHEN year = 2022 THEN i END) AS "2022", + count(CASE WHEN year = 2023 THEN i END) AS "2023", + count(CASE WHEN year = 2024 THEN i END) AS "2024", + count(CASE WHEN year = 2025 THEN i END) AS "2025", + count(CASE WHEN year IS NULL THEN i END) AS "NULLs" +FROM stacked_data; +``` + +
+ +| 2022 | 2023 | 2024 | 2025 | NULLs | +|:---|:---|:---|:---|:---| +| 25000000 | 25000000 | 25000000 | 12500000 | 12500000 | + +However, the `CASE WHEN` approach will not work as expected when using an aggregate function that does not ignore `NULL` values. The `first` function falls into this category, so `FILTER` is preferred in this case. + +“Pivot” the data out by year (move each year out to a separate column): + +```sql +SELECT + first(i) FILTER (year = 2022) AS "2022", + first(i) FILTER (year = 2023) AS "2023", + first(i) FILTER (year = 2024) AS "2024", + first(i) FILTER (year = 2025) AS "2025", + first(i) FILTER (year IS NULL) AS "NULLs" +FROM stacked_data; +``` + +
+ +| 2022 | 2023 | 2024 | 2025 | NULLs | +|:---|:---|:---|:---|:---| +| 1474561 | 25804801 | 50749441 | 76431361 | 87500001 | + +This will produce `NULL` values whenever the first evaluation of the `CASE WHEN` clause returns a `NULL`: + +```sql +SELECT + first(CASE WHEN year = 2022 THEN i END) AS "2022", + first(CASE WHEN year = 2023 THEN i END) AS "2023", + first(CASE WHEN year = 2024 THEN i END) AS "2024", + first(CASE WHEN year = 2025 THEN i END) AS "2025", + first(CASE WHEN year IS NULL THEN i END) AS "NULLs" +FROM stacked_data; +``` + +
+ +| 2022 | 2023 | 2024 | 2025 | NULLs | +|:---|:---|:---|:---|:---| +| 1228801 | NULL | NULL | NULL | NULL | + +## Aggregate Function Syntax (Including `FILTER` Clause) + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/from.md b/docs/archive/1.0/sql/query_syntax/from.md new file mode 100644 index 00000000000..542451e23ef --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/from.md @@ -0,0 +1,534 @@ +--- +blurb: The FROM clause can contain a single table, a combination of multiple tables + that are joined together, or another SELECT query inside a subquery node. +layout: docu +railroad: query_syntax/from.js +title: FROM & JOIN Clauses +--- + +The `FROM` clause specifies the *source* of the data on which the remainder of the query should operate. Logically, the `FROM` clause is where the query starts execution. The `FROM` clause can contain a single table, a combination of multiple tables that are joined together using `JOIN` clauses, or another `SELECT` query inside a subquery node. DuckDB also has an optional `FROM`-first syntax which enables you to also query without a `SELECT` statement. + +## Examples + +Select all columns from the table called `table_name`: + +```sql +SELECT * +FROM table_name; +``` + +Select all columns from the table using the `FROM`-first syntax: + +```sql +FROM table_name +SELECT *; +``` + +Select all columns using the `FROM`-first syntax and omitting the `SELECT` clause: + +```sql +FROM table_name; +``` + +Select all columns from the table called `table_name` through an alias `tn`: + +```sql +SELECT tn.* +FROM table_name tn; +``` + +Select all columns from the table `table_name` in the schema `schema_name`: + +```sql +SELECT * +FROM schema_name.table_name; +``` + +Select the column `i` from the table function `range`, where the first column of the range function is renamed to `i`: + +```sql +SELECT t.i +FROM range(100) AS t(i); +``` + +Select all columns from the CSV file called `test.csv`: + +```sql +SELECT * +FROM 'test.csv'; +``` + +Select all columns from a subquery: + +```sql +SELECT * +FROM (SELECT * FROM table_name); +``` + +Select the entire row of the table as a struct: + +```sql +SELECT t +FROM t; +``` + +Select the entire row of the subquery as a struct (i.e., a single column): + +```sql +SELECT t +FROM (SELECT unnest(generate_series(41, 43)) AS x, 'hello' AS y) t; +``` + +Join two tables together: + +```sql +SELECT * +FROM table_name +JOIN other_table + ON table_name.key = other_table.key; +``` + +Select a 10% sample from a table: + +```sql +SELECT * +FROM table_name +TABLESAMPLE 10%; +``` + +Select a sample of 10 rows from a table: + +```sql +SELECT * +FROM table_name +TABLESAMPLE 10 ROWS; +``` + +Use the `FROM`-first syntax with `WHERE` clause and aggregation: + +```sql +FROM range(100) AS t(i) +SELECT sum(t.i) +WHERE i % 2 = 0; +``` + +## Joins + +Joins are a fundamental relational operation used to connect two tables or relations horizontally. +The relations are referred to as the _left_ and _right_ sides of the join +based on how they are written in the join clause. +Each result row has the columns from both relations. + +A join uses a rule to match pairs of rows from each relation. +Often this is a predicate, but there are other implied rules that may be specified. + +### Outer Joins + +Rows that do not have any matches can still be returned if an `OUTER` join is specified. +Outer joins can be one of: + +* `LEFT` (All rows from the left relation appear at least once) +* `RIGHT` (All rows from the right relation appear at least once) +* `FULL` (All rows from both relations appear at least once) + +A join that is not `OUTER` is `INNER` (only rows that get paired are returned). + +When an unpaired row is returned, the attributes from the other table are set to `NULL`. + +### Cross Product Joins (Cartesian Product) + +The simplest type of join is a `CROSS JOIN`. +There are no conditions for this type of join, +and it just returns all the possible pairs. + +Return all pairs of rows: + +```sql +SELECT a.*, b.* +FROM a +CROSS JOIN b; +``` + +This is equivalent to omitting the `JOIN` clause: + +```sql +SELECT a.*, b.* +FROM a, b; +``` + +### Conditional Joins + +Most joins are specified by a predicate that connects +attributes from one side to attributes from the other side. +The conditions can be explicitly specified using an `ON` clause +with the join (clearer) or implied by the `WHERE` clause (old-fashioned). + +We use the `l_regions` and the `l_nations` tables from the TPC-H schema: + +```sql +CREATE TABLE l_regions ( + r_regionkey INTEGER NOT NULL PRIMARY KEY, + r_name CHAR(25) NOT NULL, + r_comment VARCHAR(152) +); + +CREATE TABLE l_nations ( + n_nationkey INTEGER NOT NULL PRIMARY KEY, + n_name CHAR(25) NOT NULL, + n_regionkey INTEGER NOT NULL, + n_comment VARCHAR(152), + FOREIGN KEY (n_regionkey) REFERENCES l_regions(r_regionkey) +); +``` + +Return the regions for the nations: + +```sql +SELECT n.*, r.* +FROM l_nations n +JOIN l_regions r ON (n_regionkey = r_regionkey); +``` + +If the column names are the same and are required to be equal, +then the simpler `USING` syntax can be used: + +```sql +CREATE TABLE l_regions (regionkey INTEGER NOT NULL PRIMARY KEY, + name CHAR(25) NOT NULL, + comment VARCHAR(152)); + +CREATE TABLE l_nations (nationkey INTEGER NOT NULL PRIMARY KEY, + name CHAR(25) NOT NULL, + regionkey INTEGER NOT NULL, + comment VARCHAR(152), + FOREIGN KEY (regionkey) REFERENCES l_regions(regionkey)); +``` + +Return the regions for the nations: + +```sql +SELECT n.*, r.* +FROM l_nations n +JOIN l_regions r USING (regionkey); +``` + +The expressions do not have to be equalities – any predicate can be used: + +Return the pairs of jobs where one ran longer but cost less: + +```sql +SELECT s1.t_id, s2.t_id +FROM west s1, west s2 +WHERE s1.time > s2.time + AND s1.cost < s2.cost; +``` + +### Natural Joins + +Natural joins join two tables based on attributes that share the same name. + +For example, take the following example with cities, airport codes and airport names. Note that both tables are intentionally incomplete, i.e., they do not have a matching pair in the other table. + +```sql +CREATE TABLE city_airport (city_name VARCHAR, iata VARCHAR); +CREATE TABLE airport_names (iata VARCHAR, airport_name VARCHAR); +INSERT INTO city_airport VALUES + ('Amsterdam', 'AMS'), + ('Rotterdam', 'RTM'), + ('Eindhoven', 'EIN'), + ('Groningen', 'GRQ'); +INSERT INTO airport_names VALUES + ('AMS', 'Amsterdam Airport Schiphol'), + ('RTM', 'Rotterdam The Hague Airport'), + ('MST', 'Maastricht Aachen Airport'); +``` + +To join the tables on their shared [`IATA`](https://en.wikipedia.org/wiki/IATA_airport_code) attributes, run: + +```sql +SELECT * +FROM city_airport +NATURAL JOIN airport_names; +``` + +This produces the following result: + +| city_name | iata | airport_name | +|-----------|------|-----------------------------| +| Amsterdam | AMS | Amsterdam Airport Schiphol | +| Rotterdam | RTM | Rotterdam The Hague Airport | + +Note that only rows where the same `iata` attribute was present in both tables were included in the result. + +We can also express query using the vanilla `JOIN` clause with the `USING` keyword: + +```sql +SELECT * +FROM city_airport +JOIN airport_names +USING (iata); +``` + +### Semi and Anti Joins + +Semi joins return rows from the left table that have at least one match in the right table. +Anti joins return rows from the left table that have _no_ matches in the right table. +When using a semi or anti join the result will never have more rows than the left hand side table. +Semi and anti joins provide the same logic as [`IN`]({% link docs/archive/1.0/sql/expressions/in.md %}) and `NOT IN` statements, respectively. + +#### Semi Join Example + +Return a list of city–airport code pairs from the `city_airport` table where the airport name **is available** in the `airport_names` table: + +```sql +SELECT * +FROM city_airport +SEMI JOIN airport_names + USING (iata); +``` + +| city_name | iata | +|-----------|------| +| Amsterdam | AMS | +| Rotterdam | RTM | + +This query is equivalent with: + +```sql +SELECT * +FROM city_airport +WHERE iata IN (SELECT iata FROM airport_names); +``` + +#### Anti Join Example + +Return a list of city–airport code pairs from the `city_airport` table where the airport name **is not available** in the `airport_names` table: + +```sql +SELECT * +FROM city_airport +ANTI JOIN airport_names + USING (iata); +``` + +| city_name | iata | +|-----------|------| +| Eindhoven | EIN | +| Groningen | GRQ | + +This query is equivalent with: + +```sql +SELECT * +FROM city_airport +WHERE iata NOT IN (SELECT iata FROM airport_names); +``` + +### Lateral Joins + +The `LATERAL` keyword allows subqueries in the `FROM` clause to refer to previous subqueries. This feature is also known as a _lateral join_. + +```sql +SELECT * +FROM range(3) t(i), LATERAL (SELECT i + 1) t2(j); +``` + +| i | j | +|--:|--:| +| 0 | 1 | +| 2 | 3 | +| 1 | 2 | + +Lateral joins are a generalization of correlated subqueries, as they can return multiple values per input value rather than only a single value. + +```sql +SELECT * +FROM + generate_series(0, 1) t(i), + LATERAL (SELECT i + 10 UNION ALL SELECT i + 100) t2(j); +``` + +| i | j | +|--:|----:| +| 0 | 10 | +| 1 | 11 | +| 0 | 100 | +| 1 | 101 | + +It may be helpful to think about `LATERAL` as a loop where we iterate through the rows of the first subquery and use it as input to the second (`LATERAL`) subquery. +In the examples above, we iterate through table `t` and refer to its column `i` from the definition of table `t2`. The rows of `t2` form column `j` in the result. + +It is possible to refer to multiple attributes from the `LATERAL` subquery. Using the table from the first example: + +```sql +CREATE TABLE t1 AS + SELECT * + FROM range(3) t(i), LATERAL (SELECT i + 1) t2(j); + +SELECT * + FROM t1, LATERAL (SELECT i + j) t2(k) + ORDER BY ALL; +``` + +| i | j | k | +|--:|--:|--:| +| 0 | 1 | 1 | +| 1 | 2 | 3 | +| 2 | 3 | 5 | + +> DuckDB detects when `LATERAL` joins should be used, making the use of the `LATERAL` keyword optional. + +### Positional Joins + +When working with data frames or other embedded tables of the same size, +the rows may have a natural correspondence based on their physical order. +In scripting languages, this is easily expressed using a loop: + +```cpp +for (i = 0; i < n; i++) { + f(t1.a[i], t2.b[i]); +} +``` + +It is difficult to express this in standard SQL because +relational tables are not ordered, but imported tables such as [data frames]({% link docs/archive/1.0/api/python/data_ingestion.md %}#pandas-dataframes-–-object-columns) +or disk files (like [CSVs]({% link docs/archive/1.0/data/csv/overview.md %}) or [Parquet files]({% link docs/archive/1.0/data/parquet/overview.md %})) do have a natural ordering. + +Connecting them using this ordering is called a _positional join:_ + +```sql +CREATE TABLE t1 (x INTEGER); +CREATE TABLE t2 (s VARCHAR); + +INSERT INTO t1 VALUES (1), (2), (3); +INSERT INTO t2 VALUES ('a'), ('b'); + +SELECT * +FROM t1 +POSITIONAL JOIN t2; +``` + +| x | s | +|--:|------| +| 1 | a | +| 2 | b | +| 3 | NULL | + +Positional joins are always `FULL OUTER` joins, i.e., missing values (the last values in the shorter column) are set to `NULL`. + +### As-Of Joins + +A common operation when working with temporal or similarly-ordered data +is to find the nearest (first) event in a reference table (such as prices). +This is called an _as-of join:_ + +Attach prices to stock trades: + +```sql +SELECT t.*, p.price +FROM trades t +ASOF JOIN prices p + ON t.symbol = p.symbol AND t.when >= p.when; +``` + +The `ASOF` join requires at least one inequality condition on the ordering field. +The inequality can be any inequality condition (`>=`, `>`, `<=`, `<`) +on any data type, but the most common form is `>=` on a temporal type. +Any other conditions must be equalities (or `NOT DISTINCT`). +This means that the left/right order of the tables is significant. + +`ASOF` joins each left side row with at most one right side row. +It can be specified as an `OUTER` join to find unpaired rows +(e.g., trades without prices or prices which have no trades.) + +Attach prices or NULLs to stock trades: + +```sql +SELECT * +FROM trades t +ASOF LEFT JOIN prices p + ON t.symbol = p.symbol + AND t.when >= p.when; +``` + +`ASOF` joins can also specify join conditions on matching column names with the `USING` syntax, +but the *last* attribute in the list must be the inequality, +which will be greater than or equal to (`>=`): + +```sql +SELECT * +FROM trades t +ASOF JOIN prices p USING (symbol, "when"); +``` + +Returns symbol, trades.when, price (but NOT prices.when): + +If you combine `USING` with a `SELECT *` like this, +the query will return the left side (probe) column values for the matches, +not the right side (build) column values. +To get the `prices` times in the example, you will need to list the columns explicitly: + +```sql +SELECT t.symbol, t.when AS trade_when, p.when AS price_when, price +FROM trades t +ASOF LEFT JOIN prices p USING (symbol, "when"); +``` + +## `FROM`-First Syntax + +DuckDB's SQL supports the `FROM`-first syntax, i.e., it allows putting the `FROM` clause before the `SELECT` clause or completely omitting the `SELECT` clause. We use the following example to demonstrate it: + +```sql +CREATE TABLE tbl AS + SELECT * + FROM (VALUES ('a'), ('b')) t1(s), range(1, 3) t2(i); +``` + +### `FROM`-First Syntax with a `SELECT` Clause + +The following statement demonstrates the use of the `FROM`-first syntax: + +```sql +FROM tbl +SELECT i, s; +``` + +This is equivalent to: + +```sql +SELECT i, s +FROM tbl; +``` + +| i | s | +|--:|---| +| 1 | a | +| 2 | a | +| 1 | b | +| 2 | b | + +### `FROM`-First Syntax without a `SELECT` Clause + +The following statement demonstrates the use of the optional `SELECT` clause: + +```sql +FROM tbl; +``` + +This is equivalent to: + +```sql +SELECT * +FROM tbl; +``` + +| s | i | +|---|--:| +| a | 1 | +| a | 2 | +| b | 1 | +| b | 2 | + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/groupby.md b/docs/archive/1.0/sql/query_syntax/groupby.md new file mode 100644 index 00000000000..017f083df4f --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/groupby.md @@ -0,0 +1,64 @@ +--- +layout: docu +railroad: query_syntax/groupby.js +title: GROUP BY Clause +--- + +The `GROUP BY` clause specifies which grouping columns should be used to perform any aggregations in the `SELECT` clause. +If the `GROUP BY` clause is specified, the query is always an aggregate query, even if no aggregations are present in the `SELECT` clause. + +When a `GROUP BY` clause is specified, all tuples that have matching data in the grouping columns (i.e., all tuples that belong to the same group) will be combined. +The values of the grouping columns themselves are unchanged, and any other columns can be combined using an [aggregate function]({% link docs/archive/1.0/sql/functions/aggregates.md %}) (such as `count`, `sum`, `avg`, etc). + +## `GROUP BY ALL` + +Use `GROUP BY ALL` to `GROUP BY` all columns in the `SELECT` statement that are not wrapped in aggregate functions. +This simplifies the syntax by allowing the columns list to be maintained in a single location, and prevents bugs by keeping the `SELECT` granularity aligned to the `GROUP BY` granularity (Ex: Prevents any duplication). +See examples below and additional examples in the [“Friendlier SQL with DuckDB” blog post]({% post_url 2022-05-04-friendlier-sql %}#group-by-all). + +## Multiple Dimensions + +Normally, the `GROUP BY` clause groups along a single dimension. +Using the [`GROUPING SETS`, `CUBE` or `ROLLUP` clauses]({% link docs/archive/1.0/sql/query_syntax/grouping_sets.md %}) it is possible to group along multiple dimensions. +See the [`GROUPING SETS`]({% link docs/archive/1.0/sql/query_syntax/grouping_sets.md %}) page for more information. + +## Examples + +Count the number of entries in the `addresses` table that belong to each different city: + +```sql +SELECT city, count(*) +FROM addresses +GROUP BY city; +``` + +Compute the average income per city per street_name: + +```sql +SELECT city, street_name, avg(income) +FROM addresses +GROUP BY city, street_name; +``` + +### `GROUP BY ALL` Examples + +Group by city and street_name to remove any duplicate values: + +```sql +SELECT city, street_name +FROM addresses +GROUP BY ALL; +``` + +Compute the average income per city per street_name. Since income is wrapped in an aggregate function, do not include it in the `GROUP BY`: + +```sql +SELECT city, street_name, avg(income) +FROM addresses +GROUP BY ALL; +-- GROUP BY city, street_name: +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/grouping_sets.md b/docs/archive/1.0/sql/query_syntax/grouping_sets.md new file mode 100644 index 00000000000..6c4689cfddc --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/grouping_sets.md @@ -0,0 +1,213 @@ +--- +layout: docu +railroad: query_syntax/groupby.js +title: GROUPING SETS +--- + +`GROUPING SETS`, `ROLLUP` and `CUBE` can be used in the `GROUP BY` clause to perform a grouping over multiple dimensions within the same query. +Note that this syntax is not compatible with [`GROUP BY ALL`]({% link docs/archive/1.0/sql/query_syntax/groupby.md %}#group-by-all). + +## Examples + +Compute the average income along the provided four different dimensions: + +```sql +-- the syntax () denotes the empty set (i.e., computing an ungrouped aggregate) +SELECT city, street_name, avg(income) +FROM addresses +GROUP BY GROUPING SETS ((city, street_name), (city), (street_name), ()); +``` + +Compute the average income along the same dimensions: + +```sql +SELECT city, street_name, avg(income) +FROM addresses +GROUP BY CUBE (city, street_name); +``` + +Compute the average income along the dimensions `(city, street_name)`, `(city)` and `()`: + +```sql +SELECT city, street_name, avg(income) +FROM addresses +GROUP BY ROLLUP (city, street_name); +``` + +## Description + +`GROUPING SETS` perform the same aggregate across different `GROUP BY clauses` in a single query. + +```sql +CREATE TABLE students (course VARCHAR, type VARCHAR); +INSERT INTO students (course, type) +VALUES + ('CS', 'Bachelor'), ('CS', 'Bachelor'), ('CS', 'PhD'), ('Math', 'Masters'), + ('CS', NULL), ('CS', NULL), ('Math', NULL); +``` + +```sql +SELECT course, type, count(*) +FROM students +GROUP BY GROUPING SETS ((course, type), course, type, ()); +``` + +| course | type | count_star() | +|--------|----------|-------------:| +| Math | NULL | 1 | +| NULL | NULL | 7 | +| CS | PhD | 1 | +| CS | Bachelor | 2 | +| Math | Masters | 1 | +| CS | NULL | 2 | +| Math | NULL | 2 | +| CS | NULL | 5 | +| NULL | NULL | 3 | +| NULL | Masters | 1 | +| NULL | Bachelor | 2 | +| NULL | PhD | 1 | + +In the above query, we group across four different sets: `course, type`, `course`, `type` and `()` (the empty group). The result contains `NULL` for a group which is not in the grouping set for the result, i.e., the above query is equivalent to the following UNION statement: + +Group by course, type: + +```sql +SELECT course, type, count(*) +FROM students +GROUP BY course, type +UNION ALL +``` + +Group by type: + +```sql +SELECT NULL AS course, type, count(*) +FROM students +GROUP BY type +UNION ALL +``` + +Group by course: + +```sql +SELECT course, NULL AS type, count(*) +FROM students +GROUP BY course +UNION ALL +``` + +Group by nothing: + +```sql +SELECT NULL AS course, NULL AS type, count(*) +FROM students; +``` + +`CUBE` and `ROLLUP` are syntactic sugar to easily produce commonly used grouping sets. + +The `ROLLUP` clause will produce all “sub-groups” of a grouping set, e.g., `ROLLUP (country, city, zip)` produces the grouping sets `(country, city, zip), (country, city), (country), ()`. This can be useful for producing different levels of detail of a group by clause. This produces `n+1` grouping sets where n is the amount of terms in the `ROLLUP` clause. + +`CUBE` produces grouping sets for all combinations of the inputs, e.g., `CUBE (country, city, zip)` will produce `(country, city, zip), (country, city), (country, zip), (city, zip), (country), (city), (zip), ()`. This produces `2^n` grouping sets. + +## Identifying Grouping Sets with `GROUPING_ID()` + +The super-aggregate rows generated by `GROUPING SETS`, `ROLLUP` and `CUBE` can often be identified by `NULL`-values returned for the respective column in the grouping. But if the columns used in the grouping can themselves contain actual `NULL`-values, then it can be challenging to distinguish whether the value in the resultset is a “real” `NULL`-value coming out of the data itself, or a `NULL`-value generated by the grouping construct. The `GROUPING_ID()` or `GROUPING()` function is designed to identify which groups generated the super-aggregate rows in the result. + +`GROUPING_ID()` is an aggregate function that takes the column expressions that make up the grouping(s). It returns a `BIGINT` value. The return value is `0` for the rows that are not super-aggregate rows. But for the super-aggregate rows, it returns an integer value that identifies the combination of expressions that make up the group for which the super-aggregate is generated. At this point, an example might help. Consider the following query: + +```sql +WITH days AS ( + SELECT + year("generate_series") AS y, + quarter("generate_series") AS q, + month("generate_series") AS m + FROM generate_series(DATE '2023-01-01', DATE '2023-12-31', INTERVAL 1 DAY) +) +SELECT y, q, m, GROUPING_ID(y, q, m) AS "grouping_id()" +FROM days +GROUP BY GROUPING SETS ( + (y, q, m), + (y, q), + (y), + () +) +ORDER BY y, q, m; +``` + +These are the results: + +| y | q | m | grouping_id() | +|-----:|-----:|-----:|--------------:| +| 2023 | 1 | 1 | 0 | +| 2023 | 1 | 2 | 0 | +| 2023 | 1 | 3 | 0 | +| 2023 | 1 | NULL | 1 | +| 2023 | 2 | 4 | 0 | +| 2023 | 2 | 5 | 0 | +| 2023 | 2 | 6 | 0 | +| 2023 | 2 | NULL | 1 | +| 2023 | 3 | 7 | 0 | +| 2023 | 3 | 8 | 0 | +| 2023 | 3 | 9 | 0 | +| 2023 | 3 | NULL | 1 | +| 2023 | 4 | 10 | 0 | +| 2023 | 4 | 11 | 0 | +| 2023 | 4 | 12 | 0 | +| 2023 | 4 | NULL | 1 | +| 2023 | NULL | NULL | 3 | +| NULL | NULL | NULL | 7 | + +In this example, the lowest level of grouping is at the month level, defined by the grouping set `(y, q, m)`. Result rows corresponding to that level are simply aggregate rows and the `GROUPING_ID(y, q, m)` function returns `0` for those. The grouping set `(y, q)` results in super-aggregate rows over the month level, leaving a `NULL`-value for the `m` column, and for which `GROUPING_ID(y, q, m)` returns `1`. The grouping set `(y)` results in super-aggregate rows over the quarter level, leaving `NULL`-values for the `m` and `q` column, for which `GROUPING_ID(y, q, m)` returns `3`. Finally, the `()` grouping set results in one super-aggregate row for the entire resultset, leaving `NULL`-values for `y`, `q` and `m` and for which `GROUPING_ID(y, q, m)` returns `7`. + +To understand the relationship between the return value and the grouping set, you can think of `GROUPING_ID(y, q, m)` writing to a bitfield, where the first bit corresponds to the last expression passed to `GROUPING_ID()`, the second bit to the one-but-last expression passed to `GROUPING_ID()`, and so on. This may become clearer by casting `GROUPING_ID()` to `BIT`: + +```sql +WITH days AS ( + SELECT + year("generate_series") AS y, + quarter("generate_series") AS q, + month("generate_series") AS m + FROM generate_series(DATE '2023-01-01', DATE '2023-12-31', INTERVAL 1 DAY) +) +SELECT + y, q, m, + GROUPING_ID(y, q, m) AS "grouping_id(y, q, m)", + right(GROUPING_ID(y, q, m)::BIT::VARCHAR, 3) AS "y_q_m_bits" +FROM days +GROUP BY GROUPING SETS ( + (y, q, m), + (y, q), + (y), + () +) +ORDER BY y, q, m; +``` + +Which returns these results: + +| y | q | m | grouping_id(y, q, m) | y_q_m_bits | +|-----:|-----:|-----:|---------------------:|------------| +| 2023 | 1 | 1 | 0 | 000 | +| 2023 | 1 | 2 | 0 | 000 | +| 2023 | 1 | 3 | 0 | 000 | +| 2023 | 1 | NULL | 1 | 001 | +| 2023 | 2 | 4 | 0 | 000 | +| 2023 | 2 | 5 | 0 | 000 | +| 2023 | 2 | 6 | 0 | 000 | +| 2023 | 2 | NULL | 1 | 001 | +| 2023 | 3 | 7 | 0 | 000 | +| 2023 | 3 | 8 | 0 | 000 | +| 2023 | 3 | 9 | 0 | 000 | +| 2023 | 3 | NULL | 1 | 001 | +| 2023 | 4 | 10 | 0 | 000 | +| 2023 | 4 | 11 | 0 | 000 | +| 2023 | 4 | 12 | 0 | 000 | +| 2023 | 4 | NULL | 1 | 001 | +| 2023 | NULL | NULL | 3 | 011 | +| NULL | NULL | NULL | 7 | 111 | + +Note that the number of expressions passed to `GROUPING_ID()`, or the order in which they are passed is independent from the actual group definitions appearing in the `GROUPING SETS`-clause (or the groups implied by `ROLLUP` and `CUBE`). As long as the expressions passed to `GROUPING_ID()` are expressions that appear some where in the `GROUPING SETS`-clause, `GROUPING_ID()` will set a bit corresponding to the position of the expression whenever that expression is rolled up to a super-aggregate. + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/having.md b/docs/archive/1.0/sql/query_syntax/having.md new file mode 100644 index 00000000000..88f6e4afbd8 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/having.md @@ -0,0 +1,31 @@ +--- +layout: docu +railroad: query_syntax/groupby.js +title: HAVING Clause +--- + +The `HAVING` clause can be used after the `GROUP BY` clause to provide filter criteria *after* the grouping has been completed. In terms of syntax the `HAVING` clause is identical to the `WHERE` clause, but while the `WHERE` clause occurs before the grouping, the `HAVING` clause occurs after the grouping. + +## Examples + +Count the number of entries in the `addresses` table that belong to each different `city`, filtering out cities with a count below 50: + +```sql +SELECT city, count(*) +FROM addresses +GROUP BY city +HAVING count(*) >= 50; +``` + +Compute the average income per city per `street_name`, filtering out cities with an average `income` bigger than twice the median `income`: + +```sql +SELECT city, street_name, avg(income) +FROM addresses +GROUP BY city, street_name +HAVING avg(income) > 2 * median(income); +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/limit.md b/docs/archive/1.0/sql/query_syntax/limit.md new file mode 100644 index 00000000000..fd823e02821 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/limit.md @@ -0,0 +1,42 @@ +--- +layout: docu +railroad: query_syntax/orderby.js +title: LIMIT and OFFSET Clauses +--- + +`LIMIT` is an output modifier. Logically it is applied at the very end of the query. The `LIMIT` clause restricts the amount of rows fetched. The `OFFSET` clause indicates at which position to start reading the values, i.e., the first `OFFSET` values are ignored. + +Note that while `LIMIT` can be used without an `ORDER BY` clause, the results might not be deterministic without the `ORDER BY` clause. This can still be useful, however, for example when you want to inspect a quick snapshot of the data. + +## Examples + +Select the first 5 rows from the addresses table: + +```sql +SELECT * +FROM addresses +LIMIT 5; +``` + +Select the 5 rows from the addresses table, starting at position 5 (i.e., ignoring the first 5 rows): + +```sql +SELECT * +FROM addresses +LIMIT 5 +OFFSET 5; +``` + +Select the top 5 cities with the highest population: + +```sql +SELECT city, count(*) AS population +FROM addresses +GROUP BY city +ORDER BY population DESC +LIMIT 5; +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/orderby.md b/docs/archive/1.0/sql/query_syntax/orderby.md new file mode 100644 index 00000000000..9ac47c41e0a --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/orderby.md @@ -0,0 +1,134 @@ +--- +layout: docu +railroad: query_syntax/orderby.js +title: ORDER BY Clause +--- + +`ORDER BY` is an output modifier. Logically it is applied near the very end of the query (just prior to [`LIMIT`]({% link docs/archive/1.0/sql/query_syntax/limit.md %}) or [`OFFSET`]({% link docs/archive/1.0/sql/query_syntax/limit.md %}), if present). +The `ORDER BY` clause sorts the rows on the sorting criteria in either ascending or descending order. +In addition, every order clause can specify whether `NULL` values should be moved to the beginning or to the end. + +The `ORDER BY` clause may contain one or more expressions, separated by commas. +An error will be thrown if no expressions are included, since the `ORDER BY` clause should be removed in that situation. +The expressions may begin with either an arbitrary scalar expression (which could be a column name), a column position number (Ex: `1`. Note that it is 1-indexed), or the keyword `ALL`. +Each expression can optionally be followed by an order modifier (`ASC` or `DESC`, default is `ASC`), and/or a `NULL` order modifier (`NULLS FIRST` or `NULLS LAST`, default is `NULLS LAST`). + +## `ORDER BY ALL` + +The `ALL` keyword indicates that the output should be sorted by every column in order from left to right. +The direction of this sort may be modified using either `ORDER BY ALL ASC` or `ORDER BY ALL DESC` and/or `NULLS FIRST` or `NULLS LAST`. +Note that `ALL` may not be used in combination with other expressions in the `ORDER BY` clause – it must be by itself. +See examples below. + +## NULL Order Modifier + +By default if no modifiers are provided, DuckDB sorts `ASC NULLS LAST`, i.e., the values are sorted in ascending order and null values are placed last. +This is identical to the default sort order of PostgreSQL. The default sort order can be changed with the following configuration options. + +> Using `ASC NULLS LAST` as the default sorting order was a breaking change in version 0.8.0. Prior to 0.8.0, DuckDB sorted using `ASC NULLS FIRST`. + +Change the default null sorting order to either `NULLS FIRST` and `NULLS LAST`: + +```sql +SET default_null_order = 'NULLS FIRST'; +``` + +Change the default sorting order to either `DESC` or `ASC`: + +```sql +SET default_order = 'DESC'; +``` + +## Collations + +Text is sorted using the binary comparison collation by default, which means values are sorted on their binary UTF-8 values. +While this works well for ASCII text (e.g., for English language data), the sorting order can be incorrect for other languages. +For this purpose, DuckDB provides collations. +For more information on collations, see the [Collation page]({% link docs/archive/1.0/sql/expressions/collations.md %}). + +## Examples + +All examples use this example table: + +```sql +CREATE OR REPLACE TABLE addresses AS + SELECT '123 Quack Blvd' AS address, 'DuckTown' AS city, '11111' AS zip + UNION ALL + SELECT '111 Duck Duck Goose Ln', 'DuckTown', '11111' + UNION ALL + SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111' + UNION ALL + SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111-0001'; +``` + +Select the addresses, ordered by city name using the default null order and default order: + +```sql +SELECT * +FROM addresses +ORDER BY city; +``` + +Select the addresses, ordered by city name in descending order with nulls at the end: + +```sql +SELECT * +FROM addresses +ORDER BY city DESC NULLS LAST; +``` + +Order by city and then by zip code, both using the default orderings: + +```sql +SELECT * +FROM addresses +ORDER BY city, zip; +``` + +Order by city using German collation rules: + +```sql +SELECT * +FROM addresses +ORDER BY city COLLATE DE; +``` + +### `ORDER BY ALL` Examples + +Order from left to right (by address, then by city, then by zip) in ascending order: + +```sql +SELECT * +FROM addresses +ORDER BY ALL; +``` + +
+ +| address | city | zip | +|------------------------|-----------|------------| +| 111 Duck Duck Goose Ln | Duck Town | 11111 | +| 111 Duck Duck Goose Ln | Duck Town | 11111-0001 | +| 111 Duck Duck Goose Ln | DuckTown | 11111 | +| 123 Quack Blvd | DuckTown | 11111 | + +Order from left to right (by address, then by city, then by zip) in descending order: + +```sql +SELECT * +FROM addresses +ORDER BY ALL DESC; +``` + +
+ +| address | city | zip | +|------------------------|-----------|------------| +| 123 Quack Blvd | DuckTown | 11111 | +| 111 Duck Duck Goose Ln | DuckTown | 11111 | +| 111 Duck Duck Goose Ln | Duck Town | 11111-0001 | +| 111 Duck Duck Goose Ln | Duck Town | 11111 | + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/prepared_statements.md b/docs/archive/1.0/sql/query_syntax/prepared_statements.md new file mode 100644 index 00000000000..eb5697945a7 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/prepared_statements.md @@ -0,0 +1,100 @@ +--- +layout: docu +title: Prepared Statements +--- + +DuckDB supports prepared statements where parameters are substituted when the query is executed. +This can improve readability and is useful for preventing [SQL injections](https://en.wikipedia.org/wiki/SQL_injection). + +## Syntax + +There are three syntaxes for denoting parameters in prepared statements: +auto-incremented (`?`), +positional (`$1`), +and named (`$param`). +Note that not all clients support all of these syntaxes, e.g., the [JDBC client]({% link docs/archive/1.0/api/java.md %}) only supports auto-incremented parameters in prepared statements. + +### Example Data Set + +In the following, we introduce the three different syntaxes and illustrate them with examples using the following table. + +```sql +CREATE TABLE person (name VARCHAR, age BIGINT); +INSERT INTO person VALUES ('Alice', 37), ('Ana', 35), ('Bob', 41), ('Bea', 25); +``` + +In our example query, we'll look for people whose name starts with a `B` and are at least 40 years old. +This will return a single row `<'Bob', 41>`. + +### Auto-Incremented Parameters: `?` + +DuckDB support using prepared statements with auto-incremented indexing, +i.e., the position of the parameters in the query corresponds to their position in the execution statement. +For example: + +```sql +PREPARE query_person AS + SELECT * + FROM person + WHERE starts_with(name, ?) + AND age >= ?; +``` + +Using the CLI client, the statement is executed as follows. + +```sql +EXECUTE query_person('B', 40); +``` + +### Positional Parameters: `$1` + +Prepared statements can use positional parameters, where parameters are denoted with an integer (`$1`, `$2`). +For example: + +```sql +PREPARE query_person AS + SELECT * + FROM person + WHERE starts_with(name, $2) + AND age >= $1; +``` + +Using the CLI client, the statement is executed as follows. +Note that the first parameter corresponds to `$1`, the second to `$2`, and so on. + +```sql +EXECUTE query_person(40, 'B'); +``` + +### Named Parameters: `$parameter` + +DuckDB also supports names parameters where parameters are denoted with `$parameter_name`. +For example: + +```sql +PREPARE query_person AS + SELECT * + FROM person + WHERE starts_with(name, $name_start_letter) + AND age >= $minimum_age; +``` + +Using the CLI client, the statement is executed as follows. + +```sql +EXECUTE query_person(name_start_letter := 'B', minimum_age := 40); +``` + +## Dropping Prepared Statements: `DEALLOCATE` + +To drop a prepared statement, use the `DEALLOCATE` statement: + +```sql +DEALLOCATE query_person; +``` + +Alternatively, use: + +```sql +DEALLOCATE PREPARE query_person; +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/qualify.md b/docs/archive/1.0/sql/query_syntax/qualify.md new file mode 100644 index 00000000000..2ec527d08bc --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/qualify.md @@ -0,0 +1,102 @@ +--- +blurb: The QUALIFY clause is used to filter the results of WINDOW functions. +layout: docu +railroad: query_syntax/qualify.js +title: QUALIFY Clause +--- + +The `QUALIFY` clause is used to filter the results of [`WINDOW` functions]({% link docs/archive/1.0/sql/functions/window_functions.md %}). This filtering of results is similar to how a [`HAVING` clause]({% link docs/archive/1.0/sql/query_syntax/having.md %}) filters the results of aggregate functions applied based on the [`GROUP BY` clause]({% link docs/archive/1.0/sql/query_syntax/groupby.md %}). + +The `QUALIFY` clause avoids the need for a subquery or [`WITH` clause]({% link docs/archive/1.0/sql/query_syntax/with.md %}) to perform this filtering (much like `HAVING` avoids a subquery). An example using a `WITH` clause instead of `QUALIFY` is included below the `QUALIFY` examples. + +Note that this is filtering based on [`WINDOW` functions]({% link docs/archive/1.0/sql/functions/window_functions.md %}), not necessarily based on the [`WINDOW` clause]({% link docs/archive/1.0/sql/query_syntax/window.md %}). The `WINDOW` clause is optional and can be used to simplify the creation of multiple `WINDOW` function expressions. + +The position of where to specify a `QUALIFY` clause is following the [`WINDOW` clause]({% link docs/archive/1.0/sql/query_syntax/window.md %}) in a `SELECT` statement (`WINDOW` does not need to be specified), and before the [`ORDER BY`]({% link docs/archive/1.0/sql/query_syntax/orderby.md %}). + +## Examples + +Each of the following examples produce the same output, located below. + +Filter based on a window function defined in the `QUALIFY` clause: + +```sql +SELECT + schema_name, + function_name, + -- In this example the function_rank column in the select clause is for reference + row_number() OVER (PARTITION BY schema_name ORDER BY function_name) AS function_rank +FROM duckdb_functions() +QUALIFY + row_number() OVER (PARTITION BY schema_name ORDER BY function_name) < 3; +``` + +Filter based on a window function defined in the `SELECT` clause: + +```sql +SELECT + schema_name, + function_name, + row_number() OVER (PARTITION BY schema_name ORDER BY function_name) AS function_rank +FROM duckdb_functions() +QUALIFY + function_rank < 3; +``` + +Filter based on a window function defined in the `QUALIFY` clause, but using the `WINDOW` clause: + +```sql +SELECT + schema_name, + function_name, + -- In this example the function_rank column in the select clause is for reference + row_number() OVER my_window AS function_rank +FROM duckdb_functions() +WINDOW + my_window AS (PARTITION BY schema_name ORDER BY function_name) +QUALIFY + row_number() OVER my_window < 3; +``` + +Filter based on a window function defined in the `SELECT` clause, but using the `WINDOW` clause: + +```sql +SELECT + schema_name, + function_name, + row_number() OVER my_window AS function_rank +FROM duckdb_functions() +WINDOW + my_window AS (PARTITION BY schema_name ORDER BY function_name) +QUALIFY + function_rank < 3; +``` + +Equivalent query based on a `WITH` clause (without a `QUALIFY` clause): + +```sql +WITH ranked_functions AS ( + SELECT + schema_name, + function_name, + row_number() OVER (PARTITION BY schema_name ORDER BY function_name) AS function_rank + FROM duckdb_functions() +) +SELECT + * +FROM ranked_functions +WHERE + function_rank < 3; +``` + +
+ +| schema_name | function_name | function_rank | +|:---|:---|:---| +| main | !__postfix | 1 | +| main | !~~ | 2 | +| pg_catalog | col_description | 1 | +| pg_catalog | format_pg_type | 2 | + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/sample.md b/docs/archive/1.0/sql/query_syntax/sample.md new file mode 100644 index 00000000000..7f25c6cd319 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/sample.md @@ -0,0 +1,37 @@ +--- +layout: docu +railroad: query_syntax/sample.js +title: SAMPLE Clause +--- + +The `SAMPLE` clause allows you to run the query on a sample from the base table. This can significantly speed up processing of queries, at the expense of accuracy in the result. Samples can also be used to quickly see a snapshot of the data when exploring a data set. The sample clause is applied right after anything in the `FROM` clause (i.e., after any joins, but before the `WHERE` clause or any aggregates). See the [`SAMPLE`]({% link docs/archive/1.0/sql/samples.md %}) page for more information. + +## Examples + +Select a sample of 1% of the addresses table using default (system) sampling: + +```sql +SELECT * +FROM addresses +USING SAMPLE 1%; +``` + +Select a sample of 1% of the addresses table using bernoulli sampling: + +```sql +SELECT * +FROM addresses +USING SAMPLE 1% (bernoulli); +``` + +Select a sample of 10 rows from the subquery: + +```sql +SELECT * +FROM (SELECT * FROM addresses) +USING SAMPLE 10 ROWS; +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/select.md b/docs/archive/1.0/sql/query_syntax/select.md new file mode 100644 index 00000000000..10f52126ccb --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/select.md @@ -0,0 +1,173 @@ +--- +blurb: The SELECT clause specifies the list of columns that will be returned by the + query. +layout: docu +railroad: query_syntax/select.js +title: SELECT Clause +--- + +The `SELECT` clause specifies the list of columns that will be returned by the query. While it appears first in the clause, *logically* the expressions here are executed only at the end. The `SELECT` clause can contain arbitrary expressions that transform the output, as well as aggregates and window functions. + +## Examples + +Select all columns from the table called `table_name`: + +```sql +SELECT * FROM table_name; +``` + +Perform arithmetic on the columns in a table, and provide an alias: + +```sql +SELECT col1 + col2 AS res, sqrt(col1) AS root FROM table_name; +``` + +Select all unique cities from the `addresses` table: + +```sql +SELECT DISTINCT city FROM addresses; +``` + +Return the total number of rows in the `addresses` table: + +```sql +SELECT count(*) FROM addresses; +``` + +Select all columns except the city column from the `addresses` table: + +```sql +SELECT * EXCLUDE (city) FROM addresses; +``` + +Select all columns from the `addresses` table, but replace `city` with `lower(city)`: + +```sql +SELECT * REPLACE (lower(city) AS city) FROM addresses; +``` + +Select all columns matching the given regular expression from the table: + +```sql +SELECT COLUMNS('number\d+') FROM addresses; +``` + +Compute a function on all given columns of a table: + +```sql +SELECT min(COLUMNS(*)) FROM addresses; +``` + +To select columns with spaces or special characters, use double quotes (`"`): + +```sql +SELECT "Some Column Name" FROM tbl; +``` + +## Syntax + +
+ +## `SELECT` List + +The `SELECT` clause contains a list of expressions that specify the result of a query. The select list can refer to any columns in the `FROM` clause, and combine them using expressions. As the output of a SQL query is a table – every expression in the `SELECT` clause also has a name. The expressions can be explicitly named using the `AS` clause (e.g., `expr AS name`). If a name is not provided by the user the expressions are named automatically by the system. + +> Column names are case-insensitive. See the [Rules for Case Sensitivity]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#rules-for-case-sensitivity) for more details. + +### Star Expressions + +Select all columns from the table called `table_name`: + +```sql +SELECT * +FROM table_name; +``` + +Select all columns matching the given regular expression from the table: + +```sql +SELECT COLUMNS('number\d+') +FROM addresses; +``` + +The [star expression]({% link docs/archive/1.0/sql/expressions/star.md %}) is a special expression that expands to *multiple expressions* based on the contents of the `FROM` clause. In the simplest case, `*` expands to **all** expressions in the `FROM` clause. Columns can also be selected using regular expressions or lambda functions. See the [star expression page]({% link docs/archive/1.0/sql/expressions/star.md %}) for more details. + +### `DISTINCT` Clause + +Select all unique cities from the addresses table: + +```sql +SELECT DISTINCT city +FROM addresses; +``` + +The `DISTINCT` clause can be used to return **only** the unique rows in the result – so that any duplicate rows are filtered out. + +> Queries starting with `SELECT DISTINCT` run deduplication, which is an expensive operation. Therefore, only use `DISTINCT` if necessary. + +### `DISTINCT ON` Clause + +Select only the highest population city for each country: + +```sql +SELECT DISTINCT ON(country) city, population +FROM cities +ORDER BY population DESC; +``` + +The `DISTINCT ON` clause returns only one row per unique value in the set of expressions as defined in the `ON` clause. If an `ORDER BY` clause is present, the row that is returned is the first row that is encountered *as per the `ORDER BY`* criteria. If an `ORDER BY` clause is not present, the first row that is encountered is not defined and can be any row in the table. + +> When querying large data sets, using `DISTINCT` on all columns can be expensive. Therefore, consider using `DISTINCT ON` on a column (or a set of columns) which guaranetees a sufficient degree of uniqueness for your results. For example, using `DISTINCT ON` on the key column(s) of a table guarantees full uniqueness. + +### Aggregates + +Return the total number of rows in the addresses table: + +```sql +SELECT count(*) +FROM addresses; +``` + +Return the total number of rows in the addresses table grouped by city: + +```sql +SELECT city, count(*) +FROM addresses +GROUP BY city; +``` + +[Aggregate functions]({% link docs/archive/1.0/sql/functions/aggregates.md %}) are special functions that *combine* multiple rows into a single value. When aggregate functions are present in the `SELECT` clause, the query is turned into an aggregate query. In an aggregate query, **all** expressions must either be part of an aggregate function, or part of a group (as specified by the [`GROUP BY clause`]({% link docs/archive/1.0/sql/query_syntax/groupby.md %})). + +### Window Functions + +Generate a `row_number` column containing incremental identifiers for each row: + +```sql +SELECT row_number() OVER () +FROM sales; +``` + +Compute the difference between the current amount, and the previous amount, by order of time: + +```sql +SELECT amount - lag(amount) OVER (ORDER BY time) +FROM sales; +``` + +[Window functions]({% link docs/archive/1.0/sql/functions/window_functions.md %}) are special functions that allow the computation of values relative to *other rows* in a result. Window functions are marked by the `OVER` clause which contains the *window specification*. The window specification defines the frame or context in which the window function is computed. See the [window functions page]({% link docs/archive/1.0/sql/functions/window_functions.md %}) for more information. + +### `unnest` Function + +Unnest an array by one level: + +```sql +SELECT unnest([1, 2, 3]); +``` + +Unnest a struct by one level: + +```sql +SELECT unnest({'a': 42, 'b': 84}); +``` + +The [`unnest`]({% link docs/archive/1.0/sql/query_syntax/unnest.md %}) function is a special function that can be used together with [arrays]({% link docs/archive/1.0/sql/data_types/array.md %}), [lists]({% link docs/archive/1.0/sql/data_types/list.md %}), or [structs]({% link docs/archive/1.0/sql/data_types/struct.md %}). The unnest function strips one level of nesting from the type. For example, `INTEGER[]` is transformed into `INTEGER`. `STRUCT(a INTEGER, b INTEGER)` is transformed into `a INTEGER, b INTEGER`. The unnest function can be used to transform nested types into regular scalar types, which makes them easier to operate on. \ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/setops.md b/docs/archive/1.0/sql/query_syntax/setops.md new file mode 100644 index 00000000000..8856b0c3f1a --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/setops.md @@ -0,0 +1,158 @@ +--- +layout: docu +railroad: query_syntax/setops.js +title: Set Operations +--- + +Set operations allow queries to be combined according to [set operation semantics](https://en.wikipedia.org/wiki/Set_(mathematics)#Basic_operations). Set operations refer to the [`UNION [ALL]`](#union), [`INTERSECT [ALL]`](#intersect) and [`EXCEPT [ALL]`](#except) clauses. The vanilla variants use set semantics, i.e., they eliminate duplicates, while the variants with `ALL` use bag semantics. + +Traditional set operations unify queries **by column position**, and require the to-be-combined queries to have the same number of input columns. If the columns are not of the same type, casts may be added. The result will use the column names from the first query. + +DuckDB also supports [`UNION [ALL] BY NAME`](#union-all-by-name), which joins columns by name instead of by position. `UNION BY NAME` does not require the inputs to have the same number of columns. `NULL` values will be added in case of missing columns. + +## `UNION` + +The `UNION` clause can be used to combine rows from multiple queries. The queries are required to return the same number of columns. [Implicit casting](https://duckdb.org/docs/sql/data_types/typecasting#implicit-casting) to one of the returned types is performed to combine columns of different types where necessary. If this is not possible, the `UNION` clause throws an error. + +### Vanilla `UNION` (Set Semantics) + +The vanilla `UNION` clause follows set semantics, therefore it performs duplicate elimination, i.e., only unique rows will be included in the result. + +```sql +SELECT * FROM range(2) t1(x) +UNION +SELECT * FROM range(3) t2(x); +``` + +| x | +|--:| +| 2 | +| 1 | +| 0 | + +### `UNION ALL` (Bag Semantics) + +`UNION ALL` returns all rows of both queries following bag semantics, i.e., *without* duplicate elimination. + +```sql +SELECT * FROM range(2) t1(x) +UNION ALL +SELECT * FROM range(3) t2(x); +``` + +| x | +|--:| +| 0 | +| 1 | +| 0 | +| 1 | +| 2 | + +### `UNION [ALL] BY NAME` + +The `UNION [ALL] BY NAME` clause can be used to combine rows from different tables by name, instead of by position. `UNION BY NAME` does not require both queries to have the same number of columns. Any columns that are only found in one of the queries are filled with `NULL` values for the other query. + +Take the following tables for example: + +```sql +CREATE TABLE capitals (city VARCHAR, country VARCHAR); +INSERT INTO capitals VALUES + ('Amsterdam', 'NL'), + ('Berlin', 'Germany'); +CREATE TABLE weather (city VARCHAR, degrees INTEGER, date DATE); +INSERT INTO weather VALUES + ('Amsterdam', 10, '2022-10-14'), + ('Seattle', 8, '2022-10-12'); +``` + +```sql +SELECT * FROM capitals +UNION BY NAME +SELECT * FROM weather; +``` + +| city | country | degrees | date | +|-----------|---------|--------:|------------| +| Seattle | NULL | 8 | 2022-10-12 | +| Amsterdam | NL | NULL | NULL | +| Berlin | Germany | NULL | NULL | +| Amsterdam | NULL | 10 | 2022-10-14 | + +`UNION BY NAME` follows set semantics (therefore it performs duplicate elimination), whereas `UNION ALL BY NAME` follows bag semantics. + +## `INTERSECT` + +The `INTERSECT` clause can be used to select all rows that occur in the result of **both** queries. + +### Vanilla `INTERSECT` (Set Semantics) + +Vanilla `INTERSECT` performs duplicate elimination, so only unique rows are returned. + +```sql +SELECT * FROM range(2) t1(x) +INTERSECT +SELECT * FROM range(6) t2(x); +``` + +| x | +|--:| +| 0 | +| 1 | + +### `INTERSECT ALL` (Bag Semantics) + +`INTERSECT ALL` follows bag semantics, so duplicates are returned. + +```sql +SELECT unnest([5, 5, 6, 6, 6, 6, 7, 8]) AS x +INTERSECT ALL +SELECT unnest([5, 6, 6, 7, 7, 9]); +``` + +| x | +|--:| +| 5 | +| 6 | +| 6 | +| 7 | + +## `EXCEPT` + +The `EXCEPT` clause can be used to select all rows that **only** occur in the left query. + +### Vanilla `EXCEPT` (Set Semantics) + +Vanilla `EXCEPT` follows set semantics, therefore, it performs duplicate elimination, so only unique rows are returned. + +```sql +SELECT * FROM range(5) t1(x) +EXCEPT +SELECT * FROM range(2) t2(x); +``` + +| x | +|--:| +| 2 | +| 3 | +| 4 | + +### `EXCEPT ALL` (Bag Semantics) + +`EXCEPT ALL` uses bag semantics: + +```sql +SELECT unnest([5, 5, 6, 6, 6, 6, 7, 8]) AS x +EXCEPT ALL +SELECT unnest([5, 6, 6, 7, 7, 9]); +``` + +| x | +|--:| +| 5 | +| 8 | +| 6 | +| 6 | + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/unnest.md b/docs/archive/1.0/sql/query_syntax/unnest.md new file mode 100644 index 00000000000..856cf309f0f --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/unnest.md @@ -0,0 +1,167 @@ +--- +layout: docu +title: Unnesting +--- + +## Examples + +Unnest a list, generating 3 rows (1, 2, 3): + +```sql +SELECT unnest([1, 2, 3]); +``` + +Unnesting a struct, generating two columns (a, b): + +```sql +SELECT unnest({'a': 42, 'b': 84}); +``` + +Recursive unnest of a list of structs: + +```sql +SELECT unnest([{'a': 42, 'b': 84}, {'a': 100, 'b': NULL}], recursive := true); +``` + +Limit depth of recursive unnest using `max_depth`: + +```sql +SELECT unnest([[[1, 2], [3, 4]], [[5, 6], [7, 8, 9], []], [[10, 11]]], max_depth := 2); +``` + +The `unnest` special function is used to unnest lists or structs by one level. The function can be used as a regular scalar function, but only in the `SELECT` clause. Invoking `unnest` with the `recursive` parameter will unnest lists and structs of multiple levels. The depth of unnesting can be limited using the `max_depth` parameter (which assumes `recursive` unnesting by default). + +### Unnesting Lists + +Unnest a list, generating 3 rows (1, 2, 3): + +```sql +SELECT unnest([1, 2, 3]); +``` + +Unnest a scalar list, generating 3 rows ((1, 10), (2, 11), (3, NULL)): + +```sql +SELECT unnest([1, 2, 3]), unnest([10, 11]); +``` + +Unnest a scalar list, generating 3 rows ((1, 10), (2, 10), (3, 10)): + +```sql +SELECT unnest([1, 2, 3]), 10; +``` + +Unnest a list column generated from a subquery: + +```sql +SELECT unnest(l) + 10 FROM (VALUES ([1, 2, 3]), ([4, 5])) tbl(l); +``` + +Empty result: + +```sql +SELECT unnest([]); +``` + +Empty result: + +```sql +SELECT unnest(NULL); +``` + +Using `unnest` on a list will emit one tuple per entry in the list. When `unnest` is combined with regular scalar expressions, those expressions are repeated for every entry in the list. When multiple lists are unnested in the same `SELECT` clause, the lists are unnested side-by-side. If one list is longer than the other, the shorter list will be padded with `NULL` values. + +An empty list and a `NULL` list will both unnest to zero elements. + +### Unnesting Structs + +Unnesting a struct, generating two columns (a, b): + +```sql +SELECT unnest({'a': 42, 'b': 84}); +``` + +Unnesting a struct, generating two columns (a, b): + +```sql +SELECT unnest({'a': 42, 'b': {'x': 84}}); +``` + +`unnest` on a struct will emit one column per entry in the struct. + +### Recursive Unnest + +Unnesting a list of lists recursively, generating 5 rows (1, 2, 3, 4, 5): + +```sql +SELECT unnest([[1, 2, 3], [4, 5]], recursive := true); +``` + +Unnesting a list of structs recursively, generating two rows of two columns (a, b): + +```sql +SELECT unnest([{'a': 42, 'b': 84}, {'a': 100, 'b': NULL}], recursive := true); +``` + +Unnesting a struct, generating two columns (a, b): + +```sql +SELECT unnest({'a': [1, 2, 3], 'b': 88}, recursive := true); +``` + +Calling `unnest` with the `recursive` setting will fully unnest lists, followed by fully unnesting structs. This can be useful to fully flatten columns that contain lists within lists, or lists of structs. Note that lists *within* structs are not unnested. + +### Setting the Maximum Depth of Unnesting + +The `max_depth` parameter allows limiting the maximum depth of recursive unnesting (which is assumed by default and does not have to be specified separately). +For example, unnestig to `max_depth` of 2 yields the following: + +```sql +SELECT unnest([[[1, 2], [3, 4]], [[5, 6], [7, 8, 9], []], [[10, 11]]], max_depth := 2) AS x; +``` + +| x | +|-----------| +| [1, 2] | +| [3, 4] | +| [5, 6] | +| [7, 8, 9] | +| [] | +| [10, 11] | + +Meanwhile, unnesting to `max_depth` of 3 results in: + +```sql +SELECT unnest([[[1, 2], [3, 4]], [[5, 6], [7, 8, 9], []], [[10, 11]]], max_depth := 3) AS x; +``` + +| x | +|---:| +| 1 | +| 2 | +| 3 | +| 4 | +| 5 | +| 6 | +| 7 | +| 8 | +| 9 | +| 10 | +| 11 | + +### Keeping Track of List Entry Positions + +To keep track of each entry's position within the original list, `unnest` may be combined with [`generate_subscripts`]({% link docs/archive/1.0/sql/functions/nested.md %}#generate_subscripts): + +```sql +SELECT unnest(l) as x, generate_subscripts(l, 1) AS index +FROM (VALUES ([1, 2, 3]), ([4, 5])) tbl(l); +``` + +| x | index | +|--:|------:| +| 1 | 1 | +| 2 | 2 | +| 3 | 3 | +| 4 | 1 | +| 5 | 2 | \ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/values.md b/docs/archive/1.0/sql/query_syntax/values.md new file mode 100644 index 00000000000..4789b0e7fa3 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/values.md @@ -0,0 +1,41 @@ +--- +layout: docu +railroad: query_syntax/values.js +title: VALUES Clause +--- + +The `VALUES` clause is used to specify a fixed number of rows. The `VALUES` clause can be used as a stand-alone statement, as part of the `FROM` clause, or as input to an `INSERT INTO` statement. + +## Examples + +Generate two rows and directly return them: + +```sql +VALUES ('Amsterdam', 1), ('London', 2); +``` + +Generate two rows as part of a `FROM` clause, and rename the columns: + +```sql +SELECT * +FROM (VALUES ('Amsterdam', 1), ('London', 2)) cities(name, id); +``` + +Generate two rows and insert them into a table: + +```sql +INSERT INTO cities +VALUES ('Amsterdam', 1), ('London', 2); +``` + +Create a table directly from a `VALUES` clause: + +```sql +CREATE TABLE cities AS + SELECT * + FROM (VALUES ('Amsterdam', 1), ('London', 2)) cities(name, id); +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/where.md b/docs/archive/1.0/sql/query_syntax/where.md new file mode 100644 index 00000000000..75bf4a9fc6c --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/where.md @@ -0,0 +1,37 @@ +--- +layout: docu +railroad: query_syntax/where.js +title: WHERE Clause +--- + +The `WHERE` clause specifies any filters to apply to the data. This allows you to select only a subset of the data in which you are interested. Logically the `WHERE` clause is applied immediately after the `FROM` clause. + +## Examples + +Select all rows that where the `id` is equal to 3: + +```sql +SELECT * +FROM table_name +WHERE id = 3; +``` + +Select all rows that match the given case-insensitive `LIKE` expression: + +```sql +SELECT * +FROM table_name +WHERE name ILIKE '%mark%'; +``` + +Select all rows that match the given composite expression: + +```sql +SELECT * +FROM table_name +WHERE id = 3 OR id = 7; +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/window.md b/docs/archive/1.0/sql/query_syntax/window.md new file mode 100644 index 00000000000..dac31a5f750 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/window.md @@ -0,0 +1,11 @@ +--- +layout: docu +railroad: query_syntax/window.js +title: WINDOW Clause +--- + +The `WINDOW` clause allows you to specify named windows that can be used within [window functions]({% link docs/archive/1.0/sql/functions/window_functions.md %}). These are useful when you have multiple window functions, as they allow you to avoid repeating the same window clause. + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/query_syntax/with.md b/docs/archive/1.0/sql/query_syntax/with.md new file mode 100644 index 00000000000..784c589c0e0 --- /dev/null +++ b/docs/archive/1.0/sql/query_syntax/with.md @@ -0,0 +1,302 @@ +--- +layout: docu +railroad: query_syntax/with.js +title: WITH Clause +--- + +The `WITH` clause allows you to specify common table expressions (CTEs). Regular (non-recursive) common-table-expressions are essentially views that are limited in scope to a particular query. CTEs can reference each-other and can be nested. [Recursive CTEs](#recursive-ctes) can reference themselves. + +## Basic CTE Examples + +Create a CTE called “cte” and use it in the main query: + +```sql +WITH cte AS (SELECT 42 AS x) +SELECT * FROM cte; +``` + +| x | +|---:| +| 42 | + +Create two CTEs, where the second CTE references the first CTE: + +```sql +WITH + cte AS (SELECT 42 AS i), + cte2 AS (SELECT i * 100 AS x FROM cte) +SELECT * FROM cte2; +``` + +| x | +|-----:| +| 4200 | + +## Materialized CTEs + +By default, CTEs are inlined into the main query. Inlining can result in duplicate work, because the definition is copied for each reference. Take this query for example: + +```sql +WITH t(x) AS (⟨Q_t⟩) +SELECT * +FROM t AS t1, + t AS t2, + t AS t3; +``` + +Inlining duplicates the definition of `t` for each reference which results in the following query: + +```sql +SELECT * +FROM (⟨Q_t⟩) AS t1(x), + (⟨Q_t⟩) AS t2(x), + (⟨Q_t⟩) AS t3(x); +``` + +If `⟨Q_t⟩` is expensive, materializing it with the `MATERIALIZED` keyword can improve performance. In this case, `⟨Q_t⟩` is evaluated only once. + +```sql +WITH t(x) AS MATERIALIZED (⟨Q_t⟩) +SELECT * +FROM t AS t1, + t AS t2, + t AS t3; +``` + +## Recursive CTEs + +`WITH RECURSIVE` allows the definition of CTEs which can refer to themselves. Note that the query must be formulated in a way that ensures termination, otherwise, it may run into an infinite loop. + +### Example: Fibonacci Sequence + +`WITH RECURSIVE` can be used to make recursive calculations. For example, here is how `WITH RECURSIVE` could be used to calculate the first ten Fibonacci numbers: + +```sql +WITH RECURSIVE FibonacciNumbers (RecursionDepth, FibonacciNumber, NextNumber) AS ( + -- Base case + SELECT + 0 AS RecursionDepth, + 0 AS FibonacciNumber, + 1 AS NextNumber + UNION ALL + -- Recursive step + SELECT + fib.RecursionDepth + 1 AS RecursionDepth, + fib.NextNumber AS FibonacciNumber, + fib.FibonacciNumber + fib.NextNumber AS NextNumber + FROM + FibonacciNumbers fib + WHERE + fib.RecursionDepth + 1 < 10 + ) +SELECT + fn.RecursionDepth AS FibonacciNumberIndex, + fn.FibonacciNumber +FROM + FibonacciNumbers fn; +``` + +| FibonacciNumberIndex | FibonacciNumber | +|---------------------:|----------------:| +| 0 | 0 | +| 1 | 1 | +| 2 | 1 | +| 3 | 2 | +| 4 | 3 | +| 5 | 5 | +| 6 | 8 | +| 7 | 13 | +| 8 | 21 | +| 9 | 34 | + +### Example: Tree Traversal + +`WITH RECURSIVE` can be used to traverse trees. For example, take a hierarchy of tags: + +Example tree + +```sql +CREATE TABLE tag (id INTEGER, name VARCHAR, subclassof INTEGER); +INSERT INTO tag VALUES + (1, 'U2', 5), + (2, 'Blur', 5), + (3, 'Oasis', 5), + (4, '2Pac', 6), + (5, 'Rock', 7), + (6, 'Rap', 7), + (7, 'Music', 9), + (8, 'Movies', 9), + (9, 'Art', NULL); +``` + +The following query returns the path from the node `Oasis` to the root of the tree (`Art`). + +```sql +WITH RECURSIVE tag_hierarchy(id, source, path) AS ( + SELECT id, name, [name] AS path + FROM tag + WHERE subclassof IS NULL + UNION ALL + SELECT tag.id, tag.name, list_prepend(tag.name, tag_hierarchy.path) + FROM tag, tag_hierarchy + WHERE tag.subclassof = tag_hierarchy.id + ) +SELECT path +FROM tag_hierarchy +WHERE source = 'Oasis'; +``` + +| path | +|---------------------------| +| [Oasis, Rock, Music, Art] | + +### Graph Traversal + +The `WITH RECURSIVE` clause can be used to express graph traversal on arbitrary graphs. However, if the graph has cycles, the query must perform cycle detection to prevent infinite loops. +One way to achieve this is to store the path of a traversal in a [list]({% link docs/archive/1.0/sql/data_types/list.md %}) and, before extending the path with a new edge, check whether its endpoint has been visited before (see the example later). + +Take the following directed graph from the [LDBC Graphalytics benchmark](https://arxiv.org/pdf/2011.15028.pdf): + +Example graph + +```sql +CREATE TABLE edge (node1id INTEGER, node2id INTEGER); +INSERT INTO edge VALUES + (1, 3), (1, 5), (2, 4), (2, 5), (2, 10), (3, 1), + (3, 5), (3, 8), (3, 10), (5, 3), (5, 4), (5, 8), + (6, 3), (6, 4), (7, 4), (8, 1), (9, 4); +``` + +Note that the graph contains directed cycles, e.g., between nodes 1, 2, and 5. + +#### Enumerate All Paths from a Node + +The following query returns **all paths** starting in node 1: + +```sql +WITH RECURSIVE paths(startNode, endNode, path) AS ( + SELECT -- Define the path as the first edge of the traversal + node1id AS startNode, + node2id AS endNode, + [node1id, node2id] AS path + FROM edge + WHERE startNode = 1 + UNION ALL + SELECT -- Concatenate new edge to the path + paths.startNode AS startNode, + node2id AS endNode, + array_append(path, node2id) AS path + FROM paths + JOIN edge ON paths.endNode = node1id + -- Prevent adding a repeated node to the path. + -- This ensures that no cycles occur. + WHERE list_position(paths.path, node2id) = 0 + ) +SELECT startNode, endNode, path +FROM paths +ORDER BY length(path), path; +``` + +| startNode | endNode | path | +|----------:|--------:|---------------| +| 1 | 3 | [1, 3] | +| 1 | 5 | [1, 5] | +| 1 | 5 | [1, 3, 5] | +| 1 | 8 | [1, 3, 8] | +| 1 | 10 | [1, 3, 10] | +| 1 | 3 | [1, 5, 3] | +| 1 | 4 | [1, 5, 4] | +| 1 | 8 | [1, 5, 8] | +| 1 | 4 | [1, 3, 5, 4] | +| 1 | 8 | [1, 3, 5, 8] | +| 1 | 8 | [1, 5, 3, 8] | +| 1 | 10 | [1, 5, 3, 10] | + +Note that the result of this query is not restricted to shortest paths, e.g., for node 5, the results include paths `[1, 5]` and `[1, 3, 5]`. + +#### Enumerate Unweighted Shortest Paths from a Node + +In most cases, enumerating all paths is not practical or feasible. Instead, only the **(unweighted) shortest paths** are of interest. To find these, the second half of the `WITH RECURSIVE` query should be adjusted such that it only includes a node if it has not yet been visited. This is implemented by using a subquery that checks if any of the previous paths includes the node: + +```sql +WITH RECURSIVE paths(startNode, endNode, path) AS ( + SELECT -- Define the path as the first edge of the traversal + node1id AS startNode, + node2id AS endNode, + [node1id, node2id] AS path + FROM edge + WHERE startNode = 1 + UNION ALL + SELECT -- Concatenate new edge to the path + paths.startNode AS startNode, + node2id AS endNode, + array_append(path, node2id) AS path + FROM paths + JOIN edge ON paths.endNode = node1id + -- Prevent adding a node that was visited previously by any path. + -- This ensures that (1) no cycles occur and (2) only nodes that + -- were not visited by previous (shorter) paths are added to a path. + WHERE NOT EXISTS ( + FROM paths previous_paths + WHERE list_contains(previous_paths.path, node2id) + ) + ) +SELECT startNode, endNode, path +FROM paths +ORDER BY length(path), path; +``` + +| startNode | endNode | path | +|----------:|--------:|------------| +| 1 | 3 | [1, 3] | +| 1 | 5 | [1, 5] | +| 1 | 8 | [1, 3, 8] | +| 1 | 10 | [1, 3, 10] | +| 1 | 4 | [1, 5, 4] | +| 1 | 8 | [1, 5, 8] | + +#### Enumerate Unweighted Shortest Paths between Two Nodes + +`WITH RECURSIVE` can also be used to find **all (unweighted) shortest paths between two nodes**. To ensure that the recursive query is stopped as soon as we reach the end node, we use a [window function]({% link docs/archive/1.0/sql/functions/window_functions.md %}) which checks whether the end node is among the newly added nodes. + +The following query returns all unweighted shortest paths between nodes 1 (start node) and 8 (end node): + +```sql +WITH RECURSIVE paths(startNode, endNode, path, endReached) AS ( + SELECT -- Define the path as the first edge of the traversal + node1id AS startNode, + node2id AS endNode, + [node1id, node2id] AS path, + (node2id = 8) AS endReached + FROM edge + WHERE startNode = 1 + UNION ALL + SELECT -- Concatenate new edge to the path + paths.startNode AS startNode, + node2id AS endNode, + array_append(path, node2id) AS path, + max(CASE WHEN node2id = 8 THEN 1 ELSE 0 END) + OVER (ROWS BETWEEN UNBOUNDED PRECEDING + AND UNBOUNDED FOLLOWING) AS endReached + FROM paths + JOIN edge ON paths.endNode = node1id + WHERE NOT EXISTS ( + FROM paths previous_paths + WHERE list_contains(previous_paths.path, node2id) + ) + AND paths.endReached = 0 +) +SELECT startNode, endNode, path +FROM paths +WHERE endNode = 8 +ORDER BY length(path), path; +``` + +| startNode | endNode | path | +|----------:|--------:|-----------| +| 1 | 8 | [1, 3, 8] | +| 1 | 8 | [1, 5, 8] | + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/samples.md b/docs/archive/1.0/sql/samples.md new file mode 100644 index 00000000000..5f8286e0045 --- /dev/null +++ b/docs/archive/1.0/sql/samples.md @@ -0,0 +1,131 @@ +--- +layout: docu +railroad: statements/samples.js +title: Samples +--- + +Samples are used to randomly select a subset of a dataset. + +### Examples + +Select a sample of 5 rows from `tbl` using reservoir sampling: + +```sql +SELECT * +FROM tbl +USING SAMPLE 5; +``` + +Select a sample of 10% of the table using system sampling (cluster sampling): + +```sql +SELECT * +FROM tbl +USING SAMPLE 10%; +``` + +Select a sample of 10% of the table using bernoulli sampling: + +```sql +SELECT * +FROM tbl +USING SAMPLE 10 PERCENT (bernoulli); +``` + +Select a sample of 50 rows of the table using reservoir sampling with a fixed seed (100): + +```sql +SELECT * +FROM tbl +USING SAMPLE reservoir(50 ROWS) +REPEATABLE (100); +``` + +Select a sample of 20% of the table using system sampling with a fixed seed (377): + +```sql +SELECT * +FROM tbl +USING SAMPLE 20% (system, 377); +``` + +Select a sample of 20% of `tbl` **before** the join with `tbl2`: + +```sql +SELECT * +FROM tbl TABLESAMPLE reservoir(20%), tbl2 +WHERE tbl.i = tbl2.i; +``` + +Select a sample of 20% of `tbl` **after** the join with `tbl2`: + +```sql +SELECT * +FROM tbl, tbl2 +WHERE tbl.i = tbl2.i +USING SAMPLE reservoir(20%); +``` + +### Syntax + +
+ +Samples allow you to randomly extract a subset of a dataset. Samples are useful for exploring a dataset faster, as often you might not be interested in the exact answers to queries, but only in rough indications of what the data looks like and what is in the data. Samples allow you to get approximate answers to queries faster, as they reduce the amount of data that needs to pass through the query engine. + +DuckDB supports three different types of sampling methods: `reservoir`, `bernoulli` and `system`. By default, DuckDB uses `reservoir` sampling when an exact number of rows is sampled, and `system` sampling when a percentage is specified. The sampling methods are described in detail below. + +Samples require a *sample size*, which is an indication of how many elements will be sampled from the total population. Samples can either be given as a percentage (`10%`) or as a fixed number of rows (`10 rows`). All three sampling methods support sampling over a percentage, but **only** reservoir sampling supports sampling a fixed number of rows. + +Samples are probablistic, that is to say, samples can be different between runs *unless* the seed is specifically specified. Specifying the seed *only* guarantees that the sample is the same if multi-threading is not enabled (i.e., `SET threads = 1`). In the case of multiple threads running over a sample, samples are not necessarily consistent even with a fixed seed. + +### `reservoir` + +Reservoir sampling is a stream sampling technique that selects a random sample by keeping a *reservoir* of size equal to the sample size, and randomly replacing elements as more elements come in. Reservoir sampling allows us to specify *exactly* how many elements we want in the resulting sample (by selecting the size of the reservoir). As a result, reservoir sampling *always* outputs the same amount of elements, unlike system and bernoulli sampling. + +Reservoir sampling is only recommended for small sample sizes, and is not recommended for use with percentages. That is because reservoir sampling needs to materialize the entire sample and randomly replace tuples within the materialized sample. The larger the sample size, the higher the performance hit incurred by this process. + +Reservoir sampling also incurs an additional performance penalty when multi-processing is used, since the reservoir is to be shared amongst the different threads to ensure unbiased sampling. This is not a big problem when the reservoir is very small, but becomes costly when the sample is large. + +> Bestpractice Avoid using reservoir sampling with large sample sizes if possible. +> Reservoir sampling requires the entire sample to be materialized in memory. + +### `bernoulli` + +Bernoulli sampling can only be used when a sampling percentage is specified. It is rather straightforward: every tuple in the underlying table is included with a chance equal to the specified percentage. As a result, bernoulli sampling can return a different number of tuples even if the same percentage is specified. The amount of rows will generally be more or less equal to the specified percentage of the table, but there will be some variance. + +Because bernoulli sampling is completely independent (there is no shared state), there is no penalty for using bernoulli sampling together with multiple threads. + +### `system` + +System sampling is a variant of bernoulli sampling with one crucial difference: every *vector* is included with a chance equal to the sampling percentage. This is a form of cluster sampling. System sampling is more efficient than bernoulli sampling, as no per-tuple selections have to be performed. There is almost no extra overhead for using system sampling, whereas bernoulli sampling can add additional cost as it has to perform random number generation for every single tuple. + +System sampling is not suitable for smaller data sets as the granularity of the sampling is on the order of ~1000 tuples. That means that if system sampling is used for small data sets (e.g., 100 rows) either all the data will be filtered out, or all the data will be included. + +## Table Samples + +The `TABLESAMPLE` and `USING SAMPLE` clauses are identical in terms of syntax and effect, with one important difference: tablesamples sample directly from the table for which they are specified, whereas the sample clause samples after the entire from clause has been resolved. This is relevant when there are joins present in the query plan. + +The `TABLESAMPLE` clause is essentially equivalent to creating a subquery with the `USING SAMPLE` clause, i.e., the following two queries are identical: + +Sample 20% of `tbl` **before** the join: + +```sql +SELECT * FROM tbl TABLESAMPLE reservoir(20%), tbl2 WHERE tbl.i = tbl2.i; +``` + +Sample 20% of `tbl` **before** the join: + +```sql +SELECT * +FROM (SELECT * FROM tbl USING SAMPLE reservoir(20%)) tbl, tbl2 +WHERE tbl.i = tbl2.i; +``` + +Sample 20% **after** the join (i.e., sample 20% of the join result): + +```sql +SELECT * +FROM tbl, tbl2 +WHERE tbl.i = tbl2.i +USING SAMPLE reservoir(20%); +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/alter_table.md b/docs/archive/1.0/sql/statements/alter_table.md new file mode 100644 index 00000000000..0e22a19d299 --- /dev/null +++ b/docs/archive/1.0/sql/statements/alter_table.md @@ -0,0 +1,166 @@ +--- +layout: docu +railroad: statements/alter.js +title: ALTER TABLE Statement +--- + +The `ALTER TABLE` statement changes the schema of an existing table in the catalog. + +## Examples + +Add a new column with name `k` to the table `integers`, it will be filled with the default value NULL: + +```sql +ALTER TABLE integers ADD COLUMN k INTEGER; +``` + +Add a new column with name `l` to the table integers, it will be filled with the default value 10: + +```sql +ALTER TABLE integers ADD COLUMN l INTEGER DEFAULT 10; +``` + +Drop the column `k` from the table integers: + +```sql +ALTER TABLE integers DROP k; +``` + +Change the type of the column `i` to the type `VARCHAR` using a standard cast: + +```sql +ALTER TABLE integers ALTER i TYPE VARCHAR; +``` + +Change the type of the column `i` to the type `VARCHAR`, using the specified expression to convert the data for each row: + +```sql +ALTER TABLE integers ALTER i SET DATA TYPE VARCHAR USING concat(i, '_', j); +``` + +Set the default value of a column: + +```sql +ALTER TABLE integers ALTER COLUMN i SET DEFAULT 10; +``` + +Drop the default value of a column: + +```sql +ALTER TABLE integers ALTER COLUMN i DROP DEFAULT; +``` + +Make a column not nullable: + +```sql +ALTER TABLE t ALTER COLUMN x SET NOT NULL; +``` + +Drop the not null constraint: + +```sql +ALTER TABLE t ALTER COLUMN x DROP NOT NULL; +``` + +Rename a table: + +```sql +ALTER TABLE integers RENAME TO integers_old; +``` + +Rename a column of a table: + +```sql +ALTER TABLE integers RENAME i TO j; +``` + +## Syntax + +
+ +`ALTER TABLE` changes the schema of an existing table. All the changes made by `ALTER TABLE` fully respect the transactional semantics, i.e., they will not be visible to other transactions until committed, and can be fully reverted through a rollback. + +## `RENAME TABLE` + +Rename a table: + +```sql +ALTER TABLE integers RENAME TO integers_old; +``` + +The `RENAME TO` clause renames an entire table, changing its name in the schema. Note that any views that rely on the table are **not** automatically updated. + +## `RENAME COLUMN` + +Rename a column of a table: + +```sql +ALTER TABLE integers RENAME i TO j; +ALTER TABLE integers RENAME COLUMN j TO k; +``` + +The `RENAME COLUMN` clause renames a single column within a table. Any constraints that rely on this name (e.g., `CHECK` constraints) are automatically updated. However, note that any views that rely on this column name are **not** automatically updated. + +## `ADD COLUMN` + +Add a new column with name `k` to the table `integers`, it will be filled with the default value NULL: + +```sql +ALTER TABLE integers ADD COLUMN k INTEGER; +``` + +Add a new column with name `l` to the table integers, it will be filled with the default value 10: + +```sql +ALTER TABLE integers ADD COLUMN l INTEGER DEFAULT 10; +``` + +The `ADD COLUMN` clause can be used to add a new column of a specified type to a table. The new column will be filled with the specified default value, or `NULL` if none is specified. + +## `DROP COLUMN` + +Drop the column `k` from the table `integers`: + +```sql +ALTER TABLE integers DROP k; +``` + +The `DROP COLUMN` clause can be used to remove a column from a table. Note that columns can only be removed if they do not have any indexes that rely on them. This includes any indexes created as part of a `PRIMARY KEY` or `UNIQUE` constraint. Columns that are part of multi-column check constraints cannot be dropped either. + +## `ALTER TYPE` + +Change the type of the column `i` to the type `VARCHAR` using a standard cast: + +```sql +ALTER TABLE integers ALTER i TYPE VARCHAR; +``` + +Change the type of the column `i` to the type `VARCHAR`, using the specified expression to convert the data for each row: + +```sql +ALTER TABLE integers ALTER i SET DATA TYPE VARCHAR USING concat(i, '_', j); +``` + +The `SET DATA TYPE` clause changes the type of a column in a table. Any data present in the column is converted according to the provided expression in the `USING` clause, or, if the `USING` clause is absent, cast to the new data type. Note that columns can only have their type changed if they do not have any indexes that rely on them and are not part of any `CHECK` constraints. + +## `SET` / `DROP DEFAULT` + +Set the default value of a column: + +```sql +ALTER TABLE integers ALTER COLUMN i SET DEFAULT 10; +``` + +Drop the default value of a column: + +```sql +ALTER TABLE integers ALTER COLUMN i DROP DEFAULT; +``` + +The `SET/DROP DEFAULT` clause modifies the `DEFAULT` value of an existing column. Note that this does not modify any existing data in the column. Dropping the default is equivalent to setting the default value to NULL. + +> Warning At the moment DuckDB will not allow you to alter a table if there are any dependencies. That means that if you have an index on a column you will first need to drop the index, alter the table, and then recreate the index. Otherwise, you will get a `Dependency Error`. + +## `ADD` / `DROP CONSTRAINT` + +> The `ADD CONSTRAINT` and `DROP CONSTRAINT` clauses are not yet supported in DuckDB. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/alter_view.md b/docs/archive/1.0/sql/statements/alter_view.md new file mode 100644 index 00000000000..86f3c86b6b6 --- /dev/null +++ b/docs/archive/1.0/sql/statements/alter_view.md @@ -0,0 +1,16 @@ +--- +layout: docu +title: ALTER VIEW Statement +--- + +The `ALTER VIEW` statement changes the schema of an existing view in the catalog. + +## Examples + +Rename a view: + +```sql +ALTER VIEW v1 RENAME TO v2; +``` + +`ALTER VIEW` changes the schema of an existing table. All the changes made by `ALTER VIEW` fully respect the transactional semantics, i.e., they will not be visible to other transactions until committed, and can be fully reverted through a rollback. Note that other views that rely on the table are **not** automatically updated. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/analyze.md b/docs/archive/1.0/sql/statements/analyze.md new file mode 100644 index 00000000000..aac27e64fac --- /dev/null +++ b/docs/archive/1.0/sql/statements/analyze.md @@ -0,0 +1,16 @@ +--- +layout: docu +title: ANALYZE Statement +--- + +The `ANALYZE` statement recomputes the statistics on DuckDB's tables. + +## Usage + +The statistics recomputed by the `ANALYZE` statement are only used for [join order optimization](https://blobs.duckdb.org/papers/tom-ebergen-msc-thesis-join-order-optimization-with-almost-no-statistics.pdf). It is therefore recommended to recompute these statistics for improved join orders, especially after performing large updates (inserts and/or deletes). + +To recompute the statistics, run: + +```sql +ANALYZE; +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/attach.md b/docs/archive/1.0/sql/statements/attach.md new file mode 100644 index 00000000000..9b0f9db1a59 --- /dev/null +++ b/docs/archive/1.0/sql/statements/attach.md @@ -0,0 +1,227 @@ +--- +layout: docu +railroad: statements/attach.js +title: ATTACH/DETACH Statement +--- + +DuckDB allows attaching to and detaching from database files. + +## Examples + +Attach the database `file.db` with the alias inferred from the name (`file`): + +```sql +ATTACH 'file.db'; +``` + +Attach the database `file.db` with an explicit alias (`file_db`): + +```sql +ATTACH 'file.db' AS file_db; +``` + +Attach the database `file.db` in read only mode: + +```sql +ATTACH 'file.db' (READ_ONLY); +``` + +Attach a SQLite database for reading and writing (see the [`sqlite` extension]({% link docs/archive/1.0/extensions/sqlite.md %}) for more information): + +```sql +ATTACH 'sqlite_file.db' AS sqlite_db (TYPE SQLITE); +``` + +Attach the database `file.db` if inferred database alias `file` does not yet exist: + +```sql +ATTACH IF NOT EXISTS 'file.db'; +``` + +Attach the database `file.db` if explicit database alias `file_db` does not yet exist: + +```sql +ATTACH IF NOT EXISTS 'file.db' AS file_db; +``` + +Create a table in the attached database with alias `file`: + +```sql +CREATE TABLE file.new_table (i INTEGER); +``` + +Detach the database with alias `file`: + +```sql +DETACH file; +``` + +Show a list of all attached databases: + +```sql +SHOW DATABASES; +``` + +Change the default database that is used to the database `file`: + +```sql +USE file; +``` + +## Attach + +The `ATTACH` statement adds a new database file to the catalog that can be read from and written to. +Note that attachment definitions are not persisted between sessions: when a new session is launched, you have to re-attach to all databases. + +### Attach Syntax + +
+ +`ATTACH` allows DuckDB to operate on multiple database files, and allows for transfer of data between different database files. + +## Detach + +The `DETACH` statement allows previously attached database files to be closed and detached, releasing any locks held on the database file. + +Note that it is not possible to detach from the default database: if you would like to do so, issue the [`USE` statement]({% link docs/archive/1.0/sql/statements/use.md %}) to change the default database to another one. For example, if you are connected to a persistent database, you may change to an in-memory database by issuing: + +```sql +ATTACH ':memory:' AS memory_db; +USE memory_db; +``` + +> Warning Closing the connection, e.g., invoking the [`close()` function in Python]({% link docs/archive/1.0/api/python/dbapi.md %}#connection), does not release the locks held on the database files as the file handles are held by the main DuckDB instance (in Python's case, the `duckdb` module). + +### Detach Syntax + +
+ +## Name Qualification + +The fully qualified name of catalog objects contains the *catalog*, the *schema* and the *name* of the object. For example: + +Attach the database `new_db`: + +```sql +ATTACH 'new_db.db'; +``` + +Create the schema `my_schema` in the database `new_db`: + +```sql +CREATE SCHEMA new_db.my_schema; +``` + +Create the table `my_table` in the schema `my_schema`: + +```sql +CREATE TABLE new_db.my_schema.my_table (col INTEGER); +``` + +Refer to the column `col` inside the table `my_table`: + +```sql +SELECT new_db.my_schema.my_table.col FROM new_db.my_schema.my_table; +``` + +Note that often the fully qualified name is not required. When a name is not fully qualified, the system looks for which entries to reference using the *catalog search path*. The default catalog search path includes the system catalog, the temporary catalog and the initially attached database together with the `main` schema. + +Also note the rules on [identifiers and database names in particular]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#database-names). + +### Default Database and Schema + +When a table is created without any qualifications, the table is created in the default schema of the default database. The default database is the database that is launched when the system is created – and the default schema is `main`. + +Create the table `my_table` in the default database: + +```sql +CREATE TABLE my_table (col INTEGER); +``` + +### Changing the Default Database and Schema + +The default database and schema can be changed using the `USE` command. + +Set the default database schema to `new_db.main`: + +```sql +USE new_db; +``` + +Set the default database schema to `new_db.my_schema`: + +```sql +USE new_db.my_schema; +``` + +### Resolving Conflicts + +When providing only a single qualification, the system can interpret this as *either* a catalog *or* a schema, as long as there are no conflicts. For example: + +```sql +ATTACH 'new_db.db'; +CREATE SCHEMA my_schema; +``` + +Creates the table `new_db.main.tbl`: + +```sql +CREATE TABLE new_db.tbl (i INTEGER); +``` + +Creates the table `default_db.my_schema.tbl`: + +```sql +CREATE TABLE my_schema.tbl (i INTEGER); +``` + +If we create a conflict (i.e., we have both a schema and a catalog with the same name) the system requests that a fully qualified path is used instead: + +```sql +CREATE SCHEMA new_db; +CREATE TABLE new_db.tbl (i INTEGER); +``` + +```console +Error: Binder Error: Ambiguous reference to catalog or schema "new_db" - +use a fully qualified path like "memory.new_db" +``` + +### Changing the Catalog Search Path + +The catalog search path can be adjusted by setting the `search_path` configuration option, which uses a comma-separated list of values that will be on the search path. The following example demonstrates searching in two databases: + +```sql +ATTACH ':memory:' AS db1; +ATTACH ':memory:' AS db2; +CREATE table db1.tbl1 (i INTEGER); +CREATE table db2.tbl2 (j INTEGER); +``` + +Reference the tables using their fully qualified name: + +```sql +SELECT * FROM db1.tbl1; +SELECT * FROM db2.tbl2; +``` + +Or set the search path and reference the tables using their name: + +```sql +SET search_path = 'db1,db2'; +SELECT * FROM tbl1; +SELECT * FROM tbl2; +``` + +## Transactional Semantics + +When running queries on multiple databases, the system opens separate transactions per database. The transactions are started *lazily* by default – when a given database is referenced for the first time in a query, a transaction for that database will be started. `SET immediate_transaction_mode = true` can be toggled to change this behavior to eagerly start transactions in all attached databases instead. + +While multiple transactions can be active at a time – the system only supports *writing* to a single attached database in a single transaction. If you try to write to multiple attached databases in a single transaction the following error will be thrown: + +```console +Attempting to write to database "db2" in a transaction that has already modified database "db1" - +a single transaction can only write to a single attached database. +``` + +The reason for this restriction is that the system does not maintain atomicity for transactions across attached databases. Transactions are only atomic *within* each database file. By restricting the global transaction to write to only a single database file the atomicity guarantees are maintained. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/call.md b/docs/archive/1.0/sql/statements/call.md new file mode 100644 index 00000000000..21dc43939da --- /dev/null +++ b/docs/archive/1.0/sql/statements/call.md @@ -0,0 +1,33 @@ +--- +layout: docu +railroad: statements/call.js +title: CALL Statement +--- + +The `CALL` statement invokes the given table function and returns the results. + +## Examples + +Invoke the 'duckdb_functions' table function: + +```sql +CALL duckdb_functions(); +``` + +Invoke the 'pragma_table_info' table function: + +```sql +CALL pragma_table_info('pg_am'); +``` + +Select only the functions where the name starts with `ST_`: + +```sql +SELECT function_name, parameters, parameter_types, return_type +FROM duckdb_functions() +WHERE function_name LIKE 'ST_%'; +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/checkpoint.md b/docs/archive/1.0/sql/statements/checkpoint.md new file mode 100644 index 00000000000..5100ce50fc9 --- /dev/null +++ b/docs/archive/1.0/sql/statements/checkpoint.md @@ -0,0 +1,50 @@ +--- +layout: docu +railroad: statements/checkpoint.js +title: CHECKPOINT Statement +--- + +The `CHECKPOINT` statement synchronizes data in the write-ahead log (WAL) to the database data file. For in-memory +databases this statement will succeed with no effect. + +## Examples + +Synchronize data in the default database: + +```sql +CHECKPOINT; +``` + +Synchronize data in the specified database: + +```sql +CHECKPOINT file_db; +``` + +Abort any in-progress transactions to synchronize the data: + +```sql +FORCE CHECKPOINT; +``` + +## Syntax + +
+ +Checkpoint operations happen automatically based on the WAL size (see [Configuration]({% link docs/archive/1.0/configuration/overview.md %})). This +statement is for manual checkpoint actions. + +## Behavior + +The default `CHECKPOINT` command will fail if there are any running transactions. Including `FORCE` will abort any +transactions and execute the checkpoint operation. + +Also see the related [`PRAGMA` option]({% link docs/archive/1.0/configuration/pragmas.md %}#force-checkpoint) for further behavior modification. + +### Reclaiming Space + +When performing a checkpoint (automatic or otherwise), the space occupied by deleted rows is partially reclaimed. Note that this does not remove all deleted rows, but rather merges row groups that have a significant amount of deletes together. In the current implementation this requires ~25% of rows to be deleted in adjacent row groups. + +When running in in-memory mode, checkpointing has no effect, hence it does not reclaim space after deletes in in-memory databases. + +> Warning The [`VACUUM` statement]({% link docs/archive/1.0/sql/statements/vacuum.md %}) does _not_ trigger vacuuming deletes and hence does not reclaim space. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/comment_on.md b/docs/archive/1.0/sql/statements/comment_on.md new file mode 100644 index 00000000000..103aaffe86c --- /dev/null +++ b/docs/archive/1.0/sql/statements/comment_on.md @@ -0,0 +1,127 @@ +--- +layout: docu +railroad: statements/comment.js +title: COMMENT ON Statement +--- + +The `COMMENT ON` statement allows adding metadata to catalog entries (tables, columns, etc.). +It follows the [PostgreSQL syntax](https://www.postgresql.org/docs/16/sql-comment.html). + +## Examples + +Create a comment on a `TABLE`: + +```sql +COMMENT ON TABLE test_table IS 'very nice table'; +``` + +Create a comment on a `COLUMN`: + +```sql +COMMENT ON COLUMN test_table.test_table_column IS 'very nice column'; +``` + +Create a comment on a `VIEW`: + +```sql +COMMENT ON VIEW test_view IS 'very nice view'; +``` + +Create a comment on an `INDEX`: + +```sql +COMMENT ON INDEX test_index IS 'very nice index'; +``` + +Create a comment on a `SEQUENCE`: + +```sql +COMMENT ON SEQUENCE test_sequence IS 'very nice sequence'; +``` + +Create a comment on a `TYPE`: + +```sql +COMMENT ON TYPE test_type IS 'very nice type'; +``` + +Create a comment on a `MACRO`: + +```sql +COMMENT ON MACRO test_macro IS 'very nice macro'; +``` + +Create a comment on a `MACRO TABLE`: + +```sql +COMMENT ON MACRO TABLE test_table_macro IS 'very nice table macro'; +``` + +To unset a comment, set it to `NULL`, e.g.: + +```sql +COMMENT ON TABLE test_table IS NULL; +``` + +## Reading Comments + +Comments can be read by querying the `comment` column of the respective [metadata functions]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}): + +List comments on `TABLE`s: + +```sql +SELECT comment FROM duckdb_tables(); +``` + +List comments on `COLUMN`s: + +```sql +SELECT comment FROM duckdb_columns(); +``` + +List comments on `VIEW`s: + +```sql +SELECT comment FROM duckdb_views(); +``` + +List comments on `INDEX`s: + +```sql +SELECT comment FROM duckdb_indexes(); +``` + +List comments on `SEQUENCE`s: + +```sql +SELECT comment FROM duckdb_sequences(); +``` + +List comments on `TYPE`s: + +```sql +SELECT comment FROM duckdb_types(); +``` + +List comments on `MACRO`s: + +```sql +SELECT comment FROM duckdb_functions(); +``` + +List comments on `MACRO TABLE`s: + +```sql +SELECT comment FROM duckdb_functions(); +``` + +## Limitations + +The `COMMENT ON` statement currently has the following limitations: + +* It is not possible to comment on schemas or databases. +* It is not possible to comment on things that have a dependency (e.g., a table with an index). + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/copy.md b/docs/archive/1.0/sql/statements/copy.md new file mode 100644 index 00000000000..67d009eb0d4 --- /dev/null +++ b/docs/archive/1.0/sql/statements/copy.md @@ -0,0 +1,350 @@ +--- +layout: docu +railroad: statements/copy.js +title: COPY Statement +--- + +## Examples + +Read a CSV file into the `lineitem` table, using auto-detected CSV options: + +```sql +COPY lineitem FROM 'lineitem.csv'; +``` + +Read a CSV file into the `lineitem` table, using manually specified CSV options: + +```sql +COPY lineitem FROM 'lineitem.csv' (DELIMITER '|'); +``` + +Read a Parquet file into the `lineitem` table: + +```sql +COPY lineitem FROM 'lineitem.pq' (FORMAT PARQUET); +``` + +Read a JSON file into the `lineitem` table, using auto-detected options: + +```sql +COPY lineitem FROM 'lineitem.json' (FORMAT JSON, AUTO_DETECT true); +``` + +Read a CSV file into the `lineitem` table, using double quotes: + +```sql +COPY lineitem FROM "lineitem.csv"; +``` + +Read a CSV file into the `lineitem` table, omitting quotes: + +```sql +COPY lineitem FROM lineitem.csv; +``` + +Write a table to a CSV file: + +```sql +COPY lineitem TO 'lineitem.csv' (FORMAT CSV, DELIMITER '|', HEADER); +``` + +Write a table to a CSV file, using double quotes: + +```sql +COPY lineitem TO "lineitem.csv"; +``` + +Write a table to a CSV file, omitting quotes: + +```sql +COPY lineitem TO lineitem.csv; +``` + +Write the result of a query to a Parquet file: + +```sql +COPY (SELECT l_orderkey, l_partkey FROM lineitem) TO 'lineitem.parquet' (COMPRESSION ZSTD); +``` + +Copy the entire content of database `db1` to database `db2`: + +```sql +COPY FROM DATABASE db1 TO db2; +``` + +Copy only the schema (catalog elements) but not any data: + +```sql +COPY FROM DATABASE db1 TO db2 (SCHEMA); +``` + +## Overview + +`COPY` moves data between DuckDB and external files. `COPY ... FROM` imports data into DuckDB from an external file. `COPY ... TO` writes data from DuckDB to an external file. The `COPY` command can be used for `CSV`, `PARQUET` and `JSON` files. + +## `COPY ... FROM` + +`COPY ... FROM` imports data from an external file into an existing table. The data is appended to whatever data is in the table already. The amount of columns inside the file must match the amount of columns in the table `table_name`, and the contents of the columns must be convertible to the column types of the table. In case this is not possible, an error will be thrown. + +If a list of columns is specified, `COPY` will only copy the data in the specified columns from the file. If there are any columns in the table that are not in the column list, `COPY ... FROM` will insert the default values for those columns + +Copy the contents of a comma-separated file `test.csv` without a header into the table `test`: + +```sql +COPY test FROM 'test.csv'; +``` + +Copy the contents of a comma-separated file with a header into the `category` table: + +```sql +COPY category FROM 'categories.csv' (HEADER); +``` + +Copy the contents of `lineitem.tbl` into the `lineitem` table, where the contents are delimited by a pipe character (`|`): + +```sql +COPY lineitem FROM 'lineitem.tbl' (DELIMITER '|'); +``` + +Copy the contents of `lineitem.tbl` into the `lineitem` table, where the delimiter, quote character, and presence of a header are automatically detected: + +```sql +COPY lineitem FROM 'lineitem.tbl' (AUTO_DETECT true); +``` + +Read the contents of a comma-separated file `names.csv` into the `name` column of the `category` table. Any other columns of this table are filled with their default value: + +```sql +COPY category(name) FROM 'names.csv'; +``` + +Read the contents of a Parquet file `lineitem.parquet` into the `lineitem` table: + +```sql +COPY lineitem FROM 'lineitem.parquet' (FORMAT PARQUET); +``` + +Read the contents of a newline-delimited JSON file `lineitem.ndjson` into the `lineitem` table: + +```sql +COPY lineitem FROM 'lineitem.ndjson' (FORMAT JSON); +``` + +Read the contents of a JSON file `lineitem.json` into the `lineitem` table: + +```sql +COPY lineitem FROM 'lineitem.json' (FORMAT JSON, ARRAY true); +``` + +### Syntax + +
+ +## `COPY ... TO` + +`COPY ... TO` exports data from DuckDB to an external CSV or Parquet file. It has mostly the same set of options as `COPY ... FROM`, however, in the case of `COPY ... TO` the options specify how the file should be written to disk. Any file created by `COPY ... TO` can be copied back into the database by using `COPY ... FROM` with a similar set of options. + +The `COPY ... TO` function can be called specifying either a table name, or a query. When a table name is specified, the contents of the entire table will be written into the resulting file. When a query is specified, the query is executed and the result of the query is written to the resulting file. + +Copy the contents of the `lineitem` table to a CSV file with a header: + +```sql +COPY lineitem TO 'lineitem.csv'; +``` + +Copy the contents of the `lineitem` table to the file `lineitem.tbl`, where the columns are delimited by a pipe character (`|`), including a header line: + +```sql +COPY lineitem TO 'lineitem.tbl' (DELIMITER '|'); +``` + +Use tab separators to create a TSV file without a header: + +```sql +COPY lineitem TO 'lineitem.tsv' (DELIMITER '\t', HEADER false); +``` + +Copy the l_orderkey column of the `lineitem` table to the file `orderkey.tbl`: + +```sql +COPY lineitem(l_orderkey) TO 'orderkey.tbl' (DELIMITER '|'); +``` + +Copy the result of a query to the file `query.csv`, including a header with column names: + +```sql +COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.csv' (DELIMITER ','); +``` + +Copy the result of a query to the Parquet file `query.parquet`: + +```sql +COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.parquet' (FORMAT PARQUET); +``` + +Copy the result of a query to the newline-delimited JSON file `query.ndjson`: + +```sql +COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.ndjson' (FORMAT JSON); +``` + +Copy the result of a query to the JSON file `query.json`: + +```sql +COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.json' (FORMAT JSON, ARRAY true); +``` + +### `COPY ... TO` Options + +Zero or more copy options may be provided as a part of the copy operation. The `WITH` specifier is optional, but if any options are specified, the parentheses are required. Parameter values can be passed in with or without wrapping in single quotes. + +Any option that is a Boolean can be enabled or disabled in multiple ways. You can write `true`, `ON`, or `1` to enable the option, and `false`, `OFF`, or `0` to disable it. The `BOOLEAN` value can also be omitted, e.g., by only passing `(HEADER)`, in which case `true` is assumed. + +The below options are applicable to all formats written with `COPY`. + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `FILE_SIZE_BYTES` | If this parameter is set, the `COPY` process creates a directory which will contain the exported files. If a file exceeds the set limit (specified as bytes such as `1000` or in human-readable format such as `1k`), the process creates a new file in the directory. This parameter works in combination with `PER_THREAD_OUTPUT`. Note that the size is used as an approximation, and files can be occasionally slightly over the limit. | `VARCHAR` or `BIGINT` | (empty) | +| `FORMAT` | Specifies the copy function to use. The default is selected from the file extension (e.g., `.parquet` results in a Parquet file being written/read). If the file extension is unknown `CSV` is selected. Available options are `CSV`, `PARQUET` and `JSON`. | `VARCHAR` | auto | +| `OVERWRITE_OR_IGNORE` | Whether or not to allow overwriting a directory if one already exists. Only has an effect when used with `partition_by`. | `BOOL` | `false` | +| `PARTITION_BY` | The columns to partition by using a Hive partitioning scheme, see the [partitioned writes section]({% link docs/archive/1.0/data/partitioning/partitioned_writes.md %}). | `VARCHAR[]` | (empty) | +| `PER_THREAD_OUTPUT` | Generate one file per thread, rather than one file in total. This allows for faster parallel writing. | `BOOL` | `false` | +| `USE_TMP_FILE` | Whether or not to write to a temporary file first if the original file exists (`target.csv.tmp`). This prevents overwriting an existing file with a broken file in case the writing is cancelled. | `BOOL` | `auto` | + +### Syntax + +
+ +## `COPY FROM DATABASE ... TO` + +The `COPY FROM DATABASE ... TO` statement copies the entire content from one attached database to another attached database. This includes the schema, including constraints, indexes, sequences, macros, and the data itself. + +```sql +ATTACH 'db1.db' AS db1; +CREATE TABLE db1.tbl AS SELECT 42 AS x, 3 AS y; +CREATE MACRO db1.two_x_plus_y(x, y) AS 2 * x + y; + +ATTACH 'db2.db' AS db2; +COPY FROM DATABASE db1 TO db2; +SELECT db2.two_x_plus_y(x, y) AS z FROM db2.tbl; +``` + +| z | +|---:| +| 87 | + +To only copy the **schema** of `db1` to `db2` but omit copying the data, add `SCHEMA` to the statement: + +```sql +COPY FROM DATABASE db1 TO db2 (SCHEMA); +``` + +### Syntax + +
+ +## Format-Specific Options + +### CSV Options + +The below options are applicable when writing `CSV` files. + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `COMPRESSION` | The compression type for the file. By default this will be detected automatically from the file extension (e.g., `file.csv.gz` will use gzip, `file.csv` will use `none`). Options are `none`, `gzip`, `zstd`. | `VARCHAR` | `auto` | +| `DATEFORMAT` | Specifies the date format to use when writing dates. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | (empty) | +| `DELIM` or `SEP` | The character that is written to separate columns within each row. | `VARCHAR` | `,` | +| `ESCAPE` | The character that should appear before a character that matches the `quote` value. | `VARCHAR` | `"` | +| `FORCE_QUOTE` | The list of columns to always add quotes to, even if not required. | `VARCHAR[]` | `[]` | +| `HEADER` | Whether or not to write a header for the CSV file. | `BOOL` | `true` | +| `NULLSTR` | The string that is written to represent a `NULL` value. | `VARCHAR` | (empty) | +| `QUOTE` | The quoting character to be used when a data value is quoted. | `VARCHAR` | `"` | +| `TIMESTAMPFORMAT` | Specifies the date format to use when writing timestamps. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | (empty) | + +### Parquet Options + +The below options are applicable when writing `Parquet` files. + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `COMPRESSION` | The compression format to use (`uncompressed`, `snappy`, `gzip` or `zstd`). | `VARCHAR` | `snappy` | +| `COMPRESSION_LEVEL` | Compression level, set between 1 (lowest compression, fastest) and 22 (highest compression, slowest). Only supported for zstd compression. | `BIGINT` | `3` | +| `FIELD_IDS` | The `field_id` for each column. Pass `auto` to attempt to infer automatically. | `STRUCT` | (empty) | +| `ROW_GROUP_SIZE_BYTES` | The target size of each row group. You can pass either a human-readable string, e.g., `2MB`, or an integer, i.e., the number of bytes. This option is only used when you have issued `SET preserve_insertion_order = false;`, otherwise, it is ignored. | `BIGINT` | `row_group_size * 1024` | +| `ROW_GROUP_SIZE` | The target size, i.e., number of rows, of each row group. | `BIGINT` | 122880 | + +Some examples of `FIELD_IDS` are: + +Assign `field_ids` automatically: + +```sql +COPY + (SELECT 128 AS i) + TO 'my.parquet' + (FIELD_IDS 'auto'); +``` + +Sets the `field_id` of column `i` to 42: + +```sql +COPY + (SELECT 128 AS i) + TO 'my.parquet' + (FIELD_IDS {i: 42}); +``` + +Sets the `field_id` of column `i` to 42, and column `j` to 43: + +```sql +COPY + (SELECT 128 AS i, 256 AS j) + TO 'my.parquet' + (FIELD_IDS {i: 42, j: 43}); +``` + +Sets the `field_id` of column `my_struct` to 43, and column `i` (nested inside `my_struct`) to 43: + +```sql +COPY + (SELECT {i: 128} AS my_struct) + TO 'my.parquet' + (FIELD_IDS {my_struct: {__duckdb_field_id: 42, i: 43}}); +``` + +Sets the `field_id` of column `my_list` to 42, and column `element` (default name of list child) to 43: + +```sql +COPY + (SELECT [128, 256] AS my_list) + TO 'my.parquet' + (FIELD_IDS {my_list: {__duckdb_field_id: 42, element: 43}}); +``` + +Sets the `field_id` of column `my_map` to 42, and columns `key` and `value` (default names of map children) to 43 and 44: + +```sql +COPY + (SELECT MAP {'key1' : 128, 'key2': 256} my_map) + TO 'my.parquet' + (FIELD_IDS {my_map: {__duckdb_field_id: 42, key: 43, value: 44}}); +``` + +### JSON Options + +The below options are applicable when writing `JSON` files. + +| Name | Description | Type | Default | +|:--|:-----|:-|:-| +| `ARRAY` | Whether to write a JSON array. If `true`, a JSON array of records is written, if `false`, newline-delimited JSON is written | `BOOL` | `false` | +| `COMPRESSION` | The compression type for the file. By default this will be detected automatically from the file extension (e.g., `file.json.gz` will use gzip, `file.json` will use `none`). Options are `none`, `gzip`, `zstd`. | `VARCHAR` | `auto` | +| `DATEFORMAT` | Specifies the date format to use when writing dates. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | (empty) | +| `TIMESTAMPFORMAT` | Specifies the date format to use when writing timestamps. See [Date Format]({% link docs/archive/1.0/sql/functions/dateformat.md %}) | `VARCHAR` | (empty) | + +## Limitations + +`COPY` does not support copying between tables. To copy between tables, use an [`INSERT statement`]({% link docs/archive/1.0/sql/statements/insert.md %}): + +```sql +INSERT INTO tbl2 + FROM tbl1; +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_index.md b/docs/archive/1.0/sql/statements/create_index.md new file mode 100644 index 00000000000..d9a72468bdc --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_index.md @@ -0,0 +1,80 @@ +--- +layout: docu +railroad: statements/indexes.js +title: CREATE INDEX Statement +--- + +## `CREATE INDEX` + +The `CREATE INDEX` statement constructs an index on the specified column(s) of the specified table. Compound indexes on multiple columns/expressions are supported. + +> Unidimensional indexes are supported, while multidimensional indexes are not yet supported. + +### Examples + +Create a unique index `films_id_idx` on the column id of table `films`: + +```sql +CREATE UNIQUE INDEX films_id_idx ON films (id); +``` + +Create index `s_idx` that allows for duplicate values on column `revenue` of table `films`: + +```sql +CREATE INDEX s_idx ON films (revenue); +``` + +Create compound index `gy_idx` on `genre` and `year` columns: + +```sql +CREATE INDEX gy_idx ON films (genre, year); +``` + +Create index `i_index` on the expression of the sum of columns `j` and `k` from table `integers`: + +```sql +CREATE INDEX i_index ON integers ((j + k)); +``` + +### Parameters + +
+ +| Name | Description | +|:-|:-----| +| `UNIQUE` | Causes the system to check for duplicate values in the table when the index is created (if data already exist) and each time data is added. Attempts to insert or update data that would result in duplicate entries will generate an error. | +| `name` | The name of the index to be created. | +| `table` | The name of the table to be indexed. | +| `column` | The name of the column to be indexed. | +| `expression` | An expression based on one or more columns of the table. The expression usually must be written with surrounding parentheses, as shown in the syntax. However, the parentheses can be omitted if the expression has the form of a function call. | +| `index type` | Specified index type, see [Indexes]({% link docs/archive/1.0/sql/indexes.md %}). Optional. | +| `option` | Index option in the form of a boolean true value (e.g., `is_cool`) or a key-value pair (e.g., `my_option = 2`). Optional. | + +### Syntax + +
+ +## `DROP INDEX` + +`DROP INDEX` drops an existing index from the database system. + +### Examples + +Remove the index `title_idx`: + +```sql +DROP INDEX title_idx; +``` + +### Parameters + +
+ +| Name | Description | +|:---|:---| +| `IF EXISTS` | Do not throw an error if the index does not exist. | +| `name` | The name of an index to remove. | + +### Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_macro.md b/docs/archive/1.0/sql/statements/create_macro.md new file mode 100644 index 00000000000..ff52305f15b --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_macro.md @@ -0,0 +1,246 @@ +--- +layout: docu +railroad: statements/createmacro.js +title: CREATE MACRO Statement +--- + +The `CREATE MACRO` statement can create a scalar or table macro (function) in the catalog. +A macro may only be a single `SELECT` statement (similar to a `VIEW`), but it has the benefit of accepting parameters. +For a scalar macro, `CREATE MACRO` is followed by the name of the macro, and optionally parameters within a set of parentheses. The keyword `AS` is next, followed by the text of the macro. By design, a scalar macro may only return a single value. +For a table macro, the syntax is similar to a scalar macro except `AS` is replaced with `AS TABLE`. A table macro may return a table of arbitrary size and shape. + +> If a `MACRO` is temporary, it is only usable within the same database connection and is deleted when the connection is closed. + +## Examples + +### Scalar Macros + +Create a macro that adds two expressions (`a` and `b`): + +```sql +CREATE MACRO add(a, b) AS a + b; +``` + +Create a macro for a case expression: + +```sql +CREATE MACRO ifelse(a, b, c) AS CASE WHEN a THEN b ELSE c END; +``` + +Create a macro that does a subquery: + +```sql +CREATE MACRO one() AS (SELECT 1); +``` + +Create a macro with a common table expression. +Note that parameter names get priority over column names. To work around this, disambiguate using the table name. + +```sql +CREATE MACRO plus_one(a) AS (WITH cte AS (SELECT 1 AS a) SELECT cte.a + a FROM cte); +``` + +Macros are schema-dependent, and have an alias, `FUNCTION`: + +```sql +CREATE FUNCTION main.my_avg(x) AS sum(x) / count(x); +``` + +Create a macro with default constant parameters: + +```sql +CREATE MACRO add_default(a, b := 5) AS a + b; +``` + +Create a macro `arr_append` (with a functionality equivalent to `array_append`): + +```sql +CREATE MACRO arr_append(l, e) AS list_concat(l, list_value(e)); +``` + +### Table Macros + +Create a table macro without parameters: + +```sql +CREATE MACRO static_table() AS TABLE + SELECT 'Hello' AS column1, 'World' AS column2; +``` + +Create a table macro with parameters (that can be of any type): + +```sql +CREATE MACRO dynamic_table(col1_value, col2_value) AS TABLE + SELECT col1_value AS column1, col2_value AS column2; +``` + +Create a table macro that returns multiple rows. It will be replaced if it already exists, and it is temporary (will be automatically deleted when the connection ends): + +```sql +CREATE OR REPLACE TEMP MACRO dynamic_table(col1_value, col2_value) AS TABLE + SELECT col1_value AS column1, col2_value AS column2 + UNION ALL + SELECT 'Hello' AS col1_value, 456 AS col2_value; +``` + +Pass an argument as a list: + +```sql +CREATE MACRO get_users(i) AS TABLE + SELECT * FROM users WHERE uid IN (SELECT unnest(i)); +``` + +An example for how to use the `get_users` table macro is the following: + +```sql +CREATE TABLE users AS + SELECT * + FROM (VALUES (1, 'Ada'), (2, 'Bob'), (3, 'Carl'), (4, 'Dan'), (5, 'Eve')) t(uid, name); +SELECT * FROM get_users([1, 5]); +``` + +## Syntax + +
+ +Macros allow you to create shortcuts for combinations of expressions. + +```sql +CREATE MACRO add(a) AS a + b; +``` + +```console +Binder Error: Referenced column "b" not found in FROM clause! +``` + +This works: + +```sql +CREATE MACRO add(a, b) AS a + b; +``` + +Usage example: + +```sql +SELECT add(1, 2) AS x; +``` + +| x | +|--:| +| 3 | + +However, this fails: + +```sql +SELECT add('hello', 3); +``` + +```console +Binder Error: Could not choose a best candidate function for the function call "+(STRING_LITERAL, INTEGER_LITERAL)". In order to select one, please add explicit type casts. + Candidate functions: + +(DATE, INTEGER) -> DATE + +(INTEGER, INTEGER) -> INTEGER +``` + +Macros can have default parameters. +Unlike some languages, default parameters must be named +when the macro is invoked. + +`b` is a default parameter: + +```sql +CREATE MACRO add_default(a, b := 5) AS a + b; +``` + +The following will result in 42: + +```sql +SELECT add_default(37); +``` + +The following will throw an error: + +```sql +SELECT add_default(40, 2); +``` + +```console +Binder Error: Macro function 'add_default(a)' requires a single positional argument, but 2 positional arguments were provided. +``` + +Default parameters must used by assigning them like the following: + +```sql +SELECT add_default(40, b := 2) AS x; +``` + +| x | +|---:| +| 42 | + +However, the following fails: + +```sql +SELECT add_default(b := 2, 40); +``` + +```console +Binder Error: Positional parameters cannot come after parameters with a default value! +``` + +The order of default parameters does not matter: + +```sql +CREATE MACRO triple_add(a, b := 5, c := 10) AS a + b + c; +``` + +```sql +SELECT triple_add(40, c := 1, b := 1) AS x; +``` + +| x | +|---:| +| 42 | + +When macros are used, they are expanded (i.e., replaced with the original expression), and the parameters within the expanded expression are replaced with the supplied arguments. Step by step: + +The `add` macro we defined above is used in a query: + +```sql +SELECT add(40, 2) AS x; +``` + +Internally, add is replaced with its definition of `a + b`: + +```sql +SELECT a + b; AS x +``` + +Then, the parameters are replaced by the supplied arguments: + +```sql +SELECT 40 + 2 AS x; +``` + +## Limitations + +### Using Named Parameters + +Currently, positional macro parameters can only be used positionally, and named parameters can only be used by supplying their name. Therefore, the following will not work: + +```sql +CREATE MACRO my_macro(a, b := 42) AS (a + b); +SELECT my_macro(32, 52); +``` + +```console +Error: Binder Error: Macro function 'my_macro(a)' requires a single positional argument, but 2 positional arguments were provided. +``` + +### Using Subquery Macros + +If a `MACRO` is defined as a subquery, it cannot be invoked in a table function. DuckDB will return the following error: + +```console +Binder Error: Table function cannot contain subqueries +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_schema.md b/docs/archive/1.0/sql/statements/create_schema.md new file mode 100644 index 00000000000..8bb3a07a60d --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_schema.md @@ -0,0 +1,40 @@ +--- +layout: docu +railroad: statements/createschema.js +title: CREATE SCHEMA Statement +--- + +The `CREATE SCHEMA` statement creates a schema in the catalog. The default schema is `main`. + +## Examples + +Create a schema: + +```sql +CREATE SCHEMA s1; +``` + +Create a schema if it does not exist yet: + +```sql +CREATE SCHEMA IF NOT EXISTS s2; +``` + +Create table in the schemas: + +```sql +CREATE TABLE s1.t (id INTEGER PRIMARY KEY, other_id INTEGER); +CREATE TABLE s2.t (id INTEGER PRIMARY KEY, j VARCHAR); +``` + +Compute a join between tables from two schemas: + +```sql +SELECT * +FROM s1.t s1t, s2.t s2t +WHERE s1t.other_id = s2t.id; +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_secret.md b/docs/archive/1.0/sql/statements/create_secret.md new file mode 100644 index 00000000000..5924a960c67 --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_secret.md @@ -0,0 +1,15 @@ +--- +layout: docu +railroad: statements/secrets.js +title: CREATE SECRET Statement +--- + +The `CREATE SECRET` statement creates a new secret in the [Secrets Manager]({% link docs/archive/1.0/configuration/secrets_manager.md %}). + +### Syntax for `CREATE SECRET` + +
+ +### Syntax for `DROP SECRET` + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_sequence.md b/docs/archive/1.0/sql/statements/create_sequence.md new file mode 100644 index 00000000000..aade87ae51d --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_sequence.md @@ -0,0 +1,164 @@ +--- +layout: docu +railroad: statements/createsequence.js +title: CREATE SEQUENCE Statement +--- + +The `CREATE SEQUENCE` statement creates a new sequence number generator. + +### Examples + +Generate an ascending sequence starting from 1: + +```sql +CREATE SEQUENCE serial; +``` + +Generate sequence from a given start number: + +```sql +CREATE SEQUENCE serial START 101; +``` + +Generate odd numbers using `INCREMENT BY`: + +```sql +CREATE SEQUENCE serial START WITH 1 INCREMENT BY 2; +``` + +Generate a descending sequqnce starting from 99: + +```sql +CREATE SEQUENCE serial START WITH 99 INCREMENT BY -1 MAXVALUE 99; +``` + +By default, cycles are not allowed and will result in error, e.g.: + +```console +Sequence Error: nextval: reached maximum value of sequence "serial" (10) +``` + +```sql +CREATE SEQUENCE serial START WITH 1 MAXVALUE 10; +``` + +`CYCLE` allows cycling through the same sequence repeatedly: + +```sql +CREATE SEQUENCE serial START WITH 1 MAXVALUE 10 CYCLE; +``` + +### Creating and Dropping Sequences + +Sequences can be created and dropped similarly to other catalogue items. + +Overwrite an existing sequence: + +```sql +CREATE OR REPLACE SEQUENCE serial; +``` + +Only create sequence if no such sequence exists yet: + +```sql +CREATE SEQUENCE IF NOT EXISTS serial; +``` + +Remove sequence: + +```sql +DROP SEQUENCE serial; +``` + +Remove sequence if exists: + +```sql +DROP SEQUENCE IF EXISTS serial; +``` + +### Using Sequences for Primary Keys + +Sequences can provide an integer primary key for a table. For example: + +```sql +CREATE SEQUENCE id_sequence START 1; +CREATE TABLE tbl (id INTEGER DEFAULT nextval('id_sequence'), s VARCHAR); +INSERT INTO tbl (s) VALUES ('hello'), ('world'); +SELECT * FROM tbl; +``` + +The script results in the following table: + +| id | s | +|---:|-------| +| 1 | hello | +| 2 | world | + +Sequences can also be added using the [`ALTER TABLE` statement]({% link docs/archive/1.0/sql/statements/alter_table.md %}). The following example adds an `id` column and fills it with values generated by the sequence: + +```sql +CREATE TABLE tbl (s VARCHAR); +INSERT INTO tbl VALUES ('hello'), ('world'); +CREATE SEQUENCE id_sequence START 1; +ALTER TABLE tbl ADD COLUMN id INTEGER DEFAULT nextval('id_sequence'); +SELECT * FROM tbl; +``` + +This script results in the same table as the previous example. + +### Selecting the Next Value + +To select the next number from a sequence, use `nextval`: + +```sql +CREATE SEQUENCE serial START 1; +SELECT nextval('serial') AS nextval; +``` + +| nextval | +|--------:| +| 1 | + +Using this sequence in an `INSERT` command: + +```sql +INSERT INTO distributors VALUES (nextval('serial'), 'nothing'); +``` + +### Selecting the Current Value + +You may also view the current number from the sequence. Note that the `nextval` function must have already been called before calling `currval`, otherwise a Serialization Error (`sequence is not yet defined in this session`) will be thrown. + +```sql +CREATE SEQUENCE serial START 1; +SELECT nextval('serial') AS nextval; +SELECT currval('serial') AS currval; +``` + +| currval | +|--------:| +| 1 | + +### Syntax + +
+ +`CREATE SEQUENCE` creates a new sequence number generator. + +If a schema name is given then the sequence is created in the specified schema. Otherwise it is created in the current schema. Temporary sequences exist in a special schema, so a schema name may not be given when creating a temporary sequence. The sequence name must be distinct from the name of any other sequence in the same schema. + +After a sequence is created, you use the function `nextval` to operate on the sequence. + +## Parameters + +| Name | Description | +|:--|:-----| +| `CYCLE` or `NO CYCLE` | The `CYCLE` option allows the sequence to wrap around when the `maxvalue` or `minvalue` has been reached by an ascending or descending sequence respectively. If the limit is reached, the next number generated will be the `minvalue` or `maxvalue`, respectively. If `NO CYCLE` is specified, any calls to `nextval` after the sequence has reached its maximum value will return an error. If neither `CYCLE` or `NO CYCLE` are specified, `NO CYCLE` is the default. | +| `increment` | The optional clause `INCREMENT BY increment` specifies which value is added to the current sequence value to create a new value. A positive value will make an ascending sequence, a negative one a descending sequence. The default value is 1. | +| `maxvalue` | The optional clause `MAXVALUE maxvalue` determines the maximum value for the sequence. If this clause is not supplied or `NO MAXVALUE` is specified, then default values will be used. The defaults are 2^63 - 1 and -1 for ascending and descending sequences, respectively. | +| `minvalue` | The optional clause `MINVALUE minvalue` determines the minimum value a sequence can generate. If this clause is not supplied or `NO MINVALUE` is specified, then defaults will be used. The defaults are 1 and -(2^63 - 1) for ascending and descending sequences, respectively. | +| `name` | The name (optionally schema-qualified) of the sequence to be created. | +| `start` | The optional clause `START WITH start` allows the sequence to begin anywhere. The default starting value is `minvalue` for ascending sequences and `maxvalue` for descending ones. | +| `TEMPORARY` or `TEMP` | If specified, the sequence object is created only for this session, and is automatically dropped on session exit. Existing permanent sequences with the same name are not visible (in this session) while the temporary sequence exists, unless they are referenced with schema-qualified names. | + +> Sequences are based on `BIGINT` arithmetic, so the range cannot exceed the range of an eight-byte integer (-9223372036854775808 to 9223372036854775807). \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_table.md b/docs/archive/1.0/sql/statements/create_table.md new file mode 100644 index 00000000000..a1ae3f1edda --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_table.md @@ -0,0 +1,289 @@ +--- +layout: docu +railroad: statements/createtable.js +title: CREATE TABLE Statement +--- + +The `CREATE TABLE` statement creates a table in the catalog. + +## Examples + +Create a table with two integer columns (`i` and `j`): + +```sql +CREATE TABLE t1 (i INTEGER, j INTEGER); +``` + +Create a table with a primary key: + +```sql +CREATE TABLE t1 (id INTEGER PRIMARY KEY, j VARCHAR); +``` + +Create a table with a composite primary key: + +```sql +CREATE TABLE t1 (id INTEGER, j VARCHAR, PRIMARY KEY (id, j)); +``` + +Create a table with various different types and constraints: + +```sql +CREATE TABLE t1 ( + i INTEGER NOT NULL, + decimalnr DOUBLE CHECK (decimalnr < 10), + date DATE UNIQUE, + time TIMESTAMP +); +``` + +Create table with `CREATE TABLE ... AS SELECT` (CTAS): + +```sql +CREATE TABLE t1 AS + SELECT 42 AS i, 84 AS j; +``` + +Create a table from a CSV file (automatically detecting column names and types): + +```sql +CREATE TABLE t1 AS + SELECT * + FROM read_csv('path/file.csv'); +``` + +We can use the `FROM`-first syntax to omit `SELECT *`: + +```sql +CREATE TABLE t1 AS + FROM read_csv('path/file.csv'); +``` + +Copy the schema of `t2` to `t1`: + +```sql +CREATE TABLE t1 AS + FROM t2 + LIMIT 0; +``` + +## Temporary Tables + +Temporary tables can be created using the `CREATE TEMP TABLE` or the `CREATE TEMPORARY TABLE` statement (see diagram below). +Temporary tables are session scoped (similar to PostgreSQL for example), meaning that only the specific connection that created them can access them, and once the connection to DuckDB is closed they will be automatically dropped. +Temporary tables reside in memory rather than on disk (even when connecting to a persistent DuckDB), but if the `temp_directory` [configuration]({% link docs/archive/1.0/configuration/overview.md %}) is set when connecting or with a `SET` command, data will be spilled to disk if memory becomes constrained. + +Create a temporary table from a CSV file (automatically detecting column names and types): + +```sql +CREATE TEMP TABLE t1 AS + SELECT * + FROM read_csv('path/file.csv'); +``` + +Allow temporary tables to off-load excess memory to disk: + +```sql +SET temp_directory = '/path/to/directory/'; +``` + +Temporary tables are part of the `temp.main` schema. While discouraged, their names can overlap with the names of the regular database tables. In these cases, use their fully qualified name, e.g., `temp.main.t1`, for disambiguation. + +## `CREATE OR REPLACE` + +The `CREATE OR REPLACE` syntax allows a new table to be created or for an existing table to be overwritten by the new table. This is shorthand for dropping the existing table and then creating the new one. + +Create a table with two integer columns (i and j) even if t1 already exists: + +```sql +CREATE OR REPLACE TABLE t1 (i INTEGER, j INTEGER); +``` + +## `IF NOT EXISTS` + +The `IF NOT EXISTS` syntax will only proceed with the creation of the table if it does not already exist. If the table already exists, no action will be taken and the existing table will remain in the database. + +Create a table with two integer columns (`i` and `j`) only if `t1` does not exist yet: + +```sql +CREATE TABLE IF NOT EXISTS t1 (i INTEGER, j INTEGER); +``` + +## `CREATE TABLE ... AS SELECT` (CTAS) + +DuckDB supports the `CREATE TABLE ... AS SELECT` syntax, also known as “CTAS”: + +```sql +CREATE TABLE nums AS + SELECT i + FROM range(0, 3) t(i); +``` + +This syntax can be used in combination with the [CSV reader]({% link docs/archive/1.0/data/csv/overview.md %}), the shorthand to read directly from CSV files without specifying a function, the [`FROM`-first syntax]({% link docs/archive/1.0/sql/query_syntax/from.md %}), and the [HTTP(S) support]({% link docs/archive/1.0/extensions/httpfs/https.md %}), yielding concise SQL commands such as the following: + +```sql +CREATE TABLE flights AS + FROM 'https://duckdb.org/data/flights.csv'; +``` + +The CTAS construct also works with the `OR REPLACE` modifier, yielding `CREATE OR REPLACE TABLE ... AS` statements: + +```sql +CREATE OR REPLACE TABLE flights AS + FROM 'https://duckdb.org/data/flights.csv'; +``` + +Note that it is not possible to create tables using CTAS statements with constraints (primary keys, check constraints, etc.). + +## Check Constraints + +A `CHECK` constraint is an expression that must be satisfied by the values of every row in the table. + +```sql +CREATE TABLE t1 ( + id INTEGER PRIMARY KEY, + percentage INTEGER CHECK (0 <= percentage AND percentage <= 100) +); +INSERT INTO t1 VALUES (1, 5); +INSERT INTO t1 VALUES (2, -1); +``` + +```console +Error: Constraint Error: CHECK constraint failed: t1 +``` + +```sql +INSERT INTO t1 VALUES (3, 101); +``` + +```console +Error: Constraint Error: CHECK constraint failed: t1 +``` + +```sql +CREATE TABLE t2 (id INTEGER PRIMARY KEY, x INTEGER, y INTEGER CHECK (x < y)); +INSERT INTO t2 VALUES (1, 5, 10); +INSERT INTO t2 VALUES (2, 5, 3); +``` + +```console +Error: Constraint Error: CHECK constraint failed: t2 +``` + +`CHECK` constraints can also be added as part of the `CONSTRAINTS` clause: + +```sql +CREATE TABLE t3 ( + id INTEGER PRIMARY KEY, + x INTEGER, + y INTEGER, + CONSTRAINT x_smaller_than_y CHECK (x < y) +); +INSERT INTO t3 VALUES (1, 5, 10); +INSERT INTO t3 VALUES (2, 5, 3); +``` + +```console +Error: Constraint Error: CHECK constraint failed: t3 +``` + +## Foreign Key Constraints + +A `FOREIGN KEY` is a column (or set of columns) that references another table's primary key. Foreign keys check referential integrity, i.e., the referred primary key must exist in the other table upon insertion. + +```sql +CREATE TABLE t1 (id INTEGER PRIMARY KEY, j VARCHAR); +CREATE TABLE t2 ( + id INTEGER PRIMARY KEY, + t1_id INTEGER, + FOREIGN KEY (t1_id) REFERENCES t1 (id) +); +``` + +Example: + +```sql +INSERT INTO t1 VALUES (1, 'a'); +INSERT INTO t2 VALUES (1, 1); +INSERT INTO t2 VALUES (2, 2); +``` + +```console +Error: Constraint Error: Violates foreign key constraint because key "id: 2" does not exist in the referenced table +``` + +Foreign keys can be defined on composite primary keys: + +```sql +CREATE TABLE t3 (id INTEGER, j VARCHAR, PRIMARY KEY (id, j)); +CREATE TABLE t4 ( + id INTEGER PRIMARY KEY, t3_id INTEGER, t3_j VARCHAR, + FOREIGN KEY (t3_id, t3_j) REFERENCES t3(id, j) +); +``` + +Example: + +```sql +INSERT INTO t3 VALUES (1, 'a'); +INSERT INTO t4 VALUES (1, 1, 'a'); +INSERT INTO t4 VALUES (2, 1, 'b'); +``` + +```console +Error: Constraint Error: Violates foreign key constraint because key "id: 1, j: b" does not exist in the referenced table +``` + +Foreign keys can also be defined on unique columns: + +```sql +CREATE TABLE t5 (id INTEGER UNIQUE, j VARCHAR); +CREATE TABLE t6 ( + id INTEGER PRIMARY KEY, + t5_id INTEGER, + FOREIGN KEY (t5_id) REFERENCES t5(id) +); +``` + +### Limitations + +Foreign keys have the following limitations. + +Foreign keys with cascading deletes (`FOREIGN KEY ... REFERENCES ... ON DELETE CASCADE`) are not supported. + +Inserting into tables with self-referencing foreign keys is currently not supported and will result in the following error: + +```console +Constraint Error: Violates foreign key constraint because key "..." does not exist in the referenced table. +``` + +## Generated Columns + +The `[type] [GENERATED ALWAYS] AS (expr) [VIRTUAL|STORED]` syntax will create a generated column. The data in this kind of column is generated from its expression, which can reference other (regular or generated) columns of the table. Since they are produced by calculations, these columns can not be inserted into directly. + +DuckDB can infer the type of the generated column based on the expression's return type. This allows you to leave out the type when declaring a generated column. It is possible to explicitly set a type, but insertions into the referenced columns might fail if the type can not be cast to the type of the generated column. + +Generated columns come in two varieties: `VIRTUAL` and `STORED`. +The data of virtual generated columns is not stored on disk, instead it is computed from the expression every time the column is referenced (through a select statement). + +The data of stored generated columns is stored on disk and is computed every time the data of their dependencies change (through an `INSERT` / `UPDATE` / `DROP` statement). + +Currently, only the `VIRTUAL` kind is supported, and it is also the default option if the last field is left blank. + +The simplest syntax for a generated column: + +The type is derived from the expression, and the variant defaults to `VIRTUAL`: + +```sql +CREATE TABLE t1 (x FLOAT, two_x AS (2 * x)); +``` + +Fully specifying the same generated column for completeness: + +```sql +CREATE TABLE t1 (x FLOAT, two_x FLOAT GENERATED ALWAYS AS (2 * x) VIRTUAL); +``` + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_type.md b/docs/archive/1.0/sql/statements/create_type.md new file mode 100644 index 00000000000..6cd250b328f --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_type.md @@ -0,0 +1,45 @@ +--- +layout: docu +railroad: statements/createtype.js +title: CREATE TYPE Statement +--- + +The `CREATE TYPE` statement defines a new type in the catalog. + +## Examples + +Create a simple `ENUM` type: + +```sql +CREATE TYPE mood AS ENUM ('happy', 'sad', 'curious'); +``` + +Create a simple `STRUCT` type: + +```sql +CREATE TYPE many_things AS STRUCT(k INTEGER, l VARCHAR); +``` + +Create a simple `UNION` type: + +```sql +CREATE TYPE one_thing AS UNION(number INTEGER, string VARCHAR); +``` + +Create a type alias: + +```sql +CREATE TYPE x_index AS INTEGER; +``` + +## Syntax + +
+ +The `CREATE TYPE` clause defines a new data type available to this DuckDB instance. +These new types can then be inspected in the [`duckdb_types` table]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_types). + +## Limitations + +Extending types to support custom operators (such as the PostgreSQL `&&` operator) is not possible via plain SQL. +Instead, it requires adding additional C++ code. To do this, create an [extension]({% link docs/archive/1.0/extensions/overview.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/create_view.md b/docs/archive/1.0/sql/statements/create_view.md new file mode 100644 index 00000000000..3313aba78b3 --- /dev/null +++ b/docs/archive/1.0/sql/statements/create_view.md @@ -0,0 +1,43 @@ +--- +layout: docu +railroad: statements/createview.js +title: CREATE VIEW Statement +--- + +The `CREATE VIEW` statement defines a new view in the catalog. + +## Examples + +Create a simple view: + +```sql +CREATE VIEW v1 AS SELECT * FROM tbl; +``` + +Create a view or replace it if a view with that name already exists: + +```sql +CREATE OR REPLACE VIEW v1 AS SELECT 42; +``` + +Create a view and replace the column names: + +```sql +CREATE VIEW v1(a) AS SELECT 42; +``` + +The SQL query behind an existing view can be read using the [`duckdb_views()` function]({% link docs/archive/1.0/sql/meta/duckdb_table_functions.md %}#duckdb_views) like this: + +```sql +SELECT sql FROM duckdb_views() WHERE view_name = 'v1'; +``` + +## Syntax + +
+ +`CREATE VIEW` defines a view of a query. The view is not physically materialized. Instead, the query is run every time the view is referenced in a query. + +`CREATE OR REPLACE VIEW` is similar, but if a view of the same name already exists, it is replaced. + +If a schema name is given then the view is created in the specified schema. Otherwise, it is created in the current schema. Temporary views exist in a special schema, so a schema name cannot be given when creating a temporary view. The name of the view must be distinct from the name of any other view or table in the same schema. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/delete.md b/docs/archive/1.0/sql/statements/delete.md new file mode 100644 index 00000000000..27d7c40781a --- /dev/null +++ b/docs/archive/1.0/sql/statements/delete.md @@ -0,0 +1,41 @@ +--- +layout: docu +railroad: statements/delete.js +title: DELETE Statement +--- + +The `DELETE` statement removes rows from the table identified by the table-name. + +## Examples + +Remove the rows matching the condition `i = 2` from the database: + +```sql +DELETE FROM tbl WHERE i = 2; +``` + +Delete all rows in the table `tbl`: + +```sql +DELETE FROM tbl; +``` + +The `TRUNCATE` statement removes all rows from a table, acting as an alias for `DELETE FROM` without a `WHERE` clause: + +```sql +TRUNCATE tbl; +``` + +## Syntax + +
+ +The `DELETE` statement removes rows from the table identified by the table-name. + +If the `WHERE` clause is not present, all records in the table are deleted. If a `WHERE` clause is supplied, then only those rows for which the `WHERE` clause results in true are deleted. Rows for which the expression is false or NULL are retained. + +The `USING` clause allows deleting based on the content of other tables or subqueries. + +## Limitations on Reclaiming Memory and Disk Space + +Running `DELETE` does not mean space is reclaimed. In general, rows are only marked as deleted. DuckDB reclaims space upon [performing a `CHECKPOINT`]({% link docs/archive/1.0/sql/statements/checkpoint.md %}). [`VACUUM`]({% link docs/archive/1.0/sql/statements/vacuum.md %}) currently does not reclaim space. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/describe.md b/docs/archive/1.0/sql/statements/describe.md new file mode 100644 index 00000000000..542b502441b --- /dev/null +++ b/docs/archive/1.0/sql/statements/describe.md @@ -0,0 +1,26 @@ +--- +layout: docu +title: DESCRIBE Statement +--- + +The `DESCRIBE` statement shows the schema of a table, view or query. + +## Usage + +```sql +DESCRIBE tbl; +``` + +In order to summarize a query, prepend `DESCRIBE` to a query. + +```sql +DESCRIBE SELECT * FROM tbl; +``` + +## Alias + +The `SHOW` statement is an alias for `DESCRIBE`. + +## See Also + +For more examples, see the [guide on `DESCRIBE`]({% link docs/archive/1.0/guides/meta/describe.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/drop.md b/docs/archive/1.0/sql/statements/drop.md new file mode 100644 index 00000000000..8d960e85360 --- /dev/null +++ b/docs/archive/1.0/sql/statements/drop.md @@ -0,0 +1,136 @@ +--- +layout: docu +railroad: statements/drop.js +title: DROP Statement +--- + +The `DROP` statement removes a catalog entry added previously with the `CREATE` command. + +## Examples + +Delete the table with the name `tbl`: + +```sql +DROP TABLE tbl; +``` + +Drop the view with the name `v1`; do not throw an error if the view does not exist: + +```sql +DROP VIEW IF EXISTS v1; +``` + +Drop function `fn`: + +```sql +DROP FUNCTION fn; +``` + +Drop index `idx`: + +```sql +DROP INDEX idx; +``` + +Drop schema `sch`: + +```sql +DROP SCHEMA sch; +``` + +Drop sequence `seq`: + +```sql +DROP SEQUENCE seq; +``` + +Drop macro `mcr`: + +```sql +DROP MACRO mcr; +``` + +Drop macro table `mt`: + +```sql +DROP MACRO TABLE mt; +``` + +Drop type `typ`: + +```sql +DROP TYPE typ; +``` + +## Syntax + +
+ +## Dependencies of Dropped Objects + +DuckDB performs limited dependency tracking for some object types. +By default or if the `RESTRICT` clause is provided, the entry will not be dropped if there are any other objects that depend on it. +If the `CASCADE` clause is provided then all the objects that are dependent on the object will be dropped as well. + +```sql +CREATE SCHEMA myschema; +CREATE TABLE myschema.t1 (i INTEGER); +DROP SCHEMA myschema; +``` + +```console +Dependency Error: Cannot drop entry `myschema` because there are entries that depend on it. +Use DROP...CASCADE to drop all dependents. +``` + +The `CASCADE` modifier drops both myschema and `myschema.t1`: + +```sql +CREATE SCHEMA myschema; +CREATE TABLE myschema.t1 (i INTEGER); +DROP SCHEMA myschema CASCADE; +``` + +The following dependencies are tracked and thus will raise an error if the user tries to drop the depending object without the `CASCADE` modifier. + +| Depending object type | Dependant object type | +|--|--| +| `SCHEMA` | `FUNCTION` | +| `SCHEMA` | `INDEX` | +| `SCHEMA` | `MACRO TABLE` | +| `SCHEMA` | `MACRO` | +| `SCHEMA` | `SCHEMA` | +| `SCHEMA` | `SEQUENCE` | +| `SCHEMA` | `TABLE` | +| `SCHEMA` | `TYPE` | +| `SCHEMA` | `VIEW` | +| `TABLE` | `INDEX` | + +## Limitations + +### Dependencies on Views + +Currently, dependencies are not tracked for views. For example, if a view is created that references a table and the table is dropped, then the view will be in an invalid state: + +```sql +CREATE TABLE tbl (i INTEGER); +CREATE VIEW v AS + SELECT i FROM tbl; +DROP TABLE tbl RESTRICT; +SELECT * FROM v; +``` + +```console +Catalog Error: Table with name tbl does not exist! +``` + +## Limitations on Reclaiming Disk Space + +Running `DROP TABLE` should free the memory used by the table, but not always disk space. +Even if disk space does not decrease, the free blocks will be marked as `free`. +For example, if we have a 2 GB file and we drop a 1 GB table, the file might still be 2 GB, but it should have 1 GB of free blocks in it. +To check this, use the following `PRAGMA` and check the number of `free_blocks` in the output: + +```sql +PRAGMA database_size; +``` \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/export.md b/docs/archive/1.0/sql/statements/export.md new file mode 100644 index 00000000000..8a291d4d631 --- /dev/null +++ b/docs/archive/1.0/sql/statements/export.md @@ -0,0 +1,79 @@ +--- +layout: docu +railroad: statements/export.js +title: EXPORT/IMPORT DATABASE Statements +--- + +The `EXPORT DATABASE` command allows you to export the contents of the database to a specific directory. The `IMPORT DATABASE` command allows you to then read the contents again. + +## Examples + +Export the database to the target directory 'target_directory' as CSV files: + +```sql +EXPORT DATABASE 'target_directory'; +``` + +Export to directory 'target_directory', using the given options for the CSV serialization: + +```sql +EXPORT DATABASE 'target_directory' (FORMAT CSV, DELIMITER '|'); +``` + +Export to directory 'target_directory', tables serialized as Parquet: + +```sql +EXPORT DATABASE 'target_directory' (FORMAT PARQUET); +``` + +Export to directory 'target_directory', tables serialized as Parquet, compressed with ZSTD, with a row_group_size of 100,000: + +```sql +EXPORT DATABASE 'target_directory' ( + FORMAT PARQUET, + COMPRESSION ZSTD, + ROW_GROUP_SIZE 100_000 +); +``` + +Reload the database again: + +```sql +IMPORT DATABASE 'source_directory'; +``` + +Alternatively, use a `PRAGMA`: + +```sql +PRAGMA import_database('source_directory'); +``` + +For details regarding the writing of Parquet files, see the [Parquet Files page in the Data Import section]({% link docs/archive/1.0/data/parquet/overview.md %}#writing-to-parquet-files) and the [`COPY` Statement page]({% link docs/archive/1.0/sql/statements/copy.md %}). + +## `EXPORT DATABASE` + +The `EXPORT DATABASE` command exports the full contents of the database – including schema information, tables, views and sequences – to a specific directory that can then be loaded again. The created directory will be structured as follows: + +```text +target_directory/schema.sql +target_directory/load.sql +target_directory/t_1.csv +... +target_directory/t_n.csv +``` + +The `schema.sql` file contains the schema statements that are found in the database. It contains any `CREATE SCHEMA`, `CREATE TABLE`, `CREATE VIEW` and `CREATE SEQUENCE` commands that are necessary to re-construct the database. + +The `load.sql` file contains a set of `COPY` statements that can be used to read the data from the CSV files again. The file contains a single `COPY` statement for every table found in the schema. + +### Syntax + +
+ +## `IMPORT DATABASE` + +The database can be reloaded by using the `IMPORT DATABASE` command again, or manually by running `schema.sql` followed by `load.sql` to re-load the data. + +### Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/insert.md b/docs/archive/1.0/sql/statements/insert.md new file mode 100644 index 00000000000..3971bb754fc --- /dev/null +++ b/docs/archive/1.0/sql/statements/insert.md @@ -0,0 +1,429 @@ +--- +layout: docu +railroad: statements/insert.js +title: INSERT Statement +--- + +The `INSERT` statement inserts new data into a table. + +### Examples + +Insert the values 1, 2, 3 into `tbl`: + +```sql +INSERT INTO tbl + VALUES (1), (2), (3); +``` + +Insert the result of a query into a table: + +```sql +INSERT INTO tbl + SELECT * FROM other_tbl; +``` + +Insert values into the `i` column, inserting the default value into other columns: + +```sql +INSERT INTO tbl (i) + VALUES (1), (2), (3); +``` + +Explicitly insert the default value into a column: + +```sql +INSERT INTO tbl (i) + VALUES (1), (DEFAULT), (3); +``` + +Assuming `tbl` has a primary key/unique constraint, do nothing on conflict: + +```sql +INSERT OR IGNORE INTO tbl (i) + VALUES (1); +``` + +Or update the table with the new values instead: + +```sql +INSERT OR REPLACE INTO tbl (i) + VALUES (1); +``` + +### Syntax + +
+ +`INSERT INTO` inserts new rows into a table. One can insert one or more rows specified by value expressions, or zero or more rows resulting from a query. + +## Insert Column Order + +It's possible to provide an optional insert column order, this can either be `BY POSITION` (the default) or `BY NAME`. +Each column not present in the explicit or implicit column list will be filled with a default value, either its declared default value or `NULL` if there is none. + +If the expression for any column is not of the correct data type, automatic type conversion will be attempted. + +### `INSERT INTO ... [BY POSITION]` + +The order that values are inserted into the columns of the table is determined by the order that the columns were declared in. +That is, the values supplied by the `VALUES` clause or query are associated with the column list left-to-right. +This is the default option, that can be explicitly specified using the `BY POSITION` option. +For example: + +```sql +CREATE TABLE tbl (a INTEGER, b INTEGER); +INSERT INTO tbl + VALUES (5, 42); +``` + +Specifying `BY POSITION` is optional and is equivalent to the default behavior: + +```sql +INSERT INTO tbl + BY POSITION + VALUES (5, 42); +``` + +To use a different order, column names can be provided as part of the target, for example: + +```sql +CREATE TABLE tbl (a INTEGER, b INTEGER); +INSERT INTO tbl (b, a) + VALUES (5, 42); +``` + +Adding `BY POSITION` results in the same behavior: + +```sql +INSERT INTO tbl + BY POSITION (b, a) + VALUES (5, 42); +``` + +This will insert `5` into `b` and `42` into `a`. + +### `INSERT INTO ... BY NAME` + +Using the `BY NAME` modifier, the names of the column list of the `SELECT` statement are matched against the column names of the table to determine the order that values should be inserted into the table. This allows inserting even in cases when the order of the columns in the table differs from the order of the values in the `SELECT` statement or certain columns are missing. + +For example: + +```sql +CREATE TABLE tbl (a INTEGER, b INTEGER); +INSERT INTO tbl BY NAME (SELECT 42 AS b, 32 AS a); +INSERT INTO tbl BY NAME (SELECT 22 AS b); +SELECT * FROM tbl; +``` + +| a | b | +|-----:|---:| +| 32 | 42 | +| NULL | 22 | + +It's important to note that when using `INSERT INTO ... BY NAME`, the column names specified in the `SELECT` statement must match the column names in the table. If a column name is misspelled or does not exist in the table, an error will occur. Columns that are missing from the `SELECT` statement will be filled with the default value. + +## `ON CONFLICT` Clause + +An `ON CONFLICT` clause can be used to perform a certain action on conflicts that arise from `UNIQUE` or `PRIMARY KEY` constraints. +An example for such a conflict is shown in the following example: + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl + VALUES (1, 42); +INSERT INTO tbl + VALUES (1, 84); +``` + +This raises as an error: + +```console +Constraint Error: Duplicate key "i: 1" violates primary key constraint. +``` + +The table will contain the row that was first inserted: + +```sql +SELECT * FROM tbl; +``` + +| i | j | +|--:|---:| +| 1 | 42 | + +These error messages can be avoided by explicitly handling conflicts. +DuckDB supports two such clauses: [`ON CONFLICT DO NOTHING`](#do-nothing-clause) and [`ON CONFLICT DO UPDATE SET ...`](#do-update-clause-upsert). + +### `DO NOTHING` Clause + +The `DO NOTHING` clause causes the error(s) to be ignored, and the values are not inserted or updated. +For example: + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl + VALUES (1, 42); +INSERT INTO tbl + VALUES (1, 84) + ON CONFLICT DO NOTHING; +``` + +These statements finish successfully and leaves the table with the row ``. + +#### Shorthand for `DO NOTHING` + +The `INSERT OR IGNORE INTO ...` statement is a shorter syntax alternative to `INSERT INTO ... ON CONFLICT DO NOTHING`. +For example, the following statements are equivalent: + +```sql +INSERT OR IGNORE INTO tbl + VALUES (1, 84); +INSERT INTO tbl + VALUES (1, 84) ON CONFLICT DO NOTHING; +``` + +### `DO UPDATE` Clause (Upsert) + +The `DO UPDATE` clause causes the `INSERT` to turn into an `UPDATE` on the conflicting row(s) instead. +The `SET` expressions that follow determine how these rows are updated. The expressions can use the special virtual table `EXCLUDED`, which contains the conflicting values for the row. +Optionally you can provide an additional `WHERE` clause that can exclude certain rows from the update. +The conflicts that don't meet this condition are ignored instead. + +Because we need a way to refer to both the **to-be-inserted** tuple and the **existing** tuple, we introduce the special `EXCLUDED` qualifier. +When the `EXCLUDED` qualifier is provided, the reference refers to the **to-be-inserted** tuple, otherwise, it refers to the **existing** tuple. +This special qualifier can be used within the `WHERE` clauses and `SET` expressions of the `ON CONFLICT` clause. + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl VALUES (1, 42); +INSERT INTO tbl VALUES (1, 52), (1, 62) ON CONFLICT DO UPDATE SET j = EXCLUDED.j; +``` + +#### Examples + +An example using `DO UPDATE` is the following: + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl + VALUES (1, 42); +INSERT INTO tbl + VALUES (1, 84) + ON CONFLICT DO UPDATE SET j = EXCLUDED.j; +SELECT * FROM tbl; +``` + +| i | j | +|--:|---:| +| 1 | 84 | + +Rearranging columns and using `BY NAME` is also possible: + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl + VALUES (1, 42); +INSERT INTO tbl (j, i) + VALUES (168, 1) + ON CONFLICT DO UPDATE SET j = EXCLUDED.j; +INSERT INTO tbl + BY NAME (SELECT 1 AS i, 336 AS j) + ON CONFLICT DO UPDATE SET j = EXCLUDED.j; +SELECT * FROM tbl; +``` + +| i | j | +|--:|----:| +| 1 | 336 | + +#### Shorthand + +The `INSERT OR REPLACE INTO ...` statement is a shorter syntax alternative to `INSERT INTO ... DO UPDATE SET c1 = EXCLUDED.c1, c2 = EXCLUDED.c2, ...`. +That is, it updates every column of the **existing** row to the new values of the **to-be-inserted** row. +For example, given the following input table: + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl + VALUES (1, 42); +``` + +These statements are equivalent: + +```sql +INSERT OR REPLACE INTO tbl + VALUES (1, 84); +INSERT INTO tbl + VALUES (1, 84) + ON CONFLICT DO UPDATE SET j = EXCLUDED.j; +INSERT INTO tbl (j, i) + VALUES (84, 1) + ON CONFLICT DO UPDATE SET j = EXCLUDED.j; +INSERT INTO tbl BY NAME + (SELECT 84 AS j, 1 AS i) + ON CONFLICT DO UPDATE SET j = EXCLUDED.j; +``` + +#### Limitations + +When the `ON CONFLICT ... DO UPDATE` clause is used and a conflict occurs, DuckDB internally assigns `NULL` values to the row's columns that are unaffected by the conflict, then re-assigns their values. If the affected columns use a `NOT NULL` constraint, this will trigger a `NOT NULL constraint failed` error. For example: + +```sql +CREATE TABLE t1 (id INTEGER PRIMARY KEY, val1 DOUBLE, val2 DOUBLE NOT NULL); +CREATE TABLE t2 (id INTEGER PRIMARY KEY, val1 DOUBLE); +INSERT INTO t1 + VALUES (1, 2, 3); +INSERT INTO t2 + VALUES (1, 5); + +INSERT INTO t1 BY NAME (SELECT id, val1 FROM t2) + ON CONFLICT DO UPDATE + SET val1 = EXCLUDED.val1; +``` + +This fails with the following error: + +```console +Constraint Error: NOT NULL constraint failed: t1.val2 +``` + +### Defining a Conflict Target + +A conflict target may be provided as `ON CONFLICT (conflict_target)`. This is a group of columns that an index or uniqueness/key constraint is defined on. If the conflict target is omitted, or `PRIMARY KEY` constraint(s) on the table are targeted. + +Specifying a conflict target is optional unless using a [`DO UPDATE`](#do-update-clause-upsert) and there are multiple unique/primary key constraints on the table. + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER UNIQUE, k INTEGER); +INSERT INTO tbl + VALUES (1, 20, 300); +SELECT * FROM tbl; +``` + +| i | j | k | +|--:|---:|----:| +| 1 | 20 | 300 | + +```sql +INSERT INTO tbl + VALUES (1, 40, 700) + ON CONFLICT (i) DO UPDATE SET k = 2 * EXCLUDED.k; +``` + +| i | j | k | +|--:|---:|-----:| +| 1 | 20 | 1400 | + +```sql +INSERT INTO tbl + VALUES (1, 20, 900) + ON CONFLICT (j) DO UPDATE SET k = 5 * EXCLUDED.k; +``` + +| i | j | k | +|--:|---:|-----:| +| 1 | 20 | 4500 | + +When a conflict target is provided, you can further filter this with a `WHERE` clause, that should be met by all conflicts. + +```sql +INSERT INTO tbl + VALUES (1, 40, 700) + ON CONFLICT (i) DO UPDATE SET k = 2 * EXCLUDED.k WHERE k < 100; +``` + +### Multiple Tuples Conflicting on the Same Key + +#### Limitations + +Currently, DuckDB’s `ON CONFLICT DO UPDATE` feature is limited to enforce constraints between committed and newly inserted (transaction-local) data. +In other words, having multiple tuples conflicting on the same key is not supported. +If the newly inserted data has duplicate rows, an error message will be thrown, or unexpected behavior can occur. +This also includes conflicts **only** within the newly inserted data. + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl + VALUES (1, 42); +INSERT INTO tbl + VALUES (1, 84), (1, 168) + ON CONFLICT DO UPDATE SET j = j + EXCLUDED.j; +``` + +This returns the following message. + +```console +Error: Invalid Input Error: ON CONFLICT DO UPDATE can not update the same row twice in the same command. +Ensure that no rows proposed for insertion within the same command have duplicate constrained values +``` + +To work around this, enforce uniqueness using [`DISTINCT ON`]({% link docs/archive/1.0/sql/query_syntax/select.md %}#distinct-on-clause). For example: + +```sql +CREATE TABLE tbl (i INTEGER PRIMARY KEY, j INTEGER); +INSERT INTO tbl + VALUES (1, 42); +INSERT INTO tbl + SELECT DISTINCT ON(i) i, j FROM VALUES (1, 84), (1, 168) AS t (i, j) + ON CONFLICT DO UPDATE SET j = j + EXCLUDED.j; +SELECT * FROM tbl; +``` + +| i | j | +|--:|----:| +| 1 | 126 | + +## `RETURNING` Clause + +The `RETURNING` clause may be used to return the contents of the rows that were inserted. This can be useful if some columns are calculated upon insert. For example, if the table contains an automatically incrementing primary key, then the `RETURNING` clause will include the automatically created primary key. This is also useful in the case of generated columns. + +Some or all columns can be explicitly chosen to be returned and they may optionally be renamed using aliases. Arbitrary non-aggregating expressions may also be returned instead of simply returning a column. All columns can be returned using the `*` expression, and columns or expressions can be returned in addition to all columns returned by the `*`. + +For example: + +```sql +CREATE TABLE t1 (i INTEGER); +INSERT INTO t1 + SELECT 42 + RETURNING *; +``` + +
+ +| i | +|---:| +| 42 | + +A more complex example that includes an expression in the `RETURNING` clause: + +```sql +CREATE TABLE t2 (i INTEGER, j INTEGER); +INSERT INTO t2 + SELECT 2 AS i, 3 AS j + RETURNING *, i * j AS i_times_j; +``` + +
+ +| i | j | i_times_j | +|--:|--:|----------:| +| 2 | 3 | 6 | + +The next example shows a situation where the `RETURNING` clause is more helpful. First, a table is created with a primary key column. Then a sequence is created to allow for that primary key to be incremented as new rows are inserted. When we insert into the table, we do not already know the values generated by the sequence, so it is valuable to return them. For additional information, see the [`CREATE SEQUENCE` page]({% link docs/archive/1.0/sql/statements/create_sequence.md %}). + +```sql +CREATE TABLE t3 (i INTEGER PRIMARY KEY, j INTEGER); +CREATE SEQUENCE 't3_key'; +INSERT INTO t3 + SELECT nextval('t3_key') AS i, 42 AS j + UNION ALL + SELECT nextval('t3_key') AS i, 43 AS j + RETURNING *; +``` + +
+ +| i | j | +|--:|---:| +| 1 | 42 | +| 2 | 43 | \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/overview.md b/docs/archive/1.0/sql/statements/overview.md new file mode 100644 index 00000000000..97096b6cbb0 --- /dev/null +++ b/docs/archive/1.0/sql/statements/overview.md @@ -0,0 +1,4 @@ +--- +layout: docu +title: Statements Overview +--- \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/pivot.md b/docs/archive/1.0/sql/statements/pivot.md new file mode 100644 index 00000000000..ef932fb061b --- /dev/null +++ b/docs/archive/1.0/sql/statements/pivot.md @@ -0,0 +1,420 @@ +--- +blurb: The PIVOT statement allows values within a column to be separated into their + own columns. +layout: docu +railroad: statements/pivot.js +title: PIVOT Statement +--- + +The `PIVOT` statement allows distinct values within a column to be separated into their own columns. +The values within those new columns are calculated using an aggregate function on the subset of rows that match each distinct value. + +DuckDB implements both the SQL Standard `PIVOT` syntax and a simplified `PIVOT` syntax that automatically detects the columns to create while pivoting. +`PIVOT_WIDER` may also be used in place of the `PIVOT` keyword. + +> The [`UNPIVOT` statement]({% link docs/archive/1.0/sql/statements/unpivot.md %}) is the inverse of the `PIVOT` statement. + +## Simplified `PIVOT` Syntax + +The full syntax diagram is below, but the simplified `PIVOT` syntax can be summarized using spreadsheet pivot table naming conventions as: + +```sql +PIVOT ⟨dataset⟩ +ON ⟨columns⟩ +USING ⟨values⟩ +GROUP BY ⟨rows⟩ +ORDER BY ⟨columns_with_order_directions⟩ +LIMIT ⟨number_of_rows⟩; +``` + +The `ON`, `USING`, and `GROUP BY` clauses are each optional, but they may not all be omitted. + +### Example Data + +All examples use the dataset produced by the queries below: + +```sql +CREATE TABLE Cities ( + Country VARCHAR, Name VARCHAR, Year INTEGER, Population INTEGER +); +INSERT INTO Cities VALUES + ('NL', 'Amsterdam', 2000, 1005), + ('NL', 'Amsterdam', 2010, 1065), + ('NL', 'Amsterdam', 2020, 1158), + ('US', 'Seattle', 2000, 564), + ('US', 'Seattle', 2010, 608), + ('US', 'Seattle', 2020, 738), + ('US', 'New York City', 2000, 8015), + ('US', 'New York City', 2010, 8175), + ('US', 'New York City', 2020, 8772); +``` + +```sql +FROM Cities; +``` + +
+ +| Country | Name | Year | Population | +|---------|---------------|-----:|-----------:| +| NL | Amsterdam | 2000 | 1005 | +| NL | Amsterdam | 2010 | 1065 | +| NL | Amsterdam | 2020 | 1158 | +| US | Seattle | 2000 | 564 | +| US | Seattle | 2010 | 608 | +| US | Seattle | 2020 | 738 | +| US | New York City | 2000 | 8015 | +| US | New York City | 2010 | 8175 | +| US | New York City | 2020 | 8772 | + +### `PIVOT ON` and `USING` + +Use the `PIVOT` statement below to create a separate column for each year and calculate the total population in each. +The `ON` clause specifies which column(s) to split into separate columns. +It is equivalent to the columns parameter in a spreadsheet pivot table. + +The `USING` clause determines how to aggregate the values that are split into separate columns. +This is equivalent to the values parameter in a spreadsheet pivot table. +If the `USING` clause is not included, it defaults to `count(*)`. + +```sql +PIVOT Cities +ON Year +USING sum(Population); +``` + +
+ +| Country | Name | 2000 | 2010 | 2020 | +|---------|---------------|-----:|-----:|-----:| +| NL | Amsterdam | 1005 | 1065 | 1158 | +| US | Seattle | 564 | 608 | 738 | +| US | New York City | 8015 | 8175 | 8772 | + +In the above example, the `sum` aggregate is always operating on a single value. +If we only want to change the orientation of how the data is displayed without aggregating, use the `first` aggregate function. +In this example, we are pivoting numeric values, but the `first` function works very well for pivoting out a text column. +(This is something that is difficult to do in a spreadsheet pivot table, but easy in DuckDB!) + +This query produces a result that is identical to the one above: + +```sql +PIVOT Cities ON Year USING first(Population); +``` + +### `PIVOT ON`, `USING`, and `GROUP BY` + +By default, the `PIVOT` statement retains all columns not specified in the `ON` or `USING` clauses. +To include only certain columns and further aggregate, specify columns in the `GROUP BY` clause. +This is equivalent to the rows parameter of a spreadsheet pivot table. + +In the below example, the `Name` column is no longer included in the output, and the data is aggregated up to the `Country` level. + +```sql +PIVOT Cities +ON Year +USING sum(Population) +GROUP BY Country; +``` + +
+ +| Country | 2000 | 2010 | 2020 | +|---------|-----:|-----:|-----:| +| NL | 1005 | 1065 | 1158 | +| US | 8579 | 8783 | 9510 | + +### `IN` Filter for `ON` Clause + +To only create a separate column for specific values within a column in the `ON` clause, use an optional `IN` expression. +Let's say for example that we wanted to forget about the year 2020 for no particular reason... + +```sql +PIVOT Cities +ON Year IN (2000, 2010) +USING sum(Population) +GROUP BY Country; +``` + +
+ +| Country | 2000 | 2010 | +|---------|-----:|-----:| +| NL | 1005 | 1065 | +| US | 8579 | 8783 | + +### Multiple Expressions per Clause + +Multiple columns can be specified in the `ON` and `GROUP BY` clauses, and multiple aggregate expressions can be included in the `USING` clause. + +#### Multiple `ON` Columns and `ON` Expressions + +Multiple columns can be pivoted out into their own columns. +DuckDB will find the distinct values in each `ON` clause column and create one new column for all combinations of those values (a Cartesian product). + +In the below example, all combinations of unique countries and unique cities receive their own column. +Some combinations may not be present in the underlying data, so those columns are populated with `NULL` values. + +```sql +PIVOT Cities +ON Country, Name +USING sum(Population); +``` + +
+ +| Year | NL_Amsterdam | NL_New York City | NL_Seattle | US_Amsterdam | US_New York City | US_Seattle | +|-----:|-------------:|------------------|------------|--------------|-----------------:|-----------:| +| 2000 | 1005 | NULL | NULL | NULL | 8015 | 564 | +| 2010 | 1065 | NULL | NULL | NULL | 8175 | 608 | +| 2020 | 1158 | NULL | NULL | NULL | 8772 | 738 | + +To pivot only the combinations of values that are present in the underlying data, use an expression in the `ON` clause. +Multiple expressions and/or columns may be provided. + +Here, `Country` and `Name` are concatenated together and the resulting concatenations each receive their own column. +Any arbitrary non-aggregating expression may be used. +In this case, concatenating with an underscore is used to imitate the naming convention the `PIVOT` clause uses when multiple `ON` columns are provided (like in the prior example). + +```sql +PIVOT Cities ON Country || '_' || Name USING sum(Population); +``` + +
+ +| Year | NL_Amsterdam | US_New York City | US_Seattle | +|-----:|-------------:|-----------------:|-----------:| +| 2000 | 1005 | 8015 | 564 | +| 2010 | 1065 | 8175 | 608 | +| 2020 | 1158 | 8772 | 738 | + +#### Multiple `USING` Expressions + +An alias may also be included for each expression in the `USING` clause. +It will be appended to the generated column names after an underscore (`_`). +This makes the column naming convention much cleaner when multiple expressions are included in the `USING` clause. + +In this example, both the `sum` and `max` of the Population column are calculated for each year and are split into separate columns. + +```sql +PIVOT Cities +ON Year +USING sum(Population) AS total, max(Population) AS max +GROUP BY Country; +``` + +
+ +| Country | 2000_total | 2000_max | 2010_total | 2010_max | 2020_total | 2020_max | +|---------|-----------:|---------:|-----------:|---------:|-----------:|---------:| +| US | 8579 | 8015 | 8783 | 8175 | 9510 | 8772 | +| NL | 1005 | 1005 | 1065 | 1065 | 1158 | 1158 | + +#### Multiple `GROUP BY` Columns + +Multiple `GROUP BY` columns may also be provided. +Note that column names must be used rather than column positions (1, 2, etc.), and that expressions are not supported in the `GROUP BY` clause. + +```sql +PIVOT Cities +ON Year +USING sum(Population) +GROUP BY Country, Name; +``` + +
+ +| Country | Name | 2000 | 2010 | 2020 | +|---------|---------------|-----:|-----:|-----:| +| NL | Amsterdam | 1005 | 1065 | 1158 | +| US | Seattle | 564 | 608 | 738 | +| US | New York City | 8015 | 8175 | 8772 | + +### Using `PIVOT` within a `SELECT` Statement + +The `PIVOT` statement may be included within a `SELECT` statement as a CTE ([a Common Table Expression, or `WITH` clause]({% link docs/archive/1.0/sql/query_syntax/with.md %})), or a subquery. +This allows for a `PIVOT` to be used alongside other SQL logic, as well as for multiple `PIVOT`s to be used in one query. + +No `SELECT` is needed within the CTE, the `PIVOT` keyword can be thought of as taking its place. + +```sql +WITH pivot_alias AS ( + PIVOT Cities + ON Year + USING sum(Population) + GROUP BY Country +) +SELECT * FROM pivot_alias; +``` + +A `PIVOT` may be used in a subquery and must be wrapped in parentheses. +Note that this behavior is different than the SQL Standard Pivot, as illustrated in subsequent examples. + +```sql +SELECT * +FROM ( + PIVOT Cities + ON Year + USING sum(Population) + GROUP BY Country +) pivot_alias; +``` + +### Multiple `PIVOT` Statements + +Each `PIVOT` can be treated as if it were a `SELECT` node, so they can be joined together or manipulated in other ways. + +For example, if two `PIVOT` statements share the same `GROUP BY` expression, they can be joined together using the columns in the `GROUP BY` clause into a wider pivot. + +```sql +FROM (PIVOT Cities ON Year USING sum(Population) GROUP BY Country) year_pivot +JOIN (PIVOT Cities ON Name USING sum(Population) GROUP BY Country) name_pivot +USING (Country); +``` + +
+ +| Country | 2000 | 2010 | 2020 | Amsterdam | New York City | Seattle | +|---------|-----:|-----:|-----:|----------:|--------------:|--------:| +| NL | 1005 | 1065 | 1158 | 3228 | NULL | NULL | +| US | 8579 | 8783 | 9510 | NULL | 24962 | 1910 | + +## Internals + +Pivoting is implemented as a combination of SQL query re-writing and a dedicated `PhysicalPivot` operator for higher performance. +Each `PIVOT` is implemented as set of aggregations into lists and then the dedicated `PhysicalPivot` operator converts those lists into column names and values. +Additional pre-processing steps are required if the columns to be created when pivoting are detected dynamically (which occurs when the `IN` clause is not in use). + +DuckDB, like most SQL engines, requires that all column names and types be known at the start of a query. +In order to automatically detect the columns that should be created as a result of a `PIVOT` statement, it must be translated into multiple queries. +[`ENUM` types]({% link docs/archive/1.0/sql/data_types/enum.md %}) are used to find the distinct values that should become columns. +Each `ENUM` is then injected into one of the `PIVOT` statement's `IN` clauses. + +After the `IN` clauses have been populated with `ENUM`s, the query is re-written again into a set of aggregations into lists. + +For example: + +```sql +PIVOT Cities +ON Year +USING sum(Population); +``` + +is initially translated into: + +```sql +CREATE TEMPORARY TYPE __pivot_enum_0_0 AS ENUM ( + SELECT DISTINCT + Year::VARCHAR + FROM Cities + ORDER BY + Year + ); +PIVOT Cities +ON Year IN __pivot_enum_0_0 +USING sum(Population); +``` + +and finally translated into: + +```sql +SELECT Country, Name, list(Year), list(population_sum) +FROM ( + SELECT Country, Name, Year, sum(population) AS population_sum + FROM Cities + GROUP BY ALL +) +GROUP BY ALL; +``` + +This produces the result: + +
+ +| Country | Name | list("YEAR") | list(population_sum) | +|---------|---------------|--------------------|----------------------| +| NL | Amsterdam | [2000, 2010, 2020] | [1005, 1065, 1158] | +| US | Seattle | [2000, 2010, 2020] | [564, 608, 738] | +| US | New York City | [2000, 2010, 2020] | [8015, 8175, 8772] | + +The `PhysicalPivot` operator converts those lists into column names and values to return this result: + +
+ +| Country | Name | 2000 | 2010 | 2020 | +|---------|---------------|-----:|-----:|-----:| +| NL | Amsterdam | 1005 | 1065 | 1158 | +| US | Seattle | 564 | 608 | 738 | +| US | New York City | 8015 | 8175 | 8772 | + +## Simplified `PIVOT` Full Syntax Diagram + +Below is the full syntax diagram of the `PIVOT` statement. + +
+ +## SQL Standard `PIVOT` Syntax + +The full syntax diagram is below, but the SQL Standard `PIVOT` syntax can be summarized as: + +```sql +FROM ⟨dataset⟩ +PIVOT ( + ⟨values⟩ + FOR + ⟨column_1⟩ IN (⟨in_list⟩) + ⟨column_2⟩ IN (⟨in_list⟩) + ... + GROUP BY ⟨rows⟩ +); +``` +Unlike the simplified syntax, the `IN` clause must be specified for each column to be pivoted. +If you are interested in dynamic pivoting, the simplified syntax is recommended. + +Note that no commas separate the expressions in the `FOR` clause, but that `value` and `GROUP BY` expressions must be comma-separated! + +## Examples + +This example uses a single value expression, a single column expression, and a single row expression: + +```sql +FROM Cities +PIVOT ( + sum(Population) + FOR + Year IN (2000, 2010, 2020) + GROUP BY Country +); +``` + +
+ +| Country | 2000 | 2010 | 2020 | +|---------|-----:|-----:|-----:| +| NL | 1005 | 1065 | 1158 | +| US | 8579 | 8783 | 9510 | + +This example is somewhat contrived, but serves as an example of using multiple value expressions and multiple columns in the `FOR` clause. + +```sql +FROM Cities +PIVOT ( + sum(Population) AS total, + count(Population) AS count + FOR + Year IN (2000, 2010) + Country in ('NL', 'US') +); +``` + +| Name | 2000_NL_total | 2000_NL_count | 2000_US_total | 2000_US_count | 2010_NL_total | 2010_NL_count | 2010_US_total | 2010_US_count | +|--|-:|-:|-:|-:|-:|-:|-:|-:| +| Amsterdam | 1005 | 1 | NULL | 0 | 1065 | 1 | NULL | 0 | +| Seattle | NULL | 0 | 564 | 1 | NULL | 0 | 608 | 1 | +| New York City | NULL | 0 | 8015 | 1 | NULL | 0 | 8175 | 1 | + +### SQL Standard `PIVOT` Full Syntax Diagram + +Below is the full syntax diagram of the SQL Standard version of the `PIVOT` statement. + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/profiling.md b/docs/archive/1.0/sql/statements/profiling.md new file mode 100644 index 00000000000..4fed0fc6632 --- /dev/null +++ b/docs/archive/1.0/sql/statements/profiling.md @@ -0,0 +1,27 @@ +--- +layout: docu +title: Profiling Queries +--- + +DuckDB supports profiling queries via the `EXPLAIN` and `EXPLAIN ANALYZE` statements. + +## `EXPLAIN` + +To see the query plan of a query without executing it, run: + +```sql +EXPLAIN ⟨query⟩; +``` + +The output of `EXPLAIN` contains the estimated cardinalities for each operator. + +## `EXPLAIN ANALYZE` + +To profile a query, run: + +```sql +EXPLAIN ANALYZE ⟨query⟩; +``` + +The `EXPLAIN ANALYZE` statement runs the query, and shows the actual cardinalities for each operator, +as well as the cumulative wall-clock time spent in each operator. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/select.md b/docs/archive/1.0/sql/statements/select.md new file mode 100644 index 00000000000..7cdd2d89f2d --- /dev/null +++ b/docs/archive/1.0/sql/statements/select.md @@ -0,0 +1,170 @@ +--- +blurb: The SELECT statement retrieves rows from the database. +layout: docu +railroad: statements/select.js +title: SELECT Statement +--- + +The `SELECT` statement retrieves rows from the database. + +### Examples + +Select all columns from the table `tbl`: + +```sql +SELECT * FROM tbl; +``` + +Select the rows from `tbl`: + +```sql +SELECT j FROM tbl WHERE i = 3; +``` + +Perform an aggregate grouped by the column `i`: + +```sql +SELECT i, sum(j) FROM tbl GROUP BY i; +``` + +Select only the top 3 rows from the `tbl`: + +```sql +SELECT * FROM tbl ORDER BY i DESC LIMIT 3; +``` + +Join two tables together using the `USING` clause: + +```sql +SELECT * FROM t1 JOIN t2 USING (a, b); +``` + +Use column indexes to select the first and third column from the table `tbl`: + +```sql +SELECT #1, #3 FROM tbl; +``` + +Select all unique cities from the addresses table: + +```sql +SELECT DISTINCT city FROM addresses; +``` + +Return a `STRUCT` by using a row variable: + +```sql +SELECT d +FROM (SELECT 1 AS a, 2 AS b) d; +``` + +### Syntax + +The `SELECT` statement retrieves rows from the database. The canonical order of a `SELECT` statement is as follows, with less common clauses being indented: + +```sql +SELECT ⟨select_list⟩ +FROM ⟨tables⟩ + USING SAMPLE ⟨sample_expression⟩ +WHERE ⟨condition⟩ +GROUP BY ⟨groups⟩ +HAVING ⟨group_filter⟩ + WINDOW ⟨window_expression⟩ + QUALIFY ⟨qualify_filter⟩ +ORDER BY ⟨order_expression⟩ +LIMIT ⟨n⟩; +``` + +Optionally, the `SELECT` statement can be prefixed with a [`WITH` clause]({% link docs/archive/1.0/sql/query_syntax/with.md %}). + +As the `SELECT` statement is so complex, we have split up the syntax diagrams into several parts. The full syntax diagram can be found at the bottom of the page. + +## `SELECT` Clause + +
+ +The [`SELECT` clause]({% link docs/archive/1.0/sql/query_syntax/select.md %}) specifies the list of columns that will be returned by the query. While it appears first in the clause, *logically* the expressions here are executed only at the end. The `SELECT` clause can contain arbitrary expressions that transform the output, as well as aggregates and window functions. The `DISTINCT` keyword ensures that only unique tuples are returned. + +> Column names are case-insensitive. See the [Rules for Case Sensitivity]({% link docs/archive/1.0/sql/dialect/keywords_and_identifiers.md %}#rules-for-case-sensitivity) for more details. + +## `FROM` Clause + +
+ +The [`FROM` clause]({% link docs/archive/1.0/sql/query_syntax/from.md %}) specifies the *source* of the data on which the remainder of the query should operate. Logically, the `FROM` clause is where the query starts execution. The `FROM` clause can contain a single table, a combination of multiple tables that are joined together, or another `SELECT` query inside a subquery node. + +## `SAMPLE` Clause + +
+ +[The `SAMPLE` clause]({% link docs/archive/1.0/sql/query_syntax/sample.md %}) allows you to run the query on a sample from the base table. This can significantly speed up processing of queries, at the expense of accuracy in the result. Samples can also be used to quickly see a snapshot of the data when exploring a data set. The `SAMPLE` clause is applied right after anything in the `FROM` clause (i.e., after any joins, but before the where clause or any aggregates). See the [Samples]({% link docs/archive/1.0/sql/samples.md %}) page for more information. + +## `WHERE` Clause + +
+ +[The `WHERE` clause]({% link docs/archive/1.0/sql/query_syntax/where.md %}) specifies any filters to apply to the data. This allows you to select only a subset of the data in which you are interested. Logically the `WHERE` clause is applied immediately after the `FROM` clause. + +## `GROUP BY` and `HAVING` Clauses + +
+ +[The `GROUP BY` clause]({% link docs/archive/1.0/sql/query_syntax/groupby.md %}) specifies which grouping columns should be used to perform any aggregations in the `SELECT` clause. If the `GROUP BY` clause is specified, the query is always an aggregate query, even if no aggregations are present in the `SELECT` clause. + +## `WINDOW` Clause + +
+ +[The `WINDOW` clause]({% link docs/archive/1.0/sql/query_syntax/window.md %}) allows you to specify named windows that can be used within window functions. These are useful when you have multiple window functions, as they allow you to avoid repeating the same window clause. + +## `QUALIFY` Clause + +
+ +[The `QUALIFY` clause]({% link docs/archive/1.0/sql/query_syntax/qualify.md %}) is used to filter the result of [`WINDOW` functions]({% link docs/archive/1.0/sql/functions/window_functions.md %}). + +## `ORDER BY`, `LIMIT` and `OFFSET` Clauses + +
+ +[`ORDER BY`]({% link docs/archive/1.0/sql/query_syntax/orderby.md %}), [`LIMIT` and `OFFSET`]({% link docs/archive/1.0/sql/query_syntax/limit.md %}) are output modifiers. +Logically they are applied at the very end of the query. +The `ORDER BY` clause sorts the rows on the sorting criteria in either ascending or descending order. +The `LIMIT` clause restricts the amount of rows fetched, while the `OFFSET` clause indicates at which position to start reading the values. + +## `VALUES` List + +
+ +[A `VALUES` list]({% link docs/archive/1.0/sql/query_syntax/values.md %}) is a set of values that is supplied instead of a `SELECT` statement. + +## Row IDs + +For each table, the [`rowid` pseudocolumn](https://docs.oracle.com/cd/B19306_01/server.102/b14200/pseudocolumns008.htm) returns the row identifiers based on the physical storage. + +```sql +CREATE TABLE t (id INTEGER, content VARCHAR); +INSERT INTO t VALUES (42, 'hello'), (43, 'world'); +SELECT rowid, id, content FROM t; +``` + +| rowid | id | content | +|------:|---:|---------| +| 0 | 42 | hello | +| 1 | 43 | world | + +In the current storage, these identifiers are contiguous unsigned integers (0, 1, ...) if no rows were deleted. Deletions introduce gaps in the rowids which may be reclaimed later. Therefore, it is strongly recommended *not to use rowids as identifiers*. + +> Tip The `rowid` values are stable within a transaction. + +> If there is a user-defined column named `rowid`, it shadows the `rowid` pseudocolumn. + +## Common Table Expressions + +
+ +## Full Syntax Diagram + +Below is the full syntax diagram of the `SELECT` statement: + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/set.md b/docs/archive/1.0/sql/statements/set.md new file mode 100644 index 00000000000..8667ee61945 --- /dev/null +++ b/docs/archive/1.0/sql/statements/set.md @@ -0,0 +1,77 @@ +--- +layout: docu +railroad: statements/set.js +title: SET/RESET Statements +--- + +The `SET` statement modifies the provided DuckDB configuration option at the specified scope. + +## Examples + +Update the `memory_limit` configuration value: + +```sql +SET memory_limit = '10GB'; +``` + +Configure the system to use `1` thread: + +```sql +SET threads = 1; +``` + +Or use the `TO` keyword: + +```sql +SET threads TO 1; +``` + +Change configuration option to default value: + +```sql +RESET threads; +``` + +Retrieve configuration value: + +```sql +SELECT current_setting('threads'); +``` + +Set the default catalog search path globally: + +```sql +SET GLOBAL search_path = 'db1,db2' +``` + +Set the default collation for the session: + +```sql +SET SESSION default_collation = 'nocase'; +``` + +## Syntax + +
+ +`SET` updates a DuckDB configuration option to the provided value. + +## `RESET` + +
+ +The `RESET` statement changes the given DuckDB configuration option to the default value. + +## Scopes + +Configuration options can have different scopes: + +* `GLOBAL`: Configuration value is used (or reset) across the entire DuckDB instance. +* `SESSION`: Configuration value is used (or reset) only for the current session attached to a DuckDB instance. +* `LOCAL`: Not yet implemented. + +When not specified, the default scope for the configuration option is used. For most options this is `GLOBAL`. + +## Configuration + +See the [Configuration]({% link docs/archive/1.0/configuration/overview.md %}) page for the full list of configuration options. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/summarize.md b/docs/archive/1.0/sql/statements/summarize.md new file mode 100644 index 00000000000..7a6bf1f5f1b --- /dev/null +++ b/docs/archive/1.0/sql/statements/summarize.md @@ -0,0 +1,22 @@ +--- +layout: docu +title: SUMMARIZE Statement +--- + +The `SUMMARIZE` statement returns summary statistics for a table, view or a query. + +## Usage + +```sql +SUMMARIZE tbl; +``` + +In order to summarize a query, prepend `SUMMARIZE` to a query. + +```sql +SUMMARIZE SELECT * FROM tbl; +``` + +## See Also + +For more examples, see the [guide on `SUMMARIZE`]({% link docs/archive/1.0/guides/meta/summarize.md %}). \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/transactions.md b/docs/archive/1.0/sql/statements/transactions.md new file mode 100644 index 00000000000..a954b735646 --- /dev/null +++ b/docs/archive/1.0/sql/statements/transactions.md @@ -0,0 +1,71 @@ +--- +layout: docu +title: Transaction Management +--- + +DuckDB supports [ACID database transactions](https://en.wikipedia.org/wiki/Database_transaction). +Transactions provide isolation, i.e., changes made by a transaction are not visible from concurrent transactions until it is committed. +A transaction can also be aborted, which discards any changes it made so far. + +## Statements + +DuckDB provides the following statements for transaction management. + +### Starting a Transaction + +To start a transaction, run: + +```sql +BEGIN TRANSACTION; +``` + +### Committing a Transaction + +You can commit a transaction to make it visible to other transactions and to write it to persistent storage (if using DuckDB in persistent mode). +To commit a transaction, run: + +```sql +COMMIT; +``` + +If you are not in an active transaction, the `COMMIT` statement will fail. + +### Rolling Back a Transaction + +You can abort a transaction. +This operation, also known as rolling back, will discard any changes the transaction made to the database. +To abort a transaction, run: + +```sql +ROLLBACK; +``` + +You can also use the abort command, which has an identical behavior: + +```sql +ABORT; +``` + +If you are not in an active transaction, the `ROLLBACK` and `ABORT` statements will fail. + +### Example + +We illustrate the use of transactions through a simple example. + +```sql +CREATE TABLE person (name VARCHAR, age BIGINT); + +BEGIN TRANSACTION; +INSERT INTO person VALUES ('Ada', 52); +COMMIT; + +BEGIN TRANSACTION; +DELETE FROM person WHERE name = 'Ada'; +INSERT INTO person VALUES ('Bruce', 39); +ROLLBACK; + +SELECT * FROM person; +``` + +The first transaction (inserting “Ada”) was committed but the second (deleting “Ada” and inserting “Bruce”) was aborted. +Therefore, the resulting table will only contain `<'Ada', 52>`. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/unpivot.md b/docs/archive/1.0/sql/statements/unpivot.md new file mode 100644 index 00000000000..f8d5256fae2 --- /dev/null +++ b/docs/archive/1.0/sql/statements/unpivot.md @@ -0,0 +1,374 @@ +--- +blurb: The UNPIVOT statement allows columns to be stacked into rows that indicate + the prior column name and value. +layout: docu +railroad: statements/unpivot.js +title: UNPIVOT Statement +--- + +The `UNPIVOT` statement allows multiple columns to be stacked into fewer columns. +In the basic case, multiple columns are stacked into two columns: a `NAME` column (which contains the name of the source column) and a `VALUE` column (which contains the value from the source column). + +DuckDB implements both the SQL Standard `UNPIVOT` syntax and a simplified `UNPIVOT` syntax. +Both can utilize a [`COLUMNS` expression]({% link docs/archive/1.0/sql/expressions/star.md %}#columns) to automatically detect the columns to unpivot. +`PIVOT_LONGER` may also be used in place of the `UNPIVOT` keyword. + +> The [`PIVOT` statement]({% link docs/archive/1.0/sql/statements/pivot.md %}) is the inverse of the `UNPIVOT` statement. + +## Simplified `UNPIVOT` Syntax + +The full syntax diagram is below, but the simplified `UNPIVOT` syntax can be summarized using spreadsheet pivot table naming conventions as: + +```sql +UNPIVOT ⟨dataset⟩ +ON ⟨column(s)⟩ +INTO + NAME ⟨name-column-name⟩ + VALUE ⟨value-column-name(s)⟩ +ORDER BY ⟨column(s)-with-order-direction(s)⟩ +LIMIT ⟨number-of-rows⟩; +``` + +### Example Data + +All examples use the dataset produced by the queries below: + +```sql +CREATE OR REPLACE TABLE monthly_sales + (empid INTEGER, dept TEXT, Jan INTEGER, Feb INTEGER, Mar INTEGER, Apr INTEGER, May INTEGER, Jun INTEGER); +INSERT INTO monthly_sales VALUES + (1, 'electronics', 1, 2, 3, 4, 5, 6), + (2, 'clothes', 10, 20, 30, 40, 50, 60), + (3, 'cars', 100, 200, 300, 400, 500, 600); +``` + +```sql +FROM monthly_sales; +``` + +
+ +| empid | dept | Jan | Feb | Mar | Apr | May | Jun | +|------:|-------------|----:|----:|----:|----:|----:|----:| +| 1 | electronics | 1 | 2 | 3 | 4 | 5 | 6 | +| 2 | clothes | 10 | 20 | 30 | 40 | 50 | 60 | +| 3 | cars | 100 | 200 | 300 | 400 | 500 | 600 | + + + +### `UNPIVOT` Manually + +The most typical `UNPIVOT` transformation is to take already pivoted data and re-stack it into a column each for the name and value. +In this case, all months will be stacked into a `month` column and a `sales` column. + +```sql +UNPIVOT monthly_sales +ON jan, feb, mar, apr, may, jun +INTO + NAME month + VALUE sales; +``` + +
+ +| empid | dept | month | sales | +|------:|-------------|-------|------:| +| 1 | electronics | Jan | 1 | +| 1 | electronics | Feb | 2 | +| 1 | electronics | Mar | 3 | +| 1 | electronics | Apr | 4 | +| 1 | electronics | May | 5 | +| 1 | electronics | Jun | 6 | +| 2 | clothes | Jan | 10 | +| 2 | clothes | Feb | 20 | +| 2 | clothes | Mar | 30 | +| 2 | clothes | Apr | 40 | +| 2 | clothes | May | 50 | +| 2 | clothes | Jun | 60 | +| 3 | cars | Jan | 100 | +| 3 | cars | Feb | 200 | +| 3 | cars | Mar | 300 | +| 3 | cars | Apr | 400 | +| 3 | cars | May | 500 | +| 3 | cars | Jun | 600 | + +### `UNPIVOT` Dynamically Using Columns Expression + +In many cases, the number of columns to unpivot is not easy to predetermine ahead of time. +In the case of this dataset, the query above would have to change each time a new month is added. +The [`COLUMNS` expression]({% link docs/archive/1.0/sql/expressions/star.md %}#columns-expression) can be used to select all columns that are not `empid` or `dept`. +This enables dynamic unpivoting that will work regardless of how many months are added. +The query below returns identical results to the one above. + +```sql +UNPIVOT monthly_sales +ON COLUMNS(* EXCLUDE (empid, dept)) +INTO + NAME month + VALUE sales; +``` + +
+ +| empid | dept | month | sales | +|------:|-------------|-------|------:| +| 1 | electronics | Jan | 1 | +| 1 | electronics | Feb | 2 | +| 1 | electronics | Mar | 3 | +| 1 | electronics | Apr | 4 | +| 1 | electronics | May | 5 | +| 1 | electronics | Jun | 6 | +| 2 | clothes | Jan | 10 | +| 2 | clothes | Feb | 20 | +| 2 | clothes | Mar | 30 | +| 2 | clothes | Apr | 40 | +| 2 | clothes | May | 50 | +| 2 | clothes | Jun | 60 | +| 3 | cars | Jan | 100 | +| 3 | cars | Feb | 200 | +| 3 | cars | Mar | 300 | +| 3 | cars | Apr | 400 | +| 3 | cars | May | 500 | +| 3 | cars | Jun | 600 | + +### `UNPIVOT` into Multiple Value Columns + +The `UNPIVOT` statement has additional flexibility: more than 2 destination columns are supported. +This can be useful when the goal is to reduce the extent to which a dataset is pivoted, but not completely stack all pivoted columns. +To demonstrate this, the query below will generate a dataset with a separate column for the number of each month within the quarter (month 1, 2, or 3), and a separate row for each quarter. +Since there are fewer quarters than months, this does make the dataset longer, but not as long as the above. + +To accomplish this, multiple sets of columns are included in the `ON` clause. +The `q1` and `q2` aliases are optional. +The number of columns in each set of columns in the `ON` clause must match the number of columns in the `VALUE` clause. + +```sql +UNPIVOT monthly_sales + ON (jan, feb, mar) AS q1, (apr, may, jun) AS q2 + INTO + NAME quarter + VALUE month_1_sales, month_2_sales, month_3_sales; +``` + +
+ +| empid | dept | quarter | month_1_sales | month_2_sales | month_3_sales | +|------:|-------------|---------|--------------:|--------------:|--------------:| +| 1 | electronics | q1 | 1 | 2 | 3 | +| 1 | electronics | q2 | 4 | 5 | 6 | +| 2 | clothes | q1 | 10 | 20 | 30 | +| 2 | clothes | q2 | 40 | 50 | 60 | +| 3 | cars | q1 | 100 | 200 | 300 | +| 3 | cars | q2 | 400 | 500 | 600 | + +### Using `UNPIVOT` within a `SELECT` Statement + +The `UNPIVOT` statement may be included within a `SELECT` statement as a CTE ([a Common Table Expression, or WITH clause]({% link docs/archive/1.0/sql/query_syntax/with.md %})), or a subquery. +This allows for an `UNPIVOT` to be used alongside other SQL logic, as well as for multiple `UNPIVOT`s to be used in one query. + +No `SELECT` is needed within the CTE, the `UNPIVOT` keyword can be thought of as taking its place. + +```sql +WITH unpivot_alias AS ( + UNPIVOT monthly_sales + ON COLUMNS(* EXCLUDE (empid, dept)) + INTO + NAME month + VALUE sales +) +SELECT * FROM unpivot_alias; +``` + +An `UNPIVOT` may be used in a subquery and must be wrapped in parentheses. +Note that this behavior is different than the SQL Standard Unpivot, as illustrated in subsequent examples. + +```sql +SELECT * +FROM ( + UNPIVOT monthly_sales + ON COLUMNS(* EXCLUDE (empid, dept)) + INTO + NAME month + VALUE sales +) unpivot_alias; +``` + +### Expressions within `UNPIVOT` Statements + +DuckDB allows expressions within the `UNPIVOT` statements, provided that they only involve a single column. These can be used to perform computations as well as [explicit casts]({% link docs/archive/1.0/sql/data_types/typecasting.md %}#explicit-casting). For example: + +```sql +UNPIVOT + (SELECT 42 as col1, 'woot' as col2) + ON + (col1 * 2)::VARCHAR, + col2; +``` + +| name | value | +|------|-------| +| col1 | 84 | +| col2 | woot | + +### Internals + +Unpivoting is implemented entirely as rewrites into SQL queries. +Each `UNPIVOT` is implemented as set of `unnest` functions, operating on a list of the column names and a list of the column values. +If dynamically unpivoting, the `COLUMNS` expression is evaluated first to calculate the column list. + +For example: + +```sql +UNPIVOT monthly_sales +ON jan, feb, mar, apr, may, jun +INTO + NAME month + VALUE sales; +``` + +is translated into: + +```sql +SELECT + empid, + dept, + unnest(['jan', 'feb', 'mar', 'apr', 'may', 'jun']) AS month, + unnest(["jan", "feb", "mar", "apr", "may", "jun"]) AS sales +FROM monthly_sales; +``` + +Note the single quotes to build a list of text strings to populate `month`, and the double quotes to pull the column values for use in `sales`. +This produces the same result as the initial example: + +
+ +| empid | dept | month | sales | +|------:|-------------|-------|------:| +| 1 | electronics | jan | 1 | +| 1 | electronics | feb | 2 | +| 1 | electronics | mar | 3 | +| 1 | electronics | apr | 4 | +| 1 | electronics | may | 5 | +| 1 | electronics | jun | 6 | +| 2 | clothes | jan | 10 | +| 2 | clothes | feb | 20 | +| 2 | clothes | mar | 30 | +| 2 | clothes | apr | 40 | +| 2 | clothes | may | 50 | +| 2 | clothes | jun | 60 | +| 3 | cars | jan | 100 | +| 3 | cars | feb | 200 | +| 3 | cars | mar | 300 | +| 3 | cars | apr | 400 | +| 3 | cars | may | 500 | +| 3 | cars | jun | 600 | + +### Simplified `UNPIVOT` Full Syntax Diagram + +Below is the full syntax diagram of the `UNPIVOT` statement. + +
+ +## SQL Standard `UNPIVOT` Syntax + +The full syntax diagram is below, but the SQL Standard `UNPIVOT` syntax can be summarized as: + +```sql +FROM [dataset] +UNPIVOT [INCLUDE NULLS] ( + [value-column-name(s)] + FOR [name-column-name] IN [column(s)] +); +``` + +Note that only one column can be included in the `name-column-name` expression. + +### SQL Standard `UNPIVOT` Manually + +To complete the basic `UNPIVOT` operation using the SQL standard syntax, only a few additions are needed. + +```sql +FROM monthly_sales UNPIVOT ( + sales + FOR month IN (jan, feb, mar, apr, may, jun) +); +``` + +
+ +| empid | dept | month | sales | +|------:|-------------|-------|------:| +| 1 | electronics | Jan | 1 | +| 1 | electronics | Feb | 2 | +| 1 | electronics | Mar | 3 | +| 1 | electronics | Apr | 4 | +| 1 | electronics | May | 5 | +| 1 | electronics | Jun | 6 | +| 2 | clothes | Jan | 10 | +| 2 | clothes | Feb | 20 | +| 2 | clothes | Mar | 30 | +| 2 | clothes | Apr | 40 | +| 2 | clothes | May | 50 | +| 2 | clothes | Jun | 60 | +| 3 | cars | Jan | 100 | +| 3 | cars | Feb | 200 | +| 3 | cars | Mar | 300 | +| 3 | cars | Apr | 400 | +| 3 | cars | May | 500 | +| 3 | cars | Jun | 600 | + +### SQL Standard `UNPIVOT` Dynamically Using the `COLUMNS` Expression + +The [`COLUMNS` expression]({% link docs/archive/1.0/sql/expressions/star.md %}#columns) can be used to determine the `IN` list of columns dynamically. +This will continue to work even if additional `month` columns are added to the dataset. +It produces the same result as the query above. + +```sql +FROM monthly_sales UNPIVOT ( + sales + FOR month IN (columns(* EXCLUDE (empid, dept))) +); +``` + +### SQL Standard `UNPIVOT` into Multiple Value Columns + +The `UNPIVOT` statement has additional flexibility: more than 2 destination columns are supported. +This can be useful when the goal is to reduce the extent to which a dataset is pivoted, but not completely stack all pivoted columns. +To demonstrate this, the query below will generate a dataset with a separate column for the number of each month within the quarter (month 1, 2, or 3), and a separate row for each quarter. +Since there are fewer quarters than months, this does make the dataset longer, but not as long as the above. + +To accomplish this, multiple columns are included in the `value-column-name` portion of the `UNPIVOT` statement. +Multiple sets of columns are included in the `IN` clause. +The `q1` and `q2` aliases are optional. +The number of columns in each set of columns in the `IN` clause must match the number of columns in the `value-column-name` portion. + +```sql +FROM monthly_sales +UNPIVOT ( + (month_1_sales, month_2_sales, month_3_sales) + FOR quarter IN ( + (jan, feb, mar) AS q1, + (apr, may, jun) AS q2 + ) +); +``` + +
+ +| empid | dept | quarter | month_1_sales | month_2_sales | month_3_sales | +|------:|-------------|---------|--------------:|--------------:|--------------:| +| 1 | electronics | q1 | 1 | 2 | 3 | +| 1 | electronics | q2 | 4 | 5 | 6 | +| 2 | clothes | q1 | 10 | 20 | 30 | +| 2 | clothes | q2 | 40 | 50 | 60 | +| 3 | cars | q1 | 100 | 200 | 300 | +| 3 | cars | q2 | 400 | 500 | 600 | + +### SQL Standard `UNPIVOT` Full Syntax Diagram + +Below is the full syntax diagram of the SQL Standard version of the `UNPIVOT` statement. + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/update.md b/docs/archive/1.0/sql/statements/update.md new file mode 100644 index 00000000000..93fe8ced8ca --- /dev/null +++ b/docs/archive/1.0/sql/statements/update.md @@ -0,0 +1,138 @@ +--- +layout: docu +railroad: statements/update.js +title: UPDATE Statement +--- + +The `UPDATE` statement modifies the values of rows in a table. + +## Examples + +For every row where `i` is `NULL`, set the value to 0 instead: + +```sql +UPDATE tbl +SET i = 0 +WHERE i IS NULL; +``` + +Set all values of `i` to 1 and all values of `j` to 2: + +```sql +UPDATE tbl +SET i = 1, j = 2; +``` + +## Syntax + +
+ +`UPDATE` changes the values of the specified columns in all rows that satisfy the condition. Only the columns to be modified need be mentioned in the `SET` clause; columns not explicitly modified retain their previous values. + +## Update from Other Table + +A table can be updated based upon values from another table. This can be done by specifying a table in a `FROM` clause, or using a sub-select statement. Both approaches have the benefit of completing the `UPDATE` operation in bulk for increased performance. + +```sql +CREATE OR REPLACE TABLE original AS + SELECT 1 AS key, 'original value' AS value + UNION ALL + SELECT 2 AS key, 'original value 2' AS value; + +CREATE OR REPLACE TABLE new AS + SELECT 1 AS key, 'new value' AS value + UNION ALL + SELECT 2 AS key, 'new value 2' AS value; + +SELECT * +FROM original; +``` + +
+ +| key | value | +|-----|------------------| +| 1 | original value | +| 2 | original value 2 | + +```sql +UPDATE original + SET value = new.value + FROM new + WHERE original.key = new.key; +``` + +Or: + +```sql +UPDATE original + SET value = ( + SELECT + new.value + FROM new + WHERE original.key = new.key + ); +``` + +```sql +SELECT * +FROM original; +``` + +
+ +| key | value | +|-----|-------------| +| 1 | new value | +| 2 | new value 2 | + +## Update from Same Table + +The only difference between this case and the above is that a different table alias must be specified on both the target table and the source table. +In this example `AS true_original` and `AS new` are both required. + +```sql +UPDATE original AS true_original + SET value = ( + SELECT + new.value || ' a change!' AS value + FROM original AS new + WHERE true_original.key = new.key + ); +``` + +## Update Using Joins + +To select the rows to update, `UPDATE` statements can use the `FROM` clause and express joins via the `WHERE` clause. For example: + +```sql +CREATE TABLE city (name VARCHAR, revenue BIGINT, country_code VARCHAR); +CREATE TABLE country (code VARCHAR, name VARCHAR); +INSERT INTO city VALUES ('Paris', 700, 'FR'), ('Lyon', 200, 'FR'), ('Brussels', 400, 'BE'); +INSERT INTO country VALUES ('FR', 'France'), ('BE', 'Belgium'); +``` + +To increase the revenue of all cities in France, join the `city` and the `country` tables, and filter on the latter: + +```sql +UPDATE city +SET revenue = revenue + 100 +FROM country +WHERE city.country_code = country.code + AND country.name = 'France'; +``` + +```sql +SELECT * +FROM city; +``` + +| name | revenue | country_code | +|----------|--------:|--------------| +| Paris | 800 | FR | +| Lyon | 300 | FR | +| Brussels | 400 | BE | + +## Upsert (Insert or Update) + +See the [Insert documentation]({% link docs/archive/1.0/sql/statements/insert.md %}#on-conflict-clause) for details. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/use.md b/docs/archive/1.0/sql/statements/use.md new file mode 100644 index 00000000000..f3a9521d9ff --- /dev/null +++ b/docs/archive/1.0/sql/statements/use.md @@ -0,0 +1,24 @@ +--- +layout: docu +railroad: statements/use.js +title: USE Statement +--- + +The `USE` statement selects a database and optional schema to use as the default. + +## Examples + +```sql +--- Sets the 'memory' database as the default +USE memory; +--- Sets the 'duck.main' database and schema as the default +USE duck.main; +``` + +## Syntax + +
+ +The `USE` statement sets a default database or database/schema combination to use for +future operations. For instance, tables created without providing a fully qualified +table name will be created in the default database. \ No newline at end of file diff --git a/docs/archive/1.0/sql/statements/vacuum.md b/docs/archive/1.0/sql/statements/vacuum.md new file mode 100644 index 00000000000..a51cec02834 --- /dev/null +++ b/docs/archive/1.0/sql/statements/vacuum.md @@ -0,0 +1,42 @@ +--- +layout: docu +railroad: statements/vacuum.js +title: VACUUM Statement +--- + +The `VACUUM` statement alone does nothing and is at present provided for PostgreSQL-compatibility. +The `VACUUM ANALYZE` statement recomputes table statistics if they have become stale due to table updates or deletions. + +## Examples + +No-op: + +```sql +VACUUM; +``` + +Rebuild database statistics: + +```sql +VACUUM ANALYZE; +``` + +Rebuild statistics for the table and column: + +```sql +VACUUM ANALYZE memory.main.my_table(my_column); +``` + +Not supported: + +```sql +VACUUM FULL; -- error +``` + +## Reclaiming Space + +To reclaim space after deleting rows, use the [`CHECKPOINT` statement]({% link docs/archive/1.0/sql/statements/checkpoint.md %}). + +## Syntax + +
\ No newline at end of file diff --git a/docs/archive/1.0/sql/tutorial/css/bootstrap.min.css b/docs/archive/1.0/sql/tutorial/css/bootstrap.min.css new file mode 100644 index 00000000000..ed3905e0e0c --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/css/bootstrap.min.css @@ -0,0 +1,6 @@ +/*! + * Bootstrap v3.3.7 (http://getbootstrap.com) + * Copyright 2011-2016 Twitter, Inc. + * Licensed under MIT (https://github.com/twbs/bootstrap/blob/master/LICENSE) + *//*! normalize.css v3.0.3 | MIT License | github.com/necolas/normalize.css */html{font-family:sans-serif;-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%}body{margin:0}article,aside,details,figcaption,figure,footer,header,hgroup,main,menu,nav,section,summary{display:block}audio,canvas,progress,video{display:inline-block;vertical-align:baseline}audio:not([controls]){display:none;height:0}[hidden],template{display:none}a{background-color:transparent}a:active,a:hover{outline:0}abbr[title]{border-bottom:1px dotted}b,strong{font-weight:700}dfn{font-style:italic}h1{margin:.67em 0;font-size:2em}mark{color:#000;background:#ff0}small{font-size:80%}sub,sup{position:relative;font-size:75%;line-height:0;vertical-align:baseline}sup{top:-.5em}sub{bottom:-.25em}img{border:0}svg:not(:root){overflow:hidden}figure{margin:1em 40px}hr{height:0;-webkit-box-sizing:content-box;-moz-box-sizing:content-box;box-sizing:content-box}pre{overflow:auto}code,kbd,pre,samp{font-family:monospace,monospace;font-size:1em}button,input,optgroup,select,textarea{margin:0;font:inherit;color:inherit}button{overflow:visible}button,select{text-transform:none}button,html input[type=button],input[type=reset],input[type=submit]{-webkit-appearance:button;cursor:pointer}button[disabled],html input[disabled]{cursor:default}button::-moz-focus-inner,input::-moz-focus-inner{padding:0;border:0}input{line-height:normal}input[type=checkbox],input[type=radio]{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box;padding:0}input[type=number]::-webkit-inner-spin-button,input[type=number]::-webkit-outer-spin-button{height:auto}input[type=search]{-webkit-box-sizing:content-box;-moz-box-sizing:content-box;box-sizing:content-box;-webkit-appearance:textfield}input[type=search]::-webkit-search-cancel-button,input[type=search]::-webkit-search-decoration{-webkit-appearance:none}fieldset{padding:.35em .625em .75em;margin:0 2px;border:1px solid silver}legend{padding:0;border:0}textarea{overflow:auto}optgroup{font-weight:700}table{border-spacing:0;border-collapse:collapse}td,th{padding:0}/*! Source: https://github.com/h5bp/html5-boilerplate/blob/master/src/css/main.css */@media print{*,:after,:before{color:#000!important;text-shadow:none!important;background:0 0!important;-webkit-box-shadow:none!important;box-shadow:none!important}a,a:visited{text-decoration:underline}a[href]:after{content:" (" attr(href) ")"}abbr[title]:after{content:" (" attr(title) ")"}a[href^="javascript:"]:after,a[href^="#"]:after{content:""}blockquote,pre{border:1px solid #999;page-break-inside:avoid}thead{display:table-header-group}img,tr{page-break-inside:avoid}img{max-width:100%!important}h2,h3,p{orphans:3;widows:3}h2,h3{page-break-after:avoid}.navbar{display:none}.btn>.caret,.dropup>.btn>.caret{border-top-color:#000!important}.label{border:1px solid #000}.table{border-collapse:collapse!important}.table td,.table th{background-color:#fff!important}.table-bordered td,.table-bordered th{border:1px solid #ddd!important}}@font-face{font-family:'Glyphicons Halflings';src:url(../fonts/glyphicons-halflings-regular.eot);src:url(../fonts/glyphicons-halflings-regular.eot?#iefix) format('embedded-opentype'),url(../fonts/glyphicons-halflings-regular.woff2) format('woff2'),url(../fonts/glyphicons-halflings-regular.woff) format('woff'),url(../fonts/glyphicons-halflings-regular.ttf) format('truetype'),url(../fonts/glyphicons-halflings-regular.svg#glyphicons_halflingsregular) format('svg')}.glyphicon{position:relative;top:1px;display:inline-block;font-family:'Glyphicons Halflings';font-style:normal;font-weight:400;line-height:1;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}.glyphicon-asterisk:before{content:"\002a"}.glyphicon-plus:before{content:"\002b"}.glyphicon-eur:before,.glyphicon-euro:before{content:"\20ac"}.glyphicon-minus:before{content:"\2212"}.glyphicon-cloud:before{content:"\2601"}.glyphicon-envelope:before{content:"\2709"}.glyphicon-pencil:before{content:"\270f"}.glyphicon-glass:before{content:"\e001"}.glyphicon-music:before{content:"\e002"}.glyphicon-search:before{content:"\e003"}.glyphicon-heart:before{content:"\e005"}.glyphicon-star:before{content:"\e006"}.glyphicon-star-empty:before{content:"\e007"}.glyphicon-user:before{content:"\e008"}.glyphicon-film:before{content:"\e009"}.glyphicon-th-large:before{content:"\e010"}.glyphicon-th:before{content:"\e011"}.glyphicon-th-list:before{content:"\e012"}.glyphicon-ok:before{content:"\e013"}.glyphicon-remove:before{content:"\e014"}.glyphicon-zoom-in:before{content:"\e015"}.glyphicon-zoom-out:before{content:"\e016"}.glyphicon-off:before{content:"\e017"}.glyphicon-signal:before{content:"\e018"}.glyphicon-cog:before{content:"\e019"}.glyphicon-trash:before{content:"\e020"}.glyphicon-home:before{content:"\e021"}.glyphicon-file:before{content:"\e022"}.glyphicon-time:before{content:"\e023"}.glyphicon-road:before{content:"\e024"}.glyphicon-download-alt:before{content:"\e025"}.glyphicon-download:before{content:"\e026"}.glyphicon-upload:before{content:"\e027"}.glyphicon-inbox:before{content:"\e028"}.glyphicon-play-circle:before{content:"\e029"}.glyphicon-repeat:before{content:"\e030"}.glyphicon-refresh:before{content:"\e031"}.glyphicon-list-alt:before{content:"\e032"}.glyphicon-lock:before{content:"\e033"}.glyphicon-flag:before{content:"\e034"}.glyphicon-headphones:before{content:"\e035"}.glyphicon-volume-off:before{content:"\e036"}.glyphicon-volume-down:before{content:"\e037"}.glyphicon-volume-up:before{content:"\e038"}.glyphicon-qrcode:before{content:"\e039"}.glyphicon-barcode:before{content:"\e040"}.glyphicon-tag:before{content:"\e041"}.glyphicon-tags:before{content:"\e042"}.glyphicon-book:before{content:"\e043"}.glyphicon-bookmark:before{content:"\e044"}.glyphicon-print:before{content:"\e045"}.glyphicon-camera:before{content:"\e046"}.glyphicon-font:before{content:"\e047"}.glyphicon-bold:before{content:"\e048"}.glyphicon-italic:before{content:"\e049"}.glyphicon-text-height:before{content:"\e050"}.glyphicon-text-width:before{content:"\e051"}.glyphicon-align-left:before{content:"\e052"}.glyphicon-align-center:before{content:"\e053"}.glyphicon-align-right:before{content:"\e054"}.glyphicon-align-justify:before{content:"\e055"}.glyphicon-list:before{content:"\e056"}.glyphicon-indent-left:before{content:"\e057"}.glyphicon-indent-right:before{content:"\e058"}.glyphicon-facetime-video:before{content:"\e059"}.glyphicon-picture:before{content:"\e060"}.glyphicon-map-marker:before{content:"\e062"}.glyphicon-adjust:before{content:"\e063"}.glyphicon-tint:before{content:"\e064"}.glyphicon-edit:before{content:"\e065"}.glyphicon-share:before{content:"\e066"}.glyphicon-check:before{content:"\e067"}.glyphicon-move:before{content:"\e068"}.glyphicon-step-backward:before{content:"\e069"}.glyphicon-fast-backward:before{content:"\e070"}.glyphicon-backward:before{content:"\e071"}.glyphicon-play:before{content:"\e072"}.glyphicon-pause:before{content:"\e073"}.glyphicon-stop:before{content:"\e074"}.glyphicon-forward:before{content:"\e075"}.glyphicon-fast-forward:before{content:"\e076"}.glyphicon-step-forward:before{content:"\e077"}.glyphicon-eject:before{content:"\e078"}.glyphicon-chevron-left:before{content:"\e079"}.glyphicon-chevron-right:before{content:"\e080"}.glyphicon-plus-sign:before{content:"\e081"}.glyphicon-minus-sign:before{content:"\e082"}.glyphicon-remove-sign:before{content:"\e083"}.glyphicon-ok-sign:before{content:"\e084"}.glyphicon-question-sign:before{content:"\e085"}.glyphicon-info-sign:before{content:"\e086"}.glyphicon-screenshot:before{content:"\e087"}.glyphicon-remove-circle:before{content:"\e088"}.glyphicon-ok-circle:before{content:"\e089"}.glyphicon-ban-circle:before{content:"\e090"}.glyphicon-arrow-left:before{content:"\e091"}.glyphicon-arrow-right:before{content:"\e092"}.glyphicon-arrow-up:before{content:"\e093"}.glyphicon-arrow-down:before{content:"\e094"}.glyphicon-share-alt:before{content:"\e095"}.glyphicon-resize-full:before{content:"\e096"}.glyphicon-resize-small:before{content:"\e097"}.glyphicon-exclamation-sign:before{content:"\e101"}.glyphicon-gift:before{content:"\e102"}.glyphicon-leaf:before{content:"\e103"}.glyphicon-fire:before{content:"\e104"}.glyphicon-eye-open:before{content:"\e105"}.glyphicon-eye-close:before{content:"\e106"}.glyphicon-warning-sign:before{content:"\e107"}.glyphicon-plane:before{content:"\e108"}.glyphicon-calendar:before{content:"\e109"}.glyphicon-random:before{content:"\e110"}.glyphicon-comment:before{content:"\e111"}.glyphicon-magnet:before{content:"\e112"}.glyphicon-chevron-up:before{content:"\e113"}.glyphicon-chevron-down:before{content:"\e114"}.glyphicon-retweet:before{content:"\e115"}.glyphicon-shopping-cart:before{content:"\e116"}.glyphicon-folder-close:before{content:"\e117"}.glyphicon-folder-open:before{content:"\e118"}.glyphicon-resize-vertical:before{content:"\e119"}.glyphicon-resize-horizontal:before{content:"\e120"}.glyphicon-hdd:before{content:"\e121"}.glyphicon-bullhorn:before{content:"\e122"}.glyphicon-bell:before{content:"\e123"}.glyphicon-certificate:before{content:"\e124"}.glyphicon-thumbs-up:before{content:"\e125"}.glyphicon-thumbs-down:before{content:"\e126"}.glyphicon-hand-right:before{content:"\e127"}.glyphicon-hand-left:before{content:"\e128"}.glyphicon-hand-up:before{content:"\e129"}.glyphicon-hand-down:before{content:"\e130"}.glyphicon-circle-arrow-right:before{content:"\e131"}.glyphicon-circle-arrow-left:before{content:"\e132"}.glyphicon-circle-arrow-up:before{content:"\e133"}.glyphicon-circle-arrow-down:before{content:"\e134"}.glyphicon-globe:before{content:"\e135"}.glyphicon-wrench:before{content:"\e136"}.glyphicon-tasks:before{content:"\e137"}.glyphicon-filter:before{content:"\e138"}.glyphicon-briefcase:before{content:"\e139"}.glyphicon-fullscreen:before{content:"\e140"}.glyphicon-dashboard:before{content:"\e141"}.glyphicon-paperclip:before{content:"\e142"}.glyphicon-heart-empty:before{content:"\e143"}.glyphicon-link:before{content:"\e144"}.glyphicon-phone:before{content:"\e145"}.glyphicon-pushpin:before{content:"\e146"}.glyphicon-usd:before{content:"\e148"}.glyphicon-gbp:before{content:"\e149"}.glyphicon-sort:before{content:"\e150"}.glyphicon-sort-by-alphabet:before{content:"\e151"}.glyphicon-sort-by-alphabet-alt:before{content:"\e152"}.glyphicon-sort-by-order:before{content:"\e153"}.glyphicon-sort-by-order-alt:before{content:"\e154"}.glyphicon-sort-by-attributes:before{content:"\e155"}.glyphicon-sort-by-attributes-alt:before{content:"\e156"}.glyphicon-unchecked:before{content:"\e157"}.glyphicon-expand:before{content:"\e158"}.glyphicon-collapse-down:before{content:"\e159"}.glyphicon-collapse-up:before{content:"\e160"}.glyphicon-log-in:before{content:"\e161"}.glyphicon-flash:before{content:"\e162"}.glyphicon-log-out:before{content:"\e163"}.glyphicon-new-window:before{content:"\e164"}.glyphicon-record:before{content:"\e165"}.glyphicon-save:before{content:"\e166"}.glyphicon-open:before{content:"\e167"}.glyphicon-saved:before{content:"\e168"}.glyphicon-import:before{content:"\e169"}.glyphicon-export:before{content:"\e170"}.glyphicon-send:before{content:"\e171"}.glyphicon-floppy-disk:before{content:"\e172"}.glyphicon-floppy-saved:before{content:"\e173"}.glyphicon-floppy-remove:before{content:"\e174"}.glyphicon-floppy-save:before{content:"\e175"}.glyphicon-floppy-open:before{content:"\e176"}.glyphicon-credit-card:before{content:"\e177"}.glyphicon-transfer:before{content:"\e178"}.glyphicon-cutlery:before{content:"\e179"}.glyphicon-header:before{content:"\e180"}.glyphicon-compressed:before{content:"\e181"}.glyphicon-earphone:before{content:"\e182"}.glyphicon-phone-alt:before{content:"\e183"}.glyphicon-tower:before{content:"\e184"}.glyphicon-stats:before{content:"\e185"}.glyphicon-sd-video:before{content:"\e186"}.glyphicon-hd-video:before{content:"\e187"}.glyphicon-subtitles:before{content:"\e188"}.glyphicon-sound-stereo:before{content:"\e189"}.glyphicon-sound-dolby:before{content:"\e190"}.glyphicon-sound-5-1:before{content:"\e191"}.glyphicon-sound-6-1:before{content:"\e192"}.glyphicon-sound-7-1:before{content:"\e193"}.glyphicon-copyright-mark:before{content:"\e194"}.glyphicon-registration-mark:before{content:"\e195"}.glyphicon-cloud-download:before{content:"\e197"}.glyphicon-cloud-upload:before{content:"\e198"}.glyphicon-tree-conifer:before{content:"\e199"}.glyphicon-tree-deciduous:before{content:"\e200"}.glyphicon-cd:before{content:"\e201"}.glyphicon-save-file:before{content:"\e202"}.glyphicon-open-file:before{content:"\e203"}.glyphicon-level-up:before{content:"\e204"}.glyphicon-copy:before{content:"\e205"}.glyphicon-paste:before{content:"\e206"}.glyphicon-alert:before{content:"\e209"}.glyphicon-equalizer:before{content:"\e210"}.glyphicon-king:before{content:"\e211"}.glyphicon-queen:before{content:"\e212"}.glyphicon-pawn:before{content:"\e213"}.glyphicon-bishop:before{content:"\e214"}.glyphicon-knight:before{content:"\e215"}.glyphicon-baby-formula:before{content:"\e216"}.glyphicon-tent:before{content:"\26fa"}.glyphicon-blackboard:before{content:"\e218"}.glyphicon-bed:before{content:"\e219"}.glyphicon-apple:before{content:"\f8ff"}.glyphicon-erase:before{content:"\e221"}.glyphicon-hourglass:before{content:"\231b"}.glyphicon-lamp:before{content:"\e223"}.glyphicon-duplicate:before{content:"\e224"}.glyphicon-piggy-bank:before{content:"\e225"}.glyphicon-scissors:before{content:"\e226"}.glyphicon-bitcoin:before{content:"\e227"}.glyphicon-btc:before{content:"\e227"}.glyphicon-xbt:before{content:"\e227"}.glyphicon-yen:before{content:"\00a5"}.glyphicon-jpy:before{content:"\00a5"}.glyphicon-ruble:before{content:"\20bd"}.glyphicon-rub:before{content:"\20bd"}.glyphicon-scale:before{content:"\e230"}.glyphicon-ice-lolly:before{content:"\e231"}.glyphicon-ice-lolly-tasted:before{content:"\e232"}.glyphicon-education:before{content:"\e233"}.glyphicon-option-horizontal:before{content:"\e234"}.glyphicon-option-vertical:before{content:"\e235"}.glyphicon-menu-hamburger:before{content:"\e236"}.glyphicon-modal-window:before{content:"\e237"}.glyphicon-oil:before{content:"\e238"}.glyphicon-grain:before{content:"\e239"}.glyphicon-sunglasses:before{content:"\e240"}.glyphicon-text-size:before{content:"\e241"}.glyphicon-text-color:before{content:"\e242"}.glyphicon-text-background:before{content:"\e243"}.glyphicon-object-align-top:before{content:"\e244"}.glyphicon-object-align-bottom:before{content:"\e245"}.glyphicon-object-align-horizontal:before{content:"\e246"}.glyphicon-object-align-left:before{content:"\e247"}.glyphicon-object-align-vertical:before{content:"\e248"}.glyphicon-object-align-right:before{content:"\e249"}.glyphicon-triangle-right:before{content:"\e250"}.glyphicon-triangle-left:before{content:"\e251"}.glyphicon-triangle-bottom:before{content:"\e252"}.glyphicon-triangle-top:before{content:"\e253"}.glyphicon-console:before{content:"\e254"}.glyphicon-superscript:before{content:"\e255"}.glyphicon-subscript:before{content:"\e256"}.glyphicon-menu-left:before{content:"\e257"}.glyphicon-menu-right:before{content:"\e258"}.glyphicon-menu-down:before{content:"\e259"}.glyphicon-menu-up:before{content:"\e260"}*{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}:after,:before{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}html{font-size:10px;-webkit-tap-highlight-color:rgba(0,0,0,0)}body{font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:14px;line-height:1.42857143;color:#333;background-color:#fff}button,input,select,textarea{font-family:inherit;font-size:inherit;line-height:inherit}a{color:#337ab7;text-decoration:none}a:focus,a:hover{color:#23527c;text-decoration:underline}a:focus{outline:5px auto -webkit-focus-ring-color;outline-offset:-2px}figure{margin:0}img{vertical-align:middle}.carousel-inner>.item>a>img,.carousel-inner>.item>img,.img-responsive,.thumbnail a>img,.thumbnail>img{display:block;max-width:100%;height:auto}.img-rounded{border-radius:6px}.img-thumbnail{display:inline-block;max-width:100%;height:auto;padding:4px;line-height:1.42857143;background-color:#fff;border:1px solid #ddd;border-radius:4px;-webkit-transition:all .2s ease-in-out;-o-transition:all .2s ease-in-out;transition:all .2s ease-in-out}.img-circle{border-radius:50%}hr{margin-top:20px;margin-bottom:20px;border:0;border-top:1px solid #eee}.sr-only{position:absolute;width:1px;height:1px;padding:0;margin:-1px;overflow:hidden;clip:rect(0,0,0,0);border:0}.sr-only-focusable:active,.sr-only-focusable:focus{position:static;width:auto;height:auto;margin:0;overflow:visible;clip:auto}[role=button]{cursor:pointer}.h1,.h2,.h3,.h4,.h5,.h6,h1,h2,h3,h4,h5,h6{font-family:inherit;font-weight:500;line-height:1.1;color:inherit}.h1 .small,.h1 small,.h2 .small,.h2 small,.h3 .small,.h3 small,.h4 .small,.h4 small,.h5 .small,.h5 small,.h6 .small,.h6 small,h1 .small,h1 small,h2 .small,h2 small,h3 .small,h3 small,h4 .small,h4 small,h5 .small,h5 small,h6 .small,h6 small{font-weight:400;line-height:1;color:#777}.h1,.h2,.h3,h1,h2,h3{margin-top:20px;margin-bottom:10px}.h1 .small,.h1 small,.h2 .small,.h2 small,.h3 .small,.h3 small,h1 .small,h1 small,h2 .small,h2 small,h3 .small,h3 small{font-size:65%}.h4,.h5,.h6,h4,h5,h6{margin-top:10px;margin-bottom:10px}.h4 .small,.h4 small,.h5 .small,.h5 small,.h6 .small,.h6 small,h4 .small,h4 small,h5 .small,h5 small,h6 .small,h6 small{font-size:75%}.h1,h1{font-size:36px}.h2,h2{font-size:30px}.h3,h3{font-size:24px}.h4,h4{font-size:18px}.h5,h5{font-size:14px}.h6,h6{font-size:12px}p{margin:0 0 10px}.lead{margin-bottom:20px;font-size:16px;font-weight:300;line-height:1.4}@media (min-width:768px){.lead{font-size:21px}}.small,small{font-size:85%}.mark,mark{padding:.2em;background-color:#fcf8e3}.text-left{text-align:left}.text-right{text-align:right}.text-center{text-align:center}.text-justify{text-align:justify}.text-nowrap{white-space:nowrap}.text-lowercase{text-transform:lowercase}.text-uppercase{text-transform:uppercase}.text-capitalize{text-transform:capitalize}.text-muted{color:#777}.text-primary{color:#337ab7}a.text-primary:focus,a.text-primary:hover{color:#286090}.text-success{color:#3c763d}a.text-success:focus,a.text-success:hover{color:#2b542c}.text-info{color:#31708f}a.text-info:focus,a.text-info:hover{color:#245269}.text-warning{color:#8a6d3b}a.text-warning:focus,a.text-warning:hover{color:#66512c}.text-danger{color:#a94442}a.text-danger:focus,a.text-danger:hover{color:#843534}.bg-primary{color:#fff;background-color:#337ab7}a.bg-primary:focus,a.bg-primary:hover{background-color:#286090}.bg-success{background-color:#dff0d8}a.bg-success:focus,a.bg-success:hover{background-color:#c1e2b3}.bg-info{background-color:#d9edf7}a.bg-info:focus,a.bg-info:hover{background-color:#afd9ee}.bg-warning{background-color:#fcf8e3}a.bg-warning:focus,a.bg-warning:hover{background-color:#f7ecb5}.bg-danger{background-color:#f2dede}a.bg-danger:focus,a.bg-danger:hover{background-color:#e4b9b9}.page-header{padding-bottom:9px;margin:40px 0 20px;border-bottom:1px solid #eee}ol,ul{margin-top:0;margin-bottom:10px}ol ol,ol ul,ul ol,ul ul{margin-bottom:0}.list-unstyled{padding-left:0;list-style:none}.list-inline{padding-left:0;margin-left:-5px;list-style:none}.list-inline>li{display:inline-block;padding-right:5px;padding-left:5px}dl{margin-top:0;margin-bottom:20px}dd,dt{line-height:1.42857143}dt{font-weight:700}dd{margin-left:0}@media (min-width:768px){.dl-horizontal dt{float:left;width:160px;overflow:hidden;clear:left;text-align:right;text-overflow:ellipsis;white-space:nowrap}.dl-horizontal dd{margin-left:180px}}abbr[data-original-title],abbr[title]{cursor:help;border-bottom:1px dotted #777}.initialism{font-size:90%;text-transform:uppercase}blockquote{padding:10px 20px;margin:0 0 20px;font-size:17.5px;border-left:5px solid #eee}blockquote ol:last-child,blockquote p:last-child,blockquote ul:last-child{margin-bottom:0}blockquote .small,blockquote footer,blockquote small{display:block;font-size:80%;line-height:1.42857143;color:#777}blockquote .small:before,blockquote footer:before,blockquote small:before{content:'\2014 \00A0'}.blockquote-reverse,blockquote.pull-right{padding-right:15px;padding-left:0;text-align:right;border-right:5px solid #eee;border-left:0}.blockquote-reverse .small:before,.blockquote-reverse footer:before,.blockquote-reverse small:before,blockquote.pull-right .small:before,blockquote.pull-right footer:before,blockquote.pull-right small:before{content:''}.blockquote-reverse .small:after,.blockquote-reverse footer:after,.blockquote-reverse small:after,blockquote.pull-right .small:after,blockquote.pull-right footer:after,blockquote.pull-right small:after{content:'\00A0 \2014'}address{margin-bottom:20px;font-style:normal;line-height:1.42857143}code,kbd,pre,samp{font-family:Menlo,Monaco,Consolas,"Courier New",monospace}code{padding:2px 4px;font-size:90%;color:#c7254e;background-color:#f9f2f4;border-radius:4px}kbd{padding:2px 4px;font-size:90%;color:#fff;background-color:#333;border-radius:3px;-webkit-box-shadow:inset 0 -1px 0 rgba(0,0,0,.25);box-shadow:inset 0 -1px 0 rgba(0,0,0,.25)}kbd kbd{padding:0;font-size:100%;font-weight:700;-webkit-box-shadow:none;box-shadow:none}pre{display:block;padding:9.5px;margin:0 0 10px;font-size:13px;line-height:1.42857143;color:#333;word-break:break-all;word-wrap:break-word;background-color:#f5f5f5;border:1px solid #ccc;border-radius:4px}pre code{padding:0;font-size:inherit;color:inherit;white-space:pre-wrap;background-color:transparent;border-radius:0}.pre-scrollable{max-height:340px;overflow-y:scroll}.container{padding-right:15px;padding-left:15px;margin-right:auto;margin-left:auto}@media (min-width:768px){.container{width:750px}}@media (min-width:992px){.container{width:970px}}@media (min-width:1200px){.container{width:1170px}}.container-fluid{padding-right:15px;padding-left:15px;margin-right:auto;margin-left:auto}.row{margin-right:-15px;margin-left:-15px}.col-lg-1,.col-lg-10,.col-lg-11,.col-lg-12,.col-lg-2,.col-lg-3,.col-lg-4,.col-lg-5,.col-lg-6,.col-lg-7,.col-lg-8,.col-lg-9,.col-md-1,.col-md-10,.col-md-11,.col-md-12,.col-md-2,.col-md-3,.col-md-4,.col-md-5,.col-md-6,.col-md-7,.col-md-8,.col-md-9,.col-sm-1,.col-sm-10,.col-sm-11,.col-sm-12,.col-sm-2,.col-sm-3,.col-sm-4,.col-sm-5,.col-sm-6,.col-sm-7,.col-sm-8,.col-sm-9,.col-xs-1,.col-xs-10,.col-xs-11,.col-xs-12,.col-xs-2,.col-xs-3,.col-xs-4,.col-xs-5,.col-xs-6,.col-xs-7,.col-xs-8,.col-xs-9{position:relative;min-height:1px;padding-right:15px;padding-left:15px}.col-xs-1,.col-xs-10,.col-xs-11,.col-xs-12,.col-xs-2,.col-xs-3,.col-xs-4,.col-xs-5,.col-xs-6,.col-xs-7,.col-xs-8,.col-xs-9{float:left}.col-xs-12{width:100%}.col-xs-11{width:91.66666667%}.col-xs-10{width:83.33333333%}.col-xs-9{width:75%}.col-xs-8{width:66.66666667%}.col-xs-7{width:58.33333333%}.col-xs-6{width:50%}.col-xs-5{width:41.66666667%}.col-xs-4{width:33.33333333%}.col-xs-3{width:25%}.col-xs-2{width:16.66666667%}.col-xs-1{width:8.33333333%}.col-xs-pull-12{right:100%}.col-xs-pull-11{right:91.66666667%}.col-xs-pull-10{right:83.33333333%}.col-xs-pull-9{right:75%}.col-xs-pull-8{right:66.66666667%}.col-xs-pull-7{right:58.33333333%}.col-xs-pull-6{right:50%}.col-xs-pull-5{right:41.66666667%}.col-xs-pull-4{right:33.33333333%}.col-xs-pull-3{right:25%}.col-xs-pull-2{right:16.66666667%}.col-xs-pull-1{right:8.33333333%}.col-xs-pull-0{right:auto}.col-xs-push-12{left:100%}.col-xs-push-11{left:91.66666667%}.col-xs-push-10{left:83.33333333%}.col-xs-push-9{left:75%}.col-xs-push-8{left:66.66666667%}.col-xs-push-7{left:58.33333333%}.col-xs-push-6{left:50%}.col-xs-push-5{left:41.66666667%}.col-xs-push-4{left:33.33333333%}.col-xs-push-3{left:25%}.col-xs-push-2{left:16.66666667%}.col-xs-push-1{left:8.33333333%}.col-xs-push-0{left:auto}.col-xs-offset-12{margin-left:100%}.col-xs-offset-11{margin-left:91.66666667%}.col-xs-offset-10{margin-left:83.33333333%}.col-xs-offset-9{margin-left:75%}.col-xs-offset-8{margin-left:66.66666667%}.col-xs-offset-7{margin-left:58.33333333%}.col-xs-offset-6{margin-left:50%}.col-xs-offset-5{margin-left:41.66666667%}.col-xs-offset-4{margin-left:33.33333333%}.col-xs-offset-3{margin-left:25%}.col-xs-offset-2{margin-left:16.66666667%}.col-xs-offset-1{margin-left:8.33333333%}.col-xs-offset-0{margin-left:0}@media (min-width:768px){.col-sm-1,.col-sm-10,.col-sm-11,.col-sm-12,.col-sm-2,.col-sm-3,.col-sm-4,.col-sm-5,.col-sm-6,.col-sm-7,.col-sm-8,.col-sm-9{float:left}.col-sm-12{width:100%}.col-sm-11{width:91.66666667%}.col-sm-10{width:83.33333333%}.col-sm-9{width:75%}.col-sm-8{width:66.66666667%}.col-sm-7{width:58.33333333%}.col-sm-6{width:50%}.col-sm-5{width:41.66666667%}.col-sm-4{width:33.33333333%}.col-sm-3{width:25%}.col-sm-2{width:16.66666667%}.col-sm-1{width:8.33333333%}.col-sm-pull-12{right:100%}.col-sm-pull-11{right:91.66666667%}.col-sm-pull-10{right:83.33333333%}.col-sm-pull-9{right:75%}.col-sm-pull-8{right:66.66666667%}.col-sm-pull-7{right:58.33333333%}.col-sm-pull-6{right:50%}.col-sm-pull-5{right:41.66666667%}.col-sm-pull-4{right:33.33333333%}.col-sm-pull-3{right:25%}.col-sm-pull-2{right:16.66666667%}.col-sm-pull-1{right:8.33333333%}.col-sm-pull-0{right:auto}.col-sm-push-12{left:100%}.col-sm-push-11{left:91.66666667%}.col-sm-push-10{left:83.33333333%}.col-sm-push-9{left:75%}.col-sm-push-8{left:66.66666667%}.col-sm-push-7{left:58.33333333%}.col-sm-push-6{left:50%}.col-sm-push-5{left:41.66666667%}.col-sm-push-4{left:33.33333333%}.col-sm-push-3{left:25%}.col-sm-push-2{left:16.66666667%}.col-sm-push-1{left:8.33333333%}.col-sm-push-0{left:auto}.col-sm-offset-12{margin-left:100%}.col-sm-offset-11{margin-left:91.66666667%}.col-sm-offset-10{margin-left:83.33333333%}.col-sm-offset-9{margin-left:75%}.col-sm-offset-8{margin-left:66.66666667%}.col-sm-offset-7{margin-left:58.33333333%}.col-sm-offset-6{margin-left:50%}.col-sm-offset-5{margin-left:41.66666667%}.col-sm-offset-4{margin-left:33.33333333%}.col-sm-offset-3{margin-left:25%}.col-sm-offset-2{margin-left:16.66666667%}.col-sm-offset-1{margin-left:8.33333333%}.col-sm-offset-0{margin-left:0}}@media (min-width:992px){.col-md-1,.col-md-10,.col-md-11,.col-md-12,.col-md-2,.col-md-3,.col-md-4,.col-md-5,.col-md-6,.col-md-7,.col-md-8,.col-md-9{float:left}.col-md-12{width:100%}.col-md-11{width:91.66666667%}.col-md-10{width:83.33333333%}.col-md-9{width:75%}.col-md-8{width:66.66666667%}.col-md-7{width:58.33333333%}.col-md-6{width:50%}.col-md-5{width:41.66666667%}.col-md-4{width:33.33333333%}.col-md-3{width:25%}.col-md-2{width:16.66666667%}.col-md-1{width:8.33333333%}.col-md-pull-12{right:100%}.col-md-pull-11{right:91.66666667%}.col-md-pull-10{right:83.33333333%}.col-md-pull-9{right:75%}.col-md-pull-8{right:66.66666667%}.col-md-pull-7{right:58.33333333%}.col-md-pull-6{right:50%}.col-md-pull-5{right:41.66666667%}.col-md-pull-4{right:33.33333333%}.col-md-pull-3{right:25%}.col-md-pull-2{right:16.66666667%}.col-md-pull-1{right:8.33333333%}.col-md-pull-0{right:auto}.col-md-push-12{left:100%}.col-md-push-11{left:91.66666667%}.col-md-push-10{left:83.33333333%}.col-md-push-9{left:75%}.col-md-push-8{left:66.66666667%}.col-md-push-7{left:58.33333333%}.col-md-push-6{left:50%}.col-md-push-5{left:41.66666667%}.col-md-push-4{left:33.33333333%}.col-md-push-3{left:25%}.col-md-push-2{left:16.66666667%}.col-md-push-1{left:8.33333333%}.col-md-push-0{left:auto}.col-md-offset-12{margin-left:100%}.col-md-offset-11{margin-left:91.66666667%}.col-md-offset-10{margin-left:83.33333333%}.col-md-offset-9{margin-left:75%}.col-md-offset-8{margin-left:66.66666667%}.col-md-offset-7{margin-left:58.33333333%}.col-md-offset-6{margin-left:50%}.col-md-offset-5{margin-left:41.66666667%}.col-md-offset-4{margin-left:33.33333333%}.col-md-offset-3{margin-left:25%}.col-md-offset-2{margin-left:16.66666667%}.col-md-offset-1{margin-left:8.33333333%}.col-md-offset-0{margin-left:0}}@media (min-width:1200px){.col-lg-1,.col-lg-10,.col-lg-11,.col-lg-12,.col-lg-2,.col-lg-3,.col-lg-4,.col-lg-5,.col-lg-6,.col-lg-7,.col-lg-8,.col-lg-9{float:left}.col-lg-12{width:100%}.col-lg-11{width:91.66666667%}.col-lg-10{width:83.33333333%}.col-lg-9{width:75%}.col-lg-8{width:66.66666667%}.col-lg-7{width:58.33333333%}.col-lg-6{width:50%}.col-lg-5{width:41.66666667%}.col-lg-4{width:33.33333333%}.col-lg-3{width:25%}.col-lg-2{width:16.66666667%}.col-lg-1{width:8.33333333%}.col-lg-pull-12{right:100%}.col-lg-pull-11{right:91.66666667%}.col-lg-pull-10{right:83.33333333%}.col-lg-pull-9{right:75%}.col-lg-pull-8{right:66.66666667%}.col-lg-pull-7{right:58.33333333%}.col-lg-pull-6{right:50%}.col-lg-pull-5{right:41.66666667%}.col-lg-pull-4{right:33.33333333%}.col-lg-pull-3{right:25%}.col-lg-pull-2{right:16.66666667%}.col-lg-pull-1{right:8.33333333%}.col-lg-pull-0{right:auto}.col-lg-push-12{left:100%}.col-lg-push-11{left:91.66666667%}.col-lg-push-10{left:83.33333333%}.col-lg-push-9{left:75%}.col-lg-push-8{left:66.66666667%}.col-lg-push-7{left:58.33333333%}.col-lg-push-6{left:50%}.col-lg-push-5{left:41.66666667%}.col-lg-push-4{left:33.33333333%}.col-lg-push-3{left:25%}.col-lg-push-2{left:16.66666667%}.col-lg-push-1{left:8.33333333%}.col-lg-push-0{left:auto}.col-lg-offset-12{margin-left:100%}.col-lg-offset-11{margin-left:91.66666667%}.col-lg-offset-10{margin-left:83.33333333%}.col-lg-offset-9{margin-left:75%}.col-lg-offset-8{margin-left:66.66666667%}.col-lg-offset-7{margin-left:58.33333333%}.col-lg-offset-6{margin-left:50%}.col-lg-offset-5{margin-left:41.66666667%}.col-lg-offset-4{margin-left:33.33333333%}.col-lg-offset-3{margin-left:25%}.col-lg-offset-2{margin-left:16.66666667%}.col-lg-offset-1{margin-left:8.33333333%}.col-lg-offset-0{margin-left:0}}table{background-color:transparent}caption{padding-top:8px;padding-bottom:8px;color:#777;text-align:left}th{text-align:left}.table{width:100%;max-width:100%;margin-bottom:20px}.table>tbody>tr>td,.table>tbody>tr>th,.table>tfoot>tr>td,.table>tfoot>tr>th,.table>thead>tr>td,.table>thead>tr>th{padding:8px;line-height:1.42857143;vertical-align:top;border-top:1px solid #ddd}.table>thead>tr>th{vertical-align:bottom;border-bottom:2px solid #ddd}.table>caption+thead>tr:first-child>td,.table>caption+thead>tr:first-child>th,.table>colgroup+thead>tr:first-child>td,.table>colgroup+thead>tr:first-child>th,.table>thead:first-child>tr:first-child>td,.table>thead:first-child>tr:first-child>th{border-top:0}.table>tbody+tbody{border-top:2px solid #ddd}.table .table{background-color:#fff}.table-condensed>tbody>tr>td,.table-condensed>tbody>tr>th,.table-condensed>tfoot>tr>td,.table-condensed>tfoot>tr>th,.table-condensed>thead>tr>td,.table-condensed>thead>tr>th{padding:5px}.table-bordered{border:1px solid #ddd}.table-bordered>tbody>tr>td,.table-bordered>tbody>tr>th,.table-bordered>tfoot>tr>td,.table-bordered>tfoot>tr>th,.table-bordered>thead>tr>td,.table-bordered>thead>tr>th{border:1px solid #ddd}.table-bordered>thead>tr>td,.table-bordered>thead>tr>th{border-bottom-width:2px}.table-striped>tbody>tr:nth-of-type(odd){background-color:#f9f9f9}.table-hover>tbody>tr:hover{background-color:#f5f5f5}table col[class*=col-]{position:static;display:table-column;float:none}table td[class*=col-],table th[class*=col-]{position:static;display:table-cell;float:none}.table>tbody>tr.active>td,.table>tbody>tr.active>th,.table>tbody>tr>td.active,.table>tbody>tr>th.active,.table>tfoot>tr.active>td,.table>tfoot>tr.active>th,.table>tfoot>tr>td.active,.table>tfoot>tr>th.active,.table>thead>tr.active>td,.table>thead>tr.active>th,.table>thead>tr>td.active,.table>thead>tr>th.active{background-color:#f5f5f5}.table-hover>tbody>tr.active:hover>td,.table-hover>tbody>tr.active:hover>th,.table-hover>tbody>tr:hover>.active,.table-hover>tbody>tr>td.active:hover,.table-hover>tbody>tr>th.active:hover{background-color:#e8e8e8}.table>tbody>tr.success>td,.table>tbody>tr.success>th,.table>tbody>tr>td.success,.table>tbody>tr>th.success,.table>tfoot>tr.success>td,.table>tfoot>tr.success>th,.table>tfoot>tr>td.success,.table>tfoot>tr>th.success,.table>thead>tr.success>td,.table>thead>tr.success>th,.table>thead>tr>td.success,.table>thead>tr>th.success{background-color:#dff0d8}.table-hover>tbody>tr.success:hover>td,.table-hover>tbody>tr.success:hover>th,.table-hover>tbody>tr:hover>.success,.table-hover>tbody>tr>td.success:hover,.table-hover>tbody>tr>th.success:hover{background-color:#d0e9c6}.table>tbody>tr.info>td,.table>tbody>tr.info>th,.table>tbody>tr>td.info,.table>tbody>tr>th.info,.table>tfoot>tr.info>td,.table>tfoot>tr.info>th,.table>tfoot>tr>td.info,.table>tfoot>tr>th.info,.table>thead>tr.info>td,.table>thead>tr.info>th,.table>thead>tr>td.info,.table>thead>tr>th.info{background-color:#d9edf7}.table-hover>tbody>tr.info:hover>td,.table-hover>tbody>tr.info:hover>th,.table-hover>tbody>tr:hover>.info,.table-hover>tbody>tr>td.info:hover,.table-hover>tbody>tr>th.info:hover{background-color:#c4e3f3}.table>tbody>tr.warning>td,.table>tbody>tr.warning>th,.table>tbody>tr>td.warning,.table>tbody>tr>th.warning,.table>tfoot>tr.warning>td,.table>tfoot>tr.warning>th,.table>tfoot>tr>td.warning,.table>tfoot>tr>th.warning,.table>thead>tr.warning>td,.table>thead>tr.warning>th,.table>thead>tr>td.warning,.table>thead>tr>th.warning{background-color:#fcf8e3}.table-hover>tbody>tr.warning:hover>td,.table-hover>tbody>tr.warning:hover>th,.table-hover>tbody>tr:hover>.warning,.table-hover>tbody>tr>td.warning:hover,.table-hover>tbody>tr>th.warning:hover{background-color:#faf2cc}.table>tbody>tr.danger>td,.table>tbody>tr.danger>th,.table>tbody>tr>td.danger,.table>tbody>tr>th.danger,.table>tfoot>tr.danger>td,.table>tfoot>tr.danger>th,.table>tfoot>tr>td.danger,.table>tfoot>tr>th.danger,.table>thead>tr.danger>td,.table>thead>tr.danger>th,.table>thead>tr>td.danger,.table>thead>tr>th.danger{background-color:#f2dede}.table-hover>tbody>tr.danger:hover>td,.table-hover>tbody>tr.danger:hover>th,.table-hover>tbody>tr:hover>.danger,.table-hover>tbody>tr>td.danger:hover,.table-hover>tbody>tr>th.danger:hover{background-color:#ebcccc}.table-responsive{min-height:.01%;overflow-x:auto}@media screen and (max-width:767px){.table-responsive{width:100%;margin-bottom:15px;overflow-y:hidden;-ms-overflow-style:-ms-autohiding-scrollbar;border:1px solid #ddd}.table-responsive>.table{margin-bottom:0}.table-responsive>.table>tbody>tr>td,.table-responsive>.table>tbody>tr>th,.table-responsive>.table>tfoot>tr>td,.table-responsive>.table>tfoot>tr>th,.table-responsive>.table>thead>tr>td,.table-responsive>.table>thead>tr>th{white-space:nowrap}.table-responsive>.table-bordered{border:0}.table-responsive>.table-bordered>tbody>tr>td:first-child,.table-responsive>.table-bordered>tbody>tr>th:first-child,.table-responsive>.table-bordered>tfoot>tr>td:first-child,.table-responsive>.table-bordered>tfoot>tr>th:first-child,.table-responsive>.table-bordered>thead>tr>td:first-child,.table-responsive>.table-bordered>thead>tr>th:first-child{border-left:0}.table-responsive>.table-bordered>tbody>tr>td:last-child,.table-responsive>.table-bordered>tbody>tr>th:last-child,.table-responsive>.table-bordered>tfoot>tr>td:last-child,.table-responsive>.table-bordered>tfoot>tr>th:last-child,.table-responsive>.table-bordered>thead>tr>td:last-child,.table-responsive>.table-bordered>thead>tr>th:last-child{border-right:0}.table-responsive>.table-bordered>tbody>tr:last-child>td,.table-responsive>.table-bordered>tbody>tr:last-child>th,.table-responsive>.table-bordered>tfoot>tr:last-child>td,.table-responsive>.table-bordered>tfoot>tr:last-child>th{border-bottom:0}}fieldset{min-width:0;padding:0;margin:0;border:0}legend{display:block;width:100%;padding:0;margin-bottom:20px;font-size:21px;line-height:inherit;color:#333;border:0;border-bottom:1px solid #e5e5e5}label{display:inline-block;max-width:100%;margin-bottom:5px;font-weight:700}input[type=search]{-webkit-box-sizing:border-box;-moz-box-sizing:border-box;box-sizing:border-box}input[type=checkbox],input[type=radio]{margin:4px 0 0;margin-top:1px\9;line-height:normal}input[type=file]{display:block}input[type=range]{display:block;width:100%}select[multiple],select[size]{height:auto}input[type=file]:focus,input[type=checkbox]:focus,input[type=radio]:focus{outline:5px auto -webkit-focus-ring-color;outline-offset:-2px}output{display:block;padding-top:7px;font-size:14px;line-height:1.42857143;color:#555}.form-control{display:block;width:100%;height:34px;padding:6px 12px;font-size:14px;line-height:1.42857143;color:#555;background-color:#fff;background-image:none;border:1px solid #ccc;border-radius:4px;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075);box-shadow:inset 0 1px 1px rgba(0,0,0,.075);-webkit-transition:border-color ease-in-out .15s,-webkit-box-shadow ease-in-out .15s;-o-transition:border-color ease-in-out .15s,box-shadow ease-in-out .15s;transition:border-color ease-in-out .15s,box-shadow ease-in-out .15s}.form-control:focus{border-color:#66afe9;outline:0;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 8px rgba(102,175,233,.6);box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 8px rgba(102,175,233,.6)}.form-control::-moz-placeholder{color:#999;opacity:1}.form-control:-ms-input-placeholder{color:#999}.form-control::-webkit-input-placeholder{color:#999}.form-control::-ms-expand{background-color:transparent;border:0}.form-control[disabled],.form-control[readonly],fieldset[disabled] .form-control{background-color:#eee;opacity:1}.form-control[disabled],fieldset[disabled] .form-control{cursor:not-allowed}textarea.form-control{height:auto}input[type=search]{-webkit-appearance:none}@media screen and (-webkit-min-device-pixel-ratio:0){input[type=date].form-control,input[type=time].form-control,input[type=datetime-local].form-control,input[type=month].form-control{line-height:34px}.input-group-sm input[type=date],.input-group-sm input[type=time],.input-group-sm input[type=datetime-local],.input-group-sm input[type=month],input[type=date].input-sm,input[type=time].input-sm,input[type=datetime-local].input-sm,input[type=month].input-sm{line-height:30px}.input-group-lg input[type=date],.input-group-lg input[type=time],.input-group-lg input[type=datetime-local],.input-group-lg input[type=month],input[type=date].input-lg,input[type=time].input-lg,input[type=datetime-local].input-lg,input[type=month].input-lg{line-height:46px}}.form-group{margin-bottom:15px}.checkbox,.radio{position:relative;display:block;margin-top:10px;margin-bottom:10px}.checkbox label,.radio label{min-height:20px;padding-left:20px;margin-bottom:0;font-weight:400;cursor:pointer}.checkbox input[type=checkbox],.checkbox-inline input[type=checkbox],.radio input[type=radio],.radio-inline input[type=radio]{position:absolute;margin-top:4px\9;margin-left:-20px}.checkbox+.checkbox,.radio+.radio{margin-top:-5px}.checkbox-inline,.radio-inline{position:relative;display:inline-block;padding-left:20px;margin-bottom:0;font-weight:400;vertical-align:middle;cursor:pointer}.checkbox-inline+.checkbox-inline,.radio-inline+.radio-inline{margin-top:0;margin-left:10px}fieldset[disabled] input[type=checkbox],fieldset[disabled] input[type=radio],input[type=checkbox].disabled,input[type=checkbox][disabled],input[type=radio].disabled,input[type=radio][disabled]{cursor:not-allowed}.checkbox-inline.disabled,.radio-inline.disabled,fieldset[disabled] .checkbox-inline,fieldset[disabled] .radio-inline{cursor:not-allowed}.checkbox.disabled label,.radio.disabled label,fieldset[disabled] .checkbox label,fieldset[disabled] .radio label{cursor:not-allowed}.form-control-static{min-height:34px;padding-top:7px;padding-bottom:7px;margin-bottom:0}.form-control-static.input-lg,.form-control-static.input-sm{padding-right:0;padding-left:0}.input-sm{height:30px;padding:5px 10px;font-size:12px;line-height:1.5;border-radius:3px}select.input-sm{height:30px;line-height:30px}select[multiple].input-sm,textarea.input-sm{height:auto}.form-group-sm .form-control{height:30px;padding:5px 10px;font-size:12px;line-height:1.5;border-radius:3px}.form-group-sm select.form-control{height:30px;line-height:30px}.form-group-sm select[multiple].form-control,.form-group-sm textarea.form-control{height:auto}.form-group-sm .form-control-static{height:30px;min-height:32px;padding:6px 10px;font-size:12px;line-height:1.5}.input-lg{height:46px;padding:10px 16px;font-size:18px;line-height:1.3333333;border-radius:6px}select.input-lg{height:46px;line-height:46px}select[multiple].input-lg,textarea.input-lg{height:auto}.form-group-lg .form-control{height:46px;padding:10px 16px;font-size:18px;line-height:1.3333333;border-radius:6px}.form-group-lg select.form-control{height:46px;line-height:46px}.form-group-lg select[multiple].form-control,.form-group-lg textarea.form-control{height:auto}.form-group-lg .form-control-static{height:46px;min-height:38px;padding:11px 16px;font-size:18px;line-height:1.3333333}.has-feedback{position:relative}.has-feedback .form-control{padding-right:42.5px}.form-control-feedback{position:absolute;top:0;right:0;z-index:2;display:block;width:34px;height:34px;line-height:34px;text-align:center;pointer-events:none}.form-group-lg .form-control+.form-control-feedback,.input-group-lg+.form-control-feedback,.input-lg+.form-control-feedback{width:46px;height:46px;line-height:46px}.form-group-sm .form-control+.form-control-feedback,.input-group-sm+.form-control-feedback,.input-sm+.form-control-feedback{width:30px;height:30px;line-height:30px}.has-success .checkbox,.has-success .checkbox-inline,.has-success .control-label,.has-success .help-block,.has-success .radio,.has-success .radio-inline,.has-success.checkbox label,.has-success.checkbox-inline label,.has-success.radio label,.has-success.radio-inline label{color:#3c763d}.has-success .form-control{border-color:#3c763d;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075);box-shadow:inset 0 1px 1px rgba(0,0,0,.075)}.has-success .form-control:focus{border-color:#2b542c;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 6px #67b168;box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 6px #67b168}.has-success .input-group-addon{color:#3c763d;background-color:#dff0d8;border-color:#3c763d}.has-success .form-control-feedback{color:#3c763d}.has-warning .checkbox,.has-warning .checkbox-inline,.has-warning .control-label,.has-warning .help-block,.has-warning .radio,.has-warning .radio-inline,.has-warning.checkbox label,.has-warning.checkbox-inline label,.has-warning.radio label,.has-warning.radio-inline label{color:#8a6d3b}.has-warning .form-control{border-color:#8a6d3b;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075);box-shadow:inset 0 1px 1px rgba(0,0,0,.075)}.has-warning .form-control:focus{border-color:#66512c;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 6px #c0a16b;box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 6px #c0a16b}.has-warning .input-group-addon{color:#8a6d3b;background-color:#fcf8e3;border-color:#8a6d3b}.has-warning .form-control-feedback{color:#8a6d3b}.has-error .checkbox,.has-error .checkbox-inline,.has-error .control-label,.has-error .help-block,.has-error .radio,.has-error .radio-inline,.has-error.checkbox label,.has-error.checkbox-inline label,.has-error.radio label,.has-error.radio-inline label{color:#a94442}.has-error .form-control{border-color:#a94442;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075);box-shadow:inset 0 1px 1px rgba(0,0,0,.075)}.has-error .form-control:focus{border-color:#843534;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 6px #ce8483;box-shadow:inset 0 1px 1px rgba(0,0,0,.075),0 0 6px #ce8483}.has-error .input-group-addon{color:#a94442;background-color:#f2dede;border-color:#a94442}.has-error .form-control-feedback{color:#a94442}.has-feedback label~.form-control-feedback{top:25px}.has-feedback label.sr-only~.form-control-feedback{top:0}.help-block{display:block;margin-top:5px;margin-bottom:10px;color:#737373}@media (min-width:768px){.form-inline .form-group{display:inline-block;margin-bottom:0;vertical-align:middle}.form-inline .form-control{display:inline-block;width:auto;vertical-align:middle}.form-inline .form-control-static{display:inline-block}.form-inline .input-group{display:inline-table;vertical-align:middle}.form-inline .input-group .form-control,.form-inline .input-group .input-group-addon,.form-inline .input-group .input-group-btn{width:auto}.form-inline .input-group>.form-control{width:100%}.form-inline .control-label{margin-bottom:0;vertical-align:middle}.form-inline .checkbox,.form-inline .radio{display:inline-block;margin-top:0;margin-bottom:0;vertical-align:middle}.form-inline .checkbox label,.form-inline .radio label{padding-left:0}.form-inline .checkbox input[type=checkbox],.form-inline .radio input[type=radio]{position:relative;margin-left:0}.form-inline .has-feedback .form-control-feedback{top:0}}.form-horizontal .checkbox,.form-horizontal .checkbox-inline,.form-horizontal .radio,.form-horizontal .radio-inline{padding-top:7px;margin-top:0;margin-bottom:0}.form-horizontal .checkbox,.form-horizontal .radio{min-height:27px}.form-horizontal .form-group{margin-right:-15px;margin-left:-15px}@media (min-width:768px){.form-horizontal .control-label{padding-top:7px;margin-bottom:0;text-align:right}}.form-horizontal .has-feedback .form-control-feedback{right:15px}@media (min-width:768px){.form-horizontal .form-group-lg .control-label{padding-top:11px;font-size:18px}}@media (min-width:768px){.form-horizontal .form-group-sm .control-label{padding-top:6px;font-size:12px}}.btn{display:inline-block;padding:6px 12px;margin-bottom:0;font-size:14px;font-weight:400;line-height:1.42857143;text-align:center;white-space:nowrap;vertical-align:middle;-ms-touch-action:manipulation;touch-action:manipulation;cursor:pointer;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none;background-image:none;border:1px solid transparent;border-radius:4px}.btn.active.focus,.btn.active:focus,.btn.focus,.btn:active.focus,.btn:active:focus,.btn:focus{outline:5px auto -webkit-focus-ring-color;outline-offset:-2px}.btn.focus,.btn:focus,.btn:hover{color:#333;text-decoration:none}.btn.active,.btn:active{background-image:none;outline:0;-webkit-box-shadow:inset 0 3px 5px rgba(0,0,0,.125);box-shadow:inset 0 3px 5px rgba(0,0,0,.125)}.btn.disabled,.btn[disabled],fieldset[disabled] .btn{cursor:not-allowed;filter:alpha(opacity=65);-webkit-box-shadow:none;box-shadow:none;opacity:.65}a.btn.disabled,fieldset[disabled] a.btn{pointer-events:none}.btn-default{color:#333;background-color:#fff;border-color:#ccc}.btn-default.focus,.btn-default:focus{color:#333;background-color:#e6e6e6;border-color:#8c8c8c}.btn-default:hover{color:#333;background-color:#e6e6e6;border-color:#adadad}.btn-default.active,.btn-default:active,.open>.dropdown-toggle.btn-default{color:#333;background-color:#e6e6e6;border-color:#adadad}.btn-default.active.focus,.btn-default.active:focus,.btn-default.active:hover,.btn-default:active.focus,.btn-default:active:focus,.btn-default:active:hover,.open>.dropdown-toggle.btn-default.focus,.open>.dropdown-toggle.btn-default:focus,.open>.dropdown-toggle.btn-default:hover{color:#333;background-color:#d4d4d4;border-color:#8c8c8c}.btn-default.active,.btn-default:active,.open>.dropdown-toggle.btn-default{background-image:none}.btn-default.disabled.focus,.btn-default.disabled:focus,.btn-default.disabled:hover,.btn-default[disabled].focus,.btn-default[disabled]:focus,.btn-default[disabled]:hover,fieldset[disabled] .btn-default.focus,fieldset[disabled] .btn-default:focus,fieldset[disabled] .btn-default:hover{background-color:#fff;border-color:#ccc}.btn-default .badge{color:#fff;background-color:#333}.btn-primary{color:#fff;background-color:#337ab7;border-color:#2e6da4}.btn-primary.focus,.btn-primary:focus{color:#fff;background-color:#286090;border-color:#122b40}.btn-primary:hover{color:#fff;background-color:#286090;border-color:#204d74}.btn-primary.active,.btn-primary:active,.open>.dropdown-toggle.btn-primary{color:#fff;background-color:#286090;border-color:#204d74}.btn-primary.active.focus,.btn-primary.active:focus,.btn-primary.active:hover,.btn-primary:active.focus,.btn-primary:active:focus,.btn-primary:active:hover,.open>.dropdown-toggle.btn-primary.focus,.open>.dropdown-toggle.btn-primary:focus,.open>.dropdown-toggle.btn-primary:hover{color:#fff;background-color:#204d74;border-color:#122b40}.btn-primary.active,.btn-primary:active,.open>.dropdown-toggle.btn-primary{background-image:none}.btn-primary.disabled.focus,.btn-primary.disabled:focus,.btn-primary.disabled:hover,.btn-primary[disabled].focus,.btn-primary[disabled]:focus,.btn-primary[disabled]:hover,fieldset[disabled] .btn-primary.focus,fieldset[disabled] .btn-primary:focus,fieldset[disabled] .btn-primary:hover{background-color:#337ab7;border-color:#2e6da4}.btn-primary .badge{color:#337ab7;background-color:#fff}.btn-success{color:#fff;background-color:#5cb85c;border-color:#4cae4c}.btn-success.focus,.btn-success:focus{color:#fff;background-color:#449d44;border-color:#255625}.btn-success:hover{color:#fff;background-color:#449d44;border-color:#398439}.btn-success.active,.btn-success:active,.open>.dropdown-toggle.btn-success{color:#fff;background-color:#449d44;border-color:#398439}.btn-success.active.focus,.btn-success.active:focus,.btn-success.active:hover,.btn-success:active.focus,.btn-success:active:focus,.btn-success:active:hover,.open>.dropdown-toggle.btn-success.focus,.open>.dropdown-toggle.btn-success:focus,.open>.dropdown-toggle.btn-success:hover{color:#fff;background-color:#398439;border-color:#255625}.btn-success.active,.btn-success:active,.open>.dropdown-toggle.btn-success{background-image:none}.btn-success.disabled.focus,.btn-success.disabled:focus,.btn-success.disabled:hover,.btn-success[disabled].focus,.btn-success[disabled]:focus,.btn-success[disabled]:hover,fieldset[disabled] .btn-success.focus,fieldset[disabled] .btn-success:focus,fieldset[disabled] .btn-success:hover{background-color:#5cb85c;border-color:#4cae4c}.btn-success .badge{color:#5cb85c;background-color:#fff}.btn-info{color:#fff;background-color:#5bc0de;border-color:#46b8da}.btn-info.focus,.btn-info:focus{color:#fff;background-color:#31b0d5;border-color:#1b6d85}.btn-info:hover{color:#fff;background-color:#31b0d5;border-color:#269abc}.btn-info.active,.btn-info:active,.open>.dropdown-toggle.btn-info{color:#fff;background-color:#31b0d5;border-color:#269abc}.btn-info.active.focus,.btn-info.active:focus,.btn-info.active:hover,.btn-info:active.focus,.btn-info:active:focus,.btn-info:active:hover,.open>.dropdown-toggle.btn-info.focus,.open>.dropdown-toggle.btn-info:focus,.open>.dropdown-toggle.btn-info:hover{color:#fff;background-color:#269abc;border-color:#1b6d85}.btn-info.active,.btn-info:active,.open>.dropdown-toggle.btn-info{background-image:none}.btn-info.disabled.focus,.btn-info.disabled:focus,.btn-info.disabled:hover,.btn-info[disabled].focus,.btn-info[disabled]:focus,.btn-info[disabled]:hover,fieldset[disabled] .btn-info.focus,fieldset[disabled] .btn-info:focus,fieldset[disabled] .btn-info:hover{background-color:#5bc0de;border-color:#46b8da}.btn-info .badge{color:#5bc0de;background-color:#fff}.btn-warning{color:#fff;background-color:#f0ad4e;border-color:#eea236}.btn-warning.focus,.btn-warning:focus{color:#fff;background-color:#ec971f;border-color:#985f0d}.btn-warning:hover{color:#fff;background-color:#ec971f;border-color:#d58512}.btn-warning.active,.btn-warning:active,.open>.dropdown-toggle.btn-warning{color:#fff;background-color:#ec971f;border-color:#d58512}.btn-warning.active.focus,.btn-warning.active:focus,.btn-warning.active:hover,.btn-warning:active.focus,.btn-warning:active:focus,.btn-warning:active:hover,.open>.dropdown-toggle.btn-warning.focus,.open>.dropdown-toggle.btn-warning:focus,.open>.dropdown-toggle.btn-warning:hover{color:#fff;background-color:#d58512;border-color:#985f0d}.btn-warning.active,.btn-warning:active,.open>.dropdown-toggle.btn-warning{background-image:none}.btn-warning.disabled.focus,.btn-warning.disabled:focus,.btn-warning.disabled:hover,.btn-warning[disabled].focus,.btn-warning[disabled]:focus,.btn-warning[disabled]:hover,fieldset[disabled] .btn-warning.focus,fieldset[disabled] .btn-warning:focus,fieldset[disabled] .btn-warning:hover{background-color:#f0ad4e;border-color:#eea236}.btn-warning .badge{color:#f0ad4e;background-color:#fff}.btn-danger{color:#fff;background-color:#d9534f;border-color:#d43f3a}.btn-danger.focus,.btn-danger:focus{color:#fff;background-color:#c9302c;border-color:#761c19}.btn-danger:hover{color:#fff;background-color:#c9302c;border-color:#ac2925}.btn-danger.active,.btn-danger:active,.open>.dropdown-toggle.btn-danger{color:#fff;background-color:#c9302c;border-color:#ac2925}.btn-danger.active.focus,.btn-danger.active:focus,.btn-danger.active:hover,.btn-danger:active.focus,.btn-danger:active:focus,.btn-danger:active:hover,.open>.dropdown-toggle.btn-danger.focus,.open>.dropdown-toggle.btn-danger:focus,.open>.dropdown-toggle.btn-danger:hover{color:#fff;background-color:#ac2925;border-color:#761c19}.btn-danger.active,.btn-danger:active,.open>.dropdown-toggle.btn-danger{background-image:none}.btn-danger.disabled.focus,.btn-danger.disabled:focus,.btn-danger.disabled:hover,.btn-danger[disabled].focus,.btn-danger[disabled]:focus,.btn-danger[disabled]:hover,fieldset[disabled] .btn-danger.focus,fieldset[disabled] .btn-danger:focus,fieldset[disabled] .btn-danger:hover{background-color:#d9534f;border-color:#d43f3a}.btn-danger .badge{color:#d9534f;background-color:#fff}.btn-link{font-weight:400;color:#337ab7;border-radius:0}.btn-link,.btn-link.active,.btn-link:active,.btn-link[disabled],fieldset[disabled] .btn-link{background-color:transparent;-webkit-box-shadow:none;box-shadow:none}.btn-link,.btn-link:active,.btn-link:focus,.btn-link:hover{border-color:transparent}.btn-link:focus,.btn-link:hover{color:#23527c;text-decoration:underline;background-color:transparent}.btn-link[disabled]:focus,.btn-link[disabled]:hover,fieldset[disabled] .btn-link:focus,fieldset[disabled] .btn-link:hover{color:#777;text-decoration:none}.btn-group-lg>.btn,.btn-lg{padding:10px 16px;font-size:18px;line-height:1.3333333;border-radius:6px}.btn-group-sm>.btn,.btn-sm{padding:5px 10px;font-size:12px;line-height:1.5;border-radius:3px}.btn-group-xs>.btn,.btn-xs{padding:1px 5px;font-size:12px;line-height:1.5;border-radius:3px}.btn-block{display:block;width:100%}.btn-block+.btn-block{margin-top:5px}input[type=button].btn-block,input[type=reset].btn-block,input[type=submit].btn-block{width:100%}.fade{opacity:0;-webkit-transition:opacity .15s linear;-o-transition:opacity .15s linear;transition:opacity .15s linear}.fade.in{opacity:1}.collapse{display:none}.collapse.in{display:block}tr.collapse.in{display:table-row}tbody.collapse.in{display:table-row-group}.collapsing{position:relative;height:0;overflow:hidden;-webkit-transition-timing-function:ease;-o-transition-timing-function:ease;transition-timing-function:ease;-webkit-transition-duration:.35s;-o-transition-duration:.35s;transition-duration:.35s;-webkit-transition-property:height,visibility;-o-transition-property:height,visibility;transition-property:height,visibility}.caret{display:inline-block;width:0;height:0;margin-left:2px;vertical-align:middle;border-top:4px dashed;border-top:4px solid\9;border-right:4px solid transparent;border-left:4px solid transparent}.dropdown,.dropup{position:relative}.dropdown-toggle:focus{outline:0}.dropdown-menu{position:absolute;top:100%;left:0;z-index:1000;display:none;float:left;min-width:160px;padding:5px 0;margin:2px 0 0;font-size:14px;text-align:left;list-style:none;background-color:#fff;-webkit-background-clip:padding-box;background-clip:padding-box;border:1px solid #ccc;border:1px solid rgba(0,0,0,.15);border-radius:4px;-webkit-box-shadow:0 6px 12px rgba(0,0,0,.175);box-shadow:0 6px 12px rgba(0,0,0,.175)}.dropdown-menu.pull-right{right:0;left:auto}.dropdown-menu .divider{height:1px;margin:9px 0;overflow:hidden;background-color:#e5e5e5}.dropdown-menu>li>a{display:block;padding:3px 20px;clear:both;font-weight:400;line-height:1.42857143;color:#333;white-space:nowrap}.dropdown-menu>li>a:focus,.dropdown-menu>li>a:hover{color:#262626;text-decoration:none;background-color:#f5f5f5}.dropdown-menu>.active>a,.dropdown-menu>.active>a:focus,.dropdown-menu>.active>a:hover{color:#fff;text-decoration:none;background-color:#337ab7;outline:0}.dropdown-menu>.disabled>a,.dropdown-menu>.disabled>a:focus,.dropdown-menu>.disabled>a:hover{color:#777}.dropdown-menu>.disabled>a:focus,.dropdown-menu>.disabled>a:hover{text-decoration:none;cursor:not-allowed;background-color:transparent;background-image:none;filter:progid:DXImageTransform.Microsoft.gradient(enabled=false)}.open>.dropdown-menu{display:block}.open>a{outline:0}.dropdown-menu-right{right:0;left:auto}.dropdown-menu-left{right:auto;left:0}.dropdown-header{display:block;padding:3px 20px;font-size:12px;line-height:1.42857143;color:#777;white-space:nowrap}.dropdown-backdrop{position:fixed;top:0;right:0;bottom:0;left:0;z-index:990}.pull-right>.dropdown-menu{right:0;left:auto}.dropup .caret,.navbar-fixed-bottom .dropdown .caret{content:"";border-top:0;border-bottom:4px dashed;border-bottom:4px solid\9}.dropup .dropdown-menu,.navbar-fixed-bottom .dropdown .dropdown-menu{top:auto;bottom:100%;margin-bottom:2px}@media (min-width:768px){.navbar-right .dropdown-menu{right:0;left:auto}.navbar-right .dropdown-menu-left{right:auto;left:0}}.btn-group,.btn-group-vertical{position:relative;display:inline-block;vertical-align:middle}.btn-group-vertical>.btn,.btn-group>.btn{position:relative;float:left}.btn-group-vertical>.btn.active,.btn-group-vertical>.btn:active,.btn-group-vertical>.btn:focus,.btn-group-vertical>.btn:hover,.btn-group>.btn.active,.btn-group>.btn:active,.btn-group>.btn:focus,.btn-group>.btn:hover{z-index:2}.btn-group .btn+.btn,.btn-group .btn+.btn-group,.btn-group .btn-group+.btn,.btn-group .btn-group+.btn-group{margin-left:-1px}.btn-toolbar{margin-left:-5px}.btn-toolbar .btn,.btn-toolbar .btn-group,.btn-toolbar .input-group{float:left}.btn-toolbar>.btn,.btn-toolbar>.btn-group,.btn-toolbar>.input-group{margin-left:5px}.btn-group>.btn:not(:first-child):not(:last-child):not(.dropdown-toggle){border-radius:0}.btn-group>.btn:first-child{margin-left:0}.btn-group>.btn:first-child:not(:last-child):not(.dropdown-toggle){border-top-right-radius:0;border-bottom-right-radius:0}.btn-group>.btn:last-child:not(:first-child),.btn-group>.dropdown-toggle:not(:first-child){border-top-left-radius:0;border-bottom-left-radius:0}.btn-group>.btn-group{float:left}.btn-group>.btn-group:not(:first-child):not(:last-child)>.btn{border-radius:0}.btn-group>.btn-group:first-child:not(:last-child)>.btn:last-child,.btn-group>.btn-group:first-child:not(:last-child)>.dropdown-toggle{border-top-right-radius:0;border-bottom-right-radius:0}.btn-group>.btn-group:last-child:not(:first-child)>.btn:first-child{border-top-left-radius:0;border-bottom-left-radius:0}.btn-group .dropdown-toggle:active,.btn-group.open .dropdown-toggle{outline:0}.btn-group>.btn+.dropdown-toggle{padding-right:8px;padding-left:8px}.btn-group>.btn-lg+.dropdown-toggle{padding-right:12px;padding-left:12px}.btn-group.open .dropdown-toggle{-webkit-box-shadow:inset 0 3px 5px rgba(0,0,0,.125);box-shadow:inset 0 3px 5px rgba(0,0,0,.125)}.btn-group.open .dropdown-toggle.btn-link{-webkit-box-shadow:none;box-shadow:none}.btn .caret{margin-left:0}.btn-lg .caret{border-width:5px 5px 0;border-bottom-width:0}.dropup .btn-lg .caret{border-width:0 5px 5px}.btn-group-vertical>.btn,.btn-group-vertical>.btn-group,.btn-group-vertical>.btn-group>.btn{display:block;float:none;width:100%;max-width:100%}.btn-group-vertical>.btn-group>.btn{float:none}.btn-group-vertical>.btn+.btn,.btn-group-vertical>.btn+.btn-group,.btn-group-vertical>.btn-group+.btn,.btn-group-vertical>.btn-group+.btn-group{margin-top:-1px;margin-left:0}.btn-group-vertical>.btn:not(:first-child):not(:last-child){border-radius:0}.btn-group-vertical>.btn:first-child:not(:last-child){border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:0;border-bottom-left-radius:0}.btn-group-vertical>.btn:last-child:not(:first-child){border-top-left-radius:0;border-top-right-radius:0;border-bottom-right-radius:4px;border-bottom-left-radius:4px}.btn-group-vertical>.btn-group:not(:first-child):not(:last-child)>.btn{border-radius:0}.btn-group-vertical>.btn-group:first-child:not(:last-child)>.btn:last-child,.btn-group-vertical>.btn-group:first-child:not(:last-child)>.dropdown-toggle{border-bottom-right-radius:0;border-bottom-left-radius:0}.btn-group-vertical>.btn-group:last-child:not(:first-child)>.btn:first-child{border-top-left-radius:0;border-top-right-radius:0}.btn-group-justified{display:table;width:100%;table-layout:fixed;border-collapse:separate}.btn-group-justified>.btn,.btn-group-justified>.btn-group{display:table-cell;float:none;width:1%}.btn-group-justified>.btn-group .btn{width:100%}.btn-group-justified>.btn-group .dropdown-menu{left:auto}[data-toggle=buttons]>.btn input[type=checkbox],[data-toggle=buttons]>.btn input[type=radio],[data-toggle=buttons]>.btn-group>.btn input[type=checkbox],[data-toggle=buttons]>.btn-group>.btn input[type=radio]{position:absolute;clip:rect(0,0,0,0);pointer-events:none}.input-group{position:relative;display:table;border-collapse:separate}.input-group[class*=col-]{float:none;padding-right:0;padding-left:0}.input-group .form-control{position:relative;z-index:2;float:left;width:100%;margin-bottom:0}.input-group .form-control:focus{z-index:3}.input-group-lg>.form-control,.input-group-lg>.input-group-addon,.input-group-lg>.input-group-btn>.btn{height:46px;padding:10px 16px;font-size:18px;line-height:1.3333333;border-radius:6px}select.input-group-lg>.form-control,select.input-group-lg>.input-group-addon,select.input-group-lg>.input-group-btn>.btn{height:46px;line-height:46px}select[multiple].input-group-lg>.form-control,select[multiple].input-group-lg>.input-group-addon,select[multiple].input-group-lg>.input-group-btn>.btn,textarea.input-group-lg>.form-control,textarea.input-group-lg>.input-group-addon,textarea.input-group-lg>.input-group-btn>.btn{height:auto}.input-group-sm>.form-control,.input-group-sm>.input-group-addon,.input-group-sm>.input-group-btn>.btn{height:30px;padding:5px 10px;font-size:12px;line-height:1.5;border-radius:3px}select.input-group-sm>.form-control,select.input-group-sm>.input-group-addon,select.input-group-sm>.input-group-btn>.btn{height:30px;line-height:30px}select[multiple].input-group-sm>.form-control,select[multiple].input-group-sm>.input-group-addon,select[multiple].input-group-sm>.input-group-btn>.btn,textarea.input-group-sm>.form-control,textarea.input-group-sm>.input-group-addon,textarea.input-group-sm>.input-group-btn>.btn{height:auto}.input-group .form-control,.input-group-addon,.input-group-btn{display:table-cell}.input-group .form-control:not(:first-child):not(:last-child),.input-group-addon:not(:first-child):not(:last-child),.input-group-btn:not(:first-child):not(:last-child){border-radius:0}.input-group-addon,.input-group-btn{width:1%;white-space:nowrap;vertical-align:middle}.input-group-addon{padding:6px 12px;font-size:14px;font-weight:400;line-height:1;color:#555;text-align:center;background-color:#eee;border:1px solid #ccc;border-radius:4px}.input-group-addon.input-sm{padding:5px 10px;font-size:12px;border-radius:3px}.input-group-addon.input-lg{padding:10px 16px;font-size:18px;border-radius:6px}.input-group-addon input[type=checkbox],.input-group-addon input[type=radio]{margin-top:0}.input-group .form-control:first-child,.input-group-addon:first-child,.input-group-btn:first-child>.btn,.input-group-btn:first-child>.btn-group>.btn,.input-group-btn:first-child>.dropdown-toggle,.input-group-btn:last-child>.btn-group:not(:last-child)>.btn,.input-group-btn:last-child>.btn:not(:last-child):not(.dropdown-toggle){border-top-right-radius:0;border-bottom-right-radius:0}.input-group-addon:first-child{border-right:0}.input-group .form-control:last-child,.input-group-addon:last-child,.input-group-btn:first-child>.btn-group:not(:first-child)>.btn,.input-group-btn:first-child>.btn:not(:first-child),.input-group-btn:last-child>.btn,.input-group-btn:last-child>.btn-group>.btn,.input-group-btn:last-child>.dropdown-toggle{border-top-left-radius:0;border-bottom-left-radius:0}.input-group-addon:last-child{border-left:0}.input-group-btn{position:relative;font-size:0;white-space:nowrap}.input-group-btn>.btn{position:relative}.input-group-btn>.btn+.btn{margin-left:-1px}.input-group-btn>.btn:active,.input-group-btn>.btn:focus,.input-group-btn>.btn:hover{z-index:2}.input-group-btn:first-child>.btn,.input-group-btn:first-child>.btn-group{margin-right:-1px}.input-group-btn:last-child>.btn,.input-group-btn:last-child>.btn-group{z-index:2;margin-left:-1px}.nav{padding-left:0;margin-bottom:0;list-style:none}.nav>li{position:relative;display:block}.nav>li>a{position:relative;display:block;padding:10px 15px}.nav>li>a:focus,.nav>li>a:hover{text-decoration:none;background-color:#eee}.nav>li.disabled>a{color:#777}.nav>li.disabled>a:focus,.nav>li.disabled>a:hover{color:#777;text-decoration:none;cursor:not-allowed;background-color:transparent}.nav .open>a,.nav .open>a:focus,.nav .open>a:hover{background-color:#eee;border-color:#337ab7}.nav .nav-divider{height:1px;margin:9px 0;overflow:hidden;background-color:#e5e5e5}.nav>li>a>img{max-width:none}.nav-tabs{border-bottom:1px solid #ddd}.nav-tabs>li{float:left;margin-bottom:-1px}.nav-tabs>li>a{margin-right:2px;line-height:1.42857143;border:1px solid transparent;border-radius:4px 4px 0 0}.nav-tabs>li>a:hover{border-color:#eee #eee #ddd}.nav-tabs>li.active>a,.nav-tabs>li.active>a:focus,.nav-tabs>li.active>a:hover{color:#555;cursor:default;background-color:#fff;border:1px solid #ddd;border-bottom-color:transparent}.nav-tabs.nav-justified{width:100%;border-bottom:0}.nav-tabs.nav-justified>li{float:none}.nav-tabs.nav-justified>li>a{margin-bottom:5px;text-align:center}.nav-tabs.nav-justified>.dropdown .dropdown-menu{top:auto;left:auto}@media (min-width:768px){.nav-tabs.nav-justified>li{display:table-cell;width:1%}.nav-tabs.nav-justified>li>a{margin-bottom:0}}.nav-tabs.nav-justified>li>a{margin-right:0;border-radius:4px}.nav-tabs.nav-justified>.active>a,.nav-tabs.nav-justified>.active>a:focus,.nav-tabs.nav-justified>.active>a:hover{border:1px solid #ddd}@media (min-width:768px){.nav-tabs.nav-justified>li>a{border-bottom:1px solid #ddd;border-radius:4px 4px 0 0}.nav-tabs.nav-justified>.active>a,.nav-tabs.nav-justified>.active>a:focus,.nav-tabs.nav-justified>.active>a:hover{border-bottom-color:#fff}}.nav-pills>li{float:left}.nav-pills>li>a{border-radius:4px}.nav-pills>li+li{margin-left:2px}.nav-pills>li.active>a,.nav-pills>li.active>a:focus,.nav-pills>li.active>a:hover{color:#fff;background-color:#337ab7}.nav-stacked>li{float:none}.nav-stacked>li+li{margin-top:2px;margin-left:0}.nav-justified{width:100%}.nav-justified>li{float:none}.nav-justified>li>a{margin-bottom:5px;text-align:center}.nav-justified>.dropdown .dropdown-menu{top:auto;left:auto}@media (min-width:768px){.nav-justified>li{display:table-cell;width:1%}.nav-justified>li>a{margin-bottom:0}}.nav-tabs-justified{border-bottom:0}.nav-tabs-justified>li>a{margin-right:0;border-radius:4px}.nav-tabs-justified>.active>a,.nav-tabs-justified>.active>a:focus,.nav-tabs-justified>.active>a:hover{border:1px solid #ddd}@media (min-width:768px){.nav-tabs-justified>li>a{border-bottom:1px solid #ddd;border-radius:4px 4px 0 0}.nav-tabs-justified>.active>a,.nav-tabs-justified>.active>a:focus,.nav-tabs-justified>.active>a:hover{border-bottom-color:#fff}}.tab-content>.tab-pane{display:none}.tab-content>.active{display:block}.nav-tabs .dropdown-menu{margin-top:-1px;border-top-left-radius:0;border-top-right-radius:0}.navbar{position:relative;min-height:50px;margin-bottom:20px;border:1px solid transparent}@media (min-width:768px){.navbar{border-radius:4px}}@media (min-width:768px){.navbar-header{float:left}}.navbar-collapse{padding-right:15px;padding-left:15px;overflow-x:visible;-webkit-overflow-scrolling:touch;border-top:1px solid transparent;-webkit-box-shadow:inset 0 1px 0 rgba(255,255,255,.1);box-shadow:inset 0 1px 0 rgba(255,255,255,.1)}.navbar-collapse.in{overflow-y:auto}@media (min-width:768px){.navbar-collapse{width:auto;border-top:0;-webkit-box-shadow:none;box-shadow:none}.navbar-collapse.collapse{display:block!important;height:auto!important;padding-bottom:0;overflow:visible!important}.navbar-collapse.in{overflow-y:visible}.navbar-fixed-bottom .navbar-collapse,.navbar-fixed-top .navbar-collapse,.navbar-static-top .navbar-collapse{padding-right:0;padding-left:0}}.navbar-fixed-bottom .navbar-collapse,.navbar-fixed-top .navbar-collapse{max-height:340px}@media (max-device-width:480px) and (orientation:landscape){.navbar-fixed-bottom .navbar-collapse,.navbar-fixed-top .navbar-collapse{max-height:200px}}.container-fluid>.navbar-collapse,.container-fluid>.navbar-header,.container>.navbar-collapse,.container>.navbar-header{margin-right:-15px;margin-left:-15px}@media (min-width:768px){.container-fluid>.navbar-collapse,.container-fluid>.navbar-header,.container>.navbar-collapse,.container>.navbar-header{margin-right:0;margin-left:0}}.navbar-static-top{z-index:1000;border-width:0 0 1px}@media (min-width:768px){.navbar-static-top{border-radius:0}}.navbar-fixed-bottom,.navbar-fixed-top{position:fixed;right:0;left:0;z-index:1030}@media (min-width:768px){.navbar-fixed-bottom,.navbar-fixed-top{border-radius:0}}.navbar-fixed-top{top:0;border-width:0 0 1px}.navbar-fixed-bottom{bottom:0;margin-bottom:0;border-width:1px 0 0}.navbar-brand{float:left;height:50px;padding:15px 15px;font-size:18px;line-height:20px}.navbar-brand:focus,.navbar-brand:hover{text-decoration:none}.navbar-brand>img{display:block}@media (min-width:768px){.navbar>.container .navbar-brand,.navbar>.container-fluid .navbar-brand{margin-left:-15px}}.navbar-toggle{position:relative;float:right;padding:9px 10px;margin-top:8px;margin-right:15px;margin-bottom:8px;background-color:transparent;background-image:none;border:1px solid transparent;border-radius:4px}.navbar-toggle:focus{outline:0}.navbar-toggle .icon-bar{display:block;width:22px;height:2px;border-radius:1px}.navbar-toggle .icon-bar+.icon-bar{margin-top:4px}@media (min-width:768px){.navbar-toggle{display:none}}.navbar-nav{margin:7.5px -15px}.navbar-nav>li>a{padding-top:10px;padding-bottom:10px;line-height:20px}@media (max-width:767px){.navbar-nav .open .dropdown-menu{position:static;float:none;width:auto;margin-top:0;background-color:transparent;border:0;-webkit-box-shadow:none;box-shadow:none}.navbar-nav .open .dropdown-menu .dropdown-header,.navbar-nav .open .dropdown-menu>li>a{padding:5px 15px 5px 25px}.navbar-nav .open .dropdown-menu>li>a{line-height:20px}.navbar-nav .open .dropdown-menu>li>a:focus,.navbar-nav .open .dropdown-menu>li>a:hover{background-image:none}}@media (min-width:768px){.navbar-nav{float:left;margin:0}.navbar-nav>li{float:left}.navbar-nav>li>a{padding-top:15px;padding-bottom:15px}}.navbar-form{padding:10px 15px;margin-top:8px;margin-right:-15px;margin-bottom:8px;margin-left:-15px;border-top:1px solid transparent;border-bottom:1px solid transparent;-webkit-box-shadow:inset 0 1px 0 rgba(255,255,255,.1),0 1px 0 rgba(255,255,255,.1);box-shadow:inset 0 1px 0 rgba(255,255,255,.1),0 1px 0 rgba(255,255,255,.1)}@media (min-width:768px){.navbar-form .form-group{display:inline-block;margin-bottom:0;vertical-align:middle}.navbar-form .form-control{display:inline-block;width:auto;vertical-align:middle}.navbar-form .form-control-static{display:inline-block}.navbar-form .input-group{display:inline-table;vertical-align:middle}.navbar-form .input-group .form-control,.navbar-form .input-group .input-group-addon,.navbar-form .input-group .input-group-btn{width:auto}.navbar-form .input-group>.form-control{width:100%}.navbar-form .control-label{margin-bottom:0;vertical-align:middle}.navbar-form .checkbox,.navbar-form .radio{display:inline-block;margin-top:0;margin-bottom:0;vertical-align:middle}.navbar-form .checkbox label,.navbar-form .radio label{padding-left:0}.navbar-form .checkbox input[type=checkbox],.navbar-form .radio input[type=radio]{position:relative;margin-left:0}.navbar-form .has-feedback .form-control-feedback{top:0}}@media (max-width:767px){.navbar-form .form-group{margin-bottom:5px}.navbar-form .form-group:last-child{margin-bottom:0}}@media (min-width:768px){.navbar-form{width:auto;padding-top:0;padding-bottom:0;margin-right:0;margin-left:0;border:0;-webkit-box-shadow:none;box-shadow:none}}.navbar-nav>li>.dropdown-menu{margin-top:0;border-top-left-radius:0;border-top-right-radius:0}.navbar-fixed-bottom .navbar-nav>li>.dropdown-menu{margin-bottom:0;border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:0;border-bottom-left-radius:0}.navbar-btn{margin-top:8px;margin-bottom:8px}.navbar-btn.btn-sm{margin-top:10px;margin-bottom:10px}.navbar-btn.btn-xs{margin-top:14px;margin-bottom:14px}.navbar-text{margin-top:15px;margin-bottom:15px}@media (min-width:768px){.navbar-text{float:left;margin-right:15px;margin-left:15px}}@media (min-width:768px){.navbar-left{float:left!important}.navbar-right{float:right!important;margin-right:-15px}.navbar-right~.navbar-right{margin-right:0}}.navbar-default{background-color:#f8f8f8;border-color:#e7e7e7}.navbar-default .navbar-brand{color:#777}.navbar-default .navbar-brand:focus,.navbar-default .navbar-brand:hover{color:#5e5e5e;background-color:transparent}.navbar-default .navbar-text{color:#777}.navbar-default .navbar-nav>li>a{color:#777}.navbar-default .navbar-nav>li>a:focus,.navbar-default .navbar-nav>li>a:hover{color:#333;background-color:transparent}.navbar-default .navbar-nav>.active>a,.navbar-default .navbar-nav>.active>a:focus,.navbar-default .navbar-nav>.active>a:hover{color:#555;background-color:#e7e7e7}.navbar-default .navbar-nav>.disabled>a,.navbar-default .navbar-nav>.disabled>a:focus,.navbar-default .navbar-nav>.disabled>a:hover{color:#ccc;background-color:transparent}.navbar-default .navbar-toggle{border-color:#ddd}.navbar-default .navbar-toggle:focus,.navbar-default .navbar-toggle:hover{background-color:#ddd}.navbar-default .navbar-toggle .icon-bar{background-color:#888}.navbar-default .navbar-collapse,.navbar-default .navbar-form{border-color:#e7e7e7}.navbar-default .navbar-nav>.open>a,.navbar-default .navbar-nav>.open>a:focus,.navbar-default .navbar-nav>.open>a:hover{color:#555;background-color:#e7e7e7}@media (max-width:767px){.navbar-default .navbar-nav .open .dropdown-menu>li>a{color:#777}.navbar-default .navbar-nav .open .dropdown-menu>li>a:focus,.navbar-default .navbar-nav .open .dropdown-menu>li>a:hover{color:#333;background-color:transparent}.navbar-default .navbar-nav .open .dropdown-menu>.active>a,.navbar-default .navbar-nav .open .dropdown-menu>.active>a:focus,.navbar-default .navbar-nav .open .dropdown-menu>.active>a:hover{color:#555;background-color:#e7e7e7}.navbar-default .navbar-nav .open .dropdown-menu>.disabled>a,.navbar-default .navbar-nav .open .dropdown-menu>.disabled>a:focus,.navbar-default .navbar-nav .open .dropdown-menu>.disabled>a:hover{color:#ccc;background-color:transparent}}.navbar-default .navbar-link{color:#777}.navbar-default .navbar-link:hover{color:#333}.navbar-default .btn-link{color:#777}.navbar-default .btn-link:focus,.navbar-default .btn-link:hover{color:#333}.navbar-default .btn-link[disabled]:focus,.navbar-default .btn-link[disabled]:hover,fieldset[disabled] .navbar-default .btn-link:focus,fieldset[disabled] .navbar-default .btn-link:hover{color:#ccc}.navbar-inverse{background-color:#222;border-color:#080808}.navbar-inverse .navbar-brand{color:#9d9d9d}.navbar-inverse .navbar-brand:focus,.navbar-inverse .navbar-brand:hover{color:#fff;background-color:transparent}.navbar-inverse .navbar-text{color:#9d9d9d}.navbar-inverse .navbar-nav>li>a{color:#9d9d9d}.navbar-inverse .navbar-nav>li>a:focus,.navbar-inverse .navbar-nav>li>a:hover{color:#fff;background-color:transparent}.navbar-inverse .navbar-nav>.active>a,.navbar-inverse .navbar-nav>.active>a:focus,.navbar-inverse .navbar-nav>.active>a:hover{color:#fff;background-color:#080808}.navbar-inverse .navbar-nav>.disabled>a,.navbar-inverse .navbar-nav>.disabled>a:focus,.navbar-inverse .navbar-nav>.disabled>a:hover{color:#444;background-color:transparent}.navbar-inverse .navbar-toggle{border-color:#333}.navbar-inverse .navbar-toggle:focus,.navbar-inverse .navbar-toggle:hover{background-color:#333}.navbar-inverse .navbar-toggle .icon-bar{background-color:#fff}.navbar-inverse .navbar-collapse,.navbar-inverse .navbar-form{border-color:#101010}.navbar-inverse .navbar-nav>.open>a,.navbar-inverse .navbar-nav>.open>a:focus,.navbar-inverse .navbar-nav>.open>a:hover{color:#fff;background-color:#080808}@media (max-width:767px){.navbar-inverse .navbar-nav .open .dropdown-menu>.dropdown-header{border-color:#080808}.navbar-inverse .navbar-nav .open .dropdown-menu .divider{background-color:#080808}.navbar-inverse .navbar-nav .open .dropdown-menu>li>a{color:#9d9d9d}.navbar-inverse .navbar-nav .open .dropdown-menu>li>a:focus,.navbar-inverse .navbar-nav .open .dropdown-menu>li>a:hover{color:#fff;background-color:transparent}.navbar-inverse .navbar-nav .open .dropdown-menu>.active>a,.navbar-inverse .navbar-nav .open .dropdown-menu>.active>a:focus,.navbar-inverse .navbar-nav .open .dropdown-menu>.active>a:hover{color:#fff;background-color:#080808}.navbar-inverse .navbar-nav .open .dropdown-menu>.disabled>a,.navbar-inverse .navbar-nav .open .dropdown-menu>.disabled>a:focus,.navbar-inverse .navbar-nav .open .dropdown-menu>.disabled>a:hover{color:#444;background-color:transparent}}.navbar-inverse .navbar-link{color:#9d9d9d}.navbar-inverse .navbar-link:hover{color:#fff}.navbar-inverse .btn-link{color:#9d9d9d}.navbar-inverse .btn-link:focus,.navbar-inverse .btn-link:hover{color:#fff}.navbar-inverse .btn-link[disabled]:focus,.navbar-inverse .btn-link[disabled]:hover,fieldset[disabled] .navbar-inverse .btn-link:focus,fieldset[disabled] .navbar-inverse .btn-link:hover{color:#444}.breadcrumb{padding:8px 15px;margin-bottom:20px;list-style:none;background-color:#f5f5f5;border-radius:4px}.breadcrumb>li{display:inline-block}.breadcrumb>li+li:before{padding:0 5px;color:#ccc;content:"/\00a0"}.breadcrumb>.active{color:#777}.pagination{display:inline-block;padding-left:0;margin:20px 0;border-radius:4px}.pagination>li{display:inline}.pagination>li>a,.pagination>li>span{position:relative;float:left;padding:6px 12px;margin-left:-1px;line-height:1.42857143;color:#337ab7;text-decoration:none;background-color:#fff;border:1px solid #ddd}.pagination>li:first-child>a,.pagination>li:first-child>span{margin-left:0;border-top-left-radius:4px;border-bottom-left-radius:4px}.pagination>li:last-child>a,.pagination>li:last-child>span{border-top-right-radius:4px;border-bottom-right-radius:4px}.pagination>li>a:focus,.pagination>li>a:hover,.pagination>li>span:focus,.pagination>li>span:hover{z-index:2;color:#23527c;background-color:#eee;border-color:#ddd}.pagination>.active>a,.pagination>.active>a:focus,.pagination>.active>a:hover,.pagination>.active>span,.pagination>.active>span:focus,.pagination>.active>span:hover{z-index:3;color:#fff;cursor:default;background-color:#337ab7;border-color:#337ab7}.pagination>.disabled>a,.pagination>.disabled>a:focus,.pagination>.disabled>a:hover,.pagination>.disabled>span,.pagination>.disabled>span:focus,.pagination>.disabled>span:hover{color:#777;cursor:not-allowed;background-color:#fff;border-color:#ddd}.pagination-lg>li>a,.pagination-lg>li>span{padding:10px 16px;font-size:18px;line-height:1.3333333}.pagination-lg>li:first-child>a,.pagination-lg>li:first-child>span{border-top-left-radius:6px;border-bottom-left-radius:6px}.pagination-lg>li:last-child>a,.pagination-lg>li:last-child>span{border-top-right-radius:6px;border-bottom-right-radius:6px}.pagination-sm>li>a,.pagination-sm>li>span{padding:5px 10px;font-size:12px;line-height:1.5}.pagination-sm>li:first-child>a,.pagination-sm>li:first-child>span{border-top-left-radius:3px;border-bottom-left-radius:3px}.pagination-sm>li:last-child>a,.pagination-sm>li:last-child>span{border-top-right-radius:3px;border-bottom-right-radius:3px}.pager{padding-left:0;margin:20px 0;text-align:center;list-style:none}.pager li{display:inline}.pager li>a,.pager li>span{display:inline-block;padding:5px 14px;background-color:#fff;border:1px solid #ddd;border-radius:15px}.pager li>a:focus,.pager li>a:hover{text-decoration:none;background-color:#eee}.pager .next>a,.pager .next>span{float:right}.pager .previous>a,.pager .previous>span{float:left}.pager .disabled>a,.pager .disabled>a:focus,.pager .disabled>a:hover,.pager .disabled>span{color:#777;cursor:not-allowed;background-color:#fff}.label{display:inline;padding:.2em .6em .3em;font-size:75%;font-weight:700;line-height:1;color:#fff;text-align:center;white-space:nowrap;vertical-align:baseline;border-radius:.25em}a.label:focus,a.label:hover{color:#fff;text-decoration:none;cursor:pointer}.label:empty{display:none}.btn .label{position:relative;top:-1px}.label-default{background-color:#777}.label-default[href]:focus,.label-default[href]:hover{background-color:#5e5e5e}.label-primary{background-color:#337ab7}.label-primary[href]:focus,.label-primary[href]:hover{background-color:#286090}.label-success{background-color:#5cb85c}.label-success[href]:focus,.label-success[href]:hover{background-color:#449d44}.label-info{background-color:#5bc0de}.label-info[href]:focus,.label-info[href]:hover{background-color:#31b0d5}.label-warning{background-color:#f0ad4e}.label-warning[href]:focus,.label-warning[href]:hover{background-color:#ec971f}.label-danger{background-color:#d9534f}.label-danger[href]:focus,.label-danger[href]:hover{background-color:#c9302c}.badge{display:inline-block;min-width:10px;padding:3px 7px;font-size:12px;font-weight:700;line-height:1;color:#fff;text-align:center;white-space:nowrap;vertical-align:middle;background-color:#777;border-radius:10px}.badge:empty{display:none}.btn .badge{position:relative;top:-1px}.btn-group-xs>.btn .badge,.btn-xs .badge{top:0;padding:1px 5px}a.badge:focus,a.badge:hover{color:#fff;text-decoration:none;cursor:pointer}.list-group-item.active>.badge,.nav-pills>.active>a>.badge{color:#337ab7;background-color:#fff}.list-group-item>.badge{float:right}.list-group-item>.badge+.badge{margin-right:5px}.nav-pills>li>a>.badge{margin-left:3px}.jumbotron{padding-top:30px;padding-bottom:30px;margin-bottom:30px;color:inherit;background-color:#eee}.jumbotron .h1,.jumbotron h1{color:inherit}.jumbotron p{margin-bottom:15px;font-size:21px;font-weight:200}.jumbotron>hr{border-top-color:#d5d5d5}.container .jumbotron,.container-fluid .jumbotron{padding-right:15px;padding-left:15px;border-radius:6px}.jumbotron .container{max-width:100%}@media screen and (min-width:768px){.jumbotron{padding-top:48px;padding-bottom:48px}.container .jumbotron,.container-fluid .jumbotron{padding-right:60px;padding-left:60px}.jumbotron .h1,.jumbotron h1{font-size:63px}}.thumbnail{display:block;padding:4px;margin-bottom:20px;line-height:1.42857143;background-color:#fff;border:1px solid #ddd;border-radius:4px;-webkit-transition:border .2s ease-in-out;-o-transition:border .2s ease-in-out;transition:border .2s ease-in-out}.thumbnail a>img,.thumbnail>img{margin-right:auto;margin-left:auto}a.thumbnail.active,a.thumbnail:focus,a.thumbnail:hover{border-color:#337ab7}.thumbnail .caption{padding:9px;color:#333}.alert{padding:15px;margin-bottom:20px;border:1px solid transparent;border-radius:4px}.alert h4{margin-top:0;color:inherit}.alert .alert-link{font-weight:700}.alert>p,.alert>ul{margin-bottom:0}.alert>p+p{margin-top:5px}.alert-dismissable,.alert-dismissible{padding-right:35px}.alert-dismissable .close,.alert-dismissible .close{position:relative;top:-2px;right:-21px;color:inherit}.alert-success{color:#3c763d;background-color:#dff0d8;border-color:#d6e9c6}.alert-success hr{border-top-color:#c9e2b3}.alert-success .alert-link{color:#2b542c}.alert-info{color:#31708f;background-color:#d9edf7;border-color:#bce8f1}.alert-info hr{border-top-color:#a6e1ec}.alert-info .alert-link{color:#245269}.alert-warning{color:#8a6d3b;background-color:#fcf8e3;border-color:#faebcc}.alert-warning hr{border-top-color:#f7e1b5}.alert-warning .alert-link{color:#66512c}.alert-danger{color:#a94442;background-color:#f2dede;border-color:#ebccd1}.alert-danger hr{border-top-color:#e4b9c0}.alert-danger .alert-link{color:#843534}@-webkit-keyframes progress-bar-stripes{from{background-position:40px 0}to{background-position:0 0}}@-o-keyframes progress-bar-stripes{from{background-position:40px 0}to{background-position:0 0}}@keyframes progress-bar-stripes{from{background-position:40px 0}to{background-position:0 0}}.progress{height:20px;margin-bottom:20px;overflow:hidden;background-color:#f5f5f5;border-radius:4px;-webkit-box-shadow:inset 0 1px 2px rgba(0,0,0,.1);box-shadow:inset 0 1px 2px rgba(0,0,0,.1)}.progress-bar{float:left;width:0;height:100%;font-size:12px;line-height:20px;color:#fff;text-align:center;background-color:#337ab7;-webkit-box-shadow:inset 0 -1px 0 rgba(0,0,0,.15);box-shadow:inset 0 -1px 0 rgba(0,0,0,.15);-webkit-transition:width .6s ease;-o-transition:width .6s ease;transition:width .6s ease}.progress-bar-striped,.progress-striped .progress-bar{background-image:-webkit-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:-o-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);-webkit-background-size:40px 40px;background-size:40px 40px}.progress-bar.active,.progress.active .progress-bar{-webkit-animation:progress-bar-stripes 2s linear infinite;-o-animation:progress-bar-stripes 2s linear infinite;animation:progress-bar-stripes 2s linear infinite}.progress-bar-success{background-color:#5cb85c}.progress-striped .progress-bar-success{background-image:-webkit-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:-o-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent)}.progress-bar-info{background-color:#5bc0de}.progress-striped .progress-bar-info{background-image:-webkit-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:-o-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent)}.progress-bar-warning{background-color:#f0ad4e}.progress-striped .progress-bar-warning{background-image:-webkit-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:-o-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent)}.progress-bar-danger{background-color:#d9534f}.progress-striped .progress-bar-danger{background-image:-webkit-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:-o-linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent);background-image:linear-gradient(45deg,rgba(255,255,255,.15) 25%,transparent 25%,transparent 50%,rgba(255,255,255,.15) 50%,rgba(255,255,255,.15) 75%,transparent 75%,transparent)}.media{margin-top:15px}.media:first-child{margin-top:0}.media,.media-body{overflow:hidden;zoom:1}.media-body{width:10000px}.media-object{display:block}.media-object.img-thumbnail{max-width:none}.media-right,.media>.pull-right{padding-left:10px}.media-left,.media>.pull-left{padding-right:10px}.media-body,.media-left,.media-right{display:table-cell;vertical-align:top}.media-middle{vertical-align:middle}.media-bottom{vertical-align:bottom}.media-heading{margin-top:0;margin-bottom:5px}.media-list{padding-left:0;list-style:none}.list-group{padding-left:0;margin-bottom:20px}.list-group-item{position:relative;display:block;padding:10px 15px;margin-bottom:-1px;background-color:#fff;border:1px solid #ddd}.list-group-item:first-child{border-top-left-radius:4px;border-top-right-radius:4px}.list-group-item:last-child{margin-bottom:0;border-bottom-right-radius:4px;border-bottom-left-radius:4px}a.list-group-item,button.list-group-item{color:#555}a.list-group-item .list-group-item-heading,button.list-group-item .list-group-item-heading{color:#333}a.list-group-item:focus,a.list-group-item:hover,button.list-group-item:focus,button.list-group-item:hover{color:#555;text-decoration:none;background-color:#f5f5f5}button.list-group-item{width:100%;text-align:left}.list-group-item.disabled,.list-group-item.disabled:focus,.list-group-item.disabled:hover{color:#777;cursor:not-allowed;background-color:#eee}.list-group-item.disabled .list-group-item-heading,.list-group-item.disabled:focus .list-group-item-heading,.list-group-item.disabled:hover .list-group-item-heading{color:inherit}.list-group-item.disabled .list-group-item-text,.list-group-item.disabled:focus .list-group-item-text,.list-group-item.disabled:hover .list-group-item-text{color:#777}.list-group-item.active,.list-group-item.active:focus,.list-group-item.active:hover{z-index:2;color:#fff;background-color:#337ab7;border-color:#337ab7}.list-group-item.active .list-group-item-heading,.list-group-item.active .list-group-item-heading>.small,.list-group-item.active .list-group-item-heading>small,.list-group-item.active:focus .list-group-item-heading,.list-group-item.active:focus .list-group-item-heading>.small,.list-group-item.active:focus .list-group-item-heading>small,.list-group-item.active:hover .list-group-item-heading,.list-group-item.active:hover .list-group-item-heading>.small,.list-group-item.active:hover .list-group-item-heading>small{color:inherit}.list-group-item.active .list-group-item-text,.list-group-item.active:focus .list-group-item-text,.list-group-item.active:hover .list-group-item-text{color:#c7ddef}.list-group-item-success{color:#3c763d;background-color:#dff0d8}a.list-group-item-success,button.list-group-item-success{color:#3c763d}a.list-group-item-success .list-group-item-heading,button.list-group-item-success .list-group-item-heading{color:inherit}a.list-group-item-success:focus,a.list-group-item-success:hover,button.list-group-item-success:focus,button.list-group-item-success:hover{color:#3c763d;background-color:#d0e9c6}a.list-group-item-success.active,a.list-group-item-success.active:focus,a.list-group-item-success.active:hover,button.list-group-item-success.active,button.list-group-item-success.active:focus,button.list-group-item-success.active:hover{color:#fff;background-color:#3c763d;border-color:#3c763d}.list-group-item-info{color:#31708f;background-color:#d9edf7}a.list-group-item-info,button.list-group-item-info{color:#31708f}a.list-group-item-info .list-group-item-heading,button.list-group-item-info .list-group-item-heading{color:inherit}a.list-group-item-info:focus,a.list-group-item-info:hover,button.list-group-item-info:focus,button.list-group-item-info:hover{color:#31708f;background-color:#c4e3f3}a.list-group-item-info.active,a.list-group-item-info.active:focus,a.list-group-item-info.active:hover,button.list-group-item-info.active,button.list-group-item-info.active:focus,button.list-group-item-info.active:hover{color:#fff;background-color:#31708f;border-color:#31708f}.list-group-item-warning{color:#8a6d3b;background-color:#fcf8e3}a.list-group-item-warning,button.list-group-item-warning{color:#8a6d3b}a.list-group-item-warning .list-group-item-heading,button.list-group-item-warning .list-group-item-heading{color:inherit}a.list-group-item-warning:focus,a.list-group-item-warning:hover,button.list-group-item-warning:focus,button.list-group-item-warning:hover{color:#8a6d3b;background-color:#faf2cc}a.list-group-item-warning.active,a.list-group-item-warning.active:focus,a.list-group-item-warning.active:hover,button.list-group-item-warning.active,button.list-group-item-warning.active:focus,button.list-group-item-warning.active:hover{color:#fff;background-color:#8a6d3b;border-color:#8a6d3b}.list-group-item-danger{color:#a94442;background-color:#f2dede}a.list-group-item-danger,button.list-group-item-danger{color:#a94442}a.list-group-item-danger .list-group-item-heading,button.list-group-item-danger .list-group-item-heading{color:inherit}a.list-group-item-danger:focus,a.list-group-item-danger:hover,button.list-group-item-danger:focus,button.list-group-item-danger:hover{color:#a94442;background-color:#ebcccc}a.list-group-item-danger.active,a.list-group-item-danger.active:focus,a.list-group-item-danger.active:hover,button.list-group-item-danger.active,button.list-group-item-danger.active:focus,button.list-group-item-danger.active:hover{color:#fff;background-color:#a94442;border-color:#a94442}.list-group-item-heading{margin-top:0;margin-bottom:5px}.list-group-item-text{margin-bottom:0;line-height:1.3}.panel{margin-bottom:20px;background-color:#fff;border:1px solid transparent;border-radius:4px;-webkit-box-shadow:0 1px 1px rgba(0,0,0,.05);box-shadow:0 1px 1px rgba(0,0,0,.05)}.panel-body{padding:15px}.panel-heading{padding:10px 15px;border-bottom:1px solid transparent;border-top-left-radius:3px;border-top-right-radius:3px}.panel-heading>.dropdown .dropdown-toggle{color:inherit}.panel-title{margin-top:0;margin-bottom:0;font-size:16px;color:inherit}.panel-title>.small,.panel-title>.small>a,.panel-title>a,.panel-title>small,.panel-title>small>a{color:inherit}.panel-footer{padding:10px 15px;background-color:#f5f5f5;border-top:1px solid #ddd;border-bottom-right-radius:3px;border-bottom-left-radius:3px}.panel>.list-group,.panel>.panel-collapse>.list-group{margin-bottom:0}.panel>.list-group .list-group-item,.panel>.panel-collapse>.list-group .list-group-item{border-width:1px 0;border-radius:0}.panel>.list-group:first-child .list-group-item:first-child,.panel>.panel-collapse>.list-group:first-child .list-group-item:first-child{border-top:0;border-top-left-radius:3px;border-top-right-radius:3px}.panel>.list-group:last-child .list-group-item:last-child,.panel>.panel-collapse>.list-group:last-child .list-group-item:last-child{border-bottom:0;border-bottom-right-radius:3px;border-bottom-left-radius:3px}.panel>.panel-heading+.panel-collapse>.list-group .list-group-item:first-child{border-top-left-radius:0;border-top-right-radius:0}.panel-heading+.list-group .list-group-item:first-child{border-top-width:0}.list-group+.panel-footer{border-top-width:0}.panel>.panel-collapse>.table,.panel>.table,.panel>.table-responsive>.table{margin-bottom:0}.panel>.panel-collapse>.table caption,.panel>.table caption,.panel>.table-responsive>.table caption{padding-right:15px;padding-left:15px}.panel>.table-responsive:first-child>.table:first-child,.panel>.table:first-child{border-top-left-radius:3px;border-top-right-radius:3px}.panel>.table-responsive:first-child>.table:first-child>tbody:first-child>tr:first-child,.panel>.table-responsive:first-child>.table:first-child>thead:first-child>tr:first-child,.panel>.table:first-child>tbody:first-child>tr:first-child,.panel>.table:first-child>thead:first-child>tr:first-child{border-top-left-radius:3px;border-top-right-radius:3px}.panel>.table-responsive:first-child>.table:first-child>tbody:first-child>tr:first-child td:first-child,.panel>.table-responsive:first-child>.table:first-child>tbody:first-child>tr:first-child th:first-child,.panel>.table-responsive:first-child>.table:first-child>thead:first-child>tr:first-child td:first-child,.panel>.table-responsive:first-child>.table:first-child>thead:first-child>tr:first-child th:first-child,.panel>.table:first-child>tbody:first-child>tr:first-child td:first-child,.panel>.table:first-child>tbody:first-child>tr:first-child th:first-child,.panel>.table:first-child>thead:first-child>tr:first-child td:first-child,.panel>.table:first-child>thead:first-child>tr:first-child th:first-child{border-top-left-radius:3px}.panel>.table-responsive:first-child>.table:first-child>tbody:first-child>tr:first-child td:last-child,.panel>.table-responsive:first-child>.table:first-child>tbody:first-child>tr:first-child th:last-child,.panel>.table-responsive:first-child>.table:first-child>thead:first-child>tr:first-child td:last-child,.panel>.table-responsive:first-child>.table:first-child>thead:first-child>tr:first-child th:last-child,.panel>.table:first-child>tbody:first-child>tr:first-child td:last-child,.panel>.table:first-child>tbody:first-child>tr:first-child th:last-child,.panel>.table:first-child>thead:first-child>tr:first-child td:last-child,.panel>.table:first-child>thead:first-child>tr:first-child th:last-child{border-top-right-radius:3px}.panel>.table-responsive:last-child>.table:last-child,.panel>.table:last-child{border-bottom-right-radius:3px;border-bottom-left-radius:3px}.panel>.table-responsive:last-child>.table:last-child>tbody:last-child>tr:last-child,.panel>.table-responsive:last-child>.table:last-child>tfoot:last-child>tr:last-child,.panel>.table:last-child>tbody:last-child>tr:last-child,.panel>.table:last-child>tfoot:last-child>tr:last-child{border-bottom-right-radius:3px;border-bottom-left-radius:3px}.panel>.table-responsive:last-child>.table:last-child>tbody:last-child>tr:last-child td:first-child,.panel>.table-responsive:last-child>.table:last-child>tbody:last-child>tr:last-child th:first-child,.panel>.table-responsive:last-child>.table:last-child>tfoot:last-child>tr:last-child td:first-child,.panel>.table-responsive:last-child>.table:last-child>tfoot:last-child>tr:last-child th:first-child,.panel>.table:last-child>tbody:last-child>tr:last-child td:first-child,.panel>.table:last-child>tbody:last-child>tr:last-child th:first-child,.panel>.table:last-child>tfoot:last-child>tr:last-child td:first-child,.panel>.table:last-child>tfoot:last-child>tr:last-child th:first-child{border-bottom-left-radius:3px}.panel>.table-responsive:last-child>.table:last-child>tbody:last-child>tr:last-child td:last-child,.panel>.table-responsive:last-child>.table:last-child>tbody:last-child>tr:last-child th:last-child,.panel>.table-responsive:last-child>.table:last-child>tfoot:last-child>tr:last-child td:last-child,.panel>.table-responsive:last-child>.table:last-child>tfoot:last-child>tr:last-child th:last-child,.panel>.table:last-child>tbody:last-child>tr:last-child td:last-child,.panel>.table:last-child>tbody:last-child>tr:last-child th:last-child,.panel>.table:last-child>tfoot:last-child>tr:last-child td:last-child,.panel>.table:last-child>tfoot:last-child>tr:last-child th:last-child{border-bottom-right-radius:3px}.panel>.panel-body+.table,.panel>.panel-body+.table-responsive,.panel>.table+.panel-body,.panel>.table-responsive+.panel-body{border-top:1px solid #ddd}.panel>.table>tbody:first-child>tr:first-child td,.panel>.table>tbody:first-child>tr:first-child th{border-top:0}.panel>.table-bordered,.panel>.table-responsive>.table-bordered{border:0}.panel>.table-bordered>tbody>tr>td:first-child,.panel>.table-bordered>tbody>tr>th:first-child,.panel>.table-bordered>tfoot>tr>td:first-child,.panel>.table-bordered>tfoot>tr>th:first-child,.panel>.table-bordered>thead>tr>td:first-child,.panel>.table-bordered>thead>tr>th:first-child,.panel>.table-responsive>.table-bordered>tbody>tr>td:first-child,.panel>.table-responsive>.table-bordered>tbody>tr>th:first-child,.panel>.table-responsive>.table-bordered>tfoot>tr>td:first-child,.panel>.table-responsive>.table-bordered>tfoot>tr>th:first-child,.panel>.table-responsive>.table-bordered>thead>tr>td:first-child,.panel>.table-responsive>.table-bordered>thead>tr>th:first-child{border-left:0}.panel>.table-bordered>tbody>tr>td:last-child,.panel>.table-bordered>tbody>tr>th:last-child,.panel>.table-bordered>tfoot>tr>td:last-child,.panel>.table-bordered>tfoot>tr>th:last-child,.panel>.table-bordered>thead>tr>td:last-child,.panel>.table-bordered>thead>tr>th:last-child,.panel>.table-responsive>.table-bordered>tbody>tr>td:last-child,.panel>.table-responsive>.table-bordered>tbody>tr>th:last-child,.panel>.table-responsive>.table-bordered>tfoot>tr>td:last-child,.panel>.table-responsive>.table-bordered>tfoot>tr>th:last-child,.panel>.table-responsive>.table-bordered>thead>tr>td:last-child,.panel>.table-responsive>.table-bordered>thead>tr>th:last-child{border-right:0}.panel>.table-bordered>tbody>tr:first-child>td,.panel>.table-bordered>tbody>tr:first-child>th,.panel>.table-bordered>thead>tr:first-child>td,.panel>.table-bordered>thead>tr:first-child>th,.panel>.table-responsive>.table-bordered>tbody>tr:first-child>td,.panel>.table-responsive>.table-bordered>tbody>tr:first-child>th,.panel>.table-responsive>.table-bordered>thead>tr:first-child>td,.panel>.table-responsive>.table-bordered>thead>tr:first-child>th{border-bottom:0}.panel>.table-bordered>tbody>tr:last-child>td,.panel>.table-bordered>tbody>tr:last-child>th,.panel>.table-bordered>tfoot>tr:last-child>td,.panel>.table-bordered>tfoot>tr:last-child>th,.panel>.table-responsive>.table-bordered>tbody>tr:last-child>td,.panel>.table-responsive>.table-bordered>tbody>tr:last-child>th,.panel>.table-responsive>.table-bordered>tfoot>tr:last-child>td,.panel>.table-responsive>.table-bordered>tfoot>tr:last-child>th{border-bottom:0}.panel>.table-responsive{margin-bottom:0;border:0}.panel-group{margin-bottom:20px}.panel-group .panel{margin-bottom:0;border-radius:4px}.panel-group .panel+.panel{margin-top:5px}.panel-group .panel-heading{border-bottom:0}.panel-group .panel-heading+.panel-collapse>.list-group,.panel-group .panel-heading+.panel-collapse>.panel-body{border-top:1px solid #ddd}.panel-group .panel-footer{border-top:0}.panel-group .panel-footer+.panel-collapse .panel-body{border-bottom:1px solid #ddd}.panel-default{border-color:#ddd}.panel-default>.panel-heading{color:#333;background-color:#f5f5f5;border-color:#ddd}.panel-default>.panel-heading+.panel-collapse>.panel-body{border-top-color:#ddd}.panel-default>.panel-heading .badge{color:#f5f5f5;background-color:#333}.panel-default>.panel-footer+.panel-collapse>.panel-body{border-bottom-color:#ddd}.panel-primary{border-color:#337ab7}.panel-primary>.panel-heading{color:#fff;background-color:#337ab7;border-color:#337ab7}.panel-primary>.panel-heading+.panel-collapse>.panel-body{border-top-color:#337ab7}.panel-primary>.panel-heading .badge{color:#337ab7;background-color:#fff}.panel-primary>.panel-footer+.panel-collapse>.panel-body{border-bottom-color:#337ab7}.panel-success{border-color:#d6e9c6}.panel-success>.panel-heading{color:#3c763d;background-color:#dff0d8;border-color:#d6e9c6}.panel-success>.panel-heading+.panel-collapse>.panel-body{border-top-color:#d6e9c6}.panel-success>.panel-heading .badge{color:#dff0d8;background-color:#3c763d}.panel-success>.panel-footer+.panel-collapse>.panel-body{border-bottom-color:#d6e9c6}.panel-info{border-color:#bce8f1}.panel-info>.panel-heading{color:#31708f;background-color:#d9edf7;border-color:#bce8f1}.panel-info>.panel-heading+.panel-collapse>.panel-body{border-top-color:#bce8f1}.panel-info>.panel-heading .badge{color:#d9edf7;background-color:#31708f}.panel-info>.panel-footer+.panel-collapse>.panel-body{border-bottom-color:#bce8f1}.panel-warning{border-color:#faebcc}.panel-warning>.panel-heading{color:#8a6d3b;background-color:#fcf8e3;border-color:#faebcc}.panel-warning>.panel-heading+.panel-collapse>.panel-body{border-top-color:#faebcc}.panel-warning>.panel-heading .badge{color:#fcf8e3;background-color:#8a6d3b}.panel-warning>.panel-footer+.panel-collapse>.panel-body{border-bottom-color:#faebcc}.panel-danger{border-color:#ebccd1}.panel-danger>.panel-heading{color:#a94442;background-color:#f2dede;border-color:#ebccd1}.panel-danger>.panel-heading+.panel-collapse>.panel-body{border-top-color:#ebccd1}.panel-danger>.panel-heading .badge{color:#f2dede;background-color:#a94442}.panel-danger>.panel-footer+.panel-collapse>.panel-body{border-bottom-color:#ebccd1}.embed-responsive{position:relative;display:block;height:0;padding:0;overflow:hidden}.embed-responsive .embed-responsive-item,.embed-responsive embed,.embed-responsive iframe,.embed-responsive object,.embed-responsive video{position:absolute;top:0;bottom:0;left:0;width:100%;height:100%;border:0}.embed-responsive-16by9{padding-bottom:56.25%}.embed-responsive-4by3{padding-bottom:75%}.well{min-height:20px;padding:19px;margin-bottom:20px;background-color:#f5f5f5;border:1px solid #e3e3e3;border-radius:4px;-webkit-box-shadow:inset 0 1px 1px rgba(0,0,0,.05);box-shadow:inset 0 1px 1px rgba(0,0,0,.05)}.well blockquote{border-color:#ddd;border-color:rgba(0,0,0,.15)}.well-lg{padding:24px;border-radius:6px}.well-sm{padding:9px;border-radius:3px}.close{float:right;font-size:21px;font-weight:700;line-height:1;color:#000;text-shadow:0 1px 0 #fff;filter:alpha(opacity=20);opacity:.2}.close:focus,.close:hover{color:#000;text-decoration:none;cursor:pointer;filter:alpha(opacity=50);opacity:.5}button.close{-webkit-appearance:none;padding:0;cursor:pointer;background:0 0;border:0}.modal-open{overflow:hidden}.modal{position:fixed;top:0;right:0;bottom:0;left:0;z-index:1050;display:none;overflow:hidden;-webkit-overflow-scrolling:touch;outline:0}.modal.fade .modal-dialog{-webkit-transition:-webkit-transform .3s ease-out;-o-transition:-o-transform .3s ease-out;transition:transform .3s ease-out;-webkit-transform:translate(0,-25%);-ms-transform:translate(0,-25%);-o-transform:translate(0,-25%);transform:translate(0,-25%)}.modal.in .modal-dialog{-webkit-transform:translate(0,0);-ms-transform:translate(0,0);-o-transform:translate(0,0);transform:translate(0,0)}.modal-open .modal{overflow-x:hidden;overflow-y:auto}.modal-dialog{position:relative;width:auto;margin:10px}.modal-content{position:relative;background-color:#fff;-webkit-background-clip:padding-box;background-clip:padding-box;border:1px solid #999;border:1px solid rgba(0,0,0,.2);border-radius:6px;outline:0;-webkit-box-shadow:0 3px 9px rgba(0,0,0,.5);box-shadow:0 3px 9px rgba(0,0,0,.5)}.modal-backdrop{position:fixed;top:0;right:0;bottom:0;left:0;z-index:1040;background-color:#000}.modal-backdrop.fade{filter:alpha(opacity=0);opacity:0}.modal-backdrop.in{filter:alpha(opacity=50);opacity:.5}.modal-header{padding:15px;border-bottom:1px solid #e5e5e5}.modal-header .close{margin-top:-2px}.modal-title{margin:0;line-height:1.42857143}.modal-body{position:relative;padding:15px}.modal-footer{padding:15px;text-align:right;border-top:1px solid #e5e5e5}.modal-footer .btn+.btn{margin-bottom:0;margin-left:5px}.modal-footer .btn-group .btn+.btn{margin-left:-1px}.modal-footer .btn-block+.btn-block{margin-left:0}.modal-scrollbar-measure{position:absolute;top:-9999px;width:50px;height:50px;overflow:scroll}@media (min-width:768px){.modal-dialog{width:600px;margin:30px auto}.modal-content{-webkit-box-shadow:0 5px 15px rgba(0,0,0,.5);box-shadow:0 5px 15px rgba(0,0,0,.5)}.modal-sm{width:300px}}@media (min-width:992px){.modal-lg{width:900px}}.tooltip{position:absolute;z-index:1070;display:block;font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:12px;font-style:normal;font-weight:400;line-height:1.42857143;text-align:left;text-align:start;text-decoration:none;text-shadow:none;text-transform:none;letter-spacing:normal;word-break:normal;word-spacing:normal;word-wrap:normal;white-space:normal;filter:alpha(opacity=0);opacity:0;line-break:auto}.tooltip.in{filter:alpha(opacity=90);opacity:.9}.tooltip.top{padding:5px 0;margin-top:-3px}.tooltip.right{padding:0 5px;margin-left:3px}.tooltip.bottom{padding:5px 0;margin-top:3px}.tooltip.left{padding:0 5px;margin-left:-3px}.tooltip-inner{max-width:200px;padding:3px 8px;color:#fff;text-align:center;background-color:#000;border-radius:4px}.tooltip-arrow{position:absolute;width:0;height:0;border-color:transparent;border-style:solid}.tooltip.top .tooltip-arrow{bottom:0;left:50%;margin-left:-5px;border-width:5px 5px 0;border-top-color:#000}.tooltip.top-left .tooltip-arrow{right:5px;bottom:0;margin-bottom:-5px;border-width:5px 5px 0;border-top-color:#000}.tooltip.top-right .tooltip-arrow{bottom:0;left:5px;margin-bottom:-5px;border-width:5px 5px 0;border-top-color:#000}.tooltip.right .tooltip-arrow{top:50%;left:0;margin-top:-5px;border-width:5px 5px 5px 0;border-right-color:#000}.tooltip.left .tooltip-arrow{top:50%;right:0;margin-top:-5px;border-width:5px 0 5px 5px;border-left-color:#000}.tooltip.bottom .tooltip-arrow{top:0;left:50%;margin-left:-5px;border-width:0 5px 5px;border-bottom-color:#000}.tooltip.bottom-left .tooltip-arrow{top:0;right:5px;margin-top:-5px;border-width:0 5px 5px;border-bottom-color:#000}.tooltip.bottom-right .tooltip-arrow{top:0;left:5px;margin-top:-5px;border-width:0 5px 5px;border-bottom-color:#000}.popover{position:absolute;top:0;left:0;z-index:1060;display:none;max-width:276px;padding:1px;font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:14px;font-style:normal;font-weight:400;line-height:1.42857143;text-align:left;text-align:start;text-decoration:none;text-shadow:none;text-transform:none;letter-spacing:normal;word-break:normal;word-spacing:normal;word-wrap:normal;white-space:normal;background-color:#fff;-webkit-background-clip:padding-box;background-clip:padding-box;border:1px solid #ccc;border:1px solid rgba(0,0,0,.2);border-radius:6px;-webkit-box-shadow:0 5px 10px rgba(0,0,0,.2);box-shadow:0 5px 10px rgba(0,0,0,.2);line-break:auto}.popover.top{margin-top:-10px}.popover.right{margin-left:10px}.popover.bottom{margin-top:10px}.popover.left{margin-left:-10px}.popover-title{padding:8px 14px;margin:0;font-size:14px;background-color:#f7f7f7;border-bottom:1px solid #ebebeb;border-radius:5px 5px 0 0}.popover-content{padding:9px 14px}.popover>.arrow,.popover>.arrow:after{position:absolute;display:block;width:0;height:0;border-color:transparent;border-style:solid}.popover>.arrow{border-width:11px}.popover>.arrow:after{content:"";border-width:10px}.popover.top>.arrow{bottom:-11px;left:50%;margin-left:-11px;border-top-color:#999;border-top-color:rgba(0,0,0,.25);border-bottom-width:0}.popover.top>.arrow:after{bottom:1px;margin-left:-10px;content:" ";border-top-color:#fff;border-bottom-width:0}.popover.right>.arrow{top:50%;left:-11px;margin-top:-11px;border-right-color:#999;border-right-color:rgba(0,0,0,.25);border-left-width:0}.popover.right>.arrow:after{bottom:-10px;left:1px;content:" ";border-right-color:#fff;border-left-width:0}.popover.bottom>.arrow{top:-11px;left:50%;margin-left:-11px;border-top-width:0;border-bottom-color:#999;border-bottom-color:rgba(0,0,0,.25)}.popover.bottom>.arrow:after{top:1px;margin-left:-10px;content:" ";border-top-width:0;border-bottom-color:#fff}.popover.left>.arrow{top:50%;right:-11px;margin-top:-11px;border-right-width:0;border-left-color:#999;border-left-color:rgba(0,0,0,.25)}.popover.left>.arrow:after{right:1px;bottom:-10px;content:" ";border-right-width:0;border-left-color:#fff}.carousel{position:relative}.carousel-inner{position:relative;width:100%;overflow:hidden}.carousel-inner>.item{position:relative;display:none;-webkit-transition:.6s ease-in-out left;-o-transition:.6s ease-in-out left;transition:.6s ease-in-out left}.carousel-inner>.item>a>img,.carousel-inner>.item>img{line-height:1}@media all and (transform-3d),(-webkit-transform-3d){.carousel-inner>.item{-webkit-transition:-webkit-transform .6s ease-in-out;-o-transition:-o-transform .6s ease-in-out;transition:transform .6s ease-in-out;-webkit-backface-visibility:hidden;backface-visibility:hidden;-webkit-perspective:1000px;perspective:1000px}.carousel-inner>.item.active.right,.carousel-inner>.item.next{left:0;-webkit-transform:translate3d(100%,0,0);transform:translate3d(100%,0,0)}.carousel-inner>.item.active.left,.carousel-inner>.item.prev{left:0;-webkit-transform:translate3d(-100%,0,0);transform:translate3d(-100%,0,0)}.carousel-inner>.item.active,.carousel-inner>.item.next.left,.carousel-inner>.item.prev.right{left:0;-webkit-transform:translate3d(0,0,0);transform:translate3d(0,0,0)}}.carousel-inner>.active,.carousel-inner>.next,.carousel-inner>.prev{display:block}.carousel-inner>.active{left:0}.carousel-inner>.next,.carousel-inner>.prev{position:absolute;top:0;width:100%}.carousel-inner>.next{left:100%}.carousel-inner>.prev{left:-100%}.carousel-inner>.next.left,.carousel-inner>.prev.right{left:0}.carousel-inner>.active.left{left:-100%}.carousel-inner>.active.right{left:100%}.carousel-control{position:absolute;top:0;bottom:0;left:0;width:15%;font-size:20px;color:#fff;text-align:center;text-shadow:0 1px 2px rgba(0,0,0,.6);background-color:rgba(0,0,0,0);filter:alpha(opacity=50);opacity:.5}.carousel-control.left{background-image:-webkit-linear-gradient(left,rgba(0,0,0,.5) 0,rgba(0,0,0,.0001) 100%);background-image:-o-linear-gradient(left,rgba(0,0,0,.5) 0,rgba(0,0,0,.0001) 100%);background-image:-webkit-gradient(linear,left top,right top,from(rgba(0,0,0,.5)),to(rgba(0,0,0,.0001)));background-image:linear-gradient(to right,rgba(0,0,0,.5) 0,rgba(0,0,0,.0001) 100%);filter:progid:DXImageTransform.Microsoft.gradient(startColorstr='#80000000', endColorstr='#00000000', GradientType=1);background-repeat:repeat-x}.carousel-control.right{right:0;left:auto;background-image:-webkit-linear-gradient(left,rgba(0,0,0,.0001) 0,rgba(0,0,0,.5) 100%);background-image:-o-linear-gradient(left,rgba(0,0,0,.0001) 0,rgba(0,0,0,.5) 100%);background-image:-webkit-gradient(linear,left top,right top,from(rgba(0,0,0,.0001)),to(rgba(0,0,0,.5)));background-image:linear-gradient(to right,rgba(0,0,0,.0001) 0,rgba(0,0,0,.5) 100%);filter:progid:DXImageTransform.Microsoft.gradient(startColorstr='#00000000', endColorstr='#80000000', GradientType=1);background-repeat:repeat-x}.carousel-control:focus,.carousel-control:hover{color:#fff;text-decoration:none;filter:alpha(opacity=90);outline:0;opacity:.9}.carousel-control .glyphicon-chevron-left,.carousel-control .glyphicon-chevron-right,.carousel-control .icon-next,.carousel-control .icon-prev{position:absolute;top:50%;z-index:5;display:inline-block;margin-top:-10px}.carousel-control .glyphicon-chevron-left,.carousel-control .icon-prev{left:50%;margin-left:-10px}.carousel-control .glyphicon-chevron-right,.carousel-control .icon-next{right:50%;margin-right:-10px}.carousel-control .icon-next,.carousel-control .icon-prev{width:20px;height:20px;font-family:serif;line-height:1}.carousel-control .icon-prev:before{content:'\2039'}.carousel-control .icon-next:before{content:'\203a'}.carousel-indicators{position:absolute;bottom:10px;left:50%;z-index:15;width:60%;padding-left:0;margin-left:-30%;text-align:center;list-style:none}.carousel-indicators li{display:inline-block;width:10px;height:10px;margin:1px;text-indent:-999px;cursor:pointer;background-color:#000\9;background-color:rgba(0,0,0,0);border:1px solid #fff;border-radius:10px}.carousel-indicators .active{width:12px;height:12px;margin:0;background-color:#fff}.carousel-caption{position:absolute;right:15%;bottom:20px;left:15%;z-index:10;padding-top:20px;padding-bottom:20px;color:#fff;text-align:center;text-shadow:0 1px 2px rgba(0,0,0,.6)}.carousel-caption .btn{text-shadow:none}@media screen and (min-width:768px){.carousel-control .glyphicon-chevron-left,.carousel-control .glyphicon-chevron-right,.carousel-control .icon-next,.carousel-control .icon-prev{width:30px;height:30px;margin-top:-10px;font-size:30px}.carousel-control .glyphicon-chevron-left,.carousel-control .icon-prev{margin-left:-10px}.carousel-control .glyphicon-chevron-right,.carousel-control .icon-next{margin-right:-10px}.carousel-caption{right:20%;left:20%;padding-bottom:30px}.carousel-indicators{bottom:20px}}.btn-group-vertical>.btn-group:after,.btn-group-vertical>.btn-group:before,.btn-toolbar:after,.btn-toolbar:before,.clearfix:after,.clearfix:before,.container-fluid:after,.container-fluid:before,.container:after,.container:before,.dl-horizontal dd:after,.dl-horizontal dd:before,.form-horizontal .form-group:after,.form-horizontal .form-group:before,.modal-footer:after,.modal-footer:before,.modal-header:after,.modal-header:before,.nav:after,.nav:before,.navbar-collapse:after,.navbar-collapse:before,.navbar-header:after,.navbar-header:before,.navbar:after,.navbar:before,.pager:after,.pager:before,.panel-body:after,.panel-body:before,.row:after,.row:before{display:table;content:" "}.btn-group-vertical>.btn-group:after,.btn-toolbar:after,.clearfix:after,.container-fluid:after,.container:after,.dl-horizontal dd:after,.form-horizontal .form-group:after,.modal-footer:after,.modal-header:after,.nav:after,.navbar-collapse:after,.navbar-header:after,.navbar:after,.pager:after,.panel-body:after,.row:after{clear:both}.center-block{display:block;margin-right:auto;margin-left:auto}.pull-right{float:right!important}.pull-left{float:left!important}.hide{display:none!important}.show{display:block!important}.invisible{visibility:hidden}.text-hide{font:0/0 a;color:transparent;text-shadow:none;background-color:transparent;border:0}.hidden{display:none!important}.affix{position:fixed}@-ms-viewport{width:device-width}.visible-lg,.visible-md,.visible-sm,.visible-xs{display:none!important}.visible-lg-block,.visible-lg-inline,.visible-lg-inline-block,.visible-md-block,.visible-md-inline,.visible-md-inline-block,.visible-sm-block,.visible-sm-inline,.visible-sm-inline-block,.visible-xs-block,.visible-xs-inline,.visible-xs-inline-block{display:none!important}@media (max-width:767px){.visible-xs{display:block!important}table.visible-xs{display:table!important}tr.visible-xs{display:table-row!important}td.visible-xs,th.visible-xs{display:table-cell!important}}@media (max-width:767px){.visible-xs-block{display:block!important}}@media (max-width:767px){.visible-xs-inline{display:inline!important}}@media (max-width:767px){.visible-xs-inline-block{display:inline-block!important}}@media (min-width:768px) and (max-width:991px){.visible-sm{display:block!important}table.visible-sm{display:table!important}tr.visible-sm{display:table-row!important}td.visible-sm,th.visible-sm{display:table-cell!important}}@media (min-width:768px) and (max-width:991px){.visible-sm-block{display:block!important}}@media (min-width:768px) and (max-width:991px){.visible-sm-inline{display:inline!important}}@media (min-width:768px) and (max-width:991px){.visible-sm-inline-block{display:inline-block!important}}@media (min-width:992px) and (max-width:1199px){.visible-md{display:block!important}table.visible-md{display:table!important}tr.visible-md{display:table-row!important}td.visible-md,th.visible-md{display:table-cell!important}}@media (min-width:992px) and (max-width:1199px){.visible-md-block{display:block!important}}@media (min-width:992px) and (max-width:1199px){.visible-md-inline{display:inline!important}}@media (min-width:992px) and (max-width:1199px){.visible-md-inline-block{display:inline-block!important}}@media (min-width:1200px){.visible-lg{display:block!important}table.visible-lg{display:table!important}tr.visible-lg{display:table-row!important}td.visible-lg,th.visible-lg{display:table-cell!important}}@media (min-width:1200px){.visible-lg-block{display:block!important}}@media (min-width:1200px){.visible-lg-inline{display:inline!important}}@media (min-width:1200px){.visible-lg-inline-block{display:inline-block!important}}@media (max-width:767px){.hidden-xs{display:none!important}}@media (min-width:768px) and (max-width:991px){.hidden-sm{display:none!important}}@media (min-width:992px) and (max-width:1199px){.hidden-md{display:none!important}}@media (min-width:1200px){.hidden-lg{display:none!important}}.visible-print{display:none!important}@media print{.visible-print{display:block!important}table.visible-print{display:table!important}tr.visible-print{display:table-row!important}td.visible-print,th.visible-print{display:table-cell!important}}.visible-print-block{display:none!important}@media print{.visible-print-block{display:block!important}}.visible-print-inline{display:none!important}@media print{.visible-print-inline{display:inline!important}}.visible-print-inline-block{display:none!important}@media print{.visible-print-inline-block{display:inline-block!important}}@media print{.hidden-print{display:none!important}} +/*# sourceMappingURL=bootstrap.min.css.map */ \ No newline at end of file diff --git a/docs/archive/1.0/sql/tutorial/css/codemirror.css b/docs/archive/1.0/sql/tutorial/css/codemirror.css new file mode 100644 index 00000000000..2a6a262282b --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/css/codemirror.css @@ -0,0 +1,341 @@ +/* BASICS */ + +.CodeMirror { + /* Set height, width, borders, and global font properties here */ + font-family: monospace; + height: 300px; + color: black; +} + +/* PADDING */ + +.CodeMirror-lines { + padding: 4px 0; /* Vertical padding around content */ +} +.CodeMirror pre { + padding: 0 4px; /* Horizontal padding of content */ +} + +.CodeMirror-scrollbar-filler, .CodeMirror-gutter-filler { + background-color: white; /* The little square between H and V scrollbars */ +} + +/* GUTTER */ + +.CodeMirror-gutters { + border-right: 1px solid #ddd; + background-color: #f7f7f7; + white-space: nowrap; +} +.CodeMirror-linenumbers {} +.CodeMirror-linenumber { + padding: 0 3px 0 5px; + min-width: 20px; + text-align: right; + color: #999; + white-space: nowrap; +} + +.CodeMirror-guttermarker { color: black; } +.CodeMirror-guttermarker-subtle { color: #999; } + +/* CURSOR */ + +.CodeMirror-cursor { + border-left: 1px solid black; + border-right: none; + width: 0; +} +/* Shown when moving in bi-directional text */ +.CodeMirror div.CodeMirror-secondarycursor { + border-left: 1px solid silver; +} +.cm-fat-cursor .CodeMirror-cursor { + width: auto; + border: 0 !important; + background: #7e7; +} +.cm-fat-cursor div.CodeMirror-cursors { + z-index: 1; +} + +.cm-animate-fat-cursor { + width: auto; + border: 0; + -webkit-animation: blink 1.06s steps(1) infinite; + -moz-animation: blink 1.06s steps(1) infinite; + animation: blink 1.06s steps(1) infinite; + background-color: #7e7; +} +@-moz-keyframes blink { + 0% {} + 50% { background-color: transparent; } + 100% {} +} +@-webkit-keyframes blink { + 0% {} + 50% { background-color: transparent; } + 100% {} +} +@keyframes blink { + 0% {} + 50% { background-color: transparent; } + 100% {} +} + +/* Can style cursor different in overwrite (non-insert) mode */ +.CodeMirror-overwrite .CodeMirror-cursor {} + +.cm-tab { display: inline-block; text-decoration: inherit; } + +.CodeMirror-rulers { + position: absolute; + left: 0; right: 0; top: -50px; bottom: -20px; + overflow: hidden; +} +.CodeMirror-ruler { + border-left: 1px solid #ccc; + top: 0; bottom: 0; + position: absolute; +} + +/* DEFAULT THEME */ + +.cm-s-default .cm-header {color: blue;} +.cm-s-default .cm-quote {color: #090;} +.cm-negative {color: #d44;} +.cm-positive {color: #292;} +.cm-header, .cm-strong {font-weight: bold;} +.cm-em {font-style: italic;} +.cm-link {text-decoration: underline;} +.cm-strikethrough {text-decoration: line-through;} + +.cm-s-default .cm-keyword {color: #708;} +.cm-s-default .cm-atom {color: #219;} +.cm-s-default .cm-number {color: #164;} +.cm-s-default .cm-def {color: #00f;} +.cm-s-default .cm-variable, +.cm-s-default .cm-punctuation, +.cm-s-default .cm-property, +.cm-s-default .cm-operator {} +.cm-s-default .cm-variable-2 {color: #05a;} +.cm-s-default .cm-variable-3 {color: #085;} +.cm-s-default .cm-comment {color: #a50;} +.cm-s-default .cm-string {color: #a11;} +.cm-s-default .cm-string-2 {color: #f50;} +.cm-s-default .cm-meta {color: #555;} +.cm-s-default .cm-qualifier {color: #555;} +.cm-s-default .cm-builtin {color: #30a;} +.cm-s-default .cm-bracket {color: #997;} +.cm-s-default .cm-tag {color: #170;} +.cm-s-default .cm-attribute {color: #00c;} +.cm-s-default .cm-hr {color: #999;} +.cm-s-default .cm-link {color: #00c;} + +.cm-s-default .cm-error {color: #f00;} +.cm-invalidchar {color: #f00;} + +.CodeMirror-composing { border-bottom: 2px solid; } + +/* Default styles for common addons */ + +div.CodeMirror span.CodeMirror-matchingbracket {color: #0f0;} +div.CodeMirror span.CodeMirror-nonmatchingbracket {color: #f22;} +.CodeMirror-matchingtag { background: rgba(255, 150, 0, .3); } +.CodeMirror-activeline-background {background: #e8f2ff;} + +/* STOP */ + +/* The rest of this file contains styles related to the mechanics of + the editor. You probably shouldn't touch them. */ + +.CodeMirror { + position: relative; + overflow: hidden; + background: white; +} + +.CodeMirror-scroll { + overflow: scroll !important; /* Things will break if this is overridden */ + /* 30px is the magic margin used to hide the element's real scrollbars */ + /* See overflow: hidden in .CodeMirror */ + margin-bottom: -30px; margin-right: -30px; + padding-bottom: 30px; + height: 100%; + outline: none; /* Prevent dragging from highlighting the element */ + position: relative; +} +.CodeMirror-sizer { + position: relative; + border-right: 30px solid transparent; +} + +/* The fake, visible scrollbars. Used to force redraw during scrolling + before actual scrolling happens, thus preventing shaking and + flickering artifacts. */ +.CodeMirror-vscrollbar, .CodeMirror-hscrollbar, .CodeMirror-scrollbar-filler, .CodeMirror-gutter-filler { + position: absolute; + z-index: 6; + display: none; +} +.CodeMirror-vscrollbar { + right: 0; top: 0; + overflow-x: hidden; + overflow-y: scroll; +} +.CodeMirror-hscrollbar { + bottom: 0; left: 0; + overflow-y: hidden; + overflow-x: scroll; +} +.CodeMirror-scrollbar-filler { + right: 0; bottom: 0; +} +.CodeMirror-gutter-filler { + left: 0; bottom: 0; +} + +.CodeMirror-gutters { + position: absolute; left: 0; top: 0; + min-height: 100%; + z-index: 3; +} +.CodeMirror-gutter { + white-space: normal; + height: 100%; + display: inline-block; + vertical-align: top; + margin-bottom: -30px; +} +.CodeMirror-gutter-wrapper { + position: absolute; + z-index: 4; + background: none !important; + border: none !important; +} +.CodeMirror-gutter-background { + position: absolute; + top: 0; bottom: 0; + z-index: 4; +} +.CodeMirror-gutter-elt { + position: absolute; + cursor: default; + z-index: 4; +} +.CodeMirror-gutter-wrapper { + -webkit-user-select: none; + -moz-user-select: none; + user-select: none; +} + +.CodeMirror-lines { + cursor: text; + min-height: 1px; /* prevents collapsing before first draw */ +} +.CodeMirror pre { + /* Reset some styles that the rest of the page might have set */ + -moz-border-radius: 0; -webkit-border-radius: 0; border-radius: 0; + border-width: 0; + background: transparent; + font-family: inherit; + font-size: inherit; + margin: 0; + white-space: pre; + word-wrap: normal; + line-height: inherit; + color: inherit; + z-index: 2; + position: relative; + overflow: visible; + -webkit-tap-highlight-color: transparent; + -webkit-font-variant-ligatures: contextual; + font-variant-ligatures: contextual; +} +.CodeMirror-wrap pre { + word-wrap: break-word; + white-space: pre-wrap; + word-break: normal; +} + +.CodeMirror-linebackground { + position: absolute; + left: 0; right: 0; top: 0; bottom: 0; + z-index: 0; +} + +.CodeMirror-linewidget { + position: relative; + z-index: 2; + overflow: auto; +} + +.CodeMirror-widget {} + +.CodeMirror-code { + outline: none; +} + +/* Force content-box sizing for the elements where we expect it */ +.CodeMirror-scroll, +.CodeMirror-sizer, +.CodeMirror-gutter, +.CodeMirror-gutters, +.CodeMirror-linenumber { + -moz-box-sizing: content-box; + box-sizing: content-box; +} + +.CodeMirror-measure { + position: absolute; + width: 100%; + height: 0; + overflow: hidden; + visibility: hidden; +} + +.CodeMirror-cursor { + position: absolute; + pointer-events: none; +} +.CodeMirror-measure pre { position: static; } + +div.CodeMirror-cursors { + visibility: hidden; + position: relative; + z-index: 3; +} +div.CodeMirror-dragcursors { + visibility: visible; +} + +.CodeMirror-focused div.CodeMirror-cursors { + visibility: visible; +} + +.CodeMirror-selected { background: #d9d9d9; } +.CodeMirror-focused .CodeMirror-selected { background: #d7d4f0; } +.CodeMirror-crosshair { cursor: crosshair; } +.CodeMirror-line::selection, .CodeMirror-line > span::selection, .CodeMirror-line > span > span::selection { background: #d7d4f0; } +.CodeMirror-line::-moz-selection, .CodeMirror-line > span::-moz-selection, .CodeMirror-line > span > span::-moz-selection { background: #d7d4f0; } + +.cm-searching { + background: #ffa; + background: rgba(255, 255, 0, .4); +} + +/* Used to force a border model for a node */ +.cm-force-border { padding-right: .1px; } + +@media print { + /* Hide the cursor when printing */ + .CodeMirror div.CodeMirror-cursors { + visibility: hidden; + } +} + +/* See issue #2901 */ +.cm-tab-wrap-hack:after { content: ''; } + +/* Help users use markselection to safely style text background */ +span.CodeMirror-selectedtext { background: none; } diff --git a/docs/archive/1.0/sql/tutorial/css/docs.min.css b/docs/archive/1.0/sql/tutorial/css/docs.min.css new file mode 100644 index 00000000000..74563b9c19b --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/css/docs.min.css @@ -0,0 +1,11 @@ +/*! + * IE10 viewport hack for Surface/desktop Windows 8 bug + * Copyright 2014-2015 Twitter, Inc. + * Licensed under MIT (https://github.com/twbs/bootstrap/blob/master/LICENSE) + */@-ms-viewport{width:device-width}@-o-viewport{width:device-width}@viewport{width:device-width}.hll{background-color:#ffc}.c{color:#999}.err{color:#A00;background-color:#FAA}.k{color:#069}.o{color:#555}.cm{color:#999}.cp{color:#099}.c1{color:#999}.cs{color:#999}.gd{background-color:#FCC;border:1px solid #C00}.ge{font-style:italic}.gr{color:red}.gh{color:#030}.gi{background-color:#CFC;border:1px solid #0C0}.go{color:#AAA}.gp{color:#009}.gu{color:#030}.gt{color:#9C6}.kc{color:#069}.kd{color:#069}.kn{color:#069}.kp{color:#069}.kr{color:#069}.kt{color:#078}.m{color:#F60}.s{color:#d44950}.na{color:#4f9fcf}.nb{color:#366}.nc{color:#0A8}.no{color:#360}.nd{color:#99F}.ni{color:#999}.ne{color:#C00}.nf{color:#C0F}.nl{color:#99F}.nn{color:#0CF}.nt{color:#2f6f9f}.nv{color:#033}.ow{color:#000}.w{color:#bbb}.mf{color:#F60}.mh{color:#F60}.mi{color:#F60}.mo{color:#F60}.sb{color:#C30}.sc{color:#C30}.sd{color:#C30;font-style:italic}.s2{color:#C30}.se{color:#C30}.sh{color:#C30}.si{color:#A00}.sx{color:#C30}.sr{color:#3AA}.s1{color:#C30}.ss{color:#FC3}.bp{color:#366}.vc{color:#033}.vg{color:#033}.vi{color:#033}.il{color:#F60}.css .nt+.nt,.css .o,.css .o+.nt{color:#999}/*! + * Bootstrap Docs (http://getbootstrap.com) + * Copyright 2011-2016 Twitter, Inc. + * Licensed under the Creative Commons Attribution 3.0 Unported License. For + * details, see https://creativecommons.org/licenses/by/3.0/. + */body{position:relative}.table code{font-size:13px;font-weight:400}h2 code,h3 code,h4 code{background-color:inherit}.btn-outline{color:#563d7c;background-color:transparent;border-color:#563d7c}.btn-outline:active,.btn-outline:focus,.btn-outline:hover{color:#fff;background-color:#563d7c;border-color:#563d7c}.btn-outline-inverse{color:#fff;background-color:transparent;border-color:#cdbfe3}.btn-outline-inverse:active,.btn-outline-inverse:focus,.btn-outline-inverse:hover{color:#563d7c;text-shadow:none;background-color:#fff;border-color:#fff}.bs-docs-booticon{display:block;font-weight:500;color:#fff;text-align:center;cursor:default;background-color:#563d7c;border-radius:15%}.bs-docs-booticon-sm{width:30px;height:30px;font-size:20px;line-height:28px}.bs-docs-booticon-lg{width:144px;height:144px;font-size:108px;line-height:140px}.bs-docs-booticon-inverse{color:#563d7c;background-color:#fff}.bs-docs-booticon-outline{background-color:transparent;border:1px solid #cdbfe3}#skippy{display:block;padding:1em;color:#fff;background-color:#6f5499;outline:0}#skippy .skiplink-text{padding:.5em;outline:1px dotted}#content:focus{outline:0}.bs-docs-nav{margin-bottom:0;background-color:#fff;border-bottom:0}.bs-home-nav .bs-nav-b{display:none}.bs-docs-nav .navbar-brand,.bs-docs-nav .navbar-nav>li>a{font-weight:500;color:#563d7c}.bs-docs-nav .navbar-nav>.active>a,.bs-docs-nav .navbar-nav>.active>a:hover,.bs-docs-nav .navbar-nav>li>a:hover{color:#463265;background-color:#f9f9f9}.bs-docs-nav .navbar-toggle .icon-bar{background-color:#563d7c}.bs-docs-nav .navbar-header .navbar-toggle{border-color:#fff}.bs-docs-nav .navbar-header .navbar-toggle:focus,.bs-docs-nav .navbar-header .navbar-toggle:hover{background-color:#f9f9f9;border-color:#f9f9f9}.bs-docs-footer{padding-top:50px;padding-bottom:50px;margin-top:100px;color:#99979c;text-align:center;background-color:#2a2730}.bs-docs-footer a{color:#fff}.bs-docs-footer-links{padding-left:0;margin-bottom:20px}.bs-docs-footer-links li{display:inline-block}.bs-docs-footer-links li+li{margin-left:15px}@media (min-width:768px){.bs-docs-footer{text-align:left}.bs-docs-footer p{margin-bottom:0}}.bs-docs-header,.bs-docs-masthead{position:relative;padding:30px 0;color:#cdbfe3;text-align:center;text-shadow:0 1px 0 rgba(0,0,0,.1);background-color:#6f5499;background-image:-webkit-gradient(linear,left top,left bottom,from(#563d7c),to(#6f5499));background-image:-webkit-linear-gradient(top,#563d7c 0,#6f5499 100%);background-image:-o-linear-gradient(top,#563d7c 0,#6f5499 100%);background-image:linear-gradient(to bottom,#563d7c 0,#6f5499 100%);filter:progid:DXImageTransform.Microsoft.gradient(startColorstr='#563d7c', endColorstr='#6F5499', GradientType=0);background-repeat:repeat-x}.bs-docs-masthead .bs-docs-booticon{margin:0 auto 30px}.bs-docs-masthead h1{font-weight:300;line-height:1;color:#fff}.bs-docs-masthead .lead{margin:0 auto 30px;font-size:20px;color:#fff}.bs-docs-masthead .version{margin-top:-15px;margin-bottom:30px;color:#9783b9}.bs-docs-masthead .btn{width:100%;padding:15px 30px;font-size:20px}@media (min-width:480px){.bs-docs-masthead .btn{width:auto}}@media (min-width:768px){.bs-docs-masthead{padding:80px 0}.bs-docs-masthead h1{font-size:60px}.bs-docs-masthead .lead{font-size:24px}}@media (min-width:992px){.bs-docs-masthead .lead{width:80%;font-size:30px}}.bs-docs-header{margin-bottom:40px;font-size:20px}.bs-docs-header h1{margin-top:0;color:#fff}.bs-docs-header p{margin-bottom:0;font-weight:300;line-height:1.4}.bs-docs-header .container{position:relative}@media (min-width:768px){.bs-docs-header{padding-top:60px;padding-bottom:60px;font-size:24px;text-align:left}.bs-docs-header h1{font-size:60px;line-height:1}}@media (min-width:992px){.bs-docs-header h1,.bs-docs-header p{margin-right:380px}}.carbonad{width:auto!important;height:auto!important;padding:20px!important;margin:30px -15px -31px!important;overflow:hidden;font-size:13px!important;line-height:16px!important;text-align:left;background:0 0!important;border:solid #866ab3!important;border-width:1px 0!important}.carbonad-img{margin:0!important}.carbonad-tag,.carbonad-text{display:block!important;float:none!important;width:auto!important;height:auto!important;margin-left:145px!important;font-family:"Helvetica Neue",Helvetica,Arial,sans-serif!important}.carbonad-text{padding-top:0!important}.carbonad-tag{color:inherit!important;text-align:left!important}.carbonad-tag a,.carbonad-text a{color:#fff!important}.carbonad #azcarbon>img{display:none}@media (min-width:480px){.carbonad{width:330px!important;margin:20px auto!important;border-width:1px!important;border-radius:4px}.bs-docs-masthead .carbonad{margin:50px auto 0!important}}@media (min-width:768px){.carbonad{margin-right:0!important;margin-left:0!important}}@media (min-width:992px){.carbonad{position:absolute;top:0;right:15px;width:330px!important;padding:15px!important;margin:0!important}.bs-docs-masthead .carbonad{position:static}}.bs-docs-featurette{padding-top:40px;padding-bottom:40px;font-size:16px;line-height:1.5;color:#555;text-align:center;background-color:#fff;border-bottom:1px solid #e5e5e5}.bs-docs-featurette+.bs-docs-footer{margin-top:0;border-top:0}.bs-docs-featurette-title{margin-bottom:5px;font-size:30px;font-weight:400;color:#333}.half-rule{width:100px;margin:40px auto}.bs-docs-featurette h3{margin-bottom:5px;font-weight:400;color:#333}.bs-docs-featurette-img{display:block;margin-bottom:20px;color:#333}.bs-docs-featurette-img:hover{color:#337ab7;text-decoration:none}.bs-docs-featurette-img img{display:block;margin-bottom:15px}@media (min-width:480px){.bs-docs-featurette .img-responsive{margin-top:30px}}@media (min-width:768px){.bs-docs-featurette{padding-top:100px;padding-bottom:100px}.bs-docs-featurette-title{font-size:40px}.bs-docs-featurette .lead{max-width:80%;margin-right:auto;margin-left:auto}.bs-docs-featurette .img-responsive{margin-top:0}}.bs-docs-featured-sites{margin-right:-1px;margin-left:-1px}.bs-docs-featured-sites .col-xs-6{padding:1px}.bs-docs-featured-sites .img-responsive{margin-top:0}@media (min-width:768px){.bs-docs-featured-sites .col-sm-3:first-child img{border-top-left-radius:4px;border-bottom-left-radius:4px}.bs-docs-featured-sites .col-sm-3:last-child img{border-top-right-radius:4px;border-bottom-right-radius:4px}}.bs-examples .thumbnail{margin-bottom:10px}.bs-examples h4{margin-bottom:5px}.bs-examples p{margin-bottom:20px}@media (max-width:480px){.bs-examples{margin-right:-10px;margin-left:-10px}.bs-examples>[class^=col-]{padding-right:10px;padding-left:10px}}.bs-docs-sidebar.affix{position:static}@media (min-width:768px){.bs-docs-sidebar{padding-left:20px}}.bs-docs-sidenav{margin-top:20px;margin-bottom:20px}.bs-docs-sidebar .nav>li>a{display:block;padding:4px 20px;font-size:13px;font-weight:500;color:#767676}.bs-docs-sidebar .nav>li>a:focus,.bs-docs-sidebar .nav>li>a:hover{padding-left:19px;color:#563d7c;text-decoration:none;background-color:transparent;border-left:1px solid #563d7c}.bs-docs-sidebar .nav>.active:focus>a,.bs-docs-sidebar .nav>.active:hover>a,.bs-docs-sidebar .nav>.active>a{padding-left:18px;font-weight:700;color:#563d7c;background-color:transparent;border-left:2px solid #563d7c}.bs-docs-sidebar .nav .nav{display:none;padding-bottom:10px}.bs-docs-sidebar .nav .nav>li>a{padding-top:1px;padding-bottom:1px;padding-left:30px;font-size:12px;font-weight:400}.bs-docs-sidebar .nav .nav>li>a:focus,.bs-docs-sidebar .nav .nav>li>a:hover{padding-left:29px}.bs-docs-sidebar .nav .nav>.active:focus>a,.bs-docs-sidebar .nav .nav>.active:hover>a,.bs-docs-sidebar .nav .nav>.active>a{padding-left:28px;font-weight:500}.back-to-top,.bs-docs-theme-toggle{display:none;padding:4px 10px;margin-top:10px;margin-left:10px;font-size:12px;font-weight:500;color:#999}.back-to-top:hover,.bs-docs-theme-toggle:hover{color:#563d7c;text-decoration:none}.bs-docs-theme-toggle{margin-top:0}@media (min-width:768px){.back-to-top,.bs-docs-theme-toggle{display:block}}@media (min-width:992px){.bs-docs-sidebar .nav>.active>ul{display:block}.bs-docs-sidebar.affix,.bs-docs-sidebar.affix-bottom{width:213px}.bs-docs-sidebar.affix{position:fixed;top:20px}.bs-docs-sidebar.affix-bottom{position:absolute}.bs-docs-sidebar.affix .bs-docs-sidenav,.bs-docs-sidebar.affix-bottom .bs-docs-sidenav{margin-top:0;margin-bottom:0}}@media (min-width:1200px){.bs-docs-sidebar.affix,.bs-docs-sidebar.affix-bottom{width:263px}}.bs-docs-section{margin-bottom:60px}.bs-docs-section:last-child{margin-bottom:0}h1[id]{padding-top:20px;margin-top:0}.bs-callout{padding:20px;margin:20px 0;border:1px solid #eee;border-left-width:5px;border-radius:3px}.bs-callout h4{margin-top:0;margin-bottom:5px}.bs-callout p:last-child{margin-bottom:0}.bs-callout code{border-radius:3px}.bs-callout+.bs-callout{margin-top:-5px}.bs-callout-danger{border-left-color:#ce4844}.bs-callout-danger h4{color:#ce4844}.bs-callout-warning{border-left-color:#aa6708}.bs-callout-warning h4{color:#aa6708}.bs-callout-info{border-left-color:#1b809e}.bs-callout-info h4{color:#1b809e}.color-swatches{margin:0 -5px;overflow:hidden}.color-swatch{float:left;width:60px;height:60px;margin:0 5px;border-radius:3px}@media (min-width:768px){.color-swatch{width:100px;height:100px}}.color-swatches .gray-darker{background-color:#222}.color-swatches .gray-dark{background-color:#333}.color-swatches .gray{background-color:#555}.color-swatches .gray-light{background-color:#999}.color-swatches .gray-lighter{background-color:#eee}.color-swatches .brand-primary{background-color:#337ab7}.color-swatches .brand-success{background-color:#5cb85c}.color-swatches .brand-warning{background-color:#f0ad4e}.color-swatches .brand-danger{background-color:#d9534f}.color-swatches .brand-info{background-color:#5bc0de}.color-swatches .bs-purple{background-color:#563d7c}.color-swatches .bs-purple-light{background-color:#c7bfd3}.color-swatches .bs-purple-lighter{background-color:#e5e1ea}.color-swatches .bs-gray{background-color:#f9f9f9}.bs-team .team-member{line-height:32px;color:#555}.bs-team .team-member:hover{color:#333;text-decoration:none}.bs-team .github-btn{float:right;width:180px;height:20px;margin-top:6px;border:none}.bs-team img{float:left;width:32px;margin-right:10px;border-radius:4px}.bs-docs-browser-bugs td p{margin-bottom:0}.bs-docs-browser-bugs th:first-child{width:18%}.show-grid{margin-bottom:15px}.show-grid [class^=col-]{padding-top:10px;padding-bottom:10px;background-color:#eee;background-color:rgba(86,61,124,.15);border:1px solid #ddd;border:1px solid rgba(86,61,124,.2)}.bs-example{position:relative;padding:45px 15px 15px;margin:0 -15px 15px;border-color:#e5e5e5 #eee #eee;border-style:solid;border-width:1px 0;-webkit-box-shadow:inset 0 3px 6px rgba(0,0,0,.05);box-shadow:inset 0 3px 6px rgba(0,0,0,.05)}.bs-example:after{position:absolute;top:15px;left:15px;font-size:12px;font-weight:700;color:#959595;text-transform:uppercase;letter-spacing:1px;content:"Example"}.bs-example-padded-bottom{padding-bottom:24px}.bs-example+.highlight,.bs-example+.zero-clipboard+.highlight{margin:-15px -15px 15px;border-width:0 0 1px;border-radius:0}@media (min-width:768px){.bs-example{margin-right:0;margin-left:0;background-color:#fff;border-color:#ddd;border-width:1px;border-radius:4px 4px 0 0;-webkit-box-shadow:none;box-shadow:none}.bs-example+.highlight,.bs-example+.zero-clipboard+.highlight{margin-top:-16px;margin-right:0;margin-left:0;border-width:1px;border-bottom-right-radius:4px;border-bottom-left-radius:4px}.bs-example-standalone{border-radius:4px}}.bs-example .container{width:auto}.bs-example>.alert:last-child,.bs-example>.form-control:last-child,.bs-example>.jumbotron:last-child,.bs-example>.list-group:last-child,.bs-example>.navbar:last-child,.bs-example>.panel:last-child,.bs-example>.progress:last-child,.bs-example>.table-responsive:last-child>.table,.bs-example>.table:last-child,.bs-example>.well:last-child,.bs-example>blockquote:last-child,.bs-example>ol:last-child,.bs-example>p:last-child,.bs-example>ul:last-child{margin-bottom:0}.bs-example>p>.close{float:none}.bs-example-type .table .type-info{color:#767676;vertical-align:middle}.bs-example-type .table td{padding:15px 0;border-color:#eee}.bs-example-type .table tr:first-child td{border-top:0}.bs-example-type h1,.bs-example-type h2,.bs-example-type h3,.bs-example-type h4,.bs-example-type h5,.bs-example-type h6{margin:0}.bs-example-bg-classes p{padding:15px}.bs-example>.img-circle,.bs-example>.img-rounded,.bs-example>.img-thumbnail{margin:5px}.bs-example>.table-responsive>.table{background-color:#fff}.bs-example>.btn,.bs-example>.btn-group{margin-top:5px;margin-bottom:5px}.bs-example>.btn-toolbar+.btn-toolbar{margin-top:10px}.bs-example-control-sizing input[type=text]+input[type=text],.bs-example-control-sizing select{margin-top:10px}.bs-example-form .input-group{margin-bottom:10px}.bs-example>textarea.form-control{resize:vertical}.bs-example>.list-group{max-width:400px}.bs-example .navbar:last-child{margin-bottom:0}.bs-navbar-bottom-example,.bs-navbar-top-example{z-index:1;padding:0;overflow:hidden}.bs-navbar-bottom-example .navbar-header,.bs-navbar-top-example .navbar-header{margin-left:0}.bs-navbar-bottom-example .navbar-fixed-bottom,.bs-navbar-top-example .navbar-fixed-top{position:relative;margin-right:0;margin-left:0}.bs-navbar-top-example{padding-bottom:45px}.bs-navbar-top-example:after{top:auto;bottom:15px}.bs-navbar-top-example .navbar-fixed-top{top:-1px}.bs-navbar-bottom-example{padding-top:45px}.bs-navbar-bottom-example .navbar-fixed-bottom{bottom:-1px}.bs-navbar-bottom-example .navbar{margin-bottom:0}@media (min-width:768px){.bs-navbar-bottom-example .navbar-fixed-bottom,.bs-navbar-top-example .navbar-fixed-top{position:absolute}}.bs-example .pagination{margin-top:10px;margin-bottom:10px}.bs-example>.pager{margin-top:0}.bs-example-modal{background-color:#f5f5f5}.bs-example-modal .modal{position:relative;top:auto;right:auto;bottom:auto;left:auto;z-index:1;display:block}.bs-example-modal .modal-dialog{left:auto;margin-right:auto;margin-left:auto}.bs-example>.dropdown>.dropdown-toggle{float:left}.bs-example>.dropdown>.dropdown-menu{position:static;display:block;margin-bottom:5px;clear:left}.bs-example-tabs .nav-tabs{margin-bottom:15px}.bs-example-tooltips{text-align:center}.bs-example-tooltips>.btn{margin-top:5px;margin-bottom:5px}.bs-example-tooltip .tooltip{position:relative;display:inline-block;margin:10px 20px;opacity:1}.bs-example-popover{padding-bottom:24px;background-color:#f9f9f9}.bs-example-popover .popover{position:relative;display:block;float:left;width:260px;margin:20px}.scrollspy-example{position:relative;height:200px;margin-top:10px;overflow:auto}.bs-example>.nav-pills-stacked-example{max-width:300px}#collapseExample .well{margin-bottom:0}.bs-events-table>tbody>tr>td:first-child,.bs-events-table>thead>tr>th:first-child{white-space:nowrap}.bs-events-table>thead>tr>th:first-child{width:150px}.js-options-table>thead>tr>th:nth-child(1),.js-options-table>thead>tr>th:nth-child(2){width:100px}.js-options-table>thead>tr>th:nth-child(3){width:50px}.highlight{padding:9px 14px;margin-bottom:14px;background-color:#f7f7f9;border:1px solid #e1e1e8;border-radius:4px}.highlight pre{padding:0;margin-top:0;margin-bottom:0;word-break:normal;white-space:nowrap;background-color:transparent;border:0}.highlight pre code{font-size:inherit;color:#333}.highlight pre code:first-child{display:inline-block;padding-right:45px}.table-responsive .highlight pre{white-space:normal}.bs-table th small,.responsive-utilities th small{display:block;font-weight:400;color:#999}.responsive-utilities tbody th{font-weight:400}.responsive-utilities td{text-align:center}.responsive-utilities td.is-visible{color:#468847;background-color:#dff0d8!important}.responsive-utilities td.is-hidden{color:#ccc;background-color:#f9f9f9!important}.responsive-utilities-test{margin-top:5px}.responsive-utilities-test .col-xs-6{margin-bottom:10px}.responsive-utilities-test span{display:block;padding:15px 10px;font-size:14px;font-weight:700;line-height:1.1;text-align:center;border-radius:4px}.hidden-on .col-xs-6 .hidden-lg,.hidden-on .col-xs-6 .hidden-md,.hidden-on .col-xs-6 .hidden-sm,.hidden-on .col-xs-6 .hidden-xs,.visible-on .col-xs-6 .hidden-lg,.visible-on .col-xs-6 .hidden-md,.visible-on .col-xs-6 .hidden-sm,.visible-on .col-xs-6 .hidden-xs{color:#999;border:1px solid #ddd}.hidden-on .col-xs-6 .visible-lg-block,.hidden-on .col-xs-6 .visible-md-block,.hidden-on .col-xs-6 .visible-sm-block,.hidden-on .col-xs-6 .visible-xs-block,.visible-on .col-xs-6 .visible-lg-block,.visible-on .col-xs-6 .visible-md-block,.visible-on .col-xs-6 .visible-sm-block,.visible-on .col-xs-6 .visible-xs-block{color:#468847;background-color:#dff0d8;border:1px solid #d6e9c6}.bs-glyphicons{margin:0 -10px 20px;overflow:hidden}.bs-glyphicons-list{padding-left:0;list-style:none}.bs-glyphicons li{float:left;width:25%;height:115px;padding:10px;font-size:10px;line-height:1.4;text-align:center;background-color:#f9f9f9;border:1px solid #fff}.bs-glyphicons .glyphicon{margin-top:5px;margin-bottom:10px;font-size:24px}.bs-glyphicons .glyphicon-class{display:block;text-align:center;word-wrap:break-word}.bs-glyphicons li:hover{color:#fff;background-color:#563d7c}@media (min-width:768px){.bs-glyphicons{margin-right:0;margin-left:0}.bs-glyphicons li{width:12.5%;font-size:12px}}.bs-customizer .toggle{float:right;margin-top:25px}.bs-customizer label{margin-top:10px;font-weight:500;color:#555}.bs-customizer h2{padding-top:30px;margin-top:0;margin-bottom:5px}.bs-customizer h3{margin-bottom:0}.bs-customizer h4{margin-top:15px;margin-bottom:0}.bs-customizer .bs-callout h4{margin-top:0;margin-bottom:5px}.bs-customizer input[type=text]{font-family:Menlo,Monaco,Consolas,"Courier New",monospace;background-color:#fafafa}.bs-customizer .help-block{margin-bottom:5px;font-size:12px}#less-section label{font-weight:400}.bs-customize-download .btn-outline{padding:20px}.bs-customizer-alert{position:fixed;top:0;right:0;left:0;z-index:1030;padding:15px 0;color:#fff;background-color:#d9534f;border-bottom:1px solid #b94441;-webkit-box-shadow:inset 0 1px 0 rgba(255,255,255,.25);box-shadow:inset 0 1px 0 rgba(255,255,255,.25)}.bs-customizer-alert .close{margin-top:-4px;font-size:24px}.bs-customizer-alert p{margin-bottom:0}.bs-customizer-alert .glyphicon{margin-right:5px}.bs-customizer-alert pre{margin:10px 0 0;color:#fff;background-color:#a83c3a;border-color:#973634;-webkit-box-shadow:inset 0 2px 4px rgba(0,0,0,.05),0 1px 0 rgba(255,255,255,.1);box-shadow:inset 0 2px 4px rgba(0,0,0,.05),0 1px 0 rgba(255,255,255,.1)}.bs-dropzone{position:relative;padding:20px;margin-bottom:20px;color:#777;text-align:center;border:2px dashed #eee;border-radius:4px}.bs-dropzone .import-header{margin-bottom:5px}.bs-dropzone .glyphicon-download-alt{font-size:40px}.bs-dropzone hr{width:100px}.bs-dropzone .lead{margin-bottom:10px;font-weight:400;color:#333}#import-manual-trigger{cursor:pointer}.bs-dropzone p:last-child{margin-bottom:0}.bs-brand-logos{display:table;width:100%;margin-bottom:15px;overflow:hidden;color:#563d7c;background-color:#f9f9f9;border-radius:4px}.bs-brand-item{padding:60px 0;text-align:center}.bs-brand-item+.bs-brand-item{border-top:1px solid #fff}.bs-brand-logos .inverse{color:#fff;background-color:#563d7c}.bs-brand-item h1,.bs-brand-item h3{margin-top:0;margin-bottom:0}.bs-brand-item .bs-docs-booticon{margin-right:auto;margin-left:auto}.bs-brand-item .glyphicon{width:30px;height:30px;margin:10px auto -10px;line-height:30px;color:#fff;border-radius:50%}.bs-brand-item .glyphicon-ok{background-color:#5cb85c}.bs-brand-item .glyphicon-remove{background-color:#d9534f}@media (min-width:768px){.bs-brand-item{display:table-cell;width:1%}.bs-brand-item+.bs-brand-item{border-top:0;border-left:1px solid #fff}.bs-brand-item h1{font-size:60px}}.zero-clipboard{position:relative;display:none}.btn-clipboard{position:absolute;top:0;right:0;z-index:10;display:block;padding:5px 8px;font-size:12px;color:#767676;cursor:pointer;background-color:#fff;border:1px solid #e1e1e8;border-radius:0 4px 0 4px}.btn-clipboard-hover{color:#fff;background-color:#563d7c;border-color:#563d7c}@media (min-width:768px){.zero-clipboard{display:block}.bs-example+.zero-clipboard .btn-clipboard{top:-16px;border-top-right-radius:0}}.anchorjs-link{color:inherit}@media (max-width:480px){.anchorjs-link{display:none}}:hover>.anchorjs-link{opacity:.75;-webkit-transition:color .16s linear;-o-transition:color .16s linear;transition:color .16s linear}.anchorjs-link:focus,:hover>.anchorjs-link:hover{text-decoration:none;opacity:1}#focusedInput{border-color:#ccc;border-color:rgba(82,168,236,.8);outline:0;outline:thin dotted\9;-webkit-box-shadow:0 0 8px rgba(82,168,236,.6);box-shadow:0 0 8px rgba(82,168,236,.6)}.v4-tease{display:block;padding:15px 20px;font-weight:700;color:#fff;text-align:center;background-color:#0275d8}.v4-tease:hover{color:#fff;text-decoration:none;background-color:#0269c2}@media print{a[href]:after{content:""!important}} +/*# sourceMappingURL=docs.min.css.map */ \ No newline at end of file diff --git a/docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.eot b/docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.eot new file mode 100644 index 0000000000000000000000000000000000000000..b93a4953fff68df523aa7656497ee339d6026d64 GIT binary patch literal 20127 zcma%hV{j!vx9y2-`@~L8?1^pLwlPU2wr$&<*tR|KBoo`2;LUg6eW-eW-tKDb)vH%` z^`A!Vd<6hNSRMcX|Cb;E|1qflDggj6Kmr)xA10^t-vIc3*Z+F{r%|K(GyE^?|I{=9 zNq`(c8=wS`0!RZy0g3{M(8^tv41d}oRU?8#IBFtJy*9zAN5dcxqGlMZGL>GG%R#)4J zDJ2;)4*E1pyHia%>lMv3X7Q`UoFyoB@|xvh^)kOE3)IL&0(G&i;g08s>c%~pHkN&6 z($7!kyv|A2DsV2mq-5Ku)D#$Kn$CzqD-wm5Q*OtEOEZe^&T$xIb0NUL}$)W)Ck`6oter6KcQG9Zcy>lXip)%e&!lQgtQ*N`#abOlytt!&i3fo)cKV zP0BWmLxS1gQv(r_r|?9>rR0ZeEJPx;Vi|h1!Eo*dohr&^lJgqJZns>&vexP@fs zkPv93Nyw$-kM5Mw^{@wPU47Y1dSkiHyl3dtHLwV&6Tm1iv{ve;sYA}Z&kmH802s9Z zyJEn+cfl7yFu#1^#DbtP7k&aR06|n{LnYFYEphKd@dJEq@)s#S)UA&8VJY@S2+{~> z(4?M();zvayyd^j`@4>xCqH|Au>Sfzb$mEOcD7e4z8pPVRTiMUWiw;|gXHw7LS#U< zsT(}Z5SJ)CRMXloh$qPnK77w_)ctHmgh}QAe<2S{DU^`!uwptCoq!Owz$u6bF)vnb zL`bM$%>baN7l#)vtS3y6h*2?xCk z>w+s)@`O4(4_I{L-!+b%)NZcQ&ND=2lyP+xI#9OzsiY8$c)ys-MI?TG6 zEP6f=vuLo!G>J7F4v|s#lJ+7A`^nEQScH3e?B_jC&{sj>m zYD?!1z4nDG_Afi$!J(<{>z{~Q)$SaXWjj~%ZvF152Hd^VoG14rFykR=_TO)mCn&K$ z-TfZ!vMBvnToyBoKRkD{3=&=qD|L!vb#jf1f}2338z)e)g>7#NPe!FoaY*jY{f)Bf>ohk-K z4{>fVS}ZCicCqgLuYR_fYx2;*-4k>kffuywghn?15s1dIOOYfl+XLf5w?wtU2Og*f z%X5x`H55F6g1>m~%F`655-W1wFJtY>>qNSdVT`M`1Mlh!5Q6#3j={n5#za;!X&^OJ zgq;d4UJV-F>gg?c3Y?d=kvn3eV)Jb^ zO5vg0G0yN0%}xy#(6oTDSVw8l=_*2k;zTP?+N=*18H5wp`s90K-C67q{W3d8vQGmr zhpW^>1HEQV2TG#8_P_0q91h8QgHT~8=-Ij5snJ3cj?Jn5_66uV=*pq(j}yHnf$Ft;5VVC?bz%9X31asJeQF2jEa47H#j` zk&uxf3t?g!tltVP|B#G_UfDD}`<#B#iY^i>oDd-LGF}A@Fno~dR72c&hs6bR z2F}9(i8+PR%R|~FV$;Ke^Q_E_Bc;$)xN4Ti>Lgg4vaip!%M z06oxAF_*)LH57w|gCW3SwoEHwjO{}}U=pKhjKSZ{u!K?1zm1q? zXyA6y@)}_sONiJopF}_}(~}d4FDyp|(@w}Vb;Fl5bZL%{1`}gdw#i{KMjp2@Fb9pg ziO|u7qP{$kxH$qh8%L+)AvwZNgUT6^zsZq-MRyZid{D?t`f|KzSAD~C?WT3d0rO`0 z=qQ6{)&UXXuHY{9g|P7l_nd-%eh}4%VVaK#Nik*tOu9lBM$<%FS@`NwGEbP0&;Xbo zObCq=y%a`jSJmx_uTLa{@2@}^&F4c%z6oe-TN&idjv+8E|$FHOvBqg5hT zMB=7SHq`_-E?5g=()*!V>rIa&LcX(RU}aLm*38U_V$C_g4)7GrW5$GnvTwJZdBmy6 z*X)wi3=R8L=esOhY0a&eH`^fSpUHV8h$J1|o^3fKO|9QzaiKu>yZ9wmRkW?HTkc<*v7i*ylJ#u#j zD1-n&{B`04oG>0Jn{5PKP*4Qsz{~`VVA3578gA+JUkiPc$Iq!^K|}*p_z3(-c&5z@ zKxmdNpp2&wg&%xL3xZNzG-5Xt7jnI@{?c z25=M>-VF|;an2Os$Nn%HgQz7m(ujC}Ii0Oesa(y#8>D+P*_m^X##E|h$M6tJr%#=P zWP*)Px>7z`E~U^2LNCNiy%Z7!!6RI%6fF@#ZY3z`CK91}^J$F!EB0YF1je9hJKU7!S5MnXV{+#K;y zF~s*H%p@vj&-ru7#(F2L+_;IH46X(z{~HTfcThqD%b{>~u@lSc<+f5#xgt9L7$gSK ziDJ6D*R%4&YeUB@yu@4+&70MBNTnjRyqMRd+@&lU#rV%0t3OmouhC`mkN}pL>tXin zY*p)mt=}$EGT2E<4Q>E2`6)gZ`QJhGDNpI}bZL9}m+R>q?l`OzFjW?)Y)P`fUH(_4 zCb?sm1=DD0+Q5v}BW#0n5;Nm(@RTEa3(Y17H2H67La+>ptQHJ@WMy2xRQT$|7l`8c zYHCxYw2o-rI?(fR2-%}pbs$I%w_&LPYE{4bo}vRoAW>3!SY_zH3`ofx3F1PsQ?&iq z*BRG>?<6%z=x#`NhlEq{K~&rU7Kc7Y-90aRnoj~rVoKae)L$3^z*Utppk?I`)CX&& zZ^@Go9fm&fN`b`XY zt0xE5aw4t@qTg_k=!-5LXU+_~DlW?53!afv6W(k@FPPX-`nA!FBMp7b!ODbL1zh58 z*69I}P_-?qSLKj}JW7gP!la}K@M}L>v?rDD!DY-tu+onu9kLoJz20M4urX_xf2dfZ zORd9Zp&28_ff=wdMpXi%IiTTNegC}~RLkdYjA39kWqlA?jO~o1`*B&85Hd%VPkYZT z48MPe62;TOq#c%H(`wX5(Bu>nlh4Fbd*Npasdhh?oRy8a;NB2(eb}6DgwXtx=n}fE zx67rYw=(s0r?EsPjaya}^Qc-_UT5|*@|$Q}*|>V3O~USkIe6a0_>vd~6kHuP8=m}_ zo2IGKbv;yA+TBtlCpnw)8hDn&eq?26gN$Bh;SdxaS04Fsaih_Cfb98s39xbv)=mS0 z6M<@pM2#pe32w*lYSWG>DYqB95XhgAA)*9dOxHr{t)er0Xugoy)!Vz#2C3FaUMzYl zCxy{igFB901*R2*F4>grPF}+G`;Yh zGi@nRjWyG3mR(BVOeBPOF=_&}2IWT%)pqdNAcL{eP`L*^FDv#Rzql5U&Suq_X%JfR_lC!S|y|xd5mQ0{0!G#9hV46S~A` z0B!{yI-4FZEtol5)mNWXcX(`x&Pc*&gh4k{w%0S#EI>rqqlH2xv7mR=9XNCI$V#NG z4wb-@u{PfQP;tTbzK>(DF(~bKp3;L1-A*HS!VB)Ae>Acnvde15Anb`h;I&0)aZBS6 z55ZS7mL5Wp!LCt45^{2_70YiI_Py=X{I3>$Px5Ez0ahLQ+ z9EWUWSyzA|+g-Axp*Lx-M{!ReQO07EG7r4^)K(xbj@%ZU=0tBC5shl)1a!ifM5OkF z0w2xQ-<+r-h1fi7B6waX15|*GGqfva)S)dVcgea`lQ~SQ$KXPR+(3Tn2I2R<0 z9tK`L*pa^+*n%>tZPiqt{_`%v?Bb7CR-!GhMON_Fbs0$#|H}G?rW|{q5fQhvw!FxI zs-5ZK>hAbnCS#ZQVi5K0X3PjL1JRdQO+&)*!oRCqB{wen60P6!7bGiWn@vD|+E@Xq zb!!_WiU^I|@1M}Hz6fN-m04x=>Exm{b@>UCW|c8vC`aNbtA@KCHujh^2RWZC}iYhL^<*Z93chIBJYU&w>$CGZDRcHuIgF&oyesDZ#&mA;?wxx4Cm#c0V$xYG?9OL(Smh}#fFuX(K;otJmvRP{h ze^f-qv;)HKC7geB92_@3a9@MGijS(hNNVd%-rZ;%@F_f7?Fjinbe1( zn#jQ*jKZTqE+AUTEd3y6t>*=;AO##cmdwU4gc2&rT8l`rtKW2JF<`_M#p>cj+)yCG zgKF)y8jrfxTjGO&ccm8RU>qn|HxQ7Z#sUo$q)P5H%8iBF$({0Ya51-rA@!It#NHN8MxqK zrYyl_&=}WVfQ?+ykV4*@F6)=u_~3BebR2G2>>mKaEBPmSW3(qYGGXj??m3L zHec{@jWCsSD8`xUy0pqT?Sw0oD?AUK*WxZn#D>-$`eI+IT)6ki>ic}W)t$V32^ITD zR497@LO}S|re%A+#vdv-?fXsQGVnP?QB_d0cGE+U84Q=aM=XrOwGFN3`Lpl@P0fL$ zKN1PqOwojH*($uaQFh8_)H#>Acl&UBSZ>!2W1Dinei`R4dJGX$;~60X=|SG6#jci} z&t4*dVDR*;+6Y(G{KGj1B2!qjvDYOyPC}%hnPbJ@g(4yBJrViG1#$$X75y+Ul1{%x zBAuD}Q@w?MFNqF-m39FGpq7RGI?%Bvyyig&oGv)lR>d<`Bqh=p>urib5DE;u$c|$J zwim~nPb19t?LJZsm{<(Iyyt@~H!a4yywmHKW&=1r5+oj*Fx6c89heW@(2R`i!Uiy* zp)=`Vr8sR!)KChE-6SEIyi(dvG3<1KoVt>kGV=zZiG7LGonH1+~yOK-`g0)r#+O|Q>)a`I2FVW%wr3lhO(P{ksNQuR!G_d zeTx(M!%brW_vS9?IF>bzZ2A3mWX-MEaOk^V|4d38{1D|KOlZSjBKrj7Fgf^>JyL0k zLoI$adZJ0T+8i_Idsuj}C;6jgx9LY#Ukh;!8eJ^B1N}q=Gn4onF*a2vY7~`x$r@rJ z`*hi&Z2lazgu{&nz>gjd>#eq*IFlXed(%$s5!HRXKNm zDZld+DwDI`O6hyn2uJ)F^{^;ESf9sjJ)wMSKD~R=DqPBHyP!?cGAvL<1|7K-(=?VO zGcKcF1spUa+ki<`6K#@QxOTsd847N8WSWztG~?~ z!gUJn>z0O=_)VCE|56hkT~n5xXTp}Ucx$Ii%bQ{5;-a4~I2e|{l9ur#*ghd*hSqO= z)GD@ev^w&5%k}YYB~!A%3*XbPPU-N6&3Lp1LxyP@|C<{qcn&?l54+zyMk&I3YDT|E z{lXH-e?C{huu<@~li+73lMOk&k)3s7Asn$t6!PtXJV!RkA`qdo4|OC_a?vR!kE_}k zK5R9KB%V@R7gt@9=TGL{=#r2gl!@3G;k-6sXp&E4u20DgvbY$iE**Xqj3TyxK>3AU z!b9}NXuINqt>Htt6fXIy5mj7oZ{A&$XJ&thR5ySE{mkxq_YooME#VCHm2+3D!f`{) zvR^WSjy_h4v^|!RJV-RaIT2Ctv=)UMMn@fAgjQV$2G+4?&dGA8vK35c-8r)z9Qqa=%k(FU)?iec14<^olkOU3p zF-6`zHiDKPafKK^USUU+D01>C&Wh{{q?>5m zGQp|z*+#>IIo=|ae8CtrN@@t~uLFOeT{}vX(IY*;>wAU=u1Qo4c+a&R);$^VCr>;! zv4L{`lHgc9$BeM)pQ#XA_(Q#=_iSZL4>L~8Hx}NmOC$&*Q*bq|9Aq}rWgFnMDl~d*;7c44GipcpH9PWaBy-G$*MI^F0 z?Tdxir1D<2ui+Q#^c4?uKvq=p>)lq56=Eb|N^qz~w7rsZu)@E4$;~snz+wIxi+980O6M#RmtgLYh@|2}9BiHSpTs zacjGKvwkUwR3lwTSsCHlwb&*(onU;)$yvdhikonn|B44JMgs*&Lo!jn`6AE>XvBiO z*LKNX3FVz9yLcsnmL!cRVO_qv=yIM#X|u&}#f%_?Tj0>8)8P_0r0!AjWNw;S44tst zv+NXY1{zRLf9OYMr6H-z?4CF$Y%MdbpFIN@a-LEnmkcOF>h16cH_;A|e)pJTuCJ4O zY7!4FxT4>4aFT8a92}84>q0&?46h>&0Vv0p>u~k&qd5$C1A6Q$I4V(5X~6{15;PD@ ze6!s9xh#^QI`J+%8*=^(-!P!@9%~buBmN2VSAp@TOo6}C?az+ALP8~&a0FWZk*F5N z^8P8IREnN`N0i@>O0?{i-FoFShYbUB`D7O4HB`Im2{yzXmyrg$k>cY6A@>bf7i3n0 z5y&cf2#`zctT>dz+hNF&+d3g;2)U!#vsb-%LC+pqKRTiiSn#FH#e!bVwR1nAf*TG^ z!RKcCy$P>?Sfq6n<%M{T0I8?p@HlgwC!HoWO>~mT+X<{Ylm+$Vtj9};H3$EB}P2wR$3y!TO#$iY8eO-!}+F&jMu4%E6S>m zB(N4w9O@2=<`WNJay5PwP8javDp~o~xkSbd4t4t8)9jqu@bHmJHq=MV~Pt|(TghCA}fhMS?s-{klV>~=VrT$nsp7mf{?cze~KKOD4 z_1Y!F)*7^W+BBTt1R2h4f1X4Oy2%?=IMhZU8c{qk3xI1=!na*Sg<=A$?K=Y=GUR9@ zQ(ylIm4Lgm>pt#%p`zHxok%vx_=8Fap1|?OM02|N%X-g5_#S~sT@A!x&8k#wVI2lo z1Uyj{tDQRpb*>c}mjU^gYA9{7mNhFAlM=wZkXcA#MHXWMEs^3>p9X)Oa?dx7b%N*y zLz@K^%1JaArjgri;8ptNHwz1<0y8tcURSbHsm=26^@CYJ3hwMaEvC7 z3Wi-@AaXIQ)%F6#i@%M>?Mw7$6(kW@?et@wbk-APcvMCC{>iew#vkZej8%9h0JSc? zCb~K|!9cBU+))^q*co(E^9jRl7gR4Jihyqa(Z(P&ID#TPyysVNL7(^;?Gan!OU>au zN}miBc&XX-M$mSv%3xs)bh>Jq9#aD_l|zO?I+p4_5qI0Ms*OZyyxA`sXcyiy>-{YN zA70%HmibZYcHW&YOHk6S&PQ+$rJ3(utuUra3V0~@=_~QZy&nc~)AS>v&<6$gErZC3 zcbC=eVkV4Vu0#}E*r=&{X)Kgq|8MGCh(wsH4geLj@#8EGYa})K2;n z{1~=ghoz=9TSCxgzr5x3@sQZZ0FZ+t{?klSI_IZa16pSx6*;=O%n!uXVZ@1IL;JEV zfOS&yyfE9dtS*^jmgt6>jQDOIJM5Gx#Y2eAcC3l^lmoJ{o0T>IHpECTbfYgPI4#LZq0PKqnPCD}_ zyKxz;(`fE0z~nA1s?d{X2!#ZP8wUHzFSOoTWQrk%;wCnBV_3D%3@EC|u$Ao)tO|AO z$4&aa!wbf}rbNcP{6=ajgg(`p5kTeu$ji20`zw)X1SH*x zN?T36{d9TY*S896Ijc^!35LLUByY4QO=ARCQ#MMCjudFc7s!z%P$6DESz%zZ#>H|i zw3Mc@v4~{Eke;FWs`5i@ifeYPh-Sb#vCa#qJPL|&quSKF%sp8*n#t?vIE7kFWjNFh zJC@u^bRQ^?ra|%39Ux^Dn4I}QICyDKF0mpe+Bk}!lFlqS^WpYm&xwIYxUoS-rJ)N9 z1Tz*6Rl9;x`4lwS1cgW^H_M*)Dt*DX*W?ArBf?-t|1~ge&S}xM0K;U9Ibf{okZHf~ z#4v4qc6s6Zgm8iKch5VMbQc~_V-ZviirnKCi*ouN^c_2lo&-M;YSA>W>>^5tlXObg zacX$k0=9Tf$Eg+#9k6yV(R5-&F{=DHP8!yvSQ`Y~XRnUx@{O$-bGCksk~3&qH^dqX zkf+ZZ?Nv5u>LBM@2?k%k&_aUb5Xjqf#!&7%zN#VZwmv65ezo^Y4S#(ed0yUn4tFOB zh1f1SJ6_s?a{)u6VdwUC!Hv=8`%T9(^c`2hc9nt$(q{Dm2X)dK49ba+KEheQ;7^0) ziFKw$%EHy_B1)M>=yK^=Z$U-LT36yX>EKT zvD8IAom2&2?bTmX@_PBR4W|p?6?LQ+&UMzXxqHC5VHzf@Eb1u)kwyfy+NOM8Wa2y@ zNNDL0PE$F;yFyf^jy&RGwDXQwYw6yz>OMWvJt98X@;yr!*RQDBE- zE*l*u=($Zi1}0-Y4lGaK?J$yQjgb+*ljUvNQ!;QYAoCq@>70=sJ{o{^21^?zT@r~hhf&O;Qiq+ ziGQQLG*D@5;LZ%09mwMiE4Q{IPUx-emo*;a6#DrmWr(zY27d@ezre)Z1BGZdo&pXn z+);gOFelKDmnjq#8dL7CTiVH)dHOqWi~uE|NM^QI3EqxE6+_n>IW67~UB#J==QOGF zp_S)c8TJ}uiaEiaER}MyB(grNn=2m&0yztA=!%3xUREyuG_jmadN*D&1nxvjZ6^+2 zORi7iX1iPi$tKasppaR9$a3IUmrrX)m*)fg1>H+$KpqeB*G>AQV((-G{}h=qItj|d zz~{5@{?&Dab6;0c7!!%Se>w($RmlG7Jlv_zV3Ru8b2rugY0MVPOOYGlokI7%nhIy& z-B&wE=lh2dtD!F?noD{z^O1~Tq4MhxvchzuT_oF3-t4YyA*MJ*n&+1X3~6quEN z@m~aEp=b2~mP+}TUP^FmkRS_PDMA{B zaSy(P=$T~R!yc^Ye0*pl5xcpm_JWI;@-di+nruhqZ4gy7cq-)I&s&Bt3BkgT(Zdjf zTvvv0)8xzntEtp4iXm}~cT+pi5k{w{(Z@l2XU9lHr4Vy~3ycA_T?V(QS{qwt?v|}k z_ST!s;C4!jyV5)^6xC#v!o*uS%a-jQ6< z)>o?z7=+zNNtIz1*F_HJ(w@=`E+T|9TqhC(g7kKDc8z~?RbKQ)LRMn7A1p*PcX2YR zUAr{);~c7I#3Ssv<0i-Woj0&Z4a!u|@Xt2J1>N-|ED<3$o2V?OwL4oQ%$@!zLamVz zB)K&Ik^~GOmDAa143{I4?XUk1<3-k{<%?&OID&>Ud%z*Rkt*)mko0RwC2=qFf-^OV z=d@47?tY=A;=2VAh0mF(3x;!#X!%{|vn;U2XW{(nu5b&8kOr)Kop3-5_xnK5oO_3y z!EaIb{r%D{7zwtGgFVri4_!yUIGwR(xEV3YWSI_+E}Gdl>TINWsIrfj+7DE?xp+5^ zlr3pM-Cbse*WGKOd3+*Qen^*uHk)+EpH-{u@i%y}Z!YSid<}~kA*IRSk|nf+I1N=2 zIKi+&ej%Al-M5`cP^XU>9A(m7G>58>o|}j0ZWbMg&x`*$B9j#Rnyo0#=BMLdo%=ks zLa3(2EinQLXQ(3zDe7Bce%Oszu%?8PO648TNst4SMFvj=+{b%)ELyB!0`B?9R6aO{i-63|s@|raSQGL~s)9R#J#duFaTSZ2M{X z1?YuM*a!!|jP^QJ(hAisJuPOM`8Y-Hzl~%d@latwj}t&0{DNNC+zJARnuQfiN`HQ# z?boY_2?*q;Qk)LUB)s8(Lz5elaW56p&fDH*AWAq7Zrbeq1!?FBGYHCnFgRu5y1jwD zc|yBz+UW|X`zDsc{W~8m$sh@VVnZD$lLnKlq@Hg^;ky!}ZuPdKNi2BI70;hrpvaA4+Q_+K)I@|)q1N-H zrycZU`*YUW``Qi^`bDX-j7j^&bO+-Xg$cz2#i##($uyW{Nl&{DK{=lLWV3|=<&si||2)l=8^8_z+Vho-#5LB0EqQ3v5U#*DF7 zxT)1j^`m+lW}p$>WSIG1eZ>L|YR-@Feu!YNWiw*IZYh03mq+2QVtQ}1ezRJM?0PA< z;mK(J5@N8>u@<6Y$QAHWNE};rR|)U_&bv8dsnsza7{=zD1VBcxrALqnOf-qW(zzTn zTAp|pEo#FsQ$~*$j|~Q;$Zy&Liu9OM;VF@#_&*nL!N2hH!Q6l*OeTxq!l>dEc{;Hw zCQni{iN%jHU*C;?M-VUaXxf0FEJ_G=C8)C-wD!DvhY+qQ#FT3}Th8;GgV&AV94F`D ztT6=w_Xm8)*)dBnDkZd~UWL|W=Glu!$hc|1w7_7l!3MAt95oIp4Xp{M%clu&TXehO z+L-1#{mjkpTF@?|w1P98OCky~S%@OR&o75P&ZHvC}Y=(2_{ib(-Al_7aZ^U?s34#H}= zGfFi5%KnFVCKtdO^>Htpb07#BeCXMDO8U}crpe1Gm`>Q=6qB4i=nLoLZ%p$TY=OcP z)r}Et-Ed??u~f09d3Nx3bS@ja!fV(Dfa5lXxRs#;8?Y8G+Qvz+iv7fiRkL3liip}) z&G0u8RdEC9c$$rdU53=MH`p!Jn|DHjhOxHK$tW_pw9wCTf0Eo<){HoN=zG!!Gq4z4 z7PwGh)VNPXW-cE#MtofE`-$9~nmmj}m zlzZscQ2+Jq%gaB9rMgVJkbhup0Ggpb)&L01T=%>n7-?v@I8!Q(p&+!fd+Y^Pu9l+u zek(_$^HYFVRRIFt@0Fp52g5Q#I`tC3li`;UtDLP*rA{-#Yoa5qp{cD)QYhldihWe+ zG~zuaqLY~$-1sjh2lkbXCX;lq+p~!2Z=76cvuQe*Fl>IFwpUBP+d^&E4BGc{m#l%Kuo6#{XGoRyFc%Hqhf|%nYd<;yiC>tyEyk z4I+a`(%%Ie=-*n z-{mg=j&t12)LH3R?@-B1tEb7FLMePI1HK0`Ae@#)KcS%!Qt9p4_fmBl5zhO10n401 zBSfnfJ;?_r{%R)hh}BBNSl=$BiAKbuWrNGQUZ)+0=Mt&5!X*D@yGCSaMNY&@`;^a4 z;v=%D_!K!WXV1!3%4P-M*s%V2b#2jF2bk!)#2GLVuGKd#vNpRMyg`kstw0GQ8@^k^ zuqK5uR<>FeRZ#3{%!|4X!hh7hgirQ@Mwg%%ez8pF!N$xhMNQN((yS(F2-OfduxxKE zxY#7O(VGfNuLv-ImAw5+h@gwn%!ER;*Q+001;W7W^waWT%@(T+5k!c3A-j)a8y11t zx4~rSN0s$M8HEOzkcWW4YbKK9GQez2XJ|Nq?TFy;jmGbg;`m&%U4hIiarKmdTHt#l zL=H;ZHE?fYxKQQXKnC+K!TAU}r086{4m}r()-QaFmU(qWhJlc$eas&y?=H9EYQy8N$8^bni9TpDp zkA^WRs?KgYgjxX4T6?`SMs$`s3vlut(YU~f2F+id(Rf_)$BIMibk9lACI~LA+i7xn z%-+=DHV*0TCTJp~-|$VZ@g2vmd*|2QXV;HeTzt530KyK>v&253N1l}bP_J#UjLy4) zBJili9#-ey8Kj(dxmW^ctorxd;te|xo)%46l%5qE-YhAjP`Cc03vT)vV&GAV%#Cgb zX~2}uWNvh`2<*AuxuJpq>SyNtZwzuU)r@@dqC@v=Ocd(HnnzytN+M&|Qi#f4Q8D=h ziE<3ziFW%+!yy(q{il8H44g^5{_+pH60Mx5Z*FgC_3hKxmeJ+wVuX?T#ZfOOD3E4C zRJsj#wA@3uvwZwHKKGN{{Ag+8^cs?S4N@6(Wkd$CkoCst(Z&hp+l=ffZ?2m%%ffI3 zdV7coR`R+*dPbNx=*ivWeNJK=Iy_vKd`-_Hng{l?hmp=|T3U&epbmgXXWs9ySE|=G zeQ|^ioL}tveN{s72_&h+F+W;G}?;?_s@h5>DX(rp#eaZ!E=NivgLI zWykLKev+}sHH41NCRm7W>K+_qdoJ8x9o5Cf!)|qLtF7Izxk*p|fX8UqEY)_sI_45O zL2u>x=r5xLE%s|d%MO>zU%KV6QKFiEeo12g#bhei4!Hm+`~Fo~4h|BJ)%ENxy9)Up zOxupSf1QZWun=)gF{L0YWJ<(r0?$bPFANrmphJ>kG`&7E+RgrWQi}ZS#-CQJ*i#8j zM_A0?w@4Mq@xvk^>QSvEU|VYQoVI=TaOrsLTa`RZfe8{9F~mM{L+C`9YP9?OknLw| zmkvz>cS6`pF0FYeLdY%>u&XpPj5$*iYkj=m7wMzHqzZ5SG~$i_^f@QEPEC+<2nf-{ zE7W+n%)q$!5@2pBuXMxhUSi*%F>e_g!$T-_`ovjBh(3jK9Q^~OR{)}!0}vdTE^M+m z9QWsA?xG>EW;U~5gEuKR)Ubfi&YWnXV;3H6Zt^NE725*`;lpSK4HS1sN?{~9a4JkD z%}23oAovytUKfRN87XTH2c=kq1)O5(fH_M3M-o{{@&~KD`~TRot-gqg7Q2U2o-iiF}K>m?CokhmODaLB z1p6(6JYGntNOg(s!(>ZU&lzDf+Ur)^Lirm%*}Z>T)9)fAZ9>k(kvnM;ab$ptA=hoh zVgsVaveXbMpm{|4*d<0>?l_JUFOO8A3xNLQOh%nVXjYI6X8h?a@6kDe5-m&;M0xqx z+1U$s>(P9P)f0!{z%M@E7|9nn#IWgEx6A6JNJ(7dk`%6$3@!C!l;JK-p2?gg+W|d- ziEzgk$w7k48NMqg$CM*4O~Abj3+_yUKTyK1p6GDsGEs;}=E_q>^LI-~pym$qhXPJf z2`!PJDp4l(TTm#|n@bN!j;-FFOM__eLl!6{*}z=)UAcGYloj?bv!-XY1TA6Xz;82J zLRaF{8ayzGa|}c--}|^xh)xgX>6R(sZD|Z|qX50gu=d`gEwHqC@WYU7{%<5VOnf9+ zB@FX?|UL%`8EIAe!*UdYl|6wRz6Y>(#8x92$#y}wMeE|ZM2X*c}dKJ^4NIf;Fm zNwzq%QcO?$NR-7`su!*$dlIKo2y(N;qgH@1|8QNo$0wbyyJ2^}$iZ>M{BhBjTdMjK z>gPEzgX4;g3$rU?jvDeOq`X=>)zdt|jk1Lv3u~bjHI=EGLfIR&+K3ldcc4D&Um&04 z3^F*}WaxR(ZyaB>DlmF_UP@+Q*h$&nsOB#gwLt{1#F4i-{A5J@`>B9@{^i?g_Ce&O z<<}_We-RUFU&&MHa1#t56u_oM(Ljn7djja!T|gcxSoR=)@?owC*NkDarpBj=W4}=i1@)@L|C) zQKA+o<(pMVp*Su(`zBC0l1yTa$MRfQ#uby|$mlOMs=G`4J|?apMzKei%jZql#gP@IkOaOjB7MJM=@1j(&!jNnyVkn5;4lvro1!vq ztXiV8HYj5%)r1PPpIOj)f!>pc^3#LvfZ(hz}C@-3R(Cx7R427*Fwd!XO z4~j&IkPHcBm0h_|iG;ZNrYdJ4HI!$rSyo&sibmwIgm1|J#g6%>=ML1r!kcEhm(XY& zD@mIJt;!O%WP7CE&wwE3?1-dt;RTHdm~LvP7K`ccWXkZ0kfFa2S;wGtx_a}S2lslw z$<4^Jg-n#Ypc(3t2N67Juasu=h)j&UNTPNDil4MQMTlnI81kY46uMH5B^U{~nmc6+ z9>(lGhhvRK9ITfpAD!XQ&BPphL3p8B4PVBN0NF6U49;ZA0Tr75AgGw7(S=Yio+xg_ zepZ*?V#KD;sHH+15ix&yCs0eSB-Z%D%uujlXvT#V$Rz@$+w!u#3GIo*AwMI#Bm^oO zLr1e}k5W~G0xaO!C%Mb{sarxWZ4%Dn9vG`KHmPC9GWZwOOm11XJp#o0-P-${3m4g( z6~)X9FXw%Xm~&99tj>a-ri})ZcnsfJtc10F@t9xF5vq6E)X!iUXHq-ohlO`gQdS&k zZl})3k||u)!_=nNlvMbz%AuIr89l#I$;rG}qvDGiK?xTd5HzMQkw*p$YvFLGyQM!J zNC^gD!kP{A84nGosi~@MLKqWQNacfs7O$dkZtm4-BZ~iA8xWZPkTK!HpA5zr!9Z&+icfAJ1)NWkTd!-9`NWU>9uXXUr;`Js#NbKFgrNhTcY4GNv*71}}T zFJh?>=EcbUd2<|fiL+H=wMw8hbX6?+_cl4XnCB#ddwdG>bki* zt*&6Dy&EIPluL@A3_;R%)shA-tDQA1!Tw4ffBRyy;2n)vm_JV06(4Or&QAOKNZB5f(MVC}&_!B>098R{Simr!UG}?CW1Ah+X+0#~0`X)od zLYablwmFxN21L))!_zc`IfzWi`5>MxPe(DmjjO1}HHt7TJtAW+VXHt!aKZk>y6PoMsbDXRJnov;D~Ur~2R_7(Xr)aa%wJwZhS3gr7IGgt%@;`jpL@gyc6bGCVx!9CE7NgIbUNZ!Ur1RHror0~ zr(j$^yM4j`#c2KxSP61;(Tk^pe7b~}LWj~SZC=MEpdKf;B@on9=?_n|R|0q;Y*1_@ z>nGq>)&q!;u-8H)WCwtL&7F4vbnnfSAlK1mwnRq2&gZrEr!b1MA z(3%vAbh3aU-IX`d7b@q`-WiT6eitu}ZH9x#d&qx}?CtDuAXak%5<-P!{a`V=$|XmJ zUn@4lX6#ulB@a=&-9HG)a>KkH=jE7>&S&N~0X0zD=Q=t|7w;kuh#cU=NN7gBGbQTT z;?bdSt8V&IIi}sDTzA0dkU}Z-Qvg;RDe8v>468p3*&hbGT1I3hi9hh~Z(!H}{+>eUyF)H&gdrX=k$aB%J6I;6+^^kn1mL+E+?A!A}@xV(Qa@M%HD5C@+-4Mb4lI=Xp=@9+^x+jhtOc zYgF2aVa(uSR*n(O)e6tf3JEg2xs#dJfhEmi1iOmDYWk|wXNHU?g23^IGKB&yHnsm7 zm_+;p?YpA#N*7vXCkeN2LTNG`{QDa#U3fcFz7SB)83=<8rF)|udrEbrZL$o6W?oDR zQx!178Ih9B#D9Ko$H(jD{4MME&<|6%MPu|TfOc#E0B}!j^MMpV69D#h2`vsEQ{(?c zJ3Lh!3&=yS5fWL~;1wCZ?)%nmK`Eqgcu)O6rD^3%ijcxL50^z?OI(LaVDvfL0#zjZ z2?cPvC$QCzpxpt5jMFp05OxhK0F!Q`rPhDi5)y=-0C} zIM~ku&S@pl1&0=jl+rlS<4`riV~LC-#pqNde@44MB(j%)On$0Ko(@q?4`1?4149Z_ zZi!5aU@2vM$dHR6WSZpj+VboK+>u-CbNi7*lw4K^ZxxM#24_Yc`jvb9NPVi75L+MlM^U~`;a7`4H0L|TYK>%hfEfXLsu1JGM zbh|8{wuc7ucV+`Ys1kqxsj`dajwyM;^X^`)#<+a~$WFy8b2t_RS{8yNYKKlnv+>vB zX(QTf$kqrJ;%I@EwEs{cIcH@Z3|#^S@M+5jsP<^`@8^I4_8MlBb`~cE^n+{{;qW2q z=p1=&+fUo%T{GhVX@;56kH8K_%?X=;$OTYqW1L*)hzelm^$*?_K;9JyIWhsn4SK(| zSmXLTUE8VQX{se#8#Rj*lz`xHtT<61V~fb;WZUpu(M)f#;I+2_zR+)y5Jv?l`CxAinx|EY!`IJ*x9_gf_k&Gx2alL!hK zUWj1T_pk|?iv}4EP#PZvYD_-LpzU!NfcLL%fK&r$W8O1KH9c2&GV~N#T$kaXGvAOl)|T zuF9%6(i=Y3q?X%VK-D2YIYFPH3f|g$TrXW->&^Ab`WT z7>Oo!u1u40?jAJ8Hy`bv}qbgs8)cF0&qeVjD?e+3Ggn1Im>K77ZSpbU*08 zfZkIFcv?y)!*B{|>nx@cE{KoutP+seQU?bCGE`tS0GKUO3PN~t=2u7q_6$l;uw^4c zVu^f{uaqsZ{*a-N?2B8ngrLS8E&s6}Xtv9rR9C^b`@q8*iH)pFzf1|kCfiLw6u{Z%aC z!X^5CzF6qofFJgklJV3oc|Qc2XdFl+y5M9*P8}A>Kh{ zWRgRwMSZ(?Jw;m%0etU5BsWT-Dj-5F;Q$OQJrQd+lv`i6>MhVo^p*^w6{~=fhe|bN z*37oV0kji)4an^%3ABbg5RC;CS50@PV5_hKfXjYx+(DqQdKC^JIEMo6X66$qDdLRc z!YJPSKnbY`#Ht6`g@xGzJmKzzn|abYbP+_Q(v?~~ z96%cd{E0BCsH^0HaWt{y(Cuto4VE7jhB1Z??#UaU(*R&Eo+J`UN+8mcb51F|I|n*J zJCZ3R*OdyeS9hWkc_mA7-br>3Tw=CX2bl(=TpVt#WP8Bg^vE_9bP&6ccAf3lFMgr` z{3=h@?Ftb$RTe&@IQtiJfV;O&4fzh)e1>7seG; z=%mA4@c7{aXeJnhEg2J@Bm;=)j=O=cl#^NNkQ<{r;Bm|8Hg}bJ-S^g4`|itx)~!LN zXtL}?f1Hs6UQ+f0-X6&TBCW=A4>bU0{rv8C4T!(wD-h>VCK4YJk`6C9$by!fxOYw- zV#n+0{E(0ttq_#16B} ze8$E#X9o{B!0vbq#WUwmv5Xz6{(!^~+}sBW{xctdNHL4^vDk!0E}(g|W_q;jR|ZK< z8w>H-8G{%R#%f!E7cO_^B?yFRKLOH)RT9GJsb+kAKq~}WIF)NRLwKZ^Q;>!2MNa|} z-mh?=B;*&D{Nd-mQRcfVnHkChI=DRHU4ga%xJ%+QkBd|-d9uRI76@BT(bjsjwS+r) zvx=lGNLv1?SzZ;P)Gnn>04fO7Culg*?LmbEF0fATG8S@)oJ>NT3pYAXa*vX!eUTDF ziBrp(QyDqr0ZMTr?4uG_Nqs6f%S0g?h`1vO5fo=5S&u#wI2d4+3hWiolEU!=3_oFo zfie?+4W#`;1dd#X@g9Yj<53S<6OB!TM8w8})7k-$&q5(smc%;r z(BlXkTp`C47+%4JA{2X}MIaPbVF!35P#p;u7+fR*46{T+LR8+j25oduCfDzDv6R-hU{TVVo9fz?^N3ShMt!t0NsH)pB zRK8-S{Dn*y3b|k^*?_B70<2gHt==l7c&cT>r`C#{S}J2;s#d{M)ncW(#Y$C*lByLQ z&?+{dR7*gpdT~(1;M(FfF==3z`^eW)=5a9RqvF-)2?S-(G zhS;p(u~_qBum*q}On@$#08}ynd0+spzyVco0%G6;<-i5&016cV5UKzhQ~)fX03|>L z8ej+HzzgVr6_5ZUpa4HW0Ca!=r1%*}Oo;2no&Zz8DfR)L!@r<5 z2viSZpmvo5XqXyAz{Ms7`7kX>fnr1gi4X~7KpznRT0{Xc5Cfz@43PjBMBoH@z_{~( z(Wd}IPJ9hH+%)Fc)0!hrV+(A;76rhtI|YHbEDeERV~Ya>SQg^IvlazFkSK(KG9&{q zkPIR~EeQaaBmwA<20}mBO?)N$(z1@p)5?%}rM| zGF()~Z&Kx@OIDRI$d0T8;JX@vj3^2%pd_+@l9~a4lntZ;AvUIjqIZbuNTR6@hNJoV zk4F;ut)LN4ARuyn2M6F~eg-e#UH%2P;8uPGFW^vq1vj8mdIayFOZo(tphk8C7hpT~ z1Fv8?b_LNR3QD9J+!v=p%}# \ No newline at end of file diff --git a/docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.ttf b/docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.ttf new file mode 100644 index 0000000000000000000000000000000000000000..1413fc609ab6f21774de0cb7e01360095584f65b GIT binary patch literal 45404 zcmd?Sd0-pWwLh*qi$?oCk~i6sWlOeWJC3|4juU5JNSu9hSVACzERcmjLV&P^utNzg zIE4Kr1=5g!SxTX#Ern9_%4&01rlrW`Z!56xXTGQR4C z3vR~wXq>NDx$c~e?;ia3YjJ*$!C>69a?2$lLyhpI!CFfJsP=|`8@K0|bbMpWwVUEygg0=0x_)HeHpGSJagJNLA3c!$EuOV>j$wi! zbo{vZ(s8tl>@!?}dmNHXo)ABy7ohD7_1G-P@SdJWT8*oeyBVYVW9*vn}&VI4q++W;Z+uz=QTK}^C75!`aFYCX# zf7fC2;o`%!huaTNJAB&VWrx=szU=VLhwnbT`vc<#<`4WI6n_x@AofA~2d90o?1L3w z9!I|#P*NQ)$#9aASijuw>JRld^-t)Zhmy|i-`Iam|IWkguaMR%lhi4p~cX-9& zjfbx}yz}s`4-6>D^+6FzihR)Y!GsUy=_MWi_v7y#KmYi-{iZ+s@ekkq!@Wxz!~BQwiI&ti z>hC&iBe2m(dpNVvSbZe3DVgl(dxHt-k@{xv;&`^c8GJY%&^LpM;}7)B;5Qg5J^E${ z7z~k8eWOucjX6)7q1a%EVtmnND8cclz8R1=X4W@D8IDeUGXxEWe&p>Z*voO0u_2!! zj3dT(Ki+4E;uykKi*yr?w6!BW2FD55PD6SMj`OfBLwXL5EA-9KjpMo4*5Eqs^>4&> z8PezAcn!9jk-h-Oo!E9EjX8W6@EkTHeI<@AY{f|5fMW<-Ez-z)xCvW3()Z#x0oydB zzm4MzY^NdpIF9qMp-jU;99LjlgY@@s+=z`}_%V*xV7nRV*Kwrx-i`FzI0BZ#yOI8# z!SDeNA5b6u9!Imj89v0(g$;dT_y|Yz!3V`i{{_dez8U@##|X9A};s^7vEd!3AcdyVlhVk$v?$O442KIM1-wX^R{U7`JW&lPr3N(%kXfXT_`7w^? z=#ntx`tTF|N$UT?pELvw7T*2;=Q-x@KmDUIbLyXZ>f5=y7z1DT<7>Bp0k;eItHF?1 zErzhlD2B$Tm|^7DrxnTYm-tgg`Mt4Eivp5{r$o9e)8(fXBO4g|G^6Xy?y$SM*&V52 z6SR*%`%DZC^w(gOWQL?6DRoI*hBNT)xW9sxvmi@!vI^!mI$3kvAMmR_q#SGn3zRb_ zGe$=;Tv3dXN~9XuIHow*NEU4y&u}FcZEZoSlXb9IBOA}!@J3uovp}yerhPMaiI8|SDhvWVr z^BE&yx6e3&RYqIg;mYVZ*3#A-cDJ;#ms4txEmwm@g^s`BB}KmSr7K+ruIoKs=s|gOXP|2 zb1!)87h9?(+1^QRWb(Vo8+@G=o24gyuzF3ytfsKjTHZJ}o{YznGcTDm!s)DRnmOX} z3pPL4wExoN$kyc2>#J`k+<67sy-VsfbQ-1u+HkyFR?9G`9r6g4*8!(!c65Be-5hUg zZHY$M0k(Yd+DT1*8)G(q)1&tDl=g9H7!bZTOvEEFnBOk_K=DXF(d4JOaH zI}*A3jGmy{gR>s}EQzyJa_q_?TYPNXRU1O;fcV_&TQZhd{@*8Tgpraf~nT0BYktu*n{a~ub^UUqQPyr~yBY{k2O zgV)honv{B_CqY|*S~3up%Wn%7i*_>Lu|%5~j)}rQLT1ZN?5%QN`LTJ}vA!EE=1`So z!$$Mv?6T)xk)H8JTrZ~m)oNXxS}pwPd#);<*>zWsYoL6iK!gRSBB{JCgB28C#E{T? z5VOCMW^;h~eMke(w6vLlKvm!!TyIf;k*RtK)|Q>_@nY#J%=h%aVb)?Ni_By)XNxY)E3`|}_u}fn+Kp^3p4RbhFUBRtGsDyx9Eolg77iWN z2iH-}CiM!pfYDIn7;i#Ui1KG01{3D<{e}uWTdlX4Vr*nsb^>l0%{O?0L9tP|KGw8w z+T5F}md>3qDZQ_IVkQ|BzuN08uN?SsVt$~wcHO4pB9~ykFTJO3g<4X({-Tm1w{Ufo zI03<6KK`ZjqVyQ(>{_aMxu7Zm^ck&~)Q84MOsQ-XS~{6j>0lTl@lMtfWjj;PT{nlZ zIn0YL?kK7CYJa)(8?unZ)j8L(O}%$5S#lTcq{rr5_gqqtZ@*0Yw4}OdjL*kBv+>+@ z&*24U=y{Nl58qJyW1vTwqsvs=VRAzojm&V zEn6=WzdL1y+^}%Vg!ap>x%%nFi=V#wn# zUuheBR@*KS)5Mn0`f=3fMwR|#-rPMQJg(fW*5e`7xO&^UUH{L(U8D$JtI!ac!g(Ze89<`UiO@L+)^D zjPk2_Ie0p~4|LiI?-+pHXuRaZKG$%zVT0jn!yTvvM^jlcp`|VSHRt-G@_&~<4&qW@ z?b#zIN)G(}L|60jer*P7#KCu*Af;{mpWWvYK$@Squ|n-Vtfgr@ZOmR5Xpl;0q~VILmjk$$mgp+`<2jP z@+nW5Oap%fF4nFwnVwR7rpFaOdmnfB$-rkO6T3#w^|*rft~acgCP|ZkgA6PHD#Of| zY%E!3tXtsWS`udLsE7cSE8g@p$ceu*tI71V31uA7jwmXUCT7+Cu3uv|W>ZwD{&O4Nfjjvl43N#A$|FWxId! z%=X!HSiQ-#4nS&smww~iXRn<-`&zc)nR~js?|Ei-cei$^$KsqtxNDZvl1oavXK#Pz zT&%Wln^Y5M95w=vJxj0a-ko_iQt(LTX_5x#*QfQLtPil;kkR|kz}`*xHiLWr35ajx zHRL-QQv$|PK-$ges|NHw8k6v?&d;{A$*q15hz9{}-`e6ys1EQ1oNNKDFGQ0xA!x^( zkG*-ueZT(GukSnK&Bs=4+w|(kuWs5V_2#3`!;f}q?>xU5IgoMl^DNf+Xd<=sl2XvkqviJ>d?+G@Z5nxxd5Sqd$*ENUB_mb8Z+7CyyU zA6mDQ&e+S~w49csl*UePzY;^K)Fbs^%?7;+hFc(xz#mWoek4_&QvmT7Fe)*{h-9R4 zqyXuN5{)HdQ6yVi#tRUO#M%;pL>rQxN~6yoZ)*{{!?jU)RD*oOxDoTjVh6iNmhWNC zB5_{R=o{qvxEvi(khbRS`FOXmOO|&Dj$&~>*oo)bZz%lPhEA@ zQ;;w5eu5^%i;)w?T&*=UaK?*|U3~{0tC`rvfEsRPgR~16;~{_S2&=E{fE2=c>{+y} zx1*NTv-*zO^px5TA|B```#NetKg`19O!BK*-#~wDM@KEllk^nfQ2quy25G%)l72<> zzL$^{DDM#jKt?<>m;!?E2p0l12`j+QJjr{Lx*47Nq(v6i3M&*P{jkZB{xR?NOSPN% zU>I+~d_ny=pX??qjF*E78>}Mgts@_yn`)C`wN-He_!OyE+gRI?-a>Om>Vh~3OX5+& z6MX*d1`SkdXwvb7KH&=31RCC|&H!aA1g_=ZY0hP)-Wm6?A7SG0*|$mC7N^SSBh@MG z9?V0tv_sE>X==yV{)^LsygK2=$Mo_0N!JCOU?r}rmWdHD%$h~~G3;bt`lH& zAuOOZ=G1Mih**0>lB5x+r)X^8mz!0K{SScj4|a=s^VhUEp#2M=^#WRqe?T&H9GnWa zYOq{+gBn9Q0e0*Zu>C(BAX=I-Af9wIFhCW6_>TsIH$d>|{fIrs&BX?2G>GvFc=<8` zVJ`#^knMU~65dWGgXcht`Kb>{V2oo%<{NK|iH+R^|Gx%q+env#Js*(EBT3V0=w4F@W+oLFsA)l7Qy8mx_;6Vrk;F2RjKFvmeq} zro&>@b^(?f))OoQ#^#s)tRL>b0gzhRYRG}EU%wr9GjQ#~Rpo|RSkeik^p9x2+=rUr}vfnQoeFAlv=oX%YqbLpvyvcZ3l$B z5bo;hDd(fjT;9o7g9xUg3|#?wU2#BJ0G&W1#wn?mfNR{O7bq747tc~mM%m%t+7YN}^tMa24O4@w<|$lk@pGx!;%pKiq&mZB z?3h<&w>un8r?Xua6(@Txu~Za9tI@|C4#!dmHMzDF_-_~Jolztm=e)@vG11bZQAs!tFvd9{C;oxC7VfWq377Y(LR^X_TyX9bn$)I765l=rJ%9uXcjggX*r?u zk|0!db_*1$&i8>d&G3C}A`{Fun_1J;Vx0gk7P_}8KBZDowr*8$@X?W6v^LYmNWI)lN92yQ;tDpN zOUdS-W4JZUjwF-X#w0r;97;i(l}ZZT$DRd4u#?pf^e2yaFo zbm>I@5}#8FjsmigM8w_f#m4fEP~r~_?OWB%SGWcn$ThnJ@Y`ZI-O&Qs#Y14To( zWAl>9Gw7#}eT(!c%D0m>5D8**a@h;sLW=6_AsT5v1Sd_T-C4pgu_kvc?7+X&n_fct znkHy(_LExh=N%o3I-q#f$F4QJpy>jZBW zRF7?EhqTGk)w&Koi}QQY3sVh?@e-Z3C9)P!(hMhxmXLC zF_+ZSTQU`Gqx@o(~B$dbr zHlEUKoK&`2gl>zKXlEi8w6}`X3kh3as1~sX5@^`X_nYl}hlbpeeVlj#2sv)CIMe%b zBs7f|37f8qq}gA~Is9gj&=te^wN8ma?;vF)7gce;&sZ64!7LqpR!fy)?4cEZposQ8 zf;rZF7Q>YMF1~eQ|Z*!5j0DuA=`~VG$Gg6B?Om1 z6fM@`Ck-K*k(eJ)Kvysb8sccsFf@7~3vfnC=<$q+VNv)FyVh6ZsWw}*vs>%k3$)9| zR9ek-@pA23qswe1io)(Vz!vS1o*XEN*LhVYOq#T`;rDkgt86T@O`23xW~;W_#ZS|x zvwx-XMb7_!hIte-#JNpFxskMMpo2OYhHRr0Yn8d^(jh3-+!CNs0K2B!1dL$9UuAD= zQ%7Ae(Y@}%Cd~!`h|wAdm$2WoZ(iA1(a_-1?znZ%8h72o&Mm*4x8Ta<4++;Yr6|}u zW8$p&izhdqF=m8$)HyS2J6cKyo;Yvb>DTfx4`4R{ zPSODe9E|uflE<`xTO=r>u~u=NuyB&H!(2a8vwh!jP!yfE3N>IiO1jI>7e&3rR#RO3_}G23W?gwDHgSgekzQ^PU&G5z&}V5GO? zfg#*72*$DP1T8i`S7=P;bQ8lYF9_@8^C(|;9v8ZaK2GnWz4$Th2a0$)XTiaxNWfdq z;yNi9veH!j)ba$9pke8`y2^63BP zIyYKj^7;2don3se!P&%I2jzFf|LA&tQ=NDs{r9fIi-F{-yiG-}@2`VR^-LIFN8BC4 z&?*IvLiGHH5>NY(Z^CL_A;yISNdq58}=u~9!Ia7 zm7MkDiK~lsfLpvmPMo!0$keA$`%Tm`>Fx9JpG^EfEb(;}%5}B4Dw!O3BCkf$$W-dF z$BupUPgLpHvr<<+QcNX*w@+Rz&VQz)Uh!j4|DYeKm5IC05T$KqVV3Y|MSXom+Jn8c zgUEaFW1McGi^44xoG*b0JWE4T`vka7qTo#dcS4RauUpE{O!ZQ?r=-MlY#;VBzhHGU zS@kCaZ*H73XX6~HtHd*4qr2h}Pf0Re@!WOyvres_9l2!AhPiV$@O2sX>$21)-3i+_ z*sHO4Ika^!&2utZ@5%VbpH(m2wE3qOPn-I5Tbnt&yn9{k*eMr3^u6zG-~PSr(w$p> zw)x^a*8Ru$PE+{&)%VQUvAKKiWiwvc{`|GqK2K|ZMy^Tv3g|zENL86z7i<c zW`W>zV1u}X%P;Ajn+>A)2iXZbJ5YB_r>K-h5g^N=LkN^h0Y6dPFfSBh(L`G$D%7c` z&0RXDv$}c7#w*7!x^LUes_|V*=bd&aP+KFi((tG*gakSR+FA26%{QJdB5G1F=UuU&koU*^zQA=cEN9}Vd?OEh| zgzbFf1?@LlPkcXH$;YZe`WEJ3si6&R2MRb}LYK&zK9WRD=kY-JMPUurX-t4(Wy{%` zZ@0WM2+IqPa9D(^*+MXw2NWwSX-_WdF0nMWpEhAyotIgqu5Y$wA=zfuXJ0Y2lL3#ji26-P3Z?-&0^KBc*`T$+8+cqp`%g0WB zTH9L)FZ&t073H4?t=(U6{8B+uRW_J_n*vW|p`DugT^3xe8Tomh^d}0k^G7$3wLgP& zn)vTWiMA&=bR8lX9H=uh4G04R6>C&Zjnx_f@MMY!6HK5v$T%vaFm;E8q=`w2Y}ucJ zkz~dKGqv9$E80NTtnx|Rf_)|3wxpnY6nh3U9<)fv2-vhQ6v=WhKO@~@X57N-`7Ppc zF;I7)eL?RN23FmGh0s;Z#+p)}-TgTJE%&>{W+}C`^-sy{gTm<$>rR z-X7F%MB9Sf%6o7A%ZHReD4R;imU6<9h81{%avv}hqugeaf=~^3A=x(Om6Lku-Pn9i zC;LP%Q7Xw*0`Kg1)X~nAsUfdV%HWrpr8dZRpd-#%)c#Fu^mqo|^b{9Mam`^Zw_@j@ zR&ZdBr3?@<@%4Z-%LT&RLgDUFs4a(CTah_5x4X`xDRugi#vI-cw*^{ncwMtA4NKjByYBza)Y$hozZCpuxL{IP&=tw6ZO52WY3|iwGf&IJCn+u(>icK zZB1~bWXCmwAUz|^<&ysd#*!DSp8}DLNbl5lRFat4NkvItxy;9tpp9~|@ z;JctShv^Iq4(z+y7^j&I?GCdKMVg&jCwtCkc4*@O7HY*veGDBtAIn*JgD$QftP}8= zxFAdF=(S>Ra6(4slk#h%b?EOU-96TIX$Jbfl*_7IY-|R%H zF8u|~hYS-YwWt5+^!uGcnKL~jM;)ObZ#q68ZkA?}CzV-%6_vPIdzh_wHT_$mM%vws9lxUj;E@#1UX?WO2R^41(X!nk$+2oJGr!sgcbn1f^yl1 z#pbPB&Bf;1&2+?};Jg5qgD1{4_|%X#s48rOLE!vx3@ktstyBsDQWwDz4GYlcgu$UJ zp|z_32yN72T*oT$SF8<}>e;FN^X&vWNCz>b2W0rwK#<1#kbV)Cf`vN-F$&knLo5T& z8!sO-*^x4=kJ$L&*h%rQ@49l?7_9IG99~xJDDil00<${~D&;kiqRQqeW5*22A`8I2 z(^@`qZoF7_`CO_e;8#qF!&g>UY;wD5MxWU>azoo=E{kW(GU#pbOi%XAn%?W{b>-bTt&2?G=E&BnK9m0zs{qr$*&g8afR_x`B~o zd#dxPpaap;I=>1j8=9Oj)i}s@V}oXhP*{R|@DAQXzQJekJnmuQ;vL90_)H_nD1g6e zS1H#dzg)U&6$fz0g%|jxDdz|FQN{KJ&Yx0vfuzAFewJjv`pdMRpY-wU`-Y6WQnJ(@ zGVb!-8DRJZvHnRFiR3PG3Tu^nCn(CcZHh7hQvyd7i6Q3&ot86XI{jo%WZqCPcTR0< zMRg$ZE=PQx66ovJDvI_JChN~k@L^Pyxv#?X^<)-TS5gk`M~d<~j%!UOWG;ZMi1af< z+86U0=sm!qAVJAIqqU`Qs1uJhQJA&n@9F1PUrYuW!-~IT>l$I!#5dBaiAK}RUufjg{$#GdQBkxF1=KU2E@N=i^;xgG2Y4|{H>s` z$t`k8c-8`fS7Yfb1FM#)vPKVE4Uf(Pk&%HLe z%^4L>@Z^9Z{ZOX<^e)~adVRkKJDanJ6VBC_m@6qUq_WF@Epw>AYqf%r6qDzQ~AEJ!jtUvLp^CcqZ^G-;Kz3T;O4WG45Z zFhrluCxlY`M+OKr2SeI697btH7Kj`O>A!+2DTEQ=48cR>Gg2^5uqp(+y5Sl09MRl* zp|28!v*wvMd_~e2DdKDMMQ|({HMn3D%%ATEecGG8V9>`JeL)T0KG}=}6K8NiSN5W< z79-ZdYWRUb`T}(b{RjN8>?M~opnSRl$$^gT`B27kMym5LNHu-k;A;VF8R(HtDYJHS zU7;L{a@`>jd0svOYKbwzq+pWSC(C~SPgG~nWR3pBA8@OICK$Cy#U`kS$I;?|^-SBC zBFkoO8Z^%8Fc-@X!KebF2Ob3%`8zlVHj6H;^(m7J35(_bS;cZPd}TY~qixY{MhykQ zV&7u7s%E=?i`}Ax-7dB0ih47w*7!@GBt<*7ImM|_mYS|9_K7CH+i}?*#o~a&tF-?C zlynEu1DmiAbGurEX2Flfy$wEVk7AU;`k#=IQE*6DMWafTL|9-vT0qs{A3mmZGzOyN zcM9#Rgo7WgB_ujU+?Q@Ql?V-!E=jbypS+*chI&zA+C_3_@aJal}!Q54?qsL0In({Ly zjH;e+_SK8yi0NQB%TO+Dl77jp#2pMGtwsgaC>K!)NimXG3;m7y`W+&<(ZaV>N*K$j zLL~I+6ouPk6_(iO>61cIsinx`5}DcKSaHjYkkMuDoVl>mKO<4$F<>YJ5J9A2Vl}#BP7+u~L8C6~D zsk`pZ$9Bz3teQS1Wb|8&c2SZ;qo<#F&gS;j`!~!ADr(jJXMtcDJ9cVi>&p3~{bqaP zgo%s8i+8V{UrYTc9)HiUR_c?cfx{Yan2#%PqJ{%?Wux4J;T$#cumM0{Es3@$>}DJg zqe*c8##t;X(4$?A`ve)e@YU3d2Balcivot{1(ahlE5qg@S-h(mPNH&`pBX$_~HdG48~)$x5p z{>ghzqqn_t8~pY<5?-To>cy^6o~mifr;KWvx_oMtXOw$$d6jddXG)V@a#lL4o%N@A zNJlQAz6R8{7jax-kQsH6JU_u*En%k^NHlvBB!$JAK!cYmS)HkLAkm0*9G3!vwMIWv zo#)+EamIJHEUV|$d|<)2iJ`lqBQLx;HgD}c3mRu{iK23C>G{0Mp1K)bt6OU?xC4!_ zZLqpFzeu&+>O1F>%g-%U^~yRg(-wSp@vmD-PT#bCWy!%&H;qT7rfuRCEgw67V!Qob z&tvPU@*4*$YF#2_>M0(75QxqrJr3Tvh~iDeFhxl=MzV@(psx%G8|I{~9;tv#BBE`l z3)_98eZqFNwEF1h)uqhBmT~mSmT8k$7vSHdR97K~kM)P9PuZdS;|Op4A?O<*%!?h` zn`}r_j%xvffs46x2hCWuo0BfIQWCw9aKkH==#B(TJ%p}p-RuIVzsRlaPL_Co{&R0h zQrqn=g1PGjQg3&sc2IlKG0Io#v%@p>tFwF)RG0ahYs@Zng6}M*d}Xua)+h&?$`%rb z;>M=iMh5eIHuJ5c$aC`y@CYjbFsJnSPH&}LQz4}za9YjDuao>Z^EdL@%saRm&LGQWXs*;FzwN#pH&j~SLhDZ+QzhplV_ij(NyMl z;v|}amvxRddO81LJFa~2QFUs z+Lk zZck)}9uK^buJNMo4G(rSdX{57(7&n=Q6$QZ@lIO9#<3pA2ceDpO_340B*pHlh_y{>i&c1?vdpN1j>3UN-;;Yq?P+V5oY`4Z(|P8SwWq<)n`W@AwcQ?E9 zd5j8>FT^m=MHEWfN9jS}UHHsU`&SScib$qd0i=ky0>4dz5ADy70AeIuSzw#gHhQ_c zOp1!v6qU)@8MY+ zMNIID?(CysRc2uZQ$l*QZVY)$X?@4$VT^>djbugLQJdm^P>?51#lXBkdXglYm|4{L zL%Sr?2f`J+xrcN@=0tiJt(<-=+v>tHy{XaGj7^cA6felUn_KPa?V4ebfq7~4i~GKE zpm)e@1=E;PP%?`vK6KVPKXjUXyLS1^NbnQ&?z>epHCd+J$ktT1G&L~T)nQeExe;0Z zlei}<_ni ztFo}j7nBl$)s_3odmdafVieFxc)m!wM+U`2u%yhJ90giFcU1`dR6BBTKc2cQ*d zm-{?M&%(={xYHy?VCx!ogr|4g5;V{2q(L?QzJGsirn~kWHU`l`rHiIrc-Nan!hR7zaLsPr4uR zG{En&gaRK&B@lyWV@yfFpD_^&z>84~_0Rd!v(Nr%PJhFF_ci3D#ixf|(r@$igZiWw za*qbXIJ_Hm4)TaQ=zW^g)FC6uvyO~Hg-#Z5Vsrybz6uOTF>Rq1($JS`imyNB7myWWpxYL(t7`H8*voI3Qz6mvm z$JxtArLJ(1wlCO_te?L{>8YPzQ})xJlvc5wv8p7Z=HviPYB#^#_vGO#*`<0r%MR#u zN_mV4vaBb2RwtoOYCw)X^>r{2a0kK|WyEYoBjGxcObFl&P*??)WEWKU*V~zG5o=s@ z;rc~uuQQf9wf)MYWsWgPR!wKGt6q;^8!cD_vxrG8GMoFGOVV=(J3w6Xk;}i)9(7*U zwR4VkP_5Zx7wqn8%M8uDj4f1aP+vh1Wue&ry@h|wuN(D2W;v6b1^ z`)7XBZ385zg;}&Pt@?dunQ=RduGRJn^9HLU&HaeUE_cA1{+oSIjmj3z+1YiOGiu-H zf8u-oVnG%KfhB8H?cg%@#V5n+L$MO2F4>XoBjBeX>css^h}Omu#)ExTfUE^07KOQS znMfQY2wz?!7!{*C^)aZ^UhMZf=TJNDv8VrrW;JJ9`=|L0`w9DE8MS>+o{f#{7}B4P z{I34>342vLsP}o=ny1eZkEabr@niT5J2AhByUz&i3Ck0H*H`LRHz;>3C_ru!X+EhJ z6(+(lI#4c`2{`q0o9aZhI|jRjBZOV~IA_km7ItNtUa(Wsr*Hmb;b4=;R(gF@GmsRI`pF+0tmq0zy~wnoJD(LSEwHjTOt4xb0XB-+ z&4RO{Snw4G%gS9w#uSUK$Zbb#=jxEl;}6&!b-rSY$0M4pftat-$Q)*y!bpx)R%P>8 zrB&`YEX2%+s#lFCIV;cUFUTIR$Gn2%F(3yLeiG8eG8&)+cpBlzx4)sK?>uIlH+$?2 z9q9wk5zY-xr_fzFSGxYp^KSY0s%1BhsI>ai2VAc8&JiwQ>3RRk?ITx!t~r45qsMnj zkX4bl06ojFCMq<9l*4NHMAtIxDJOX)H=K*$NkkNG<^nl46 zHWH1GXb?Og1f0S+8-((5yaeegCT62&4N*pNQY;%asz9r9Lfr;@Bl${1@a4QAvMLbV6JDp>8SO^q1)#(o%k!QiRSd0eTmzC< zNIFWY5?)+JTl1Roi=nS4%@5iF+%XztpR^BSuM~DX9q`;Mv=+$M+GgE$_>o+~$#?*y zAcD4nd~L~EsAjXV-+li6Lua4;(EFdi|M2qV53`^4|7gR8AJI;0Xb6QGLaYl1zr&eu zH_vFUt+Ouf4SXA~ z&Hh8K@ms^`(hJfdicecj>J^Aqd00^ccqN!-f-!=N7C1?`4J+`_f^nV!B3Q^|fuU)7 z1NDNT04hd4QqE+qBP+>ZE7{v;n3OGN`->|lHjNL5w40pePJ?^Y6bFk@^k%^5CXZ<+4qbOplxpe)l7c6m%o-l1oWmCx%c6@rx85hi(F=v(2 zJ$jN>?yPgU#DnbDXPkHLeQwED5)W5sH#-eS z%#^4dxiVs{+q(Yd^ShMN3GH)!h!@W&N`$L!SbElXCuvnqh{U7lcCvHI#{ZjwnKvu~ zAeo7Pqot+Ohm{8|RJsTr3J4GjCy5UTo_u_~p)MS&Z5UrUc|+;Mc(YS+ju|m3Y_Dvt zonVtpBWlM718YwaN3a3wUNqX;7TqvAFnVUoD5v5WTh~}r)KoLUDw%8Rrqso~bJqd> z_T!&Rmr6ebpV^4|knJZ%qmzL;OvG3~A*loGY7?YS%hS{2R0%NQ@fRoEK52Aiu%gj( z_7~a}eQUh8PnyI^J!>pxB(x7FeINHHC4zLDT`&C*XUpp@s0_B^!k5Uu)^j_uuu^T> z8WW!QK0SgwFHTA%M!L`bl3hHjPp)|wL5Var_*A1-H8LV?uY5&ou{hRjj>#X@rxV>5%-9hbP+v?$4}3EfoRH;l_wSiz{&1<+`Y5%o%q~4rdpRF0jOsCoLnWY5x?V)0ga>CDo`NpqS) z@x`mh1QGkx;f)p-n^*g5M^zRTHz%b2IkLBY{F+HsjrFC9_H(=9Z5W&Eymh~A_FUJ} znhTc9KG((OnjFO=+q>JQZJbeOoUM77M{)$)qQMcxK9f;=L;IOv_J>*~w^YOW744QZ zoG;!b9VD3ww}OX<8sZ0F##8hvfDP{hpa3HjaLsKbLJ8 z0WpY2E!w?&cWi7&N%bOMZD~o7QT*$xCRJ@{t31~qx~+0yYrLXubXh2{_L699Nl_pn z6)9eu+uUTUdjHXYs#pX^L)AIb!FjjNsTp7C399w&B{Q4q%yKfmy}T2uQdU|1EpNcY zDk~(h#AdxybjfzB+mg6rdU9mDZ^V>|U13Dl$Gj+pAL}lR2a1u!SJXU_YqP9N{ose4 zk+$v}BIHX60WSGVWv;S%zvHOWdDP(-ceo(<8`y@Goy%4wDu>57QZNJc)f>Ls+}9h7 z^N=#3q3|l?aG8K#HwiW2^PJu{v|x5;awYfahC?>_af3$LmMc4%N~JwVlRZa4c+eW2 zE!zosAjOv&UeCeu;Bn5OQUC=jtZjF;NDk9$fGbxf3d29SUBekX1!a$Vmq_VK*MHQ4)eB!dQrHH)LVYNF%-t8!d`@!cb z2CsKs3|!}T^7fSZm?0dJ^JE`ZGxA&a!jC<>6_y67On0M)hd$m*RAzo_qM?aeqkm`* zXpDYcc_>TFZYaC3JV>{>mp(5H^efu!Waa7hGTAts29jjuVd1vI*fEeB?A&uG<8dLZ z(j6;-%vJ7R0U9}XkH)1g>&uptXPHBEA*7PSO2TZ+dbhVxspNW~ZQT3fApz}2 z_@0-lZODcd>dLrYp!mHn4k>>7kibI!Em+Vh*;z}l?0qro=aJt68joCr5Jo(Vk<@i) z5BCKb4p6Gdr9=JSf(2Mgr=_6}%4?SwhV+JZj3Ox^_^OrQk$B^v?eNz}d^xRaz&~ zKVnlLnK#8^y=If2f1zmb~^5lPLe?%l}>?~wN4IN((2~U{e9fKhLMtYFj)I$(y zgnKv?R+ZpxA$f)Q2l=aqE6EPTK=i0sY&MDFJp!vQayyvzh4wee<}kybNthRlX>SHh z7S}9he^EBOqzBCww^duHu!u+dnf9veG{HjW!}aT7aJqzze9K6-Z~8pZAgdm1n~aDs z8_s7?WXMPJ3EPJHi}NL&d;lZP8hDhAXf5Hd!x|^kEHu`6QukXrVdLnq5zbI~oPo?7 z2Cbu8U?$K!Z4_yNM1a(bL!GRe!@{Qom+DxjrJ!B99qu5b*Ma%^&-=6UEbC+S2zX&= zQ!%bgJTvmv^2}hhvNQg!l=kbapAgM^hruE3k@jTxsG(B6d=4thBC*4tzVpCYXFc$a zeqgVB^zua)y-YjpiibCCdU%txXYeNFnXcbNj*D?~)5AGjL+!!ij_4{5EWKGav0^={~M^q}baAFOPzxfUM>`KPf|G z&hsaR*7(M6KzTj8Z?;45zX@L#xU{4n$9Q_<-ac(y4g~S|Hyp^-<*d8+P4NHe?~vfm z@y309=`lGdvN8*jw-CL<;o#DKc-%lb0i9a3%{v&2X($|Qxv(_*()&=xD=5oBg=$B0 zU?41h9)JKvP0yR{KsHoC>&`(Uz>?_`tlLjw1&5tPH3FoB%}j;yffm$$s$C=RHi`I3*m@%CPqWnP@B~%DEe;7ZT{9!IMTo1hT3Q347HJ&!)BM2 z3~aClf>aFh0_9||4G}(Npu`9xYY1*SD|M~9!CCFn{-J$u2&Dg*=5$_nozpoD2nxqq zB!--eA8UWZlcEDp4r#vhZ6|vq^9sFvRnA9HpHch5Mq4*T)oGbruj!U8Lx_G%Lby}o zTQ-_4A7b)5A42vA0U}hUJq6&wQ0J%$`w#ph!EGmW96)@{AUx>q6E>-r^Emk!iCR+X zdIaNH`$}7%57D1FyTccs3}Aq0<0Ei{`=S7*>pyg=Kv3nrqblqZcpsCWSQl^uMSsdj zYzh73?6th$c~CI0>%5@!Ej`o)Xm38u0fp9=HE@Sa6l2oX9^^4|Aq%GA z3(AbFR9gA_2T2i%Ck5V2Q2WW-(a&(j#@l6wE4Z`xg#S za#-UWUpU2U!TmIo`CN0JwG^>{+V#9;zvx;ztc$}@NlcyJr?q(Y`UdW6qhq!aWyB5xV1#Jb{I-ghFNO0 zFU~+QgPs{FY1AbiU&S$QSix>*rqYVma<-~s%ALhFyVhAYepId1 zs!gOB&weC18yhE-v6ltKZMV|>JwTX+X)Y_EI(Ff^3$WTD|Ea-1HlP;6L~&40Q&5{0 z$e$2KhUgH8ucMJxJV#M%cs!d~#hR^nRwk|uuCSf6irJCkSyI<%CR==tftx6d%;?ef zYIcjZrP@APzbtOeUe>m-TW}c-ugh+U*RbL1eIY{?>@8aW9bb1NGRy@MTse@>= za%;5=U}X%K2tKTYe9gjMcBvX%qrC&uZ`d(t)g)X8snf?vBe3H%dG=bl^rv8Z@YN$gd9yveHY0@Wt0$s zh^7jCp(q+6XDoekb;=%y=Wr8%6;z0ANH5dDR_VudDG|&_lYykJaiR+(y{zpR=qL3|2e${8 z2V;?jgHj7}Kl(d8C9xWRjhpf_)KOXl+@c4wrHy zL3#9U(`=N59og2KqVh>nK~g9>fX*PI0`>i;;b6KF|8zg+k2hViCt}4dfMdvb1NJ-Rfa7vL2;lPK{Lq*u`JT>S zoM_bZ_?UY6oV6Ja14X^;LqJPl+w?vf*C!nGK;uU^0GRN|UeFF@;H(Hgp8x^|;ygh? zIZx3DuO(lD01ksanR@Mn#lti=p28RTNYY6yK={RMFiVd~k8!@a&^jicZ&rxD3CCI! zVb=fI?;c#f{K4Pp2lnb8iF2mig)|6JEmU86Y%l}m>(VnI*Bj`a6qk8QL&~PFDxI8b z2mcsQBe9$q`Q$LfG2wdvK`M1}7?SwLAV&)nO;kAk`SAz%x9CDVHVbUd$O(*aI@D|s zLxJW7W(QeGpQY<$dSD6U$ja(;Hb3{Zx@)*fIQaW{8<$KJ&fS0caI2Py^clOq9@Irt z7th7F?7W`j{&UmM==Lo~T&^R7A?G=K_e-zfTX|)i`pLitlNE(~tq*}sS1x2}Jlul6 z5+r#4SpQu8h{ntIv#qCVH`uG~+I8l+7ZG&d`Dm!+(rZQDV*1LS^WfH%-!5aTAxry~ z4xl&rot5ct{xQ$w$MtVTUi6tBFSJWq2Rj@?HAX1H$eL*fk{Hq;E`x|hghRkipYNyt zKCO=*KSziiVk|+)qQCGrTYH9X!Z0$k{Nde~0Wl`P{}ca%nv<6fnYw^~9dYxTnTZB&&962jX0DM&wy&8fdxX8xeHSe=UU&Mq zRTaUKnQO|A>E#|PUo+F=Q@dMdt`P*6e92za(TH{5C*2I2S~p?~O@hYiT>1(n^Lqqn zqewq3ctAA%0E)r53*P-a8Ak32mGtUG`L^WVcm`QovX`ecB4E9X60wrA(6NZ7z~*_DV_e z8$I*eZ8m=WtChE{#QzeyHpZ%7GwFHlwo2*tAuloI-j2exx3#x7EL^&D;Re|Kj-XT- zt908^soV2`7s+Hha!d^#J+B)0-`{qIF_x=B811SZlbUe%kvPce^xu7?LY|C z@f1gRPha1jq|=f}Se)}v-7MWH9)YAs*FJ&v3ZT9TSi?e#jarin0tjPNmxZNU_JFJG z+tZi!q)JP|4pQ)?l8$hRaPeoKf!3>MM-bp06RodLa*wD=g3)@pYJ^*YrwSIO!SaZo zDTb!G9d!hb%Y0QdYxqNSCT5o0I!GDD$Z@N!8J3eI@@0AiJmD7brkvF!pJGg_AiJ1I zO^^cKe`w$DsO|1#^_|`6XTfw6E3SJ(agG*G9qj?JiqFSL|6tSD6vUwK?Cwr~gg)Do zp@$D~7~66-=p4`!!UzJDKAymb!!R(}%O?Uel|rMH>OpRGINALtg%gpg`=}M^Q#V5( zMgJY&gF)+;`e38QHI*c%B}m94o&tOfae;og&!J2;6ENW}QeL73jatbI1*9X~y=$Dm%6FwDcnCyMRL}zo`0=y7=}*Uw zo3!qZncAL{HCgY!+}eKr{P8o27ye+;qJP;kOB%RpSesGoHLT6tcYp*6v~Z9NCyb6m zP#qds0jyqXX46qMNhXDn3pyIxw2f_z;L_X9EIB}AhyC`FYI}G3$WnW>#NMy{0aw}nB%1=Z4&*(FaCn5QG(zvdG^pQRU25;{wwG4h z@kuLO0F->{@g2!;NNd!PfqM-;@F0;&wK}0fT9UrH}(8A5I zt33(+&U;CLN|8+71@g z(s!f-kZZZILUG$QXm9iYiE*>2w;gpM>lgM{R9vT3q>qI{ELO2hJHVi`)*jzOk$r)9 zq}$VrE0$GUCm6A3H5J-=Z9i*biw8ng zi<1nM0lo^KqRY@Asucc#DMmWsnCS;5uPR)GL3pL=-IqSd>4&D&NKSGHH?pG;=Xo`w zw~VV9ddkwbp~m>9G0*b?j7-0fOwR?*U#BE#n7A=_fDS>`fwatxQ+`FzhBGQUAyIRZ??eJt46vHBlR>9m!vfb6I)8!v6TmtZ%G6&E|1e zOtx5xy%yOSu+<9Ul5w5N=&~4Oph?I=ZKLX5DXO(*&Po>5KjbY7s@tp$8(fO|`Xy}Y z;NmMypLoG7r#Xz4aHz7n)MYZ7Z1v;DFHLNV{)to;(;TJ=bbMgud96xRMME#0d$z-S z-r1ROBbW^&YdQWA>U|Y>{whex#~K!ZgEEk=LYG8Wqo28NFv)!t!~}quaAt}I^y-m| z8~E{9H2VnyVxb_wCZ7v%y(B@VrM6lzk~|ywCi3HeiSV`TF>j+Ijd|p*kyn;=mqtf8&DK^|*f+y$38+9!sis9N=S)nINm9=CJ<;Y z!t&C>MIeyou4XLM*ywT_JuOXR>VkpFwuT9j5>667A=CU*{TBrMTgb4HuW&!%Yt`;#md7-`R`ouOi$rEd!ErI zo#>qggAcx?C7`rQ2;)~PYCw%CkS(@EJHZ|!!lhi@Dp$*n^mgrrImsS~(ioGak>3)w zvop0lq@IISuA0Ou*#1JkG{U>xSQV1e}c)!d$L1plFX5XDXX5N7Ns{kT{y5|6MfhBD+esT)e7&CgSW8FxsXTAY=}?0A!j_V9 zJ;IJ~d%av<@=fNPJ9)T3qE78kaz64E>dJaYab5uaU`n~Zdp2h{8DV%SKE5G^$LfuOTRRjB;TnT(Jk$r{Pfe4CO!SM_7d)I zquW~FVCpSycJ~c*B*V8?Qqo=GwU8CkmmLFugfHQ7;A{yCy1OL-+X=twLYg9|H=~8H znnN@|tCs^ZLlCBl5wHvYF}2vo>a6%mUWpTds_mt*@wMN4-r`%NTA%+$(`m6{MNpi@ zMx)8f>U4hd!row@gM&PVo&Hx+lV@$j9yWTjTue zG9n0DP<*HUmJ7ZZWwI2x+{t3QEfr6?T}2iXl=6e0b~)J>X3`!fXd9+2wc1%cj&F@Z zgYR|r5Xd5jy9;YW&=4{-0rJ*L5CgDPj9^3%bp-`HkyBs`j1iTUGD4?WilZ6RO8mIE z+~Joc?GID6K96dyuv(dWREK9Os~%?$$FxswxQsoOi8M?RnL%B~Lyk&(-09D0M?^Jy zWjP)n(b)TF<-|CG%!Vz?8Fu&6iU<>oG#kGcrcrrBlfZMVl0wOJvsq%RL9To%iCW@)#& zZAJWhgzYAq)#NTNb~3GBcD%ZZOc43!YWSyA7TD6xkk)n^FaRAz73b}%9d&YisBic(?mv=Iq^r%Ug zzHq-rRrhfOOF+yR=AN!a9*Rd#sM9ONt5h~w)yMP7Dl9lfpi$H0%GPW^lS4~~?vI8Z z%^ToK#NOe0ExmUsb`lLO$W*}yXNOxPe@zD*90uTDULnH6C?InP3J=jYEO2d)&e|mP z1DSd0QOZeuLWo*NqZzopA+LXy9)fJC00NSX=_4Mi1Z)YyZVC>C!g}cY(Amaj%QN+bev|Xxd2OPD zk!dfkY6k!(sDBvsFC2r^?}hb81(WG5Lt9|riT`2?P;B%jaf5UX<~OJ;uAL$=Ien+V zC!V8u0v?CUa)4*Q+Q_u zkx{q;NjLcvyMuU*{+uDsCQ4U{JLowYby-tn@hatL zy}X>9y08#}oytdn^qfFesF)Tt(2!XGw#r%?7&zzFFh2U;#U9XBO8W--#gOpfbJ`Ey z|M8FCKlWQrOJwE;@Sm02l9OBr7N}go4V8ur)}M@m2uWjggb)DC4s`I4d7_8O&E(j; z?3$9~R$QDxNM^rNh9Y;6P7w+bo2q}NEd6f&_raor-v`UCaTM3TT8HK2-$|n{N@U>_ zL-`P7EXoEU5JRMa)?tNUEe8XFis+w8g9k(QQ)%?&Oac}S`2V$b?%`DwXBgja&&fR@ zH_XidF$p1wA)J|Wk1;?lCl?fgc)=TB3>Y8;BoMqHwJqhL)Tgydv9(?(TBX)fq%=~C zmLj!iX-kn7QA(9snzk0LRf<%SzO&~IhLor6A3f*U^UcoAygRe!H#@UCv$JUP&vPxs zeDj$1%#<2T1!e|!7xI+~_VXLl5|jHqvOhU7ZDUGee;HnkcPP=_k_FFxPjXg*9KyI+ zIh0@+s)1JDSuKMeaDZ3|<_*J8{TUFDLl|mXmY8B>Wj_?4mC#=XjsCKPEO=p0c&t&Z zd1%kHxR#o9S*C?du*}tEHfAC7WetnvS}`<%j=o7YVna)6pw(xzkUi7f#$|^y4WQ{7 zu@@lu=j6xr*11VEIY+`B{tgd(c3zO8%nGk0U^%ec6h)G_`ki|XQXr!?NsQkxzV6Bn1ea9L+@ z(Zr7CU_oXaW>VOdfzENm+FlFQ7Se0ROrNdw(QLvb6{f}HRQ{$Je>(c&rws#{dFI^r zZ4^(`J*G0~Pu_+p5AAh>RRpkcbaS2a?Fe&JqxDTp`dIW9;DL%0wxX5;`KxyA4F{(~_`93>NF@bj4LF!NC&D6Zm+Di$Q-tb2*Q z&csGmXyqA%Z9s(AxNO3@Ij=WGt=UG6J7F;r*uqdQa z?7j!nV{8eQE-cwY7L(3AEXF3&V*9{DpSYdyCjRhv#&2johwf{r+k`QB81%!aRVN<& z@b*N^xiw_lU>H~@4MWzgHxSOGVfnD|iC7=hf0%CPm_@@4^t-nj#GHMug&S|FJtr?i z^JVrobltd(-?Ll>)6>jwgX=dUy+^n_ifzM>3)an3iOzpG9Tu;+96TP<0Jm_PIqof3 zMn=~M!#Ky{CTN_2f7Y-i#|gW~32RCWKA4-J9sS&>kYpTOx#xVNLCo)A$LUme^fVNH z@^S7VU^UJ0YR8?Oy$^IYuG*bm|g;@aX~i60%`7XLy*AYpYvZ^F^U(!|RW z*C!rJ@+7TGdL=nNd1gv^%B+;Fcr$y)i0!GRsZXRHPs>QVGVR{9r_#&Qd(wL|5;H;> zD>HUw=4CF++&{7$<8G@j*nGjhEO%BQYfjeItp4mPvY*JYb1HKd!{HJ9*)(3%BR%{Pp?AM&*yHAJsW({ivOzj*qS!-7|XEn6@zo z3L*tBT%<4RxoAh>q{0n_JBmgW6&8hx?kL(_^k%VL>?xjAyrKBmSl`$=V|SK}ELl}@ zd|d0eo#RfG`bw9SK3%r4Y+rdvc}w}~ixV%tqawbdqvE-WcgE+BUpxMT%F@btm76MG zn=oQRWWuTm+a{dy)Oc2V4yX(@M{QAkx>(QB59*`dLT`Pz3Lsj9iB=HSHAiCq()ns|Cr)1*c605Cx}3V&x}Lg?b+6Q?)z7Kl zQh&1Hx`y6JY-Cwvd*ozeps}a1xAA0CR+Da;+O(i)P1C;SjOI}Dtmf6tPqo-Bl`U78 zv$kYgPntPp@G)n1an9tEoL*Vumu9`>_@I(;+5+fBa-*?fEx=mTEjZ7wq}#@Gd5_cW z!mP{N=yqEntDo)|>oy6{9cu+-3*GTnmb^`O0^FzRPO^&aG`f@F_R*aQ_e{F+_9%NW z4KG_B`@X3EVV9L>?_RNDMddA>w=e0KfAiw5?#i1NFT%Zz#nuv(&!yIU>lVxmzYKQ` zzJ*0w9<&L4aJ6A;0j|_~i>+y(q-=;2Xxhx2v%CYY^{} z^J@LO()eLo|7!{ghQ+(u$wxO*xY#)cL(|miH2_ck2yN{mu4O9=hBW*pM_()-_YdH#Ru{JtwJ^R2}3?!>>m1pohh zrn(!xCjE0Q&EH1QK?zA%sxVh&H99cObJUY$veZhQ)MLu-h%`!*G)s$2k;~+A z)Kk->Ri?`oGDEJEtI*wijm(s5f$W78FH{+qBxiU{~kq((J3uK{m z$|C8K#j-?hm8H@x%VfFqpnvu@xn1s%J7uNZC9C99a<_b1J|mx%)$%!6gPU|~<@2&m zz99GDp`|a%m*iggvfL;4%X;~WY>)@!tMWB@P`)k?$;0x9JSrRI8?s3rlgH(o@`OAo zn{f*gZ#t2u6K??hx|aElOM`Xd0t+SAIUEHvFw%?Wsm$s zUXq{6UU?a>Nc@@Xlb_2k9M1Ctr<#+O?yd}rv z_wu&=_t$!Yngd@N_AUj}T; z#*Ce|%XZr_sQcsWcsl{pCnnj+c8ZNIMmx<;w=-g$Q>BU;9k;w|zQ;4!W32Xg2Cd?{ zvmO3kuKQ^Hv;o>6ZHP8ZJ2`4~Bx?N;cf<0fi=!*G^^WzbTF3e$b&d^qqB{>nqLG81 zs94bBh%|Vj+hLu=!8(b9brJ>ZBns9^6s(gdSVyP9qnu2_I{Sg8j-rloG6{d`De5We zDe5WeY3ga}Y3ga}Y3ga}Y3ga}Y3ga}d8y~6o|k%F>UpW>rJk31Ug~+N=cS&HdOqs; zsOO`ek9t1p`Kafko{xGy>iMbXr=FjBxZMYc8a#gL`Kjlpo}YSt>iMY`pk9DF0qO*( z6QE9jIsxhgs1u-0kUBx8D@eT{^@7w3QZGooAoYUO3sNscy%6<6)C*BBM7L`dk$Xk%6}eZQXgo#!75P`>Uy*-B{uTLGUy*-B{uTLGUy*-B{uTLG))v8{5gt_uj9!t5)^yb-JtjRGrhi zYInOUNJxNyf_yKX01)K=WP|Si>HqEj|B{eUl?MR<)%<1&{(~)D+NPwKxWqT-@~snp zg9KCz1VTZDiS?UH`PRk1VPM{29cgT9=D?!Wc_@}qzggFv;gb@2cJQAYWWtpEZ7?y@jSVqjx${B5UV@SO|wH<<0; z{><1KdVI%Ki}>~<`46C0AggwUwx-|QcU;iiZ{NZu`ur>hd*|Hb(|6veERqxu=b@5Bab=rqptGxd{QJg!4*-i_$sES~)AB46}Fjg|ea#e@?J}z%CUJ zOsLWRQR1#ng^sD)A4FDuY!iUhzlgfJh(J@BRqd&P#v2B`+saBx>m+M&q7vk-75$NH%T5pi%m z5FX?`2-5l53=a&GkC9^NZCLpN5(DMKMwwab$FDIs?q>4!!xBS}75gX_5;(luk;3Vl zLCLd5a_8`Iyz}K}+#RMwu6DVk3O_-}n>aE!4NaD*sQn`GxY?cHe!Bl9n?u&g6?aKm z-P8z&;Q3gr;h`YIxX%z^o&GZZg1=>_+hP2$$-DnL_?7?3^!WAsY4I7|@K;aL<>OTK zByfjl2PA$T83*LM9(;espx-qB%wv7H2i6CFsfAg<9V>Pj*OpwX)l?^mQfr$*OPPS$ z=`mzTYs{*(UW^ij1U8UfXjNoY7GK*+YHht(2oKE&tfZuvAyoN(;_OF>-J6AMmS5fB z^sY6wea&&${+!}@R1f$5oC-2J>J-A${@r(dRzc`wnK>a7~8{Y-scc|ETOI8 zjtNY%Y2!PI;8-@a=O}+{ap1Ewk0@T`C`q!|=KceX9gK8wtOtIC96}-^7)v23Mu;MH zhKyLGOQMujfRG$p(s`(2*nP4EH7*J57^=|%t(#PwCcW7U%e=8Jb>p6~>RAlY4a*ts=pl}_J{->@kKzxH|8XQ5{t=E zV&o`$D#ZHdv&iZWFa)(~oBh-Osl{~CS0hfM7?PyWUWsr5oYlsyC1cwULoQ4|Y5RHA2*rN+EnFPnu z`Y_&Yz*#550YJwDy@brZU>0pWV^RxRjL221@2ABq)AtA%Cz?+FG(}Yh?^v)1Lnh%D zeM{{3&-4#F9rZhS@DT0E(WRkrG!jC#5?OFjZv*xQjUP~XsaxL2rqRKvPW$zHqHr8Urp2Z)L z+)EvQeoeJ8c6A#Iy9>3lxiH3=@86uiTbnnJJJoypZ7gco_*HvKOH97B? zWiwp>+r}*Zf9b3ImxwvjL~h~j<<3shN8$k-$V1p|96I!=N6VBqmb==Bec|*;HUg?) z4!5#R*(#Fe)w%+RH#y{8&%%!|fQ5JcFzUE;-yVYR^&Ek55AXb{^w|@j|&G z|6C-+*On%j;W|f8mj?;679?!qY86c{(s1-PI2Wahoclf%1*8%JAvRh1(0)5Vu37Iz z`JY?RW@qKr+FMmBC{TC7k@}fv-k8t6iO}4K-i3WkF!Lc=D`nuD)v#Na zA|R*no51fkUN3^rmI;tty#IK284*2Zu!kG13!$OlxJAt@zLU`kvsazO25TpJLbK&;M8kw*0)*14kpf*)3;GiDh;C(F}$- z1;!=OBkW#ctacN=je*Pr)lnGzX=OwgNZjTpVbFxqb;8kTc@X&L2XR0A7oc!Mf2?u9 zcctQLCCr+tYipa_k=;1ETIpHt!Jeo;iy^xqBES^Ct6-+wHi%2g&)?7N^Yy zUrMIu){Jk)luDa@7We5U!$$3XFNbyRT!YPIbMKj5$IEpTX1IOtVP~(UPO2-+9ZFi6 z-$3<|{Xb#@tABt0M0s1TVCWKwveDy^S!!@4$s|DAqhsEv--Z}Dl)t%0G>U#ycJ7cy z^8%;|pg32=7~MJmqlC-x07Sd!2YX^|2D`?y;-$a!rZ3R5ia{v1QI_^>gi(HSS_e%2 zUbdg^zjMBBiLr8eSI^BqXM6HKKg#@-w`a**w(}RMe%XWl3MipvBODo*hi?+ykYq)z ziqy4goZw0@VIUY65+L7DaM5q=KWFd$;W3S!Zi>sOzpEF#(*3V-27N;^pDRoMh~(ZD zJLZXIam0lM7U#)119Hm947W)p3$%V`0Tv+*n=&ybF&}h~FA}7hEpA&1Y!BiYIb~~D z$TSo9#3ee02e^%*@4|*+=Nq6&JG5>zX4k5f?)z*#pI-G(+j|jye%13CUdcSP;rNlY z#Q!X%zHf|V)GWIcEz-=fW6AahfxI~y7w7i|PK6H@@twdgH>D_R@>&OtKl}%MuAQ7I zcpFmV^~w~8$4@zzh~P~+?B~%L@EM3x(^KXJSgc6I=;)B6 zpRco2LKIlURPE*XUmZ^|1vb?w*ZfF}EXvY13I4af+()bAI5V?BRbFp`Sb{8GRJHd* z4S2s%4A)6Uc=PK%4@PbJ<{1R6+2THMk0c+kif**#ZGE)w6WsqH z`r^DL&r8|OEAumm^qyrryd(HQ9olv$ltnVGB{aY?_76Uk%6p;e)2DTvF(;t=Q+|8b zqfT(u5@BP);6;jmRAEV057E*2d^wx@*aL1GqWU|$6h5%O@cQtVtC^isd%gD7PZ_Io z_BDP5w(2*)Mu&JxS@X%%ByH_@+l>y07jIc~!@;Raw)q_;9oy@*U#mCnc7%t85qa4? z%_Vr5tkN^}(^>`EFhag;!MpRh!&bKnveQZAJ4)gEJo1@wHtT$Gs6IpznN$Lk-$NcM z3ReVC&qcXvfGX$I0nfkS$a|Pm%x+lq{WweNc;K>a1M@EAVWs2IBcQPiEJNt}+Ea8~WiapASoMvo(&PdUO}AfC~>ZGzqWjd)4no( ziLi#e3lOU~sI*XPH&n&J0cWfoh*}eWEEZW%vX?YK!$?w}htY|GALx3;YZoo=JCF4@ zdiaA-uq!*L5;Yg)z-_`MciiIwDAAR3-snC4V+KA>&V%Ak;p{1u>{Lw$NFj)Yn0Ms2*kxUZ)OTddbiJM}PK!DM}Ot zczn?EZXhx3wyu6i{QMz_Ht%b?K&-@5r;8b076YDir`KXF0&2i9NQ~#JYaq*}Ylb}^ z<{{6xy&;dQ;|@k_(31PDr!}}W$zF7Jv@f%um0M$#=8ygpu%j(VU-d5JtQwT714#f0z+Cm$F9JjGr_G!~NS@L9P;C1? z;Ij2YVYuv}tzU+HugU=f9b1Wbx3418+xj$RKD;$gf$0j_A&c;-OhoF*z@DhEW@d9o zbQBjqEQnn2aG?N9{bmD^A#Um6SDKsm0g{g_<4^dJjg_l_HXdDMk!p`oFv8+@_v_9> zq;#WkQ!GNGfLT7f8m60H@$tu?p;o_It#TApmE`xnZr|_|cb3XXE)N^buLE`9R=Qbg zXJu}6r07me2HU<)S7m?@GzrQDTE3UH?FXM7V+-lT#l}P(U>Fvnyw8T7RTeP`R579m zj=Y>qDw1h-;|mX-)cSXCc$?hr;43LQt)7z$1QG^pyclQ1Bd!jbzsVEgIg~u9b38;> zfsRa%U`l%did6HzPRd;TK{_EW;n^Ivp-%pu0%9G-z@Au{Ry+EqEcqW=z-#6;-!{WA z;l+xC6Zke>dl+(R1q7B^Hu~HmrG~Kt575mzve>x*cL-shl+zqp6yuGX)DDGm`cid! znlnZY=+a5*xQ=$qM}5$N+o!^(TqTFHDdyCcL8NM4VY@2gnNXF|D?5a558Lb*Yfm4) z_;0%2EF7k{)i(tTvS`l5he^KvW%l&-suPwpIlWB_Za1Hfa$@J!emrcyPpTKKM@NqL z?X_SqHt#DucWm<3Lp}W|&YyQE27zbGP55=HtZmB(k*WZA79f##?TweCt{%5yuc+Kx zgfSrIZI*Y57FOD9l@H0nzqOu|Bhrm&^m_RK6^Z<^N($=DDxyyPLA z+J)E(gs9AfaO`5qk$IGGY+_*tEk0n_wrM}n4G#So>8Dw6#K7tx@g;U`8hN_R;^Uw9JLRUgOQ?PTMr4YD5H7=ryv)bPtl=<&4&% z*w6k|D-%Tg*F~sh0Ns(h&mOQ_Qf{`#_XU44(VDY8b})RFpLykg10uxUztD>gswTH} z&&xgt>zc(+=GdM2gIQ%3V4AGxPFW0*l0YsbA|nFZpN~ih4u-P!{39d@_MN)DC%d1w z7>SaUs-g@Hp7xqZ3Tn)e z7x^sC`xJ{V<3YrmbB{h9i5rdancCEyL=9ZOJXoVHo@$$-%ZaNm-75Z-Ry9Z%!^+STWyv~To>{^T&MW0-;$3yc9L2mhq z;ZbQ5LGNM+aN628)Cs16>p55^T^*8$Dw&ss_~4G5Go63gW^CY+0+Z07f2WB4Dh0^q z-|6QgV8__5>~&z1gq0FxDWr`OzmR}3aJmCA^d_eufde7;d|OCrKdnaM>4(M%4V`PxpCJc~UhEuddx9)@)9qe_|i z)0EA%&P@_&9&o#9eqZCUCbh?`j!zgih5sJ%c4(7_#|Xt#r7MVL&Q+^PQEg3MBW;4T zG^4-*8L%s|A}R%*eGdx&i}B1He(mLygTmIAc^G(9Si zK7e{Ngoq>r-r-zhyygK)*9cj8_%g z)`>ANlipCdzw(raeqP-+ldhyUv_VOht+!w*>Sh+Z7(7(l=9~_Vk ztsM|g1xW`?)?|@m2jyAgC_IB`Mtz(O`mwgP15`lPb2V+VihV#29>y=H6ujE#rdnK` zH`EaHzABs~teIrh`ScxMz}FC**_Ii?^EbL(n90b(F0r0PMQ70UkL}tv;*4~bKCiYm zqngRuGy`^c_*M6{*_~%7FmOMquOEZXAg1^kM`)0ZrFqgC>C%RJvQSo_OAA(WF3{euE}GaeA?tu5kF@#62mM$a051I zNhE>u>!gFE8g#Jj95BqHQS%|>DOj71MZ?EYfM+MiJcX?>*}vKfGaBfQFZ3f^Q-R1# znhyK1*RvO@nHb|^i4Ep_0s{lZwCNa;Ix<{E5cUReguJf+72QRZIc%`9-Vy)D zWKhb?FbluyDTgT^naN%l2|rm}oO6D0=3kfXO2L{tqj(kDqjbl(pYz9DykeZlk4iW5 zER`)vqJxx(NOa;so@buE!389-YLbEi@6rZG0#GBsC+Z0fzT6+d7deYVU;dy!rPXiE zmu73@Jr&~K{-9MVQD}&`)e>yLNWr>Yh8CXae9XqfvVQ&eC_;#zpoaMxZ0GpZz7xjx z`t_Q-F?u=vrRPaj3r<9&t6K=+egimiJ8D4gh-rUYvaVy zG($v+3zk5sMuOhjxkH7bQ}(5{PD3Mg?!@8PkK&w>n7tO8FmAmoF30_#^B~c(Q_`4L zYWOoDVSnK|1=p{+@`Fk^Qb81Xf89_S`RSTzv(a4ID%71nll%{Wad$!CKfeTKkyC?n zCkMKHU#*nz_(tO$M)UP&ZfJ#*q(0Gr!E(l5(ce<3xut+_i8XrK8?Xr7_oeHz(bZ?~8q5q~$Rah{5@@7SMN zx9PnJ-5?^xeW2m?yC_7A#WK*B@oIy*Y@iC1n7lYKj&m7vV;KP4TVll=II)$39dOJ^czLRU>L> z68P*PFMN+WXxdAu=Hyt3g$l(GTeTVOZYw3KY|W0Fk-$S_`@9`K=60)bEy?Z%tT+Iq z7f>%M9P)FGg3EY$ood+v$pdsXvG? zd2q3abeu-}LfAQWY@=*+#`CX8RChoA`=1!hS1x5dOF)rGjX4KFg!iPHZE2E=rv|A} zro(8h38LLFljl^>?nJkc+wdY&MOOlVa@6>vBki#gKhNVv+%Add{g6#-@Z$k*ps}0Y zQ=8$)+Nm||)mVz^aa4b-Vpg=1daRaOU)8@BY4jS>=5n#6abG@(F2`=k-eQ9@u# zxfNFHv=z2w@{p1dzSOgHokX1AUGT0DY4jQI@YMw)EWQ~q5wmR$KQ}Y;(HPMSQCwzu zdli|G?bj(>++CP)yQ4s6YfpDc3KqPmquQSxg%*EnTWumWugbDW5ef%8j-rT#3rJu? z)5n;4b2c*;2LIW%LmvUu6t1~di~}0&Svy}QX#ER|hDFZwl!~zUP&}B1oKAxIzt~so zb!GaJYOb#&qRUjEI1xe_`@7qv_-LggQ$JE8+{ryT4%ldwC5ete+{G3C#g@^oxfY3#F zcLlj(l2G8>tC<5XWV|6_DZQZ7ow?MD8EZ9mM2oV~WoV-uoExmbwpzc6eMV}%J_{3l zW(4t2a-o}XRlU|NSiYn!*nR(Sc>*@TuU*(S77gfCi7+WR%2b;4#RiyxWR3(u5BIdf zo@#g4wQjtG3T$PqdX$2z8Zi|QP~I^*9iC+(!;?qkyk&Q7v>DLJGjS44q|%yBz}}>i z&Ve%^6>xY<=Pi9WlwpWB%K10Iz`*#gS^YqMeV9$4qFchMFO}(%y}xs2Hn_E}s4=*3 z+lAeCKtS}9E{l(P=PBI;rsYVG-gw}-_x;KwUefIB@V%RLA&}WU2XCL_?hZHoR<7ED zY}4#P_MmX(_G_lqfp=+iX|!*)RdLCr-1w`4rB_@bI&Uz# z!>9C3&LdoB$r+O#n);WTPi;V52OhNeKfW6_NLnw zpFTuLC^@aPy~ZGUPZr;)=-p|b$-R8htO)JXy{ecE5a|b{{&0O%H2rN&9(VHxmvNly zbY?sVk}@^{aw)%#J}|UW=ucLWs%%j)^n7S%8D1Woi$UT}VuU6@Sd6zc2+t_2IMBxd zb4R#ykMr8s5gKy=v+opw6;4R&&46$V+OOpDZwp3iR0Osqpjx))joB*iX+diVl?E~Q zc|$qmb#T#7Kcal042LUNAoPTPUxF-iGFw>ZFnUqU@y$&s8%h-HGD`EoNBbe#S>Y-4 zlkeAP>62k~-N zHQqXXyN67hGD6CxQIq_zoepU&j0 zYO&}<4cS^2sp!;5))(aAD!KmUED#QGr48DVlwbyft31WlS2yU<1>#VMp?>D1BCFfB z_JJ-kxTB{OLI}5XcPHXUo}x~->VP%of!G_N-(3Snvq`*gX3u0GR&}*fFwHo3-vIw0 zeiWskq3ZT9hTg^je{sC^@+z3FAd}KNhbpE5RO+lsLgv$;1igG7pRwI|;BO7o($2>mS(E z$CO@qYf5i=Zh6-xB=U8@mR7Yjk%OUp;_MMBfe_v1A(Hqk6!D})x%JNl838^ZA13Xu zz}LyD@X2;5o1P61Rc$%jcUnJ>`;6r{h5yrEbnbM$$ntA@P2IS1PyW^RyG0$S2tUlh z8?E(McS?7}X3nAAJs2u_n{^05)*D7 zW{Y>o99!I9&KQdzgtG(k@BT|J*;{Pt*b|?A_})e98pXCbMWbhBZ$t&YbNQOwN^=F) z_yIb_az2Pyya2530n@Y@s>s>n?L79;U-O9oPY$==~f1gXro5Y z*3~JaenSl_I}1*&dpYD?i8s<7w%~sEojqq~iFnaYyLgM#so%_ZZ^WTV0`R*H@{m2+ zja4MX^|#>xS9YQo{@F1I)!%RhM{4ZUapHTKgLZLcn$ehRq(emb8 z9<&Nx*RLcS#)SdTxcURrJhxPM2IBP%I zf1bWu&uRf{60-?Gclb5(IFI*!%tU*7d`i!l@>TaHzYQqH4_Y*6!Wy0d-B#Lz7Rg3l zqKsvXUk9@6iKV6#!bDy5n&j9MYpcKm!vG7z*2&4G*Yl}iccl*@WqKZWQSJCgQSj+d ze&}E1mAs^hP}>`{BJ6lv*>0-ft<;P@`u&VFI~P3qRtufE11+|#Y6|RJccqo27Wzr}Tp|DH z`G4^v)_8}R24X3}=6X&@Uqu;hKEQV^-)VKnBzI*|Iskecw~l?+R|WKO*~(1LrpdJ? z0!JKnCe<|m*WR>m+Qm+NKNH<_yefIml z+x32qzkNRrhR^IhT#yCiYU{3oq196nC3ePkB)f%7X1G^Ibog$ZnYu4(HyHUiFB`6x zo$ty-8pknmO|B9|(5TzoHG|%>s#7)CM(i=M7Nl=@GyDi-*ng6ahK(&-_4h(lyUN-oOa$` zo+P;C4d@m^p9J4c~rbi$rq9nhGxayFjhg+Rqa{l#`Y z!(P6K7fK3T;y!VZhGiC#)|pl$QX?a)a9$(4l(usVSH>2&5pIu5ALn*CqBt)9$yAl; z-{fOmgu><7YJ5k>*0Q~>lq72!XFX6P5Z{vW&zLsraKq5H%Z26}$OKDMv=sim;K?vsoVs(JNbgTU8-M%+ zN(+7Xl}`BDl=KDkUHM9fLlV)gN&PqbyX)$86!Wv!y+r*~kAyjFUKPDWL3A)m$@ir9 zjJ;uQV9#3$*`Dqo1Cy5*;^8DQcid^Td=CivAP+D;gl4b7*xa9IQ-R|lY5tIpiM~9- z%Hm9*vDV@_1FfiR|Kqh_5Ml0sm?abD>@peo(cnhiSWs$uy&$RYcd+m`6%X9FN%?w}s~Q=3!pJzbN~iJ}bbM*PPi@!E0eN zhKcuT=kAsz8TQo76CMO+FW#hr6da({mqpGK2K4T|xv9SNIXZ}a=4_K5pbz1HE6T}9 zbApW~m0C`q)S^F}B9Kw5!eT)Bj_h9vlCX8%VRvMOg8PJ*>PU>%yt-hyGOhjg!2pZR4{ z=VR_*?Hw|aai##~+^H>3p$W@6Zi`o4^iO2Iy=FPdEAI58Ebc~*%1#sh8KzUKOVHs( z<3$LMSCFP|!>fmF^oESZR|c|2JI3|gucuLq4R(||_!8L@gHU8hUQZKn2S#z@EVf3? zTroZd&}JK(mJLe>#x8xL)jfx$6`okcHP?8i%dW?F%nZh=VJ)32CmY;^y5C1^?V0;M z<3!e8GZcPej-h&-Osc>6PU2f4x=XhA*<_K*D6U6R)4xbEx~{3*ldB#N+7QEXD^v=I z+i^L+V7_2ld}O2b-(#bmv*PyZI4|U#Q5|22a(-VLOTZc3!9ns1RI-? zA<~h|tPH0y*bO1#EMrsWN>4yJM7vqFZr?uw$H8*PhiHRQg1U9YoscX-G|gck+SSRX!(e7@~eeUEw+POsT;=W9J&=EV`cUc{PIg_#TQVGnZsQbCs7#Q-)v#BicxLw#Fb?#)8TYbu zN)5R=MI1i7FHhF|X}xEl=sW~`-kf;fOR^h1yjthSw?%#F{HqrY2$q>7!nbw~nZ8q9 zh{vY! z%i=H!!P&wh z7_E%pB7l5)*VU>_O-S~d5Z!+;f{pQ4e86*&);?G<9*Q$JEJ!ZxY;Oj5&@^eg0Zs!iLCAR`2K?MSFzjX;kHD6)^`&=EZOIdW>L#O`J zf~$M4}JiV}v6B-e{NUBGFgj-*H%NG zfY0X(@|S8?V)drF;2OQcpDl2LV=~=%gGx?_$fbSsi@%J~taHcMTLLpjNF8FkjnjyM zW;4sSf6RHaa~LijL#EJ0W2m!BmQP(f=%Km_N@hsBFw%q#7{Er?y1V~UEPEih87B`~ zv$jE%>Ug9&=o+sZVZL7^+sp)PSrS;ZIJac4S-M>#V;T--4FXZ*>CI7w%583<{>tb6 zOZ8gZ#B0jplyTbzto2VOs)s9U%trre`m=RlKf{I_Nwdxn(xNG%zaVNurEYiMV3*g| z``3;{j7`UyfFrjlEbIJN{0db|r>|LA@=vX9CHFZYiexnkn$b%8Rvw0TZOQIXa;oTI zv@j;ZP+#~|!J(aBz9S{wL7W%Dr1H)G-XUNt9-lP?ijJ-XEj1e*CI~-Xz@4(Xg;UoG z{uzBf-U+(SHe}6oG%;A*93Zb=oE>uTb^%qsL>|bQf?7_6=KIiPU`I|r;YcZ!YG7y~ zQu@UldAwz$^|uoz3mz1;An-WVBtefSh-pv<`n&TU3oM!hrEI?l@v8A4#^$4t&~T32 zl*J=1q~h+60sNc43>0aVvhzyfjshgPYZoQ(OOh>LbUIoblb@1z~zp?))n?^)q6WGuDh}gMUaA9|X z3qq-XlcNldy5==T4rq*~g@XVY!9sYZjo#R7 zr{n)r5^S{9+$+8l7IVB*3_k5%-TBY@C%`P@&tZf>82sm#nfw7L%92>nN$663yW!yt zhS>EfLcE_Z)gv-Y^h1;xj(<4nD4GY{C-nWUgQc9cMmH{qpa!uEznrGF^?bbJHApScQ$j>$JZHAX80DdXu z--AMgrA0$Otdd#N9#!cg2Z~N8&lj1d+wDh+^ZObWJ$J)_h(&2#msu>q0B$DEERy{1 zCJN{7M@%#E@8pda`@u!v@{gcT3bA*>g*xYLXlbb&o@1vX*x+l}Voys6o~^_7>#GB| z*r!R%kA9k%J`?m>1tMHB9x$ZRe0$r~ui}X}jOC)9LH=Po*2SLdtf3^4?VKnu2ox&mV~0oDgi` z;9d}P$g~9%ThTK8s}5ow2V4?(-lU*ed8ro|}mU}pk% z;bqB0bx3AOk<0Joeh}Vl@_7Po&C`Cg>>gff>e7fu41U3Ic{JQu1W%+!Gvz3GDO2ixKd;KF6UEw8F_cDAh08gB>@ zaRH2Q96sBJ>`4aXvrF0xPtIWoA1pPsRQtU~xDtnEfTJnl{A9u5pR^K8=UdNq%T8F$)FbN> zgK+_(BF#D>R>kK!M#OT~=@@}3yAYqm33?{Bv?2iBr|-aRK0@uapzuXI)wE0=R@m^7 zQ`wLBn(M*wg!mgmQT1d!@3<2z>~rmDW)KG0*B4>_R6LjiI0^9QT8gtDDT|Lclxppm z+OeL6H3QpearJAB%1ellZ6d*)wBQ(hPbE=%?y6i^uf%`RXm*JW*WQ%>&J+=V(=qf{ zri~yItvTZbII+7S0>4Q0U9@>HnMP$X>8TqAfD(vAh};2P{QK)ik`a6$W$nG<{bR2Ufd!^iE z#1K58$gW!xpeYHeehuhQCXZ9p%N8m zB+l~T_u-Ycr!U>!?xu!!*6rNxq37{`DhMMfY6NpD3Jw zkYQDstvt30Hc_SaZuuMP2YrdW@HsPMbf^Y9lI<9$bnMil2X7`Ba-DGLbzgqP>mxwe zf1&JkDH54D3nLar2KjJ3z`*R+rUABq4;>>4Kjc2iQEj7pVLcZYZ~pteAG4rm1{>PQy=!QiV5G|tVk)53 zP?Azw+N)Yq3zZ`dW7Q9Bq@Y*jSK0<1f`HM;_>GH57pf_S%Ounz_yhTY8lplQSM`xx zU{r-Deqs+*I~sLI$Oq`>i`J1kJ(+yNOYy$_>R3Jfi680<|^u#J@aY%Q>O zqfI~sCbk#3--^zMkV&Yj0D(R^rK}+_npgPr_4^kYuG=pO%$C_7v{s@-{M-P@RL3^<`kO@b=YdKMuccfO1ZW# zeRYE%D~CMAgPlo?T!O6?b|pOZv{iMWb;sN=jF%=?$Iz_5zH?K;aFGU^8l7u%zHgiy z%)~y|k;Es-7YX69AMj^epGX#&^c@pp+lc}kKc`5CjPN4Z$$e58$Yn*J?81%`0~A)D zPg-db*pj-t4-G9>ImW4IMi*v#9z^9VD9h@9t;3jMAUVxt=oor+16yHf{lT|G4 zya6{4#BxFw!!~UTRwXXawKU4iz$$GMY6=Z8VM{2@0{=5A0+A#p6$aT3ubRyWMWPq9 zCEH5(Il0v4e4=Yxg(tDglfYAy!UpC>&^4=x7#6_S&Ktds)a8^`^tp6RnRd{KImB^o z2n=t#>iKx<*evmvoE{+fH#@WXGWs$)Uxrtf?r>AaxV0?kf0o@oDboJ6z0cgP@A$;k>SK1UqC?Q_ zk_I?j74;}uNXhOf_5ZxQSgB4otDEb9JJrX1kq`-o%T>g%M5~xXf!2_4P~K64tKgXq z&KHZ0@!cPvUJG4kw-0;tPo$zJrU-Nop>Uo65Pm|yaNvKjhi7V1g98;^N1~V3% zTR>yWa+X2FJ_wpPwz3i^6AGwOa_VMS-&`*KoKgF2&oR10Jn6{!pvVG@n=Jk@vjNuY zL~P7aDGhg~O9G^!bHi$8?G9v9Gp0cmekYkK;(q=47;~gI>h-kx-ceM{ml$#8KI$4ltyjaqP zki^cyDERloAb)dcDBU4na9C(pfD{P@eBGA}0|Rb)p{ISqi60=^FUEdF!ok{Gs;vb) zfj9(#1QA64w*ud^YsN5&PeiI>c`VioE8h)e}W%S9NMA55Gs zrWL6l+@3CKd@8(UQLTwe12SGWMqRn+j)QZRj*g)Xua)%ayzpqs{pD(WWESJYL3{M$ z%qkpM`jFoqLYVv6{IbCkL?fEiJj$VG=$taup&RL9e{s(Sgse2xVJlw0h74EXJKt2eX|dxz{->0)3W`JN7Bv!rLvRZc z0tAOZ2yVe4g9iq826qXAg`f!*+}(o1;1FDb>kKexumFS40KvK0yH1_@Z=LgWZ+}(Y zwYsa;OLz6tTA%gS=>8$=Z7pLh>|K2QElL)E=Q*(n*H`8R`8={-@4mTD-SWBOYRxV? zmF(-rJB8^Wlp?319rTrh^?QEP?|Msxrv?WbJ-+id+V#F2Y4(JPJ6U9bv+U1cIIH^W z)lg$_=g^Ma>2~Pyd_YOAv29Cb-U6DJO?NxnW7~QP*SmYi*vdUVuW#LWQ_u0`hymZi zaQS3Nb^4`ro$>0G%zbXmr5|D|iq0R<;S@?kr0j5Ruq87-Z1>crx%EzVZ9#U;{?}ti zW2W%*9MQg3Nbh%Ti6LhDd|-aFSgXoPG`mHlUU1iCHr>ru>DX?W_#13(`u*!Plu2OP z6jk=2>BC0l)aw;HCmxoYD1i4b%m$1`DYC_^L~ zIEAnFcHvad=-aO3(_MI=9#`z6-9*_!&$?<%meb5;jGd5Qp=MGf z6BD{%`L#TAOq%z%@*ib95Ey7NbUF=BlszVk3Iu3imD&*91N-ij%hW?W@~2TtdHTfP z#n0@Xd7X8Dyu36n{k#PwQ~T~X7mAO^cNV+z<HO@3X-# z_@rAn$k~(l@kciCC;&Qd*fWRI>=;fL{UPlciNDWyj$bX<#r^(r;EE8wwUVQm&7~QY zCXRj!**r^xybAEPq>h3W$uvI1j=yNIyzkE_D7fpGw)OV{U*Uwm{xB;mEg2(|y|ICd zMdQVqzMb-=XM6|E-a9kNh)^9lY`-DjhhHD1w5lufRcy+QLgJ47!fFne86#F; zX{ufroVBEZJOY?rDo!;Te6aOZ^1SO!dYRxQ*2njyA~dCWawn)>!*k7~>8Ikt&e*0>>V5ZbO|*1+2LFOqVe zXHb!aMk03^h%&9L8GMy7UDI2Kev>V@(R}*Iu6x+!Hn4~D@wj`P%#Hdbf(lK{+DD7f zJ&(v*mhn_e(R$^5L#bM^^Q@-!*b!l|+Xrb(q*MRFJYnrE7*xko!SJOy9LngR2|q5k zY`Ioiu+YBfzF{Labszk-E#*BYQk>$()=xWEGZRKwY)*UxP}0dGuPLZOkNJDI9Hy zFjfwiK6RjhH#rHW#B0(MW}i%V`943<6@Z*Nd^JEP5uZonXm=u%AM>{H^U@&Jy*i0s za_Da^xI6pMtXzHc{e~_ZcnKP*;=YL2Z^RmzDl{dJTk7*}E_h*NvgnhnxVKB59Duh~ zqouS_WoOR*{UvUw_K#OWz;gMracr%8>QQ&V*jv!8)ho;U8}9~8EU{N<=Z_gR%IpMT zbkePUG_afm=#|iIfFmdqkpLMGxY5D$`?I}&T7>TexU@v zkBx09kG)O;09ckj#(_Uov6vv{{HOcr-%H#DUQ@*GzF8Zh{iSM13%fuB%>wjdU@3Nf zlnYE!GTyNrqes|;nLFXfWU*Wg-9wmr=NBd$nCk+H?iwNvcd0Wab^3CT9a`>3V~oWI z9=_H+N-Q=MQ(io4u4mpdQ;k&5FXnKV5M7R`@WJ9h(GrAirO#XXOU{qQpk^B^Vd=Dt{wiqT zg-#j9J~@o%H2;W9mg)o6@*Vo;BSs2*4HAHpDk02mndAsov08R_48zJZ@J)s7+hyCo zy*0L#y)?AqZt-wX%+_Vx`8*A95OLHvs1$k~{h-_N_vov_gHJE=`X>L?5K+ zD?u59=mjtImMvd1GsDytuYp{IyUkW&?h zF>$#`n$~bZ)KN0B$XGeMYh&`;g8 zo_2-koaO6+8O!+L>SpIQbG(i;QW9UJi{Ecewlo?s&D!^>i$|#jaW}#HJuxt|W48=? zb^Y&O$a1s5ddr8DIt!sD!t=y1g(d4GR(s;s-HfV$GXl&m;+sAAxB^rk(3_NjE$p#L z*t4em?tA0d+XwRxN^OQwzbDZMuSE0J1)Ky{mq)^t4bnSl*)s>zNM@mMdtd78&ebHN z`!(|lE5q-p+TsRaNnMXwALaN5QIZ2IUi^Z22tsN5>nvIO+YU}Q*xh6}ee6@rR~<&1 z(PB4z>9ZBUMXZwSMmd9-aKKsmJeJq^G|#JclOh*xf0?^e0(`40nsg1z)(48;4}B_( zGwPI)yo|{oX{dVDL-5-aMGr;~vU1cPtJP5JM(sswz&Q`e<@0?y{YhsO9YK8EYJA;L z>7oG_Mts+(wCBC*Md82#XdKw&J*IizR?9k^rf1r{Ot-&>V^ke{9nI9zavlcNkIJtN z7T>?o|4rENk-?|lewZ(EfdR;%BUrzKJ^UkCpsM)EA9QHBVV8trT&*O(9?FO{MLTFL z=5P0H+T6C^jAuX0k4U;~GM!x`!X2N~3_n?qXY$HI>x@(DHEy&Q3ucT1R6fj28wX!I zC=&d$@bJ_v^%?W2Ngl}e8ww`b%BrN-PzGH;$@B2Ky1?%GMkm#~Okj(-Admyy;qya| zOi73kr_pwt?5Nj3p=&H>81!w#>Agj z(QXx{j0r=pTl>micAI_5vUw<3`Sht?Z}-j2Wx~F8DKCUQrsXl2?W8hur42(F_ zsSJ)_36&x6A|YkY6c<2a94SXbv~d>4CC4nkDPvf9Z5Fys^6^5r0j5=E>Cgy_Dk@tS z%?c}9!qB?t6t8(XMH%le8UeNWp@Nsma~Ql+^3Bo%_npMryeQJz4V=BAqE~T?dejng z3ge{fjCHoNAfYBvsfq;G%VL|j7t z`X0sy1EEgpyD;)tS1x+fnv-?C@glP0{RCW}Ma?3qpoq_&IJAYOy3G#s`rsh5=3>`K zkj``=;|*x5HSjZC zXNvPLh372q;=+6ja|SC!R-`JcL}}wwskajjTUGTpL(1zkN-p?BA2lmf+J3WsB7!k`0Brx8^cLTF9h)r+LZ$vsZo}`OpOs)?c6$hclR!R#MAeh|_DY|9r zy+_3c%IO9h9X?ksp?an&>Lw;QeQ`T-Ku6HaK~H?E9-Z5$cZu{YU;1+-6B$|JD;%!^ zt(4l>F8}a-UkC4YtOxFHckhl4VKr6P$P_O*U!)IDory%}Wz`YeFx6TO{y2Y${SBm?H9cTWV=WWJ z`_*CGso!ZN>l@~_jkeXtV}fczfA{TUkyeD>)i3|NFGcCsBmK3HXp&ol_@GVs7PIpfULy!hi zs+%KYgS%(n7_z_}6)hblk~W#LZ@&2)fwm6xkFP%&Ju|MFWbNiTwy{{g-pV1RK`L&=RE2D z4|g;~vd8xd|teYS%w!IlT4W$&FTrk-hcTADX!P?*f1YWEIRwq$Ys%^(Z9w&HT$>} zsMD#6Df=uJrX!JHP7<>Or;e_Cf=}`!`qR=i8fBj)$6Lxx{HRzd8Tnzd0p>kSps{OG zKJkml>bUj8$u|F=``l(-aMxWBC@CGZ#FXClQZ<4|&%jN}Tkg#q8z)=>Ly{$i0`rjU zvt|QddO&i=91e?h3>s~i;+6{ z8X4i6a1wDLrSuE#W(zhan+U*Zq+8p3a))JFVF4ffaV51K^YgTso~3;Y*NmM; zx8T?y-N0uyWY(8=me-HUC9xtABvX5~%yg+Cp&XF$Bq=OcK6T*D7eZ2EmIoCFWm{$S z1PNw8HDpe5hHeCusN8kdeb&f2#=3M^A~7YwJ7FRrhq*)PG9x?JIAaC{MV}5}g#7R$-Ly%)4=IUkRCGOR|XTMjn&okRmFjaO^YF5^* z@)#MCBOBezD)*xQNxydlUyN?dW{fS(s-T`gv*0BEnk}`BdmrbmPO8q8y(X$AA}*RH%I7Av!~84pudHb&%Q5-j zt?=6x(iR?<^_7X0v6Ys#VAL}dKk^hcjI=|EY;kPcZ_w<*H`_*|N7SacaM1ERD@6ab zg`!iTm7$URV+lpW_{V$ruR&A>jrX68k4x2wo$45}&wf7o<|o(@B!u-L@bKyQBAGwy z4#}UrRAu>^>Vb6k2-th^>WjvP;Nl|i3WrjWv3ISkj{m{eAcQIW^_ndxSX@|8T(ASJ z?_$fcP2u*6uOBk-{d>^ z0vWlfGQMvysI%R=iE|A+!!Nw?C917EU*_$`;;)px?s83CRd3i_jBN)k#nR5t$dJ(+ z_sP;wG@Ad)^(3LRj7q}0b2O(b`|i0~5SYb%Sjk^*5ISZ-Ab+}DGu$-X1n^TF1Ndw_ zF|e*1)cI2%`TR&AW~XpqpFb!=3cHbS>np9hYD_Mr5}y5Y`SY^r7isA2Q4(z zazRQEqWDKT2zIEbjSYdCPi1ZOGz80Nsl}gxO^DWMY0AV<2K&OL{&^6#@L1?lXu#6xSMh%3^5c*}oM6DQGY#(a^@z<&D zF(43I9e&5`h|A$5!+UFuOH0>F3$shBV4`0#M4RSB8=6F0ZgIbq<2LQ$Hh^(kAJu=! zt8ZGXTacD{(3W{V1$j_{Jc)Ka7t6u}ho`4kF+4@t_0!mCBn z)}o%eA}L)_L?=jw6BIfll7tb3n}?*yLt&XADa=rW>qz=_6s9ziOd5sXjil>FVFx3r zf>Feewk0v#W9>Gp4GacTRr>Sd2T6dWi-{YX`v!D)kCWzG5xQB=?es5ON(%nkwUhNl zV>@xkWWWv*N+{e$(SrExvN6BXzU(Hxlx27{VYHf+LpIbTO+Yu(ltMk<;)3A(LU@ytVYFkYvTa79idMtUFhfxx?P!)2F`prNWW#Fub#l>N2s@nh&n_ zA4{#}|AIs9|A4P0ZF%fy=hDN!t#ifH<)4u2kirK~JUpjQ-J+~cXOZI&dIts;P}UeXslP6zKvpEKSN-$y>kJ^nw2tC9bv zo(|lT@?vZ!{_l|d^8Yh)eEBh*5ABh+Lzjw+?V)o z#P-W7361>E(Y4;@`sv;VKn G`u_lkUM?>H literal 0 HcmV?d00001 diff --git a/docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.woff2 b/docs/archive/1.0/sql/tutorial/fonts/glyphicons-halflings-regular.woff2 new file mode 100644 index 0000000000000000000000000000000000000000..64539b54c3751a6d9adb44c8e3a45ba5a73b77f0 GIT binary patch literal 18028 zcmV(~K+nH-Pew8T0RR9107h&84*&oF0I^&E07eM_0Rl|`00000000000000000000 z0000#Mn+Uk92y`7U;vDA2m}!b3WBL5f#qcZHUcCAhI9*rFaQJ~1&1OBl~F%;WnyLq z8)b|&?3j;$^FW}&KmNW53flIFARDZ7_Wz%hpoWaWlgHTHEHf()GI0&dMi#DFPaEt6 zCO)z0v0~C~q&0zBj^;=tv8q{$8JxX)>_`b}WQGgXi46R*CHJ}6r+;}OrvwA{_SY+o zK)H-vy{l!P`+NG*`*x6^PGgHH4!dsolgU4RKj@I8Xz~F6o?quCX&=VQ$Q{w01;M0? zKe|5r<_7CD z=eO3*x!r$aX2iFh3;}xNfx0v;SwBfGG+@Z;->HhvqfF4r__4$mU>Dl_1w;-9`~5rF~@!3;r~xP-hZvOfOx)A z#>8O3N{L{naf215f>m=bzbp7_(ssu&cx)Qo-{)!)Yz3A@Z0uZaM2yJ8#OGlzm?JO5gbrj~@)NB4@?>KE(K-$w}{};@dKY#K3+Vi64S<@!Z{(I{7l=!p9 z&kjG^P~0f46i13(w!hEDJga;*Eb z`!n|++@H8VaKG<9>VDh(y89J#=;Z$ei=GnD5TesW#|Wf)^D+9NKN4J3H5PF_t=V+Z zdeo8*h9+8&Zfc?>>1|E4B7MAx)^uy$L>szyXre7W|81fjy+RZ1>Gd}@@${~PCOXo) z$#HZd3)V3@lNGG%(3PyIbvyJTOJAWcN@Uh!FqUkx^&BuAvc)G}0~SKI`8ZZXw$*xP zum-ZdtPciTAUn$XWb6vrS=JX~f5?M%9S(=QsdYP?K%Odn0S0-Ad<-tBtS3W06I^FK z8}d2eR_n!(uK~APZ-#tl@SycxkRJ@5wmypdWV{MFtYBUY#g-Vv?5AEBj1 z`$T^tRKca*sn7gt%s@XUD-t>bij-4q-ilku9^;QJ3Mpc`HJ_EX4TGGQ-Og)`c~qm51<|gp7D@ zp#>Grssv^#A)&M8>ulnDM_5t#Al`#jaFpZ<#YJ@>!a$w@kEZ1<@PGs#L~kxOSz7jj zEhb?;W)eS}0IQQuk4~JT30>4rFJ3!b+77}>$_>v#2FFEnN^%(ls*o80pv0Q>#t#%H z@`Yy-FXQ9ULKh{Up&oA_A4B!(x^9&>i`+T|eD!&QOLVd(_avv-bFX~4^>o{%mzzrg_i~SBnr%DeE|i+^}|8?kaV(Z32{`vA^l!sp15>Z72z52FgXf z^8ZITvJ9eXBT1~iQjW|Q`Fac^ak$^N-vI^*geh5|*CdMz;n16gV_zk|Z7q8tFfCvU zJK^Pptnn0Rc~egGIAK}uv99VZm2WLPezQQ5K<`f zg{8Ll|GioPYfNheMj-7-S87=w4N0WxHP`1V6Y)0M&SkYzVrwp>yfsEF7wj&T0!}dB z)R~gGfP9pOR;GY_e0~K^^oJ-3AT+m~?Al!{>>5gNe17?OWz)$)sMH*xuQiB>FT2{i zQ>6U_8}Ay~r4li;jzG+$&?S12{)+<*k9 z<^SX#xY|jvlvTxt(m~C7{y{3g>7TX#o2q$xQO|fc<%8rE@A3=UW(o?gVg?gDV!0q6O!{MlX$6-Bu_m&0ms66 znWS&zr{O_4O&{2uCLQvA?xC5vGZ}KV1v6)#oTewgIMSnBur0PtM0&{R5t#UEy3I9) z`LVP?3f;o}sz*7g5qdTxJl^gk3>;8%SOPH@B)rmFOJ)m6?PlYa$y=RX%;}KId{m9R#2=LNwosF@OTivgMqxpRGe}5=LtAn?VVl6VWCFLD z7l#^^H8jY~42hR)OoVF#YDW(md!g(&pJ;yMj|UBAQa}UH?ED@%ci=*(q~Opn>kE2Q z_4Kgf|0kEA6ary41A;)^Ku(*nirvP!Y>{FZYBLXLP6QL~vRL+uMlZ?jWukMV*(dsn zL~~KA@jU)(UeoOz^4Gkw{fJsYQ%|UA7i79qO5=DOPBcWlv%pK!A+)*F`3WJ}t9FU3 zXhC4xMV7Z%5RjDs0=&vC4WdvD?Zi5tg4@xg8-GLUI>N$N&3aS4bHrp%3_1u9wqL)i z)XQLsI&{Hd&bQE!3m&D0vd!4D`l1$rt_{3NS?~lj#|$GN5RmvP(j3hzJOk=+0B*2v z)Bw133RMUM%wu_+$vbzOy?yk#kvR?xGsg-ipX4wKyXqd zROKp5))>tNy$HByaEHK%$mqd>-{Yoj`oSBK;w>+eZ&TVcj^DyXjo{DDbZ>vS2cCWB z(6&~GZ}kUdN(*2-nI!hvbnVy@z2E#F394OZD&Jb04}`Tgaj?MoY?1`{ejE2iud51% zQ~J0sijw(hqr_Ckbj@pm$FAVASKY(D4BS0GYPkSMqSDONRaFH+O2+jL{hIltJSJT~e)TNDr(}=Xt7|UhcU9eoXl&QZRR<9WomW%&m)FT~j zTgGd3-j}Uk%CRD;$@X)NNV9+RJbifYu>yr{FkO;p>_&njI> zyBHh_72bW;8}oGeY0gpHOxiV597j7mY<#?WMmkf5x~Kfk*re(&tG_mX<3&2cON*2u%V29tsXUv{#-ijs2>EuNH-x3) zPBpi+V6gI=wn}u164_j8xi-y(B?Au2o;UO=r6&)i5S3Mx*)*{_;u}~i4dh$`VgUS- zMG6t*?DXDYX0D2Oj31MI!HF>|aG8rjrOPnxHu4wZl;!=NGjjDoBpXf?ntrwt^dqxm zs(lE@*QB3NH)!`rH)5kks-D89g@UX&@DU9jvrsY)aI=9b4nPy3bfdX_U;#?zsan{G>DKob2LnhCJv8o}duQK)qP{7iaaf2=K`a-VNcfC582d4a z>sBJA*%S|NEazDxXcGPW_uZ&d7xG`~JB!U>U(}acUSn=FqOA~(pn^!aMXRnqiL0;? zebEZYouRv}-0r;Dq&z9>s#Rt1HL`0p4bB)A&sMyn|rE_9nh z?NO*RrjET8D4s(-`nS{MrdYtv*kyCnJKbsftG2D#ia@;42!8xd?a3P(&Y?vCf9na< zQ&Ni*1Qel&Xq{Z?=%f0SRqQt5m|Myg+8T=GDc)@^};=tM>9IDr7hdvE9-M@@<0pqv45xZTeNecbL- zWFQt4t`9>j8~X%lz}%We>Kzh_=`XO}!;4!OWH?=p*DOs#Nt({k^IvtBEL~Qafn)I^ zm*k{y7_bIs9YE}0B6%r`EIUH8US+MGY!KQA1fi-jCx9*}oz2k1nBsXp;4K<_&SN}}w<)!EylI_)v7}3&c)V;Cfuj*eJ2yc8LK=vugqTL><#65r6%#2e| zdYzZ)9Uq7)A$ol&ynM!|RDHc_7?FlWqjW>8TIHc`jExt)f5W|;D%GC#$u!%B*S%Z0 zsj&;bIU2jrt_7%$=!h4Q29n*A^^AI8R|stsW%O@?i+pN0YOU`z;TVuPy!N#~F8Z29 zzZh1`FU(q31wa>kmw{$q=MY>XBprL<1)Py~5TW4mgY%rg$S=4C^0qr+*A^T)Q)Q-U zGgRb9%MdE-&i#X3xW=I`%xDzAG95!RG9)s?v_5+qx`7NdkQ)If5}BoEp~h}XoeK>kweAMxJ8tehagx~;Nr_WP?jXa zJ&j7%Ef3w*XWf?V*nR)|IOMrX;$*$e23m?QN` zk>sC^GE=h6?*Cr~596s_QE@>Nnr?{EU+_^G=LZr#V&0fEXQ3IWtrM{=t^qJ62Sp=e zrrc>bzX^6yFV!^v7;>J9>j;`qHDQ4uc92eVe6nO@c>H=ouLQot``E~KLNqMqJ7(G+?GWO9Ol+q$w z!^kMv!n{vF?RqLnxVk{a_Ar;^sw0@=+~6!4&;SCh^utT=I zo&$CwvhNOjQpenw2`5*a6Gos6cs~*TD`8H9P4=#jOU_`%L!W;$57NjN%4 z39(61ZC#s7^tv`_4j}wMRT9rgDo*XtZwN-L;Qc$6v8kKkhmRrxSDkUAzGPgJ?}~_t zkwoGS4=6lsD`=RL|8L3O9L()N)lmEn-M15fRC{dhZ}7eYV%O-R^gsAp{q4 z!C1}_T8gy^v@SZ5R&Li5JMJy+K8iZw3LOGA0pN1~y@w7RRl#F()ii6Y5mr~Mdy@Kz z@FT4cm^I&#Fu_9IX(HAFP{XLbRALqm&)>m_we>a`hfv?eE|t z?YdDp2yAhj-~vuw^wzVDuj%w?exOcOT(ls(F*ceCe(C5HlN{lcQ;}|mRPqFDqLEzw zR7ldY+M6xe$$qLwekmk{Z&5cME$gpC?-8)f0m$rqaS|mj9ATNJvvyCgs(f2{r;2E!oy$k5{jik#(;S>do<#m0wVcU<}>)VtYmF9O0%(C>GDzPgh6X z9OkQLMR~y7=|MtaU!LDPPY7O)L{X#SC+M|v^X2CZ?$GS>U_|aC(VA(mIvCNk+biD| zSpj>gd(v>_Cbq>~-x^Y3o|?eHmuC?E&z>;Ij`%{$Pm$hI}bl0Kd`9KD~AchY+goL1?igDxf$qxL9< z4sW@sD)nwWr`T>e2B8MQN|p*DVTT8)3(%AZ&D|@Zh6`cJFT4G^y6`(UdPLY-&bJYJ z*L06f2~BX9qX}u)nrpmHPG#La#tiZ23<>`R@u8k;ueM6 znuSTY7>XEc+I-(VvL?Y>)adHo(cZ;1I7QP^q%hu#M{BEd8&mG_!EWR7ZV_&EGO;d(hGGJzX|tqyYEg2-m0zLT}a{COi$9!?9yK zGN7&yP$a|0gL`dPUt=4d^}?zrLN?HfKP0_gdRvb}1D73Hx!tXq>7{DWPV;^X{-)cm zFa^H5oBDL3uLkaFDWgFF@HL6Bt+_^g~*o*t`Hgy3M?nHhWvTp^|AQDc9_H< zg>IaSMzd7c(Sey;1SespO=8YUUArZaCc~}}tZZX80w%)fNpMExki-qB+;8xVX@dr; z#L52S6*aM-_$P9xFuIui;dN#qZ_MYy^C^hrY;YAMg;K`!ZpKKFc z9feHsool)`tFSS}Su|cL0%F;h!lpR+ym|P>kE-O`3QnHbJ%gJ$dQ_HPTT~>6WNX41 zoDEUpX-g&Hh&GP3koF4##?q*MX1K`@=W6(Gxm1=2Tb{hn8{sJyhQBoq}S>bZT zisRz-xDBYoYxt6--g2M1yh{#QWFCISux}4==r|7+fYdS$%DZ zXVQu{yPO<)Hn=TK`E@;l!09aY{!TMbT)H-l!(l{0j=SEj@JwW0a_h-2F0MZNpyucb zPPb+4&j?a!6ZnPTB>$t`(XSf-}`&+#rI#`GB> zl=$3HORwccTnA2%>$Nmz)u7j%_ywoGri1UXVNRxSf(<@vDLKKxFo;5pTI$R~a|-sQ zd5Rfwj+$k1t0{J`qOL^q>vZUHc7a^`cKKVa{66z?wMuQAfdZBaVVv@-wamPmes$d! z>gv^xx<0jXOz;7HIQS z4RBIFD?7{o^IQ=sNQ-k!ao*+V*|-^I2=UF?{d>bE9avsWbAs{sRE-y`7r zxVAKA9amvo4T}ZAHSF-{y1GqUHlDp4DO9I3mz5h8n|}P-9nKD|$r9AS3gbF1AX=2B zyaK3TbKYqv%~JHKQH8v+%zQ8UVEGDZY|mb>Oe3JD_Z{+Pq%HB+J1s*y6JOlk`6~H) zKt)YMZ*RkbU!GPHzJltmW-=6zqO=5;S)jz{ zFSx?ryqSMxgx|Nhv3z#kFBTuTBHsViaOHs5e&vXZ@l@mVI37<+^KvTE51!pB4Tggq zz!NlRY2ZLno0&6bA|KHPYOMY;;LZG&_lzuLy{@i$&B(}_*~Zk2 z>bkQ7u&Ww%CFh{aqkT{HCbPbRX&EvPRp=}WKmyHc>S_-qbwAr0<20vEoJ(!?-ucjE zKQ+nSlRL^VnOX0h+WcjGb6WI(8;7bsMaHXDb6ynPoOXMlf9nLKre;w*#E_whR#5!! z!^%_+X3eJVKc$fMZP;+xP$~e(CIP1R&{2m+iTQhDoC8Yl@kLM=Wily_cu>7C1wjVU z-^~I0P06ZSNVaN~A`#cSBH2L&tk6R%dU1(u1XdAx;g+5S^Hn9-L$v@p7CCF&PqV{Z?R$}4EJi36+u2JP7l(@fYfP!=e#76LGy^f>~vs0%s*x@X8`|5 zGd6JOHsQ=feES4Vo8%1P_7F5qjiIm#oRT0kO1(?Z_Dk6oX&j=Xd8Klk(;gk3S(ZFnc^8Gc=d;8O-R9tlGyp=2I@1teAZpGWUi;}`n zbJOS_Z2L16nVtDnPpMn{+wR9&yU9~C<-ncppPee`>@1k7hTl5Fn_3_KzQ)u{iJPp3 z)df?Xo%9ta%(dp@DhKuQj4D8=_!*ra#Ib&OXKrsYvAG%H7Kq|43WbayvsbeeimSa= z8~{7ya9ZUAIgLLPeuNmSB&#-`Je0Lja)M$}I41KHb7dQq$wgwX+EElNxBgyyLbA2* z=c1VJR%EPJEw(7!UE?4w@94{pI3E%(acEYd8*Wmr^R7|IM2RZ-RVXSkXy-8$!(iB* zQA`qh2Ze!EY6}Zs7vRz&nr|L60NlIgnO3L*Yz2k2Ivfen?drnVzzu3)1V&-t5S~S? zw#=Sdh>K@2vA25su*@>npw&7A%|Uh9T1jR$mV*H@)pU0&2#Se`7iJlOr$mp79`DKM z5vr*XLrg7w6lc4&S{So1KGKBqcuJ!E|HVFB?vTOjQHi)g+FwJqX@Y3q(qa#6T@3{q zhc@2T-W}XD9x4u+LCdce$*}x!Sc#+rH-sCz6j}0EE`Tk*irUq)y^za`}^1gFnF)C!yf_l_}I<6qfbT$Gc&Eyr?!QwJR~RE4!gKVmqjbI+I^*^ z&hz^7r-dgm@Mbfc#{JTH&^6sJCZt-NTpChB^fzQ}?etydyf~+)!d%V$0faN(f`rJb zm_YaJZ@>Fg>Ay2&bzTx3w^u-lsulc{mX4-nH*A(32O&b^EWmSuk{#HJk}_ULC}SB(L7`YAs>opp9o5UcnB^kVB*rmW6{s0&~_>J!_#+cEWib@v-Ms`?!&=3fDot`oH9v&$f<52>{n2l* z1FRzJ#yQbTHO}}wt0!y8Eh-0*|Um3vjX-nWH>`JN5tWB_gnW%; zUJ0V?_a#+!=>ahhrbGvmvObe8=v1uI8#gNHJ#>RwxL>E^pT05Br8+$@a9aDC1~$@* zicSQCbQcr=DCHM*?G7Hsovk|{$3oIwvymi#YoXeVfWj{Gd#XmnDgzQPRUKNAAI44y z{1WG&rhIR4ipmvBmq$BZ*5tmPIZmhhWgq|TcuR{6lA)+vhj(cH`0;+B^72{&a7ff* zkrIo|pd-Yxm+VVptC@QNCDk0=Re%Sz%ta7y{5Dn9(EapBS0r zLbDKeZepar5%cAcb<^;m>1{QhMzRmRem=+0I3ERot-)gb`i|sII^A#^Gz+x>TW5A& z3PQcpM$lDy`zb%1yf!e8&_>D02RN950KzW>GN6n@2so&Wu09x@PB=&IkIf|zZ1W}P zAKf*&Mo5@@G=w&290aG1@3=IMCB^|G4L7*xn;r3v&HBrD4D)Zg+)f~Ls$7*P-^i#B z4X7ac=0&58j^@2EBZCs}YPe3rqgLAA1L3Y}o?}$%u~)7Rk=LLFbAdSy@-Uw6lv?0K z&P@@M`o2Rll3GoYjotf@WNNjHbe|R?IKVn*?Rzf9v9QoFMq)ODF~>L}26@z`KA82t z43e!^z&WGqAk$Ww8j6bc3$I|;5^BHwt`?e)zf|&+l#!8uJV_Cwy-n1yS0^Q{W*a8B zTzTYL>tt&I&9vzGQUrO?YIm6C1r>eyh|qw~-&;7s7u1achP$K3VnXd8sV8J7ZTxTh z5+^*J5%_#X)XL2@>h(Gmv$@)fZ@ikR$v(2Rax89xscFEi!3_;ORI0dBxw)S{r50qf zg&_a*>2Xe{s@)7OX9O!C?^6fD8tc3bQTq9}fxhbx2@QeaO9Ej+2m!u~+u%Q6?Tgz{ zjYS}bleKcVhW~1$?t*AO^p!=Xkkgwx6OTik*R3~yg^L`wUU9Dq#$Z*iW%?s6pO_f8 zJ8w#u#Eaw7=8n{zJ}C>w{enA6XYHfUf7h)!Qaev)?V=yW{b@-z`hAz;I7^|DoFChP z1aYQnkGauh*ps6x*_S77@z1wwGmF8ky9fMbM$dr*`vsot4uvqWn)0vTRwJqH#&D%g zL3(0dP>%Oj&vm5Re%>*4x|h1J2X*mK5BH1?Nx_#7( zepgF`+n)rHXj!RiipusEq!X81;QQBXlTvLDj=Qub(ha&D=BDx3@-V*d!D9PeXUY?l zwZ0<4=iY!sUj4G>zTS+eYX7knN-8Oynl=NdwHS*nSz_5}*5LQ@=?Yr?uj$`C1m2OR zK`f5SD2|;=BhU#AmaTKe9QaSHQ_DUj1*cUPa*JICFt1<&S3P3zsrs^yUE;tx=x^cmW!Jq!+hohv_B> zPDMT0D&08dC4x@cTD$o1$x%So1Ir(G3_AVQMvQ13un~sP(cEWi$2%5q93E7t{3VJf%K? zuwSyDke~7KuB2?*#DV8YzJw z&}SCDexnUPD!%4|y~7}VzvJ4ch)WT4%sw@ItwoNt(C*RP)h?&~^g##vnhR0!HvIYx z0td2yz9=>t3JNySl*TszmfH6`Ir;ft@RdWs3}!J88UE|gj_GMQ6$ZYphUL2~4OY7} zB*33_bjkRf_@l;Y!7MIdb~bVe;-m78Pz|pdy=O*3kjak63UnLt!{^!!Ljg0rJD3a~ z1Q;y5Z^MF<=Hr}rdoz>yRczx+p3RxxgJE2GX&Si)14B@2t21j4hnnP#U?T3g#+{W+Zb z5s^@>->~-}4|_*!5pIzMCEp|3+i1XKcfUxW`8|ezAh>y{WiRcjSG*asw6;Ef(k#>V ztguN?EGkV_mGFdq!n#W)<7E}1#EZN8O$O|}qdoE|7K?F4zo1jL-v}E8v?9qz(d$&2 zMwyK&xlC9rXo_2xw7Qe0caC?o?Pc*-QAOE!+UvRuKjG+;dk|jQhDDBe?`XT7Y5lte zqSu0t5`;>Wv%|nhj|ZiE^IqA_lZu7OWh!2Y(627zb=r7Ends}wVk7Q5o09a@ojhH7 zU0m&h*8+j4e|OqWyJ&B`V`y=>MVO;K9=hk^6EsmVAGkLT{oUtR{JqSRY{Qi{kKw1k z6s;0SMPJOLp!som|A`*q3t0wIj-=bG8a#MC)MHcMSQU98Juv$?$CvYX)(n`P^!`5| zv3q@@|G@6wMqh;d;m4qvdibx2Yjml}vG9mDv&!0ne02M#D`Bo}xIB0VWh8>>WtNZQ z$&ISlJX;*ORQIO;k62qA{^6P%3!Z=Y1EbmY02{w^yB$`;%!{kur&XTGDiO2cjA)lr zsY^XZWy^DSAaz;kZ_VG?uWnJR7qdN18$~)>(kOoybY0~QYu9||K#|$Mby{3GduV~N zk9H7$7=RSo+?CUYF502`b76ytBy}sFak&|HIwRvB=0D|S`c#QCJPq zP)uOWI)#(n&{6|C4A^G~%B~BY21aOMoz9RuuM`Ip%oBz+NoAlb7?#`E^}7xXo!4S? zFg8I~G%!@nXi8&aJSGFcZAxQf;0m}942=i#p-&teLvE{AKm7Sl2f}Io?!IqbC|J;h z`=5LFOnU5?^w~SV@YwNZx$k_(kLNxZDE z3cf08^-rIT_>A$}B%IJBPcN^)4;90BQtiEi!gT#+EqyAUZ|}*b_}R>SGloq&6?opL zuT_+lwQMgg6!Cso$BwUA;k-1NcrzyE>(_X$B0HocjY~=Pk~Q08+N}(|%HjO_i+*=o z%G6C6A30Ch<0UlG;Zdj@ed!rfUY_i9mYwK8(aYuzcUzlTJ1yPz|Bb-9b33A9zRhGl>Ny-Q#JAq-+qtI@B@&w z$;PJbyiW=!py@g2hAi0)U1v=;avka`gd@8LC4=BEbNqL&K^UAQ5%r95#x%^qRB%KLaqMnG|6xKAm}sx!Qwo}J=2C;NROi$mfADui4)y(3wVA3k~{j^_5%H)C6K zlYAm1eY**HZOj($)xfKIQFtIVw$4&yvz9>(Crs>Gh{ zya6-FG7Dgi92#K)64=9Csj5?Zqe~_9TwSI!2quAwa1w-*uC5!}xY`?tltb0Hq740< zsq2QelPveZ4chr$=~U3!+c&>xyfvA1`)owOqj=i4wjY=A1577Gwg&Ko7;?il9r|_* z8P&IDV_g2D{in5OLFxsO!kx3AhO$5aKeoM|!q|VokqMlYM@HtsRuMtBY%I35#5$+G zpp|JOeoj^U=95HLemB04Yqv{a8X<^K9G2`&ShM_6&Bi1n?o?@MXsDj9Z*A3>#XK%J zRc*&SlFl>l)9DyRQ{*%Z+^e1XpH?0@vhpXrnPPU*d%vOhKkimm-u3c%Q^v3RKp9kx@A2dS?QfS=iigGr7m><)YkV=%LA5h@Uj@9=~ABPMJ z1UE;F&;Ttg5Kc^Qy!1SuvbNEqdgu3*l`=>s5_}dUv$B%BJbMiWrrMm7OXOdi=GOmh zZBvXXK7VqO&zojI2Om9};zCB5i|<210I{iwiGznGCx=FT89=Ef)5!lB1cZ6lbzgDn07*he}G&w7m!;|E(L-?+cz@0<9ZI~LqYQE7>HnPA436}oeN2Y(VfG6 zxNZuMK3Crm^Z_AFeHc~CVRrSl0W^?+Gbteu1g8NGYa3(8f*P{(ZT>%!jtSl6WbYVv zmE(37t0C8vJ6O-5+o*lL9XRcFbd~GSBGbGh3~R!67g&l)7n!kJlWd)~TUyXus#!&G6sR%(l(h1$xyrR5j_jM1zj#giA&@(Xl26@n<9>folx!92bQ z24h570+<)4!$!IQ(5yOU|4_E6aN@4v0+{Kx~Z z;q7fp%0cHziuI%!kB~w}g9@V+1wDz0wFlzX2UOvOy|&;e;t!lAR8tV2KQHgtfk8Uf zw;rs!(4JPODERk4ckd5I2Vq|0rd@@Mwd8MID%0^fITjYIQom^q;qhP8@|eJx{?5xX zc1@Fj*kDknlk{c-rnCloQ3hGh7OU+@efO3>fkRMcM>J?AeVP& zlfzX%cdp=N+4S#E*%^=BQ+N`A7C}|k%$|QUn0yI6S3$MS-NjO!4hm55uyju)Q6e!} z*OVO@A#-mfC9Pha6ng((Xl^V7{d+&u+yx)_B1{~t7d5e8L^i4J>;x<7@5;+l7-Gge zf#9diXJ$&v^rbN5V(ee%q0xBMEgS6%qZm7hNUP%G;^J44I!BmI@M*+FWz0!+s;+iQ zU4CuI+27bvNK8v>?7PZnVxB=heJ&_ymE0nN^W#-rqB%+JXkYGDuRw>JM_LdtLkiq* z6%%3&^BX$jnM@2bjiGc-DymKly)wVkA-pq;jSWL#7_*moZZ4I|-N}o8SK?sIv)p|c zu~9-B%tMc=!)YMFp*SiC0>kfnH8+X5>;+FFVN{~a9YVdIg1uGkZ~kegFy{^PU(4{( z`CbY`XmVA3esai686Yw8djCEyF7`bfB^F1)nwv+AqYLZ&Zy=eFhYT2uMd@{sP_qS4 zbJ&>PxajjZt?&c<1^!T|pLHfX=E^FJ>-l_XCZzvRV%x}@u(FtF(mS+Umw$e+IA74e>gCdTqi;6&=euAIpxd=Y3I5xWR zBhGoT+T`V1@91OlQ}2YO*~P4ukd*TBBdt?Plt)_ou6Y@Db`ss+Q~A-48s>?eaJYA2 zRGOa8^~Em}EFTmKIVVbMb|ob)hJJ7ITg>yHAn2i|{2ZJU!cwt9YNDT0=*WO7Bq#Xj zg@FjEaKoolrF8%c;49|`IT&25?O$dq8kp3#la9&6aH z6G|{>^C(>yP7#Dr$aeFyS0Ai_$ILhL43#*mgEl(c*4?Ae;tRL&S7Vc}Szl>B`mBuI zB9Y%xp%CZwlH!3V(`6W4-ZuETssvI&B~_O;CbULfl)X1V%(H7VSPf`_Ka9ak@8A=z z1l|B1QKT}NLI`WVTRd;2En5u{0CRqy9PTi$ja^inu){LJ&E&6W%JJPw#&PaTxpt?k zpC~gjN*22Q8tpGHR|tg~ye#9a8N<%odhZJnk7Oh=(PKfhYfzLAxdE36r<6a?A;rO&ELp_Y?8Pdw(PT^Fxn!eG_|LEbSYoBrsBA|6Fgr zt5LntyusI{Q2fdy=>ditS;}^B;I2MD4=(>7fWt0Jp~y=?VvfvzHvQhj6dyIef46J$ zl4Xu7U9v_NJV?uBBC0!kcTS0UcrV7+@~is?Fi+jrr@l3XwD|uG zr26jUWiv>Ju48Y^#qn7r9mwIH-Pv6Y|V|V-GZ&+&gQ?S?-`&ts{@5GXPqbmyZjUACC&oVXfNwUX0}ba(v978 zp8z!v9~8Zx8qB@7>oFPDm^iR@+yw`79YF)w^OHB_N;&&x7c3l^3!)IY#)}x)@D(iNaOm9 zC=^*!{`7={3*S=%iU=KsPXh=DDZcc``Ss>057i{pdW8M@4q+Ba@Tt%OytH!4>rbIbQw^-pR zGGYNPzw@n=PV@)b7yVbFr;glF*Qq3>F9oBN5PUXt!?2mdGcpv^o1?Thp`jP10G2Yi z(c93td3F3SW!Le5DUwdub!aDKoVLU6g!O?Ret21l$qOC;kdd@L#M&baVu&JZGt&<6 z!VCkvgRaav6QDW2x}tUy4~Y5(B+#Ej-8vM?DM-1?J_*&PntI3E96M!`WL#<&Z5n2u zo`P!~vBT$YOT~gU9#PB)%JZ zcd_u=m^LYzC!pH#W`yA1!(fA;D~b zG#73@l)NNd;n#XrKXZEfab;@kQRnOFU2Th-1m<4mJzlj9b3pv-GF$elX7ib9!uILM_$ke zHIGB*&=5=;ynQA{y7H93%i^d)T}y@(p>8vVhJ4L)M{0Q*@D^+SPp`EW+G6E%+`Z;u zS3goV@Dic7vc5`?!pCN44Ts@*{)zwy)9?B||AM{zKlN4T}qQRL2 zgv+{K8bv7w)#xge16;kI1fU87!W4pX)N&|cq8&i^1r`W|Hg4366r(?-ecEJ9u&Eaw zrhyikXQB>C9d>cpPGiu=VU3Z-u4|0V_iap!_J3o+K_R5EXk@sfu~zHwwYkpncVh!R zqNe7Cmf_|Wmeq4#(mIO&(wCK@b4(x0?W1Qtk(`$?+$uCJCGZm_%k?l32vuShgDFMa ztc`{$8DhB9)&?~(m&EUc=LzI1=qo#zjy#2{hLT_*aj<618qQ7mD#k2ZFGou&69;=2 z1j7=Su8k}{L*h&mfs7jg^PN&9C1Z@U!p6gXk&-7xM~{X`nqH#aGO`;Xy_zbz^rYacIq0AH%4!Oh93TzJ820%ur)8OyeS@K?sF1V(iFO z37Nnqj1z#1{|v7=_CX`lQA|$<1gtuNMHGNJYp1D_k;WQk-b+T6VmUK(x=bWviOZ~T z|4e%SpuaWLWD?qN2%`S*`P;BQBw(B__wTD6epvGdJ+>DBq2oVlf&F*lz+#avb4)3P1c^Mf#olQheVvZ|Z5 z>xXfgmv!5Z^SYn+_x}K5B%G^sRwiez&z9|f!E!#oJlT2kCOV0000$L_|bHBqAarB4TD{W@grX1CUr72@caw0faEd7-K|4L_|cawbojjHdpd6 zI6~Iv5J?-Q4*&oF000000FV;^004t70Z6Qk1Xl{X9oJ{sRC2(cs?- literal 0 HcmV?d00001 diff --git a/docs/archive/1.0/sql/tutorial/index.html b/docs/archive/1.0/sql/tutorial/index.html new file mode 100644 index 00000000000..cf25dc374ea --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/index.html @@ -0,0 +1,517 @@ + + + + + + + + VU DB Course - SQL + + + + + + + + + + + + + + + +
+
+
+
+

SQL (Structured Query Language) Querying

+ + + +

Introduction

+ +

+ SQL is a declarative data manipulation language. Simple SELECT queries follow the general form of + + SELECT [projection] FROM [tables] WHERE [selection] + + . +

+
    +
  • [projection] can be column names from the tables, constants or computations between them. A * stands for "all columns".
  • +
  • [tables] can be one or multiple table names.
  • +
  • [selection] is a set of logical data filter conditions combined with AND or OR
  • + +
+

+ Most of the query parts are optional. A very simple query is SELECT 42, which results in a single-row single-column result set containing the value 42. A more common simple query is SELECT * FROM sometable, which retrieves all row from sometable. In the following, we will slowly expand from these simple queries. +

+

For the more formally inclined, you can look at SQLite's formal grammar for SELECT queries

+

+ The sample data used below is based on the material published in the book J.R. Bruijn, F.S. Gaastra and I. Schaar, Dutch-Asiatic Shipping in the 17th and 18th Centuries, which gives an account of the trips made to the East and ships returned safely (or wrecked on the way). This is the main table, voyages: +

+
+ +
+

The exercises here will work as follows: For every task, you write a SQL query into the text box and press the "play" button left of it. This will execute the query and check the result. The following query is already given, so you can directly run it. Try it, you will see the query result. If you get the result right, the task box turns green and the result is recorded. +

+
+ + + +

Basic Queries

+ + +

Projections

+

+ Projections determine the columns that are retrieved from the database. Columns are usually data, but can also be the the result of computation on the data or constants. Several columns can be specified in the query separated by ,. For example, SELECT boatname, master FROM voyages retrieves the two columns boatname and master from the table voyages. +

+
+ For the first task, write a query below that retrieves the boat name and the departure date from the table voyages. +
+
+

+ We can also modify the data we retrieve. For example, the (pointless) query SELECT number, number + 1 FROM voyages will add 1 to every value in the numbers column. A quite large amount of computation is possible here, comparable to common programming languages. +

+
+ Write a query below that retrieves the boat name and the tonnage multiplied by 2 from the table voyages. +
+
+

+ In the previous task, you can see that the column name of the result of the computation on the tonnage column is tonnage*2. This is sometimes quite unhandy, especially when continuing to use that column in some computation. For this reason, columns and computation results can be renamed using the AS keyword. For example, SELECT boatname AS shipname FROM voyages will retrieve the boatname column, but name it shipname. +

+
+ Write a query below that retrieves the boat name and the tonnage multiplied by 2 renamed to tonnage_times_two from the table voyages. +
+
+ + +

Selections

+

+ To determine which rows are retrieved, we can filter the table with some criteria in the WHERE clause. For example, the query SELECT boatname FROM voyages WHERE number > 8395 will only retrieve voyages where the value for the number column is greater than 8395. Note that the number column is not mentioned in the projection list, this is common. +

+
+ Write a query below that retrieves the boat name for boats with a tonnage of less than 150 from the table voyages. +
+
+

You can specify multiple such filter criteria and connect them with Boolean logic using AND and OR. For example, the query SELECT boatname FROM voyages WHERE number > 8395 AND type_of_boat = 'fluit' will only retrieve rows where both conditions are met. +

+
+ Write a query below that retrieves the boat name only for boats with a tonnage of less than 150 and for ships departing from the harbour of Batavia from the table voyages. Note that string constants (Batavia) need to be quoted with single quotes in SQL, like so: 'Batavia'. +
+
+

One special case for selections are NULL values. Those typically cannot be filtered out with (non-)equality checks (because NULL != NULL, but have special syntax. In particular, WHERE departure_date IS NULL selects rows where departure_date has the value NULL, WHERE departure_date IS NOT NULL does the opposite. +

+

Another special case is string pattern matching. Sometimes, we want not to compare exact equality of strings, but use pattern matching instead (much like regular expressions). SQL uses the LIKE keyword for this. Patterns can contain two special "magic" characters, % and _. % matches an arbitrary number (zero or more) characters, _ matches a single arbitrary character. For example, if one wanted to retrieve all boat names from the voyages table where the boat name starts with D, the query would be SELECT boatname FROM voyages WHERE boatname LIKE 'D%'. If we wanted to retrieve all five-character boat names starting with D, we would use LIKE 'D____'. There is also the ILIKE (Borat) variant, which is case-insensitive. +

+
+ Write a query below that retrieves the boat name and the master only for boats where the type of boat is not NULL and the boat name consists of five-characters ending with AI. +
+
+ + +

Output Modifiers

+

+ Sometimes we want to change the result set independent from the query. Two common tasks are limiting the amount of rows that are retrieved (keyword LIMIT) and changing the ordering of the result set (keyword ORDER BY. Using LIMIT is generally a good idea since databases can grow to billions of records and they will duly output all of them if so instructed. +

+

+ Note that the relational model does not specify an explicit order of rows, hence it is important to specify an order when using LIMIT. For example, to retrieve the first five boat names in alphabetical order, we can use the query SELECT boatname FROM voyages ORDER BY boatname LIMIT 5;. We can order by multiple columns and use the DESC keyword to invert the order. +

+
+ Write a query below that retrieves the top six boat names ordered by tonnage. +
+
+ + +

All Combined

+

Now, let's combine all of the above in a single query.

+
+ Write a query below that retrieves the boat name and the tonnage divided by 2 renamed to half_tonnage from the table voyages for boats built after 1788 and the chamber code is A. The result should be ordered by departure date and only include the top 3 results. +
+
+ + + +

Aggregation Queries

+

Often, we are not interested in "raw" data, but aggregates like averages or counts. These can be expressed in SQL by adding an additional GROUP BY clause after the WHERE clause of the query. In addition, the projection list is modified by adding aggregation functions and groups. The commonly supported aggregate functions are COUNT, MIN, MAX and AVG. +

+ + +

Ungrouped Aggregation

+

+ The easiest case of aggregating values is aggregating without a GROUP BY clause. In this case, the grouping is implicit and all rows fall into a single group. For example, the query SELECT MIN(number) FROM voyages will compute the smallest value for the column number. +

+
+ Write a query below that retrieves the maximum tonnage as the column max_tonnage from the voyages table. +
+
+ + +

Single-Column Groups

+

+ The next step of aggregation is the aggregation by a group determined by the values in the data. For example, the query SELECT type_of_boat, COUNT() AS n FROM voyages GROUP BY type_of_boat generates the aggregate COUNT for each distinct value of type_of_boat. Its important to know that the projection of the query (the part behind the SELECT) can only contain column names of columns that are used in the GROUP BY part of the query as well. All other columns need to be wrapped in an aggregation function. +

+
+ Write a query below that retrieves the chamber and the maximum tonnage for each distinct value of chamber as the column max_tonnage from the voyages table grouped by chamber. +
+
+ + +

Multi-Column Groups

+

+ We can also group by multiple columns. In this case, all combinations of values between the two columns that occur in the data are grouping values. The query will simply list all grouping columns separated by , after the GROUP BY keyword. +

+
+ Write a query below that retrieves the chamber, the type of boat, and the number of tuples in each group as the column n from the voyages table grouped by chamber and type_of_boat. +
+
+ + +

Filtering Values and Groups

+

In one of the previous excersises, we have seen how we can select (WHERE ...) only part of the table to be relevant to the query. This also applies to groups. For example, before grouping, we might decide to filter out values. We can also filter out groups based on their aggregation values using the HAVING keyword. +

+

+ For example, we could aggregate only the type of boat for chamber A using a query like SELECT type_of_boat, COUNT() AS n FROM voyages WHERE chamber = 'A' GROUP BY type_of_boat. The logic here tells the database to conceptually first compute the result of the WHERE clause and then run the aggregation according to the GROUP BY column. The result of this query is: +

+
+ +
+
+ Write a query below that retrieves the departure harbour and the number of tuples in each group as the column n from the voyages table grouped by departure_harbour while filtering out rows where the field departure_harbour is NULL (IS NOT NULL, see above). +
+
+

+ From the previous result, we have seen some groups with low values for n. Let's say we want to remove those. For this we can use the HAVING clause, which operates on the result of aggregation functions. For example, we can modify the example from above to not consider boat types for which the COUNT value in n is less than 5: SELECT type_of_boat, COUNT() AS n FROM voyages GROUP BY type_of_boat HAVING n > 5. All expressions that can used following the SELECT in a grouped query are acceptable in the HAVING clause to filter the groups. +

+
+ Write a query below that retrieves the chamber and sum of tonnage as field s from the voyages table grouped by chamber while filtering out groups where the sum of tonnage is less than or equal to 5000. +
+
+

Distinct Values

+

A special case of grouping is the DISTINCT keyword. It retrieves the set of unique values from a set of columns. An example is SELECT DISTINCT type_of_boat FROM voyages. This is set-equivalent to the query SELECT type_of_boat FROM voyages GROUP BY type_of_boat. Try it! +

+
+ Write a query below that retrieves all unique values for departure harbour from the voyages table. +
+
+ + +

All Combined

+

Now let's try to combine a lot of what we have learned so far about grouping (and before).

+
+ Write a query below that retrieves the departure harbour, the departure date, the amount of voyages (rows) as column n, the minimum and maximum tonnage in each group (min_t / max_t) from the voyages table. Group by departure harbour and departure date. Filter out rows where departure harbour is NULL or equal to Batavia. Filter the groups to have at least two voyages in them. +
+
+ + + +

Join

+

One of the most powerful features of relational databases is the JOIN, the horizontal combination of two tables according to the data. For example, if one table contains information about voyages and another table contains information about the invoices (value of ship's cargo) for those voyages, the JOIN operator can be used to combine the two. +

+

+ To extend the example, we use an additional table, chambers, which contains the expansion of the VOC chamber (department) code we used in the previous section: +

+
+ +
+

+ We will also use the table invoices, which contains the total amount of money charged for a voyage: +

+
+ +
+ + +

Equi-Join

+

The basic form of JOIN is the "Equi-Join". "Equi" stands for "Equality" and means that values have to be equal to create a match for the join. In SQL, there are various syntaxes for this, but generally we need to name the two tables in the FROM clause and then either immediately or later in the WHERE specify the columns that should be considered. +

+

+ For example, we can join the voyages table and the chambers using an Equi-Join on the respective chamber columns to add the actual chamber name to the output with SELECT boatname, chamber, name FROM voyages JOIN chambers USING (chamber) LIMIT 5;. The result of this query is: +

+
+ +
+

+ Note the list of tables in the FROM clause and the USING keyword, which specifies that values from the column chamber, which exists in both tables, should be compared to generate the join result. There are at least two equivalent ways to formulate this query. First, the join condition can also be more verbose: SELECT boatname, voyages.chamber, name FROM voyages JOIN chambers ON voyages.chamber = chambers.chamber LIMIT 5;. Here, we use the ON keyword to specify a more precise and expressive join condition. We explicitly name the tables from which the join columns are coming using the tablename.columname syntax. We also explicitly use the = comparision operator. Both of these definitions are implicit when using USING. The USING keyword also adds an implicit projection which removes one of the chamber results from the result set. The ON version does not, this is why we need to explicitly name one of them in the SELECT part of the query. Second, the join condition can also be in the WHERE clause of the query: SELECT boatname, voyages.chamber, name FROM voyages JOIN chambers WHERE voyages.chamber = chambers.chamber LIMIT 5;. This method of specifying a join condition is widespread, but confusing and thus discouraged. +

+

In the result above, also note how several rows from voyages (# 1, 3 and 5) are joined up with the same row from chambers. This is normal and expected behavior, rows are re-used if they match multiple rows in the join partner. +

+

Now try to write a JOIN query yourself!

+
+ Join the voyages and invoices table using equality on the number column. Project only the boatname and invoice column. +
+
+

Joins can (of course) use more than one pair of columns to determine that a match is present. For example, the query SELECT boatname, invoice FROM voyages JOIN invoices USING (number, chamber) joins the two tables by checking equality for both the number and chamber columns. The equivalent form with the ON keyword would be SELECT boatname, invoice FROM voyages JOIN invoices ON voyages.number = invoices.number AND voyages.chamber = invoices.chamber.

+

+ + +

Natural Join

+

+ A special case of the Equi-join is the NATURAL JOIN. What it does is perform a join between two tables while using all columns from both tables with matching names as join criteria. The following two queries are equivalent: SELECT boatname, invoice FROM voyages JOIN invoices USING (number, chamber) and SELECT boatname, invoice FROM voyages NATURAL JOIN invoices, because the columns number and chamber are the only column names the two tables have in common. Think of natural joins as syntactic sugar, which often reduces readability of your query due to its implicit nature. +

+

Outer Join

+

The previous section used (by default) what is called an INNER JOIN. Here, only rows where a match in the other table is found for are candidates for the result. There is also an opposite version, the LEFT OUTER JOIN (The OUTER keyword is optional). Here, all rows from the left side of the join (the table named first) are included in the result. If a matching row from the right side is found, it is joined. If none is found, the left row is still returned with the missing fields from the right side set to NULL +

+
+ Not all rows in the voyages table have a match in the invoices table. Use a LEFT OUTER JOIN on number and a filter for invoice equal to NULL to only retrieve rows where no match was found. Retrieve only the number column. +
+
+ + +

Self-Join

+

A powerful use case of joins in SQL is the self-join. Here, we combine the table with itself, which can for example be used to express queries about relationships between rows in the same table. In our dataset, we have such a relationship with the next_voyage column in the voyages table. This column indicates the next voyage number of a particular ship. Since both table have the same name in a self-join, it is required to rename them using the AS statement to unique temporary table names. For example, SELECT v1.boatname FROM voyages AS v1 where v1.chamber='A' demonstrates such a renaming. +

+ Retrieve all voyages where the next voyage of a ship was for a different chamber. Project ot boatname, and the two (differing) chambers as c1 and c2. +
+
+ + +

More Tables

+

We can join over more than two tables by chaining JOIN keywords. We also use multiple ON or USING statements to define the join criteria. + +

+ Join all three tables voyages, invoices and chambers. Join voyages and invoices on the number and chamber columns, and the resulting table with chambers using only chamber column. Limit the result set to 5 rows. +
+
+ + +

Subqueries

+

SQL supports nested queries, the so-called subqueries. They can be used in all parts of the query and often simplify query expression as opposed to a JOIN. Other uses are creation of more convenient join partners or computation of projection results. These queries mostly differ in the allowed cardinality of their results. In projections, they can only return a single value whiled in the FROMclause, entire tables may be produced.

+ + +

Filters

+

Subqueries in a WHERE clause are typically comparing an attribute, either against a single value (e.g., using =) or against a set of values (e.g., with IN or EXISTS). In both cases, it is possible to refer to tuple variables from the enclosing query but not the other way around. Subqueries are enclosed in brackets () and are otherwise complete queries on their own. For example, the rather over-complicated query SELECT boatname FROM voyages WHERE tonnage = (SELECT 884) uses a subquery to retrieve all ships with a tonnage of 884.

+ +
+ Use a subquery in the WHERE clause to retrieve boat names from the voyages table where there is no matching entry in the invoices table for the particular voyage number. Use a subquery with either NOT IN or NOT EXISTS (with a reference to the outer query). +
+
+ +

Tables

+ +

The result of a subquery is also a table, hence they can also be used to use a subquery result where a table could be used, i.e., the FROM clause. For example, the (again) over-complicated query SELECT number FROM (SELECT * FROM voyages) AS v uses a table-creating subquery. Note how we have to use the AS clause to give the table created by the subquery a name. We often use subqueries to avoid complex filter conditions after joining tables.

+ +
+ Use a subquery in the FROM clause to only retrieve invoices from chamber 'H' and the invoice amount of larger than 10000 and join the result with the voyages table using the number column. Project to only retrieve the boatname and the invoice amount of the join result. Order by invoice amount. +
+
+ +

Set Operations

+

SQL is grounded in the set semantics of the relational model. Hence, result sets can be interpreted as sets of tuples and we can use set operations to modify them. The most common set operation is the UNION, but intersections and set differences are also possible. These operators are often used to combine tuples from different relations.

+ +

Unions

+

The UNION keyword combines two result sets while eliminating duplicates, the UNION ALL keyword does the same while keeping duplicates. For example, the query SELECT name, chamber FROM chambers UNION ALL SELECT boatname, chamber FROM voyages will stack the otherwise unrelated boat and chamber names into one result set. Often, constants are also selected to denote the source of the respective tables. +

+
+ Use the UNION keyword to add a (modern-day) province column to the chambers values. 'Noord-Holland for 'Amsterdam', 'Hoorn' and 'Enkhuizen', 'Zeeland for 'Zeeland', 'Zuid-Holland for 'Delft' and 'Rotterdam'. Select chamber code, name and province. Order entire result set by chamber code. +
+
+ +

+

You made it to the end, congratulations.

+ +
+ +
+ + + +
+ +
+ + + diff --git a/docs/archive/1.0/sql/tutorial/js/bootstrap.min.js b/docs/archive/1.0/sql/tutorial/js/bootstrap.min.js new file mode 100644 index 00000000000..9bcd2fccaed --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/js/bootstrap.min.js @@ -0,0 +1,7 @@ +/*! + * Bootstrap v3.3.7 (http://getbootstrap.com) + * Copyright 2011-2016 Twitter, Inc. + * Licensed under the MIT license + */ +if("undefined"==typeof jQuery)throw new Error("Bootstrap's JavaScript requires jQuery");+function(a){"use strict";var b=a.fn.jquery.split(" ")[0].split(".");if(b[0]<2&&b[1]<9||1==b[0]&&9==b[1]&&b[2]<1||b[0]>3)throw new Error("Bootstrap's JavaScript requires jQuery version 1.9.1 or higher, but lower than version 4")}(jQuery),+function(a){"use strict";function b(){var a=document.createElement("bootstrap"),b={WebkitTransition:"webkitTransitionEnd",MozTransition:"transitionend",OTransition:"oTransitionEnd otransitionend",transition:"transitionend"};for(var c in b)if(void 0!==a.style[c])return{end:b[c]};return!1}a.fn.emulateTransitionEnd=function(b){var c=!1,d=this;a(this).one("bsTransitionEnd",function(){c=!0});var e=function(){c||a(d).trigger(a.support.transition.end)};return setTimeout(e,b),this},a(function(){a.support.transition=b(),a.support.transition&&(a.event.special.bsTransitionEnd={bindType:a.support.transition.end,delegateType:a.support.transition.end,handle:function(b){if(a(b.target).is(this))return b.handleObj.handler.apply(this,arguments)}})})}(jQuery),+function(a){"use strict";function b(b){return this.each(function(){var c=a(this),e=c.data("bs.alert");e||c.data("bs.alert",e=new d(this)),"string"==typeof b&&e[b].call(c)})}var c='[data-dismiss="alert"]',d=function(b){a(b).on("click",c,this.close)};d.VERSION="3.3.7",d.TRANSITION_DURATION=150,d.prototype.close=function(b){function c(){g.detach().trigger("closed.bs.alert").remove()}var e=a(this),f=e.attr("data-target");f||(f=e.attr("href"),f=f&&f.replace(/.*(?=#[^\s]*$)/,""));var g=a("#"===f?[]:f);b&&b.preventDefault(),g.length||(g=e.closest(".alert")),g.trigger(b=a.Event("close.bs.alert")),b.isDefaultPrevented()||(g.removeClass("in"),a.support.transition&&g.hasClass("fade")?g.one("bsTransitionEnd",c).emulateTransitionEnd(d.TRANSITION_DURATION):c())};var e=a.fn.alert;a.fn.alert=b,a.fn.alert.Constructor=d,a.fn.alert.noConflict=function(){return a.fn.alert=e,this},a(document).on("click.bs.alert.data-api",c,d.prototype.close)}(jQuery),+function(a){"use strict";function b(b){return this.each(function(){var d=a(this),e=d.data("bs.button"),f="object"==typeof b&&b;e||d.data("bs.button",e=new c(this,f)),"toggle"==b?e.toggle():b&&e.setState(b)})}var c=function(b,d){this.$element=a(b),this.options=a.extend({},c.DEFAULTS,d),this.isLoading=!1};c.VERSION="3.3.7",c.DEFAULTS={loadingText:"loading..."},c.prototype.setState=function(b){var c="disabled",d=this.$element,e=d.is("input")?"val":"html",f=d.data();b+="Text",null==f.resetText&&d.data("resetText",d[e]()),setTimeout(a.proxy(function(){d[e](null==f[b]?this.options[b]:f[b]),"loadingText"==b?(this.isLoading=!0,d.addClass(c).attr(c,c).prop(c,!0)):this.isLoading&&(this.isLoading=!1,d.removeClass(c).removeAttr(c).prop(c,!1))},this),0)},c.prototype.toggle=function(){var a=!0,b=this.$element.closest('[data-toggle="buttons"]');if(b.length){var c=this.$element.find("input");"radio"==c.prop("type")?(c.prop("checked")&&(a=!1),b.find(".active").removeClass("active"),this.$element.addClass("active")):"checkbox"==c.prop("type")&&(c.prop("checked")!==this.$element.hasClass("active")&&(a=!1),this.$element.toggleClass("active")),c.prop("checked",this.$element.hasClass("active")),a&&c.trigger("change")}else this.$element.attr("aria-pressed",!this.$element.hasClass("active")),this.$element.toggleClass("active")};var d=a.fn.button;a.fn.button=b,a.fn.button.Constructor=c,a.fn.button.noConflict=function(){return a.fn.button=d,this},a(document).on("click.bs.button.data-api",'[data-toggle^="button"]',function(c){var d=a(c.target).closest(".btn");b.call(d,"toggle"),a(c.target).is('input[type="radio"], input[type="checkbox"]')||(c.preventDefault(),d.is("input,button")?d.trigger("focus"):d.find("input:visible,button:visible").first().trigger("focus"))}).on("focus.bs.button.data-api blur.bs.button.data-api",'[data-toggle^="button"]',function(b){a(b.target).closest(".btn").toggleClass("focus",/^focus(in)?$/.test(b.type))})}(jQuery),+function(a){"use strict";function b(b){return this.each(function(){var d=a(this),e=d.data("bs.carousel"),f=a.extend({},c.DEFAULTS,d.data(),"object"==typeof b&&b),g="string"==typeof b?b:f.slide;e||d.data("bs.carousel",e=new c(this,f)),"number"==typeof b?e.to(b):g?e[g]():f.interval&&e.pause().cycle()})}var c=function(b,c){this.$element=a(b),this.$indicators=this.$element.find(".carousel-indicators"),this.options=c,this.paused=null,this.sliding=null,this.interval=null,this.$active=null,this.$items=null,this.options.keyboard&&this.$element.on("keydown.bs.carousel",a.proxy(this.keydown,this)),"hover"==this.options.pause&&!("ontouchstart"in document.documentElement)&&this.$element.on("mouseenter.bs.carousel",a.proxy(this.pause,this)).on("mouseleave.bs.carousel",a.proxy(this.cycle,this))};c.VERSION="3.3.7",c.TRANSITION_DURATION=600,c.DEFAULTS={interval:5e3,pause:"hover",wrap:!0,keyboard:!0},c.prototype.keydown=function(a){if(!/input|textarea/i.test(a.target.tagName)){switch(a.which){case 37:this.prev();break;case 39:this.next();break;default:return}a.preventDefault()}},c.prototype.cycle=function(b){return b||(this.paused=!1),this.interval&&clearInterval(this.interval),this.options.interval&&!this.paused&&(this.interval=setInterval(a.proxy(this.next,this),this.options.interval)),this},c.prototype.getItemIndex=function(a){return this.$items=a.parent().children(".item"),this.$items.index(a||this.$active)},c.prototype.getItemForDirection=function(a,b){var c=this.getItemIndex(b),d="prev"==a&&0===c||"next"==a&&c==this.$items.length-1;if(d&&!this.options.wrap)return b;var e="prev"==a?-1:1,f=(c+e)%this.$items.length;return this.$items.eq(f)},c.prototype.to=function(a){var b=this,c=this.getItemIndex(this.$active=this.$element.find(".item.active"));if(!(a>this.$items.length-1||a<0))return this.sliding?this.$element.one("slid.bs.carousel",function(){b.to(a)}):c==a?this.pause().cycle():this.slide(a>c?"next":"prev",this.$items.eq(a))},c.prototype.pause=function(b){return b||(this.paused=!0),this.$element.find(".next, .prev").length&&a.support.transition&&(this.$element.trigger(a.support.transition.end),this.cycle(!0)),this.interval=clearInterval(this.interval),this},c.prototype.next=function(){if(!this.sliding)return this.slide("next")},c.prototype.prev=function(){if(!this.sliding)return this.slide("prev")},c.prototype.slide=function(b,d){var e=this.$element.find(".item.active"),f=d||this.getItemForDirection(b,e),g=this.interval,h="next"==b?"left":"right",i=this;if(f.hasClass("active"))return this.sliding=!1;var j=f[0],k=a.Event("slide.bs.carousel",{relatedTarget:j,direction:h});if(this.$element.trigger(k),!k.isDefaultPrevented()){if(this.sliding=!0,g&&this.pause(),this.$indicators.length){this.$indicators.find(".active").removeClass("active");var l=a(this.$indicators.children()[this.getItemIndex(f)]);l&&l.addClass("active")}var m=a.Event("slid.bs.carousel",{relatedTarget:j,direction:h});return a.support.transition&&this.$element.hasClass("slide")?(f.addClass(b),f[0].offsetWidth,e.addClass(h),f.addClass(h),e.one("bsTransitionEnd",function(){f.removeClass([b,h].join(" ")).addClass("active"),e.removeClass(["active",h].join(" ")),i.sliding=!1,setTimeout(function(){i.$element.trigger(m)},0)}).emulateTransitionEnd(c.TRANSITION_DURATION)):(e.removeClass("active"),f.addClass("active"),this.sliding=!1,this.$element.trigger(m)),g&&this.cycle(),this}};var d=a.fn.carousel;a.fn.carousel=b,a.fn.carousel.Constructor=c,a.fn.carousel.noConflict=function(){return a.fn.carousel=d,this};var e=function(c){var d,e=a(this),f=a(e.attr("data-target")||(d=e.attr("href"))&&d.replace(/.*(?=#[^\s]+$)/,""));if(f.hasClass("carousel")){var g=a.extend({},f.data(),e.data()),h=e.attr("data-slide-to");h&&(g.interval=!1),b.call(f,g),h&&f.data("bs.carousel").to(h),c.preventDefault()}};a(document).on("click.bs.carousel.data-api","[data-slide]",e).on("click.bs.carousel.data-api","[data-slide-to]",e),a(window).on("load",function(){a('[data-ride="carousel"]').each(function(){var c=a(this);b.call(c,c.data())})})}(jQuery),+function(a){"use strict";function b(b){var c,d=b.attr("data-target")||(c=b.attr("href"))&&c.replace(/.*(?=#[^\s]+$)/,"");return a(d)}function c(b){return this.each(function(){var c=a(this),e=c.data("bs.collapse"),f=a.extend({},d.DEFAULTS,c.data(),"object"==typeof b&&b);!e&&f.toggle&&/show|hide/.test(b)&&(f.toggle=!1),e||c.data("bs.collapse",e=new d(this,f)),"string"==typeof b&&e[b]()})}var d=function(b,c){this.$element=a(b),this.options=a.extend({},d.DEFAULTS,c),this.$trigger=a('[data-toggle="collapse"][href="#'+b.id+'"],[data-toggle="collapse"][data-target="#'+b.id+'"]'),this.transitioning=null,this.options.parent?this.$parent=this.getParent():this.addAriaAndCollapsedClass(this.$element,this.$trigger),this.options.toggle&&this.toggle()};d.VERSION="3.3.7",d.TRANSITION_DURATION=350,d.DEFAULTS={toggle:!0},d.prototype.dimension=function(){var a=this.$element.hasClass("width");return a?"width":"height"},d.prototype.show=function(){if(!this.transitioning&&!this.$element.hasClass("in")){var b,e=this.$parent&&this.$parent.children(".panel").children(".in, .collapsing");if(!(e&&e.length&&(b=e.data("bs.collapse"),b&&b.transitioning))){var f=a.Event("show.bs.collapse");if(this.$element.trigger(f),!f.isDefaultPrevented()){e&&e.length&&(c.call(e,"hide"),b||e.data("bs.collapse",null));var g=this.dimension();this.$element.removeClass("collapse").addClass("collapsing")[g](0).attr("aria-expanded",!0),this.$trigger.removeClass("collapsed").attr("aria-expanded",!0),this.transitioning=1;var h=function(){this.$element.removeClass("collapsing").addClass("collapse in")[g](""),this.transitioning=0,this.$element.trigger("shown.bs.collapse")};if(!a.support.transition)return h.call(this);var i=a.camelCase(["scroll",g].join("-"));this.$element.one("bsTransitionEnd",a.proxy(h,this)).emulateTransitionEnd(d.TRANSITION_DURATION)[g](this.$element[0][i])}}}},d.prototype.hide=function(){if(!this.transitioning&&this.$element.hasClass("in")){var b=a.Event("hide.bs.collapse");if(this.$element.trigger(b),!b.isDefaultPrevented()){var c=this.dimension();this.$element[c](this.$element[c]())[0].offsetHeight,this.$element.addClass("collapsing").removeClass("collapse in").attr("aria-expanded",!1),this.$trigger.addClass("collapsed").attr("aria-expanded",!1),this.transitioning=1;var e=function(){this.transitioning=0,this.$element.removeClass("collapsing").addClass("collapse").trigger("hidden.bs.collapse")};return a.support.transition?void this.$element[c](0).one("bsTransitionEnd",a.proxy(e,this)).emulateTransitionEnd(d.TRANSITION_DURATION):e.call(this)}}},d.prototype.toggle=function(){this[this.$element.hasClass("in")?"hide":"show"]()},d.prototype.getParent=function(){return a(this.options.parent).find('[data-toggle="collapse"][data-parent="'+this.options.parent+'"]').each(a.proxy(function(c,d){var e=a(d);this.addAriaAndCollapsedClass(b(e),e)},this)).end()},d.prototype.addAriaAndCollapsedClass=function(a,b){var c=a.hasClass("in");a.attr("aria-expanded",c),b.toggleClass("collapsed",!c).attr("aria-expanded",c)};var e=a.fn.collapse;a.fn.collapse=c,a.fn.collapse.Constructor=d,a.fn.collapse.noConflict=function(){return a.fn.collapse=e,this},a(document).on("click.bs.collapse.data-api",'[data-toggle="collapse"]',function(d){var e=a(this);e.attr("data-target")||d.preventDefault();var f=b(e),g=f.data("bs.collapse"),h=g?"toggle":e.data();c.call(f,h)})}(jQuery),+function(a){"use strict";function b(b){var c=b.attr("data-target");c||(c=b.attr("href"),c=c&&/#[A-Za-z]/.test(c)&&c.replace(/.*(?=#[^\s]*$)/,""));var d=c&&a(c);return d&&d.length?d:b.parent()}function c(c){c&&3===c.which||(a(e).remove(),a(f).each(function(){var d=a(this),e=b(d),f={relatedTarget:this};e.hasClass("open")&&(c&&"click"==c.type&&/input|textarea/i.test(c.target.tagName)&&a.contains(e[0],c.target)||(e.trigger(c=a.Event("hide.bs.dropdown",f)),c.isDefaultPrevented()||(d.attr("aria-expanded","false"),e.removeClass("open").trigger(a.Event("hidden.bs.dropdown",f)))))}))}function d(b){return this.each(function(){var c=a(this),d=c.data("bs.dropdown");d||c.data("bs.dropdown",d=new g(this)),"string"==typeof b&&d[b].call(c)})}var e=".dropdown-backdrop",f='[data-toggle="dropdown"]',g=function(b){a(b).on("click.bs.dropdown",this.toggle)};g.VERSION="3.3.7",g.prototype.toggle=function(d){var e=a(this);if(!e.is(".disabled, :disabled")){var f=b(e),g=f.hasClass("open");if(c(),!g){"ontouchstart"in document.documentElement&&!f.closest(".navbar-nav").length&&a(document.createElement("div")).addClass("dropdown-backdrop").insertAfter(a(this)).on("click",c);var h={relatedTarget:this};if(f.trigger(d=a.Event("show.bs.dropdown",h)),d.isDefaultPrevented())return;e.trigger("focus").attr("aria-expanded","true"),f.toggleClass("open").trigger(a.Event("shown.bs.dropdown",h))}return!1}},g.prototype.keydown=function(c){if(/(38|40|27|32)/.test(c.which)&&!/input|textarea/i.test(c.target.tagName)){var d=a(this);if(c.preventDefault(),c.stopPropagation(),!d.is(".disabled, :disabled")){var e=b(d),g=e.hasClass("open");if(!g&&27!=c.which||g&&27==c.which)return 27==c.which&&e.find(f).trigger("focus"),d.trigger("click");var h=" li:not(.disabled):visible a",i=e.find(".dropdown-menu"+h);if(i.length){var j=i.index(c.target);38==c.which&&j>0&&j--,40==c.which&&jdocument.documentElement.clientHeight;this.$element.css({paddingLeft:!this.bodyIsOverflowing&&a?this.scrollbarWidth:"",paddingRight:this.bodyIsOverflowing&&!a?this.scrollbarWidth:""})},c.prototype.resetAdjustments=function(){this.$element.css({paddingLeft:"",paddingRight:""})},c.prototype.checkScrollbar=function(){var a=window.innerWidth;if(!a){var b=document.documentElement.getBoundingClientRect();a=b.right-Math.abs(b.left)}this.bodyIsOverflowing=document.body.clientWidth
',trigger:"hover focus",title:"",delay:0,html:!1,container:!1,viewport:{selector:"body",padding:0}},c.prototype.init=function(b,c,d){if(this.enabled=!0,this.type=b,this.$element=a(c),this.options=this.getOptions(d),this.$viewport=this.options.viewport&&a(a.isFunction(this.options.viewport)?this.options.viewport.call(this,this.$element):this.options.viewport.selector||this.options.viewport),this.inState={click:!1,hover:!1,focus:!1},this.$element[0]instanceof document.constructor&&!this.options.selector)throw new Error("`selector` option must be specified when initializing "+this.type+" on the window.document object!");for(var e=this.options.trigger.split(" "),f=e.length;f--;){var g=e[f];if("click"==g)this.$element.on("click."+this.type,this.options.selector,a.proxy(this.toggle,this));else if("manual"!=g){var h="hover"==g?"mouseenter":"focusin",i="hover"==g?"mouseleave":"focusout";this.$element.on(h+"."+this.type,this.options.selector,a.proxy(this.enter,this)),this.$element.on(i+"."+this.type,this.options.selector,a.proxy(this.leave,this))}}this.options.selector?this._options=a.extend({},this.options,{trigger:"manual",selector:""}):this.fixTitle()},c.prototype.getDefaults=function(){return c.DEFAULTS},c.prototype.getOptions=function(b){return b=a.extend({},this.getDefaults(),this.$element.data(),b),b.delay&&"number"==typeof b.delay&&(b.delay={show:b.delay,hide:b.delay}),b},c.prototype.getDelegateOptions=function(){var b={},c=this.getDefaults();return this._options&&a.each(this._options,function(a,d){c[a]!=d&&(b[a]=d)}),b},c.prototype.enter=function(b){var c=b instanceof this.constructor?b:a(b.currentTarget).data("bs."+this.type);return c||(c=new this.constructor(b.currentTarget,this.getDelegateOptions()),a(b.currentTarget).data("bs."+this.type,c)),b instanceof a.Event&&(c.inState["focusin"==b.type?"focus":"hover"]=!0),c.tip().hasClass("in")||"in"==c.hoverState?void(c.hoverState="in"):(clearTimeout(c.timeout),c.hoverState="in",c.options.delay&&c.options.delay.show?void(c.timeout=setTimeout(function(){"in"==c.hoverState&&c.show()},c.options.delay.show)):c.show())},c.prototype.isInStateTrue=function(){for(var a in this.inState)if(this.inState[a])return!0;return!1},c.prototype.leave=function(b){var c=b instanceof this.constructor?b:a(b.currentTarget).data("bs."+this.type);if(c||(c=new this.constructor(b.currentTarget,this.getDelegateOptions()),a(b.currentTarget).data("bs."+this.type,c)),b instanceof a.Event&&(c.inState["focusout"==b.type?"focus":"hover"]=!1),!c.isInStateTrue())return clearTimeout(c.timeout),c.hoverState="out",c.options.delay&&c.options.delay.hide?void(c.timeout=setTimeout(function(){"out"==c.hoverState&&c.hide()},c.options.delay.hide)):c.hide()},c.prototype.show=function(){var b=a.Event("show.bs."+this.type);if(this.hasContent()&&this.enabled){this.$element.trigger(b);var d=a.contains(this.$element[0].ownerDocument.documentElement,this.$element[0]);if(b.isDefaultPrevented()||!d)return;var e=this,f=this.tip(),g=this.getUID(this.type);this.setContent(),f.attr("id",g),this.$element.attr("aria-describedby",g),this.options.animation&&f.addClass("fade");var h="function"==typeof this.options.placement?this.options.placement.call(this,f[0],this.$element[0]):this.options.placement,i=/\s?auto?\s?/i,j=i.test(h);j&&(h=h.replace(i,"")||"top"),f.detach().css({top:0,left:0,display:"block"}).addClass(h).data("bs."+this.type,this),this.options.container?f.appendTo(this.options.container):f.insertAfter(this.$element),this.$element.trigger("inserted.bs."+this.type);var k=this.getPosition(),l=f[0].offsetWidth,m=f[0].offsetHeight;if(j){var n=h,o=this.getPosition(this.$viewport);h="bottom"==h&&k.bottom+m>o.bottom?"top":"top"==h&&k.top-mo.width?"left":"left"==h&&k.left-lg.top+g.height&&(e.top=g.top+g.height-i)}else{var j=b.left-f,k=b.left+f+c;jg.right&&(e.left=g.left+g.width-k)}return e},c.prototype.getTitle=function(){var a,b=this.$element,c=this.options;return a=b.attr("data-original-title")||("function"==typeof c.title?c.title.call(b[0]):c.title)},c.prototype.getUID=function(a){do a+=~~(1e6*Math.random());while(document.getElementById(a));return a},c.prototype.tip=function(){if(!this.$tip&&(this.$tip=a(this.options.template),1!=this.$tip.length))throw new Error(this.type+" `template` option must consist of exactly 1 top-level element!");return this.$tip},c.prototype.arrow=function(){return this.$arrow=this.$arrow||this.tip().find(".tooltip-arrow")},c.prototype.enable=function(){this.enabled=!0},c.prototype.disable=function(){this.enabled=!1},c.prototype.toggleEnabled=function(){this.enabled=!this.enabled},c.prototype.toggle=function(b){var c=this;b&&(c=a(b.currentTarget).data("bs."+this.type),c||(c=new this.constructor(b.currentTarget,this.getDelegateOptions()),a(b.currentTarget).data("bs."+this.type,c))),b?(c.inState.click=!c.inState.click,c.isInStateTrue()?c.enter(c):c.leave(c)):c.tip().hasClass("in")?c.leave(c):c.enter(c)},c.prototype.destroy=function(){var a=this;clearTimeout(this.timeout),this.hide(function(){a.$element.off("."+a.type).removeData("bs."+a.type),a.$tip&&a.$tip.detach(),a.$tip=null,a.$arrow=null,a.$viewport=null,a.$element=null})};var d=a.fn.tooltip;a.fn.tooltip=b,a.fn.tooltip.Constructor=c,a.fn.tooltip.noConflict=function(){return a.fn.tooltip=d,this}}(jQuery),+function(a){"use strict";function b(b){return this.each(function(){var d=a(this),e=d.data("bs.popover"),f="object"==typeof b&&b;!e&&/destroy|hide/.test(b)||(e||d.data("bs.popover",e=new c(this,f)),"string"==typeof b&&e[b]())})}var c=function(a,b){this.init("popover",a,b)};if(!a.fn.tooltip)throw new Error("Popover requires tooltip.js");c.VERSION="3.3.7",c.DEFAULTS=a.extend({},a.fn.tooltip.Constructor.DEFAULTS,{placement:"right",trigger:"click",content:"",template:''}),c.prototype=a.extend({},a.fn.tooltip.Constructor.prototype),c.prototype.constructor=c,c.prototype.getDefaults=function(){return c.DEFAULTS},c.prototype.setContent=function(){var a=this.tip(),b=this.getTitle(),c=this.getContent();a.find(".popover-title")[this.options.html?"html":"text"](b),a.find(".popover-content").children().detach().end()[this.options.html?"string"==typeof c?"html":"append":"text"](c),a.removeClass("fade top bottom left right in"),a.find(".popover-title").html()||a.find(".popover-title").hide()},c.prototype.hasContent=function(){return this.getTitle()||this.getContent()},c.prototype.getContent=function(){var a=this.$element,b=this.options;return a.attr("data-content")||("function"==typeof b.content?b.content.call(a[0]):b.content)},c.prototype.arrow=function(){return this.$arrow=this.$arrow||this.tip().find(".arrow")};var d=a.fn.popover;a.fn.popover=b,a.fn.popover.Constructor=c,a.fn.popover.noConflict=function(){return a.fn.popover=d,this}}(jQuery),+function(a){"use strict";function b(c,d){this.$body=a(document.body),this.$scrollElement=a(a(c).is(document.body)?window:c),this.options=a.extend({},b.DEFAULTS,d),this.selector=(this.options.target||"")+" .nav li > a",this.offsets=[],this.targets=[],this.activeTarget=null,this.scrollHeight=0,this.$scrollElement.on("scroll.bs.scrollspy",a.proxy(this.process,this)),this.refresh(),this.process()}function c(c){return this.each(function(){var d=a(this),e=d.data("bs.scrollspy"),f="object"==typeof c&&c;e||d.data("bs.scrollspy",e=new b(this,f)),"string"==typeof c&&e[c]()})}b.VERSION="3.3.7",b.DEFAULTS={offset:10},b.prototype.getScrollHeight=function(){return this.$scrollElement[0].scrollHeight||Math.max(this.$body[0].scrollHeight,document.documentElement.scrollHeight)},b.prototype.refresh=function(){var b=this,c="offset",d=0;this.offsets=[],this.targets=[],this.scrollHeight=this.getScrollHeight(),a.isWindow(this.$scrollElement[0])||(c="position",d=this.$scrollElement.scrollTop()),this.$body.find(this.selector).map(function(){var b=a(this),e=b.data("target")||b.attr("href"),f=/^#./.test(e)&&a(e);return f&&f.length&&f.is(":visible")&&[[f[c]().top+d,e]]||null}).sort(function(a,b){return a[0]-b[0]}).each(function(){b.offsets.push(this[0]),b.targets.push(this[1])})},b.prototype.process=function(){var a,b=this.$scrollElement.scrollTop()+this.options.offset,c=this.getScrollHeight(),d=this.options.offset+c-this.$scrollElement.height(),e=this.offsets,f=this.targets,g=this.activeTarget;if(this.scrollHeight!=c&&this.refresh(),b>=d)return g!=(a=f[f.length-1])&&this.activate(a);if(g&&b=e[a]&&(void 0===e[a+1]||b .dropdown-menu > .active").removeClass("active").end().find('[data-toggle="tab"]').attr("aria-expanded",!1),b.addClass("active").find('[data-toggle="tab"]').attr("aria-expanded",!0),h?(b[0].offsetWidth,b.addClass("in")):b.removeClass("fade"),b.parent(".dropdown-menu").length&&b.closest("li.dropdown").addClass("active").end().find('[data-toggle="tab"]').attr("aria-expanded",!0),e&&e()}var g=d.find("> .active"),h=e&&a.support.transition&&(g.length&&g.hasClass("fade")||!!d.find("> .fade").length);g.length&&h?g.one("bsTransitionEnd",f).emulateTransitionEnd(c.TRANSITION_DURATION):f(),g.removeClass("in")};var d=a.fn.tab;a.fn.tab=b,a.fn.tab.Constructor=c,a.fn.tab.noConflict=function(){return a.fn.tab=d,this};var e=function(c){c.preventDefault(),b.call(a(this),"show")};a(document).on("click.bs.tab.data-api",'[data-toggle="tab"]',e).on("click.bs.tab.data-api",'[data-toggle="pill"]',e)}(jQuery),+function(a){"use strict";function b(b){return this.each(function(){var d=a(this),e=d.data("bs.affix"),f="object"==typeof b&&b;e||d.data("bs.affix",e=new c(this,f)),"string"==typeof b&&e[b]()})}var c=function(b,d){this.options=a.extend({},c.DEFAULTS,d),this.$target=a(this.options.target).on("scroll.bs.affix.data-api",a.proxy(this.checkPosition,this)).on("click.bs.affix.data-api",a.proxy(this.checkPositionWithEventLoop,this)),this.$element=a(b),this.affixed=null,this.unpin=null,this.pinnedOffset=null,this.checkPosition()};c.VERSION="3.3.7",c.RESET="affix affix-top affix-bottom",c.DEFAULTS={offset:0,target:window},c.prototype.getState=function(a,b,c,d){var e=this.$target.scrollTop(),f=this.$element.offset(),g=this.$target.height();if(null!=c&&"top"==this.affixed)return e=a-d&&"bottom"},c.prototype.getPinnedOffset=function(){if(this.pinnedOffset)return this.pinnedOffset;this.$element.removeClass(c.RESET).addClass("affix");var a=this.$target.scrollTop(),b=this.$element.offset();return this.pinnedOffset=b.top-a},c.prototype.checkPositionWithEventLoop=function(){setTimeout(a.proxy(this.checkPosition,this),1)},c.prototype.checkPosition=function(){if(this.$element.is(":visible")){var b=this.$element.height(),d=this.options.offset,e=d.top,f=d.bottom,g=Math.max(a(document).height(),a(document.body).height());"object"!=typeof d&&(f=e=d),"function"==typeof e&&(e=d.top(this.$element)),"function"==typeof f&&(f=d.bottom(this.$element));var h=this.getState(g,b,e,f);if(this.affixed!=h){null!=this.unpin&&this.$element.css("top","");var i="affix"+(h?"-"+h:""),j=a.Event(i+".bs.affix");if(this.$element.trigger(j),j.isDefaultPrevented())return;this.affixed=h,this.unpin="bottom"==h?this.getPinnedOffset():null,this.$element.removeClass(c.RESET).addClass(i).trigger(i.replace("affix","affixed")+".bs.affix")}"bottom"==h&&this.$element.offset({top:g-b-f})}};var d=a.fn.affix;a.fn.affix=b,a.fn.affix.Constructor=c,a.fn.affix.noConflict=function(){return a.fn.affix=d,this},a(window).on("load",function(){a('[data-spy="affix"]').each(function(){var c=a(this),d=c.data();d.offset=d.offset||{},null!=d.offsetBottom&&(d.offset.bottom=d.offsetBottom),null!=d.offsetTop&&(d.offset.top=d.offsetTop),b.call(c,d)})})}(jQuery); \ No newline at end of file diff --git a/docs/archive/1.0/sql/tutorial/js/codemirror-sql.js b/docs/archive/1.0/sql/tutorial/js/codemirror-sql.js new file mode 100644 index 00000000000..32ced3e9ded --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/js/codemirror-sql.js @@ -0,0 +1,413 @@ +// CodeMirror, copyright (c) by Marijn Haverbeke and others +// Distributed under an MIT license: http://codemirror.net/LICENSE + +(function(mod) { + if (typeof exports == "object" && typeof module == "object") // CommonJS + mod(require("../../lib/codemirror")); + else if (typeof define == "function" && define.amd) // AMD + define(["../../lib/codemirror"], mod); + else // Plain browser env + mod(CodeMirror); +})(function(CodeMirror) { +"use strict"; + +CodeMirror.defineMode("sql", function(config, parserConfig) { + "use strict"; + + var client = parserConfig.client || {}, + atoms = parserConfig.atoms || {"false": true, "true": true, "null": true}, + builtin = parserConfig.builtin || {}, + keywords = parserConfig.keywords || {}, + operatorChars = parserConfig.operatorChars || /^[*+\-%<>!=&|~^]/, + support = parserConfig.support || {}, + hooks = parserConfig.hooks || {}, + dateSQL = parserConfig.dateSQL || {"date" : true, "time" : true, "timestamp" : true}; + + function tokenBase(stream, state) { + var ch = stream.next(); + + // call hooks from the mime type + if (hooks[ch]) { + var result = hooks[ch](stream, state); + if (result !== false) return result; + } + + if (support.hexNumber && + ((ch == "0" && stream.match(/^[xX][0-9a-fA-F]+/)) + || (ch == "x" || ch == "X") && stream.match(/^'[0-9a-fA-F]+'/))) { + // hex + // ref: http://dev.mysql.com/doc/refman/5.5/en/hexadecimal-literals.html + return "number"; + } else if (support.binaryNumber && + (((ch == "b" || ch == "B") && stream.match(/^'[01]+'/)) + || (ch == "0" && stream.match(/^b[01]+/)))) { + // bitstring + // ref: http://dev.mysql.com/doc/refman/5.5/en/bit-field-literals.html + return "number"; + } else if (ch.charCodeAt(0) > 47 && ch.charCodeAt(0) < 58) { + // numbers + // ref: http://dev.mysql.com/doc/refman/5.5/en/number-literals.html + stream.match(/^[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?/); + support.decimallessFloat && stream.eat('.'); + return "number"; + } else if (ch == "?" && (stream.eatSpace() || stream.eol() || stream.eat(";"))) { + // placeholders + return "variable-3"; + } else if (ch == "'" || (ch == '"' && support.doubleQuote)) { + // strings + // ref: http://dev.mysql.com/doc/refman/5.5/en/string-literals.html + state.tokenize = tokenLiteral(ch); + return state.tokenize(stream, state); + } else if ((((support.nCharCast && (ch == "n" || ch == "N")) + || (support.charsetCast && ch == "_" && stream.match(/[a-z][a-z0-9]*/i))) + && (stream.peek() == "'" || stream.peek() == '"'))) { + // charset casting: _utf8'str', N'str', n'str' + // ref: http://dev.mysql.com/doc/refman/5.5/en/string-literals.html + return "keyword"; + } else if (/^[\(\),\;\[\]]/.test(ch)) { + // no highlighting + return null; + } else if (support.commentSlashSlash && ch == "/" && stream.eat("/")) { + // 1-line comment + stream.skipToEnd(); + return "comment"; + } else if ((support.commentHash && ch == "#") + || (ch == "-" && stream.eat("-") && (!support.commentSpaceRequired || stream.eat(" ")))) { + // 1-line comments + // ref: https://kb.askmonty.org/en/comment-syntax/ + stream.skipToEnd(); + return "comment"; + } else if (ch == "/" && stream.eat("*")) { + // multi-line comments + // ref: https://kb.askmonty.org/en/comment-syntax/ + state.tokenize = tokenComment; + return state.tokenize(stream, state); + } else if (ch == ".") { + // .1 for 0.1 + if (support.zerolessFloat && stream.match(/^(?:\d+(?:e[+-]?\d+)?)/i)) { + return "number"; + } + // .table_name (ODBC) + // // ref: http://dev.mysql.com/doc/refman/5.6/en/identifier-qualifiers.html + if (support.ODBCdotTable && stream.match(/^[a-zA-Z_]+/)) { + return "variable-2"; + } + } else if (operatorChars.test(ch)) { + // operators + stream.eatWhile(operatorChars); + return null; + } else if (ch == '{' && + (stream.match(/^( )*(d|D|t|T|ts|TS)( )*'[^']*'( )*}/) || stream.match(/^( )*(d|D|t|T|ts|TS)( )*"[^"]*"( )*}/))) { + // dates (weird ODBC syntax) + // ref: http://dev.mysql.com/doc/refman/5.5/en/date-and-time-literals.html + return "number"; + } else { + stream.eatWhile(/^[_\w\d]/); + var word = stream.current().toLowerCase(); + // dates (standard SQL syntax) + // ref: http://dev.mysql.com/doc/refman/5.5/en/date-and-time-literals.html + if (dateSQL.hasOwnProperty(word) && (stream.match(/^( )+'[^']*'/) || stream.match(/^( )+"[^"]*"/))) + return "number"; + if (atoms.hasOwnProperty(word)) return "atom"; + if (builtin.hasOwnProperty(word)) return "builtin"; + if (keywords.hasOwnProperty(word)) return "keyword"; + if (client.hasOwnProperty(word)) return "string-2"; + return null; + } + } + + // 'string', with char specified in quote escaped by '\' + function tokenLiteral(quote) { + return function(stream, state) { + var escaped = false, ch; + while ((ch = stream.next()) != null) { + if (ch == quote && !escaped) { + state.tokenize = tokenBase; + break; + } + escaped = !escaped && ch == "\\"; + } + return "string"; + }; + } + function tokenComment(stream, state) { + while (true) { + if (stream.skipTo("*")) { + stream.next(); + if (stream.eat("/")) { + state.tokenize = tokenBase; + break; + } + } else { + stream.skipToEnd(); + break; + } + } + return "comment"; + } + + function pushContext(stream, state, type) { + state.context = { + prev: state.context, + indent: stream.indentation(), + col: stream.column(), + type: type + }; + } + + function popContext(state) { + state.indent = state.context.indent; + state.context = state.context.prev; + } + + return { + startState: function() { + return {tokenize: tokenBase, context: null}; + }, + + token: function(stream, state) { + if (stream.sol()) { + if (state.context && state.context.align == null) + state.context.align = false; + } + if (stream.eatSpace()) return null; + + var style = state.tokenize(stream, state); + if (style == "comment") return style; + + if (state.context && state.context.align == null) + state.context.align = true; + + var tok = stream.current(); + if (tok == "(") + pushContext(stream, state, ")"); + else if (tok == "[") + pushContext(stream, state, "]"); + else if (state.context && state.context.type == tok) + popContext(state); + return style; + }, + + indent: function(state, textAfter) { + var cx = state.context; + if (!cx) return CodeMirror.Pass; + var closing = textAfter.charAt(0) == cx.type; + if (cx.align) return cx.col + (closing ? 0 : 1); + else return cx.indent + (closing ? 0 : config.indentUnit); + }, + + blockCommentStart: "/*", + blockCommentEnd: "*/", + lineComment: support.commentSlashSlash ? "//" : support.commentHash ? "#" : null + }; +}); + +(function() { + "use strict"; + + // `identifier` + function hookIdentifier(stream) { + // MySQL/MariaDB identifiers + // ref: http://dev.mysql.com/doc/refman/5.6/en/identifier-qualifiers.html + var ch; + while ((ch = stream.next()) != null) { + if (ch == "`" && !stream.eat("`")) return "variable-2"; + } + stream.backUp(stream.current().length - 1); + return stream.eatWhile(/\w/) ? "variable-2" : null; + } + + // variable token + function hookVar(stream) { + // variables + // @@prefix.varName @varName + // varName can be quoted with ` or ' or " + // ref: http://dev.mysql.com/doc/refman/5.5/en/user-variables.html + if (stream.eat("@")) { + stream.match(/^session\./); + stream.match(/^local\./); + stream.match(/^global\./); + } + + if (stream.eat("'")) { + stream.match(/^.*'/); + return "variable-2"; + } else if (stream.eat('"')) { + stream.match(/^.*"/); + return "variable-2"; + } else if (stream.eat("`")) { + stream.match(/^.*`/); + return "variable-2"; + } else if (stream.match(/^[0-9a-zA-Z$\.\_]+/)) { + return "variable-2"; + } + return null; + }; + + // short client keyword token + function hookClient(stream) { + // \N means NULL + // ref: http://dev.mysql.com/doc/refman/5.5/en/null-values.html + if (stream.eat("N")) { + return "atom"; + } + // \g, etc + // ref: http://dev.mysql.com/doc/refman/5.5/en/mysql-commands.html + return stream.match(/^[a-zA-Z.#!?]/) ? "variable-2" : null; + } + + // these keywords are used by all SQL dialects (however, a mode can still overwrite it) + var sqlKeywords = "alter and as asc between by count create delete desc distinct drop from group having in insert into is join like not on or order select set table union update values where limit "; + + // turn a space-separated list into an array + function set(str) { + var obj = {}, words = str.split(" "); + for (var i = 0; i < words.length; ++i) obj[words[i]] = true; + return obj; + } + + // A generic SQL Mode. It's not a standard, it just try to support what is generally supported + CodeMirror.defineMIME("text/x-sql", { + name: "sql", + keywords: set(sqlKeywords + "begin"), + builtin: set("bool boolean bit blob enum long longblob longtext medium mediumblob mediumint mediumtext time timestamp tinyblob tinyint tinytext text bigint int int1 int2 int3 int4 int8 integer float float4 float8 double char varbinary varchar varcharacter precision real date datetime year unsigned signed decimal numeric"), + atoms: set("false true null unknown"), + operatorChars: /^[*+\-%<>!=]/, + dateSQL: set("date time timestamp"), + support: set("ODBCdotTable doubleQuote binaryNumber hexNumber") + }); + + CodeMirror.defineMIME("text/x-mssql", { + name: "sql", + client: set("charset clear connect edit ego exit go help nopager notee nowarning pager print prompt quit rehash source status system tee"), + keywords: set(sqlKeywords + "begin trigger proc view index for add constraint key primary foreign collate clustered nonclustered declare exec"), + builtin: set("bigint numeric bit smallint decimal smallmoney int tinyint money float real char varchar text nchar nvarchar ntext binary varbinary image cursor timestamp hierarchyid uniqueidentifier sql_variant xml table "), + atoms: set("false true null unknown"), + operatorChars: /^[*+\-%<>!=]/, + dateSQL: set("date datetimeoffset datetime2 smalldatetime datetime time"), + hooks: { + "@": hookVar + } + }); + + CodeMirror.defineMIME("text/x-mysql", { + name: "sql", + client: set("charset clear connect edit ego exit go help nopager notee nowarning pager print prompt quit rehash source status system tee"), + keywords: set(sqlKeywords + "accessible action add after algorithm all analyze asensitive at authors auto_increment autocommit avg avg_row_length before binary binlog both btree cache call cascade cascaded case catalog_name chain change changed character check checkpoint checksum class_origin client_statistics close coalesce code collate collation collations column columns comment commit committed completion concurrent condition connection consistent constraint contains continue contributors convert cross current current_date current_time current_timestamp current_user cursor data database databases day_hour day_microsecond day_minute day_second deallocate dec declare default delay_key_write delayed delimiter des_key_file describe deterministic dev_pop dev_samp deviance diagnostics directory disable discard distinctrow div dual dumpfile each elseif enable enclosed end ends engine engines enum errors escape escaped even event events every execute exists exit explain extended fast fetch field fields first flush for force foreign found_rows full fulltext function general get global grant grants group group_concat handler hash help high_priority hosts hour_microsecond hour_minute hour_second if ignore ignore_server_ids import index index_statistics infile inner innodb inout insensitive insert_method install interval invoker isolation iterate key keys kill language last leading leave left level limit linear lines list load local localtime localtimestamp lock logs low_priority master master_heartbeat_period master_ssl_verify_server_cert masters match max max_rows maxvalue message_text middleint migrate min min_rows minute_microsecond minute_second mod mode modifies modify mutex mysql_errno natural next no no_write_to_binlog offline offset one online open optimize option optionally out outer outfile pack_keys parser partition partitions password phase plugin plugins prepare preserve prev primary privileges procedure processlist profile profiles purge query quick range read read_write reads real rebuild recover references regexp relaylog release remove rename reorganize repair repeatable replace require resignal restrict resume return returns revoke right rlike rollback rollup row row_format rtree savepoint schedule schema schema_name schemas second_microsecond security sensitive separator serializable server session share show signal slave slow smallint snapshot soname spatial specific sql sql_big_result sql_buffer_result sql_cache sql_calc_found_rows sql_no_cache sql_small_result sqlexception sqlstate sqlwarning ssl start starting starts status std stddev stddev_pop stddev_samp storage straight_join subclass_origin sum suspend table_name table_statistics tables tablespace temporary terminated to trailing transaction trigger triggers truncate uncommitted undo uninstall unique unlock upgrade usage use use_frm user user_resources user_statistics using utc_date utc_time utc_timestamp value variables varying view views warnings when while with work write xa xor year_month zerofill begin do then else loop repeat"), + builtin: set("bool boolean bit blob decimal double float long longblob longtext medium mediumblob mediumint mediumtext time timestamp tinyblob tinyint tinytext text bigint int int1 int2 int3 int4 int8 integer float float4 float8 double char varbinary varchar varcharacter precision date datetime year unsigned signed numeric"), + atoms: set("false true null unknown"), + operatorChars: /^[*+\-%<>!=&|^]/, + dateSQL: set("date time timestamp"), + support: set("ODBCdotTable decimallessFloat zerolessFloat binaryNumber hexNumber doubleQuote nCharCast charsetCast commentHash commentSpaceRequired"), + hooks: { + "@": hookVar, + "`": hookIdentifier, + "\\": hookClient + } + }); + + CodeMirror.defineMIME("text/x-mariadb", { + name: "sql", + client: set("charset clear connect edit ego exit go help nopager notee nowarning pager print prompt quit rehash source status system tee"), + keywords: set(sqlKeywords + "accessible action add after algorithm all always analyze asensitive at authors auto_increment autocommit avg avg_row_length before binary binlog both btree cache call cascade cascaded case catalog_name chain change changed character check checkpoint checksum class_origin client_statistics close coalesce code collate collation collations column columns comment commit committed completion concurrent condition connection consistent constraint contains continue contributors convert cross current current_date current_time current_timestamp current_user cursor data database databases day_hour day_microsecond day_minute day_second deallocate dec declare default delay_key_write delayed delimiter des_key_file describe deterministic dev_pop dev_samp deviance diagnostics directory disable discard distinctrow div dual dumpfile each elseif enable enclosed end ends engine engines enum errors escape escaped even event events every execute exists exit explain extended fast fetch field fields first flush for force foreign found_rows full fulltext function general generated get global grant grants group groupby_concat handler hard hash help high_priority hosts hour_microsecond hour_minute hour_second if ignore ignore_server_ids import index index_statistics infile inner innodb inout insensitive insert_method install interval invoker isolation iterate key keys kill language last leading leave left level limit linear lines list load local localtime localtimestamp lock logs low_priority master master_heartbeat_period master_ssl_verify_server_cert masters match max max_rows maxvalue message_text middleint migrate min min_rows minute_microsecond minute_second mod mode modifies modify mutex mysql_errno natural next no no_write_to_binlog offline offset one online open optimize option optionally out outer outfile pack_keys parser partition partitions password persistent phase plugin plugins prepare preserve prev primary privileges procedure processlist profile profiles purge query quick range read read_write reads real rebuild recover references regexp relaylog release remove rename reorganize repair repeatable replace require resignal restrict resume return returns revoke right rlike rollback rollup row row_format rtree savepoint schedule schema schema_name schemas second_microsecond security sensitive separator serializable server session share show shutdown signal slave slow smallint snapshot soft soname spatial specific sql sql_big_result sql_buffer_result sql_cache sql_calc_found_rows sql_no_cache sql_small_result sqlexception sqlstate sqlwarning ssl start starting starts status std stddev stddev_pop stddev_samp storage straight_join subclass_origin sum suspend table_name table_statistics tables tablespace temporary terminated to trailing transaction trigger triggers truncate uncommitted undo uninstall unique unlock upgrade usage use use_frm user user_resources user_statistics using utc_date utc_time utc_timestamp value variables varying view views virtual warnings when while with work write xa xor year_month zerofill begin do then else loop repeat"), + builtin: set("bool boolean bit blob decimal double float long longblob longtext medium mediumblob mediumint mediumtext time timestamp tinyblob tinyint tinytext text bigint int int1 int2 int3 int4 int8 integer float float4 float8 double char varbinary varchar varcharacter precision date datetime year unsigned signed numeric"), + atoms: set("false true null unknown"), + operatorChars: /^[*+\-%<>!=&|^]/, + dateSQL: set("date time timestamp"), + support: set("ODBCdotTable decimallessFloat zerolessFloat binaryNumber hexNumber doubleQuote nCharCast charsetCast commentHash commentSpaceRequired"), + hooks: { + "@": hookVar, + "`": hookIdentifier, + "\\": hookClient + } + }); + + // the query language used by Apache Cassandra is called CQL, but this mime type + // is called Cassandra to avoid confusion with Contextual Query Language + CodeMirror.defineMIME("text/x-cassandra", { + name: "sql", + client: { }, + keywords: set("add all allow alter and any apply as asc authorize batch begin by clustering columnfamily compact consistency count create custom delete desc distinct drop each_quorum exists filtering from grant if in index insert into key keyspace keyspaces level limit local_one local_quorum modify nan norecursive nosuperuser not of on one order password permission permissions primary quorum rename revoke schema select set storage superuser table three to token truncate ttl two type unlogged update use user users using values where with writetime"), + builtin: set("ascii bigint blob boolean counter decimal double float frozen inet int list map static text timestamp timeuuid tuple uuid varchar varint"), + atoms: set("false true infinity NaN"), + operatorChars: /^[<>=]/, + dateSQL: { }, + support: set("commentSlashSlash decimallessFloat"), + hooks: { } + }); + + // this is based on Peter Raganitsch's 'plsql' mode + CodeMirror.defineMIME("text/x-plsql", { + name: "sql", + client: set("appinfo arraysize autocommit autoprint autorecovery autotrace blockterminator break btitle cmdsep colsep compatibility compute concat copycommit copytypecheck define describe echo editfile embedded escape exec execute feedback flagger flush heading headsep instance linesize lno loboffset logsource long longchunksize markup native newpage numformat numwidth pagesize pause pno recsep recsepchar release repfooter repheader serveroutput shiftinout show showmode size spool sqlblanklines sqlcase sqlcode sqlcontinue sqlnumber sqlpluscompatibility sqlprefix sqlprompt sqlterminator suffix tab term termout time timing trimout trimspool ttitle underline verify version wrap"), + keywords: set("abort accept access add all alter and any array arraylen as asc assert assign at attributes audit authorization avg base_table begin between binary_integer body boolean by case cast char char_base check close cluster clusters colauth column comment commit compress connect connected constant constraint crash create current currval cursor data_base database date dba deallocate debugoff debugon decimal declare default definition delay delete desc digits dispose distinct do drop else elseif elsif enable end entry escape exception exception_init exchange exclusive exists exit external fast fetch file for force form from function generic goto grant group having identified if immediate in increment index indexes indicator initial initrans insert interface intersect into is key level library like limited local lock log logging long loop master maxextents maxtrans member minextents minus mislabel mode modify multiset new next no noaudit nocompress nologging noparallel not nowait number_base object of off offline on online only open option or order out package parallel partition pctfree pctincrease pctused pls_integer positive positiven pragma primary prior private privileges procedure public raise range raw read rebuild record ref references refresh release rename replace resource restrict return returning returns reverse revoke rollback row rowid rowlabel rownum rows run savepoint schema segment select separate session set share snapshot some space split sql start statement storage subtype successful synonym tabauth table tables tablespace task terminate then to trigger truncate type union unique unlimited unrecoverable unusable update use using validate value values variable view views when whenever where while with work"), + builtin: set("abs acos add_months ascii asin atan atan2 average bfile bfilename bigserial bit blob ceil character chartorowid chr clob concat convert cos cosh count dec decode deref dual dump dup_val_on_index empty error exp false float floor found glb greatest hextoraw initcap instr instrb int integer isopen last_day least length lengthb ln lower lpad ltrim lub make_ref max min mlslabel mod months_between natural naturaln nchar nclob new_time next_day nextval nls_charset_decl_len nls_charset_id nls_charset_name nls_initcap nls_lower nls_sort nls_upper nlssort no_data_found notfound null number numeric nvarchar2 nvl others power rawtohex real reftohex round rowcount rowidtochar rowtype rpad rtrim serial sign signtype sin sinh smallint soundex sqlcode sqlerrm sqrt stddev string substr substrb sum sysdate tan tanh to_char text to_date to_label to_multi_byte to_number to_single_byte translate true trunc uid unlogged upper user userenv varchar varchar2 variance varying vsize xml"), + operatorChars: /^[*+\-%<>!=~]/, + dateSQL: set("date time timestamp"), + support: set("doubleQuote nCharCast zerolessFloat binaryNumber hexNumber") + }); + + // Created to support specific hive keywords + CodeMirror.defineMIME("text/x-hive", { + name: "sql", + keywords: set("select alter $elem$ $key$ $value$ add after all analyze and archive as asc before between binary both bucket buckets by cascade case cast change cluster clustered clusterstatus collection column columns comment compute concatenate continue create cross cursor data database databases dbproperties deferred delete delimited desc describe directory disable distinct distribute drop else enable end escaped exclusive exists explain export extended external false fetch fields fileformat first format formatted from full function functions grant group having hold_ddltime idxproperties if import in index indexes inpath inputdriver inputformat insert intersect into is items join keys lateral left like limit lines load local location lock locks mapjoin materialized minus msck no_drop nocompress not of offline on option or order out outer outputdriver outputformat overwrite partition partitioned partitions percent plus preserve procedure purge range rcfile read readonly reads rebuild recordreader recordwriter recover reduce regexp rename repair replace restrict revoke right rlike row schema schemas semi sequencefile serde serdeproperties set shared show show_database sort sorted ssl statistics stored streamtable table tables tablesample tblproperties temporary terminated textfile then tmp to touch transform trigger true unarchive undo union uniquejoin unlock update use using utc utc_tmestamp view when where while with"), + builtin: set("bool boolean long timestamp tinyint smallint bigint int float double date datetime unsigned string array struct map uniontype"), + atoms: set("false true null unknown"), + operatorChars: /^[*+\-%<>!=]/, + dateSQL: set("date timestamp"), + support: set("ODBCdotTable doubleQuote binaryNumber hexNumber") + }); + + CodeMirror.defineMIME("text/x-pgsql", { + name: "sql", + client: set("source"), + // http://www.postgresql.org/docs/9.5/static/sql-keywords-appendix.html + keywords: set(sqlKeywords + "a abort abs absent absolute access according action ada add admin after aggregate all allocate also always analyse analyze any are array array_agg array_max_cardinality asensitive assertion assignment asymmetric at atomic attribute attributes authorization avg backward base64 before begin begin_frame begin_partition bernoulli binary bit_length blob blocked bom both breadth c cache call called cardinality cascade cascaded case cast catalog catalog_name ceil ceiling chain characteristics characters character_length character_set_catalog character_set_name character_set_schema char_length check checkpoint class class_origin clob close cluster coalesce cobol collate collation collation_catalog collation_name collation_schema collect column columns column_name command_function command_function_code comment comments commit committed concurrently condition condition_number configuration conflict connect connection connection_name constraint constraints constraint_catalog constraint_name constraint_schema constructor contains content continue control conversion convert copy corr corresponding cost covar_pop covar_samp cross csv cube cume_dist current current_catalog current_date current_default_transform_group current_path current_role current_row current_schema current_time current_timestamp current_transform_group_for_type current_user cursor cursor_name cycle data database datalink datetime_interval_code datetime_interval_precision day db deallocate dec declare default defaults deferrable deferred defined definer degree delimiter delimiters dense_rank depth deref derived describe descriptor deterministic diagnostics dictionary disable discard disconnect dispatch dlnewcopy dlpreviouscopy dlurlcomplete dlurlcompleteonly dlurlcompletewrite dlurlpath dlurlpathonly dlurlpathwrite dlurlscheme dlurlserver dlvalue do document domain dynamic dynamic_function dynamic_function_code each element else empty enable encoding encrypted end end-exec end_frame end_partition enforced enum equals escape event every except exception exclude excluding exclusive exec execute exists exp explain expression extension external extract false family fetch file filter final first first_value flag float floor following for force foreign fortran forward found frame_row free freeze fs full function functions fusion g general generated get global go goto grant granted greatest grouping groups handler header hex hierarchy hold hour id identity if ignore ilike immediate immediately immutable implementation implicit import including increment indent index indexes indicator inherit inherits initially inline inner inout input insensitive instance instantiable instead integrity intersect intersection invoker isnull isolation k key key_member key_type label lag language large last last_value lateral lead leading leakproof least left length level library like_regex link listen ln load local localtime localtimestamp location locator lock locked logged lower m map mapping match matched materialized max maxvalue max_cardinality member merge message_length message_octet_length message_text method min minute minvalue mod mode modifies module month more move multiset mumps name names namespace national natural nchar nclob nesting new next nfc nfd nfkc nfkd nil no none normalize normalized nothing notify notnull nowait nth_value ntile null nullable nullif nulls number object occurrences_regex octets octet_length of off offset oids old only open operator option options ordering ordinality others out outer output over overlaps overlay overriding owned owner p pad parameter parameter_mode parameter_name parameter_ordinal_position parameter_specific_catalog parameter_specific_name parameter_specific_schema parser partial partition pascal passing passthrough password percent percentile_cont percentile_disc percent_rank period permission placing plans pli policy portion position position_regex power precedes preceding prepare prepared preserve primary prior privileges procedural procedure program public quote range rank read reads reassign recheck recovery recursive ref references referencing refresh regr_avgx regr_avgy regr_count regr_intercept regr_r2 regr_slope regr_sxx regr_sxy regr_syy reindex relative release rename repeatable replace replica requiring reset respect restart restore restrict result return returned_cardinality returned_length returned_octet_length returned_sqlstate returning returns revoke right role rollback rollup routine routine_catalog routine_name routine_schema row rows row_count row_number rule savepoint scale schema schema_name scope scope_catalog scope_name scope_schema scroll search second section security selective self sensitive sequence sequences serializable server server_name session session_user setof sets share show similar simple size skip snapshot some source space specific specifictype specific_name sql sqlcode sqlerror sqlexception sqlstate sqlwarning sqrt stable standalone start state statement static statistics stddev_pop stddev_samp stdin stdout storage strict strip structure style subclass_origin submultiset substring substring_regex succeeds sum symmetric sysid system system_time system_user t tables tablesample tablespace table_name temp template temporary then ties timezone_hour timezone_minute to token top_level_count trailing transaction transactions_committed transactions_rolled_back transaction_active transform transforms translate translate_regex translation treat trigger trigger_catalog trigger_name trigger_schema trim trim_array true truncate trusted type types uescape unbounded uncommitted under unencrypted unique unknown unlink unlisten unlogged unnamed unnest until untyped upper uri usage user user_defined_type_catalog user_defined_type_code user_defined_type_name user_defined_type_schema using vacuum valid validate validator value value_of varbinary variadic var_pop var_samp verbose version versioning view views volatile when whenever whitespace width_bucket window within work wrapper write xmlagg xmlattributes xmlbinary xmlcast xmlcomment xmlconcat xmldeclaration xmldocument xmlelement xmlexists xmlforest xmliterate xmlnamespaces xmlparse xmlpi xmlquery xmlroot xmlschema xmlserialize xmltable xmltext xmlvalidate year yes loop repeat"), + // http://www.postgresql.org/docs/9.5/static/datatype.html + builtin: set("bigint int8 bigserial serial8 bit varying varbit boolean bool box bytea character char varchar cidr circle date double precision float8 inet integer int int4 interval json jsonb line lseg macaddr money numeric decimal path pg_lsn point polygon real float4 smallint int2 smallserial serial2 serial serial4 text time without zone with timetz timestamp timestamptz tsquery tsvector txid_snapshot uuid xml"), + atoms: set("false true null unknown"), + operatorChars: /^[*+\-%<>!=&|^\/#@?~]/, + dateSQL: set("date time timestamp"), + support: set("ODBCdotTable decimallessFloat zerolessFloat binaryNumber hexNumber nCharCast charsetCast") + }); + + // Google's SQL-like query language, GQL + CodeMirror.defineMIME("text/x-gql", { + name: "sql", + keywords: set("ancestor and asc by contains desc descendant distinct from group has in is limit offset on order select superset where"), + atoms: set("false true"), + builtin: set("blob datetime first key __key__ string integer double boolean null"), + operatorChars: /^[*+\-%<>!=]/ + }); +}()); + +}); + +/* + How Properties of Mime Types are used by SQL Mode + ================================================= + + keywords: + A list of keywords you want to be highlighted. + builtin: + A list of builtin types you want to be highlighted (if you want types to be of class "builtin" instead of "keyword"). + operatorChars: + All characters that must be handled as operators. + client: + Commands parsed and executed by the client (not the server). + support: + A list of supported syntaxes which are not common, but are supported by more than 1 DBMS. + * ODBCdotTable: .tableName + * zerolessFloat: .1 + * doubleQuote + * nCharCast: N'string' + * charsetCast: _utf8'string' + * commentHash: use # char for comments + * commentSlashSlash: use // for comments + * commentSpaceRequired: require a space after -- for comments + atoms: + Keywords that must be highlighted as atoms,. Some DBMS's support more atoms than others: + UNKNOWN, INFINITY, UNDERFLOW, NaN... + dateSQL: + Used for date/time SQL standard syntax, because not all DBMS's support same temporal types. +*/ diff --git a/docs/archive/1.0/sql/tutorial/js/codemirror.js b/docs/archive/1.0/sql/tutorial/js/codemirror.js new file mode 100644 index 00000000000..1e151c1f25f --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/js/codemirror.js @@ -0,0 +1,9231 @@ +// CodeMirror, copyright (c) by Marijn Haverbeke and others +// Distributed under an MIT license: http://codemirror.net/LICENSE + +// This is CodeMirror (http://codemirror.net), a code editor +// implemented in JavaScript on top of the browser's DOM. +// +// You can find some technical background for some of the code below +// at http://marijnhaverbeke.nl/blog/#cm-internals . + +(function (global, factory) { + typeof exports === 'object' && typeof module !== 'undefined' ? module.exports = factory() : + typeof define === 'function' && define.amd ? define(factory) : + (global.CodeMirror = factory()); +}(this, (function () { 'use strict'; + +// Kludges for bugs and behavior differences that can't be feature +// detected are enabled based on userAgent etc sniffing. +var userAgent = navigator.userAgent +var platform = navigator.platform + +var gecko = /gecko\/\d/i.test(userAgent) +var ie_upto10 = /MSIE \d/.test(userAgent) +var ie_11up = /Trident\/(?:[7-9]|\d{2,})\..*rv:(\d+)/.exec(userAgent) +var edge = /Edge\/(\d+)/.exec(userAgent) +var ie = ie_upto10 || ie_11up || edge +var ie_version = ie && (ie_upto10 ? document.documentMode || 6 : +(edge || ie_11up)[1]) +var webkit = !edge && /WebKit\//.test(userAgent) +var qtwebkit = webkit && /Qt\/\d+\.\d+/.test(userAgent) +var chrome = !edge && /Chrome\//.test(userAgent) +var presto = /Opera\//.test(userAgent) +var safari = /Apple Computer/.test(navigator.vendor) +var mac_geMountainLion = /Mac OS X 1\d\D([8-9]|\d\d)\D/.test(userAgent) +var phantom = /PhantomJS/.test(userAgent) + +var ios = !edge && /AppleWebKit/.test(userAgent) && /Mobile\/\w+/.test(userAgent) +// This is woefully incomplete. Suggestions for alternative methods welcome. +var mobile = ios || /Android|webOS|BlackBerry|Opera Mini|Opera Mobi|IEMobile/i.test(userAgent) +var mac = ios || /Mac/.test(platform) +var chromeOS = /\bCrOS\b/.test(userAgent) +var windows = /win/i.test(platform) + +var presto_version = presto && userAgent.match(/Version\/(\d*\.\d*)/) +if (presto_version) { presto_version = Number(presto_version[1]) } +if (presto_version && presto_version >= 15) { presto = false; webkit = true } +// Some browsers use the wrong event properties to signal cmd/ctrl on OS X +var flipCtrlCmd = mac && (qtwebkit || presto && (presto_version == null || presto_version < 12.11)) +var captureRightClick = gecko || (ie && ie_version >= 9) + +function classTest(cls) { return new RegExp("(^|\\s)" + cls + "(?:$|\\s)\\s*") } + +var rmClass = function(node, cls) { + var current = node.className + var match = classTest(cls).exec(current) + if (match) { + var after = current.slice(match.index + match[0].length) + node.className = current.slice(0, match.index) + (after ? match[1] + after : "") + } +} + +function removeChildren(e) { + for (var count = e.childNodes.length; count > 0; --count) + { e.removeChild(e.firstChild) } + return e +} + +function removeChildrenAndAdd(parent, e) { + return removeChildren(parent).appendChild(e) +} + +function elt(tag, content, className, style) { + var e = document.createElement(tag) + if (className) { e.className = className } + if (style) { e.style.cssText = style } + if (typeof content == "string") { e.appendChild(document.createTextNode(content)) } + else if (content) { for (var i = 0; i < content.length; ++i) { e.appendChild(content[i]) } } + return e +} + +var range +if (document.createRange) { range = function(node, start, end, endNode) { + var r = document.createRange() + r.setEnd(endNode || node, end) + r.setStart(node, start) + return r +} } +else { range = function(node, start, end) { + var r = document.body.createTextRange() + try { r.moveToElementText(node.parentNode) } + catch(e) { return r } + r.collapse(true) + r.moveEnd("character", end) + r.moveStart("character", start) + return r +} } + +function contains(parent, child) { + if (child.nodeType == 3) // Android browser always returns false when child is a textnode + { child = child.parentNode } + if (parent.contains) + { return parent.contains(child) } + do { + if (child.nodeType == 11) { child = child.host } + if (child == parent) { return true } + } while (child = child.parentNode) +} + +function activeElt() { + // IE and Edge may throw an "Unspecified Error" when accessing document.activeElement. + // IE < 10 will throw when accessed while the page is loading or in an iframe. + // IE > 9 and Edge will throw when accessed in an iframe if document.body is unavailable. + var activeElement + try { + activeElement = document.activeElement + } catch(e) { + activeElement = document.body || null + } + while (activeElement && activeElement.root && activeElement.root.activeElement) + { activeElement = activeElement.root.activeElement } + return activeElement +} + +function addClass(node, cls) { + var current = node.className + if (!classTest(cls).test(current)) { node.className += (current ? " " : "") + cls } +} +function joinClasses(a, b) { + var as = a.split(" ") + for (var i = 0; i < as.length; i++) + { if (as[i] && !classTest(as[i]).test(b)) { b += " " + as[i] } } + return b +} + +var selectInput = function(node) { node.select() } +if (ios) // Mobile Safari apparently has a bug where select() is broken. + { selectInput = function(node) { node.selectionStart = 0; node.selectionEnd = node.value.length } } +else if (ie) // Suppress mysterious IE10 errors + { selectInput = function(node) { try { node.select() } catch(_e) {} } } + +function bind(f) { + var args = Array.prototype.slice.call(arguments, 1) + return function(){return f.apply(null, args)} +} + +function copyObj(obj, target, overwrite) { + if (!target) { target = {} } + for (var prop in obj) + { if (obj.hasOwnProperty(prop) && (overwrite !== false || !target.hasOwnProperty(prop))) + { target[prop] = obj[prop] } } + return target +} + +// Counts the column offset in a string, taking tabs into account. +// Used mostly to find indentation. +function countColumn(string, end, tabSize, startIndex, startValue) { + if (end == null) { + end = string.search(/[^\s\u00a0]/) + if (end == -1) { end = string.length } + } + for (var i = startIndex || 0, n = startValue || 0;;) { + var nextTab = string.indexOf("\t", i) + if (nextTab < 0 || nextTab >= end) + { return n + (end - i) } + n += nextTab - i + n += tabSize - (n % tabSize) + i = nextTab + 1 + } +} + +var Delayed = function() {this.id = null}; +Delayed.prototype.set = function (ms, f) { + clearTimeout(this.id) + this.id = setTimeout(f, ms) +}; + +function indexOf(array, elt) { + for (var i = 0; i < array.length; ++i) + { if (array[i] == elt) { return i } } + return -1 +} + +// Number of pixels added to scroller and sizer to hide scrollbar +var scrollerGap = 30 + +// Returned or thrown by various protocols to signal 'I'm not +// handling this'. +var Pass = {toString: function(){return "CodeMirror.Pass"}} + +// Reused option objects for setSelection & friends +var sel_dontScroll = {scroll: false}; +var sel_mouse = {origin: "*mouse"}; +var sel_move = {origin: "+move"}; +// The inverse of countColumn -- find the offset that corresponds to +// a particular column. +function findColumn(string, goal, tabSize) { + for (var pos = 0, col = 0;;) { + var nextTab = string.indexOf("\t", pos) + if (nextTab == -1) { nextTab = string.length } + var skipped = nextTab - pos + if (nextTab == string.length || col + skipped >= goal) + { return pos + Math.min(skipped, goal - col) } + col += nextTab - pos + col += tabSize - (col % tabSize) + pos = nextTab + 1 + if (col >= goal) { return pos } + } +} + +var spaceStrs = [""] +function spaceStr(n) { + while (spaceStrs.length <= n) + { spaceStrs.push(lst(spaceStrs) + " ") } + return spaceStrs[n] +} + +function lst(arr) { return arr[arr.length-1] } + +function map(array, f) { + var out = [] + for (var i = 0; i < array.length; i++) { out[i] = f(array[i], i) } + return out +} + +function insertSorted(array, value, score) { + var pos = 0, priority = score(value) + while (pos < array.length && score(array[pos]) <= priority) { pos++ } + array.splice(pos, 0, value) +} + +function nothing() {} + +function createObj(base, props) { + var inst + if (Object.create) { + inst = Object.create(base) + } else { + nothing.prototype = base + inst = new nothing() + } + if (props) { copyObj(props, inst) } + return inst +} + +var nonASCIISingleCaseWordChar = /[\u00df\u0587\u0590-\u05f4\u0600-\u06ff\u3040-\u309f\u30a0-\u30ff\u3400-\u4db5\u4e00-\u9fcc\uac00-\ud7af]/ +function isWordCharBasic(ch) { + return /\w/.test(ch) || ch > "\x80" && + (ch.toUpperCase() != ch.toLowerCase() || nonASCIISingleCaseWordChar.test(ch)) +} +function isWordChar(ch, helper) { + if (!helper) { return isWordCharBasic(ch) } + if (helper.source.indexOf("\\w") > -1 && isWordCharBasic(ch)) { return true } + return helper.test(ch) +} + +function isEmpty(obj) { + for (var n in obj) { if (obj.hasOwnProperty(n) && obj[n]) { return false } } + return true +} + +// Extending unicode characters. A series of a non-extending char + +// any number of extending chars is treated as a single unit as far +// as editing and measuring is concerned. This is not fully correct, +// since some scripts/fonts/browsers also treat other configurations +// of code points as a group. +var extendingChars = /[\u0300-\u036f\u0483-\u0489\u0591-\u05bd\u05bf\u05c1\u05c2\u05c4\u05c5\u05c7\u0610-\u061a\u064b-\u065e\u0670\u06d6-\u06dc\u06de-\u06e4\u06e7\u06e8\u06ea-\u06ed\u0711\u0730-\u074a\u07a6-\u07b0\u07eb-\u07f3\u0816-\u0819\u081b-\u0823\u0825-\u0827\u0829-\u082d\u0900-\u0902\u093c\u0941-\u0948\u094d\u0951-\u0955\u0962\u0963\u0981\u09bc\u09be\u09c1-\u09c4\u09cd\u09d7\u09e2\u09e3\u0a01\u0a02\u0a3c\u0a41\u0a42\u0a47\u0a48\u0a4b-\u0a4d\u0a51\u0a70\u0a71\u0a75\u0a81\u0a82\u0abc\u0ac1-\u0ac5\u0ac7\u0ac8\u0acd\u0ae2\u0ae3\u0b01\u0b3c\u0b3e\u0b3f\u0b41-\u0b44\u0b4d\u0b56\u0b57\u0b62\u0b63\u0b82\u0bbe\u0bc0\u0bcd\u0bd7\u0c3e-\u0c40\u0c46-\u0c48\u0c4a-\u0c4d\u0c55\u0c56\u0c62\u0c63\u0cbc\u0cbf\u0cc2\u0cc6\u0ccc\u0ccd\u0cd5\u0cd6\u0ce2\u0ce3\u0d3e\u0d41-\u0d44\u0d4d\u0d57\u0d62\u0d63\u0dca\u0dcf\u0dd2-\u0dd4\u0dd6\u0ddf\u0e31\u0e34-\u0e3a\u0e47-\u0e4e\u0eb1\u0eb4-\u0eb9\u0ebb\u0ebc\u0ec8-\u0ecd\u0f18\u0f19\u0f35\u0f37\u0f39\u0f71-\u0f7e\u0f80-\u0f84\u0f86\u0f87\u0f90-\u0f97\u0f99-\u0fbc\u0fc6\u102d-\u1030\u1032-\u1037\u1039\u103a\u103d\u103e\u1058\u1059\u105e-\u1060\u1071-\u1074\u1082\u1085\u1086\u108d\u109d\u135f\u1712-\u1714\u1732-\u1734\u1752\u1753\u1772\u1773\u17b7-\u17bd\u17c6\u17c9-\u17d3\u17dd\u180b-\u180d\u18a9\u1920-\u1922\u1927\u1928\u1932\u1939-\u193b\u1a17\u1a18\u1a56\u1a58-\u1a5e\u1a60\u1a62\u1a65-\u1a6c\u1a73-\u1a7c\u1a7f\u1b00-\u1b03\u1b34\u1b36-\u1b3a\u1b3c\u1b42\u1b6b-\u1b73\u1b80\u1b81\u1ba2-\u1ba5\u1ba8\u1ba9\u1c2c-\u1c33\u1c36\u1c37\u1cd0-\u1cd2\u1cd4-\u1ce0\u1ce2-\u1ce8\u1ced\u1dc0-\u1de6\u1dfd-\u1dff\u200c\u200d\u20d0-\u20f0\u2cef-\u2cf1\u2de0-\u2dff\u302a-\u302f\u3099\u309a\ua66f-\ua672\ua67c\ua67d\ua6f0\ua6f1\ua802\ua806\ua80b\ua825\ua826\ua8c4\ua8e0-\ua8f1\ua926-\ua92d\ua947-\ua951\ua980-\ua982\ua9b3\ua9b6-\ua9b9\ua9bc\uaa29-\uaa2e\uaa31\uaa32\uaa35\uaa36\uaa43\uaa4c\uaab0\uaab2-\uaab4\uaab7\uaab8\uaabe\uaabf\uaac1\uabe5\uabe8\uabed\udc00-\udfff\ufb1e\ufe00-\ufe0f\ufe20-\ufe26\uff9e\uff9f]/ +function isExtendingChar(ch) { return ch.charCodeAt(0) >= 768 && extendingChars.test(ch) } + +// Returns a number from the range [`0`; `str.length`] unless `pos` is outside that range. +function skipExtendingChars(str, pos, dir) { + while ((dir < 0 ? pos > 0 : pos < str.length) && isExtendingChar(str.charAt(pos))) { pos += dir } + return pos +} + +// Returns the value from the range [`from`; `to`] that satisfies +// `pred` and is closest to `from`. Assumes that at least `to` satisfies `pred`. +function findFirst(pred, from, to) { + for (;;) { + if (Math.abs(from - to) <= 1) { return pred(from) ? from : to } + var mid = Math.floor((from + to) / 2) + if (pred(mid)) { to = mid } + else { from = mid } + } +} + +// The display handles the DOM integration, both for input reading +// and content drawing. It holds references to DOM nodes and +// display-related state. + +function Display(place, doc, input) { + var d = this + this.input = input + + // Covers bottom-right square when both scrollbars are present. + d.scrollbarFiller = elt("div", null, "CodeMirror-scrollbar-filler") + d.scrollbarFiller.setAttribute("cm-not-content", "true") + // Covers bottom of gutter when coverGutterNextToScrollbar is on + // and h scrollbar is present. + d.gutterFiller = elt("div", null, "CodeMirror-gutter-filler") + d.gutterFiller.setAttribute("cm-not-content", "true") + // Will contain the actual code, positioned to cover the viewport. + d.lineDiv = elt("div", null, "CodeMirror-code") + // Elements are added to these to represent selection and cursors. + d.selectionDiv = elt("div", null, null, "position: relative; z-index: 1") + d.cursorDiv = elt("div", null, "CodeMirror-cursors") + // A visibility: hidden element used to find the size of things. + d.measure = elt("div", null, "CodeMirror-measure") + // When lines outside of the viewport are measured, they are drawn in this. + d.lineMeasure = elt("div", null, "CodeMirror-measure") + // Wraps everything that needs to exist inside the vertically-padded coordinate system + d.lineSpace = elt("div", [d.measure, d.lineMeasure, d.selectionDiv, d.cursorDiv, d.lineDiv], + null, "position: relative; outline: none") + // Moved around its parent to cover visible view. + d.mover = elt("div", [elt("div", [d.lineSpace], "CodeMirror-lines")], null, "position: relative") + // Set to the height of the document, allowing scrolling. + d.sizer = elt("div", [d.mover], "CodeMirror-sizer") + d.sizerWidth = null + // Behavior of elts with overflow: auto and padding is + // inconsistent across browsers. This is used to ensure the + // scrollable area is big enough. + d.heightForcer = elt("div", null, null, "position: absolute; height: " + scrollerGap + "px; width: 1px;") + // Will contain the gutters, if any. + d.gutters = elt("div", null, "CodeMirror-gutters") + d.lineGutter = null + // Actual scrollable element. + d.scroller = elt("div", [d.sizer, d.heightForcer, d.gutters], "CodeMirror-scroll") + d.scroller.setAttribute("tabIndex", "-1") + // The element in which the editor lives. + d.wrapper = elt("div", [d.scrollbarFiller, d.gutterFiller, d.scroller], "CodeMirror") + + // Work around IE7 z-index bug (not perfect, hence IE7 not really being supported) + if (ie && ie_version < 8) { d.gutters.style.zIndex = -1; d.scroller.style.paddingRight = 0 } + if (!webkit && !(gecko && mobile)) { d.scroller.draggable = true } + + if (place) { + if (place.appendChild) { place.appendChild(d.wrapper) } + else { place(d.wrapper) } + } + + // Current rendered range (may be bigger than the view window). + d.viewFrom = d.viewTo = doc.first + d.reportedViewFrom = d.reportedViewTo = doc.first + // Information about the rendered lines. + d.view = [] + d.renderedView = null + // Holds info about a single rendered line when it was rendered + // for measurement, while not in view. + d.externalMeasured = null + // Empty space (in pixels) above the view + d.viewOffset = 0 + d.lastWrapHeight = d.lastWrapWidth = 0 + d.updateLineNumbers = null + + d.nativeBarWidth = d.barHeight = d.barWidth = 0 + d.scrollbarsClipped = false + + // Used to only resize the line number gutter when necessary (when + // the amount of lines crosses a boundary that makes its width change) + d.lineNumWidth = d.lineNumInnerWidth = d.lineNumChars = null + // Set to true when a non-horizontal-scrolling line widget is + // added. As an optimization, line widget aligning is skipped when + // this is false. + d.alignWidgets = false + + d.cachedCharWidth = d.cachedTextHeight = d.cachedPaddingH = null + + // Tracks the maximum line length so that the horizontal scrollbar + // can be kept static when scrolling. + d.maxLine = null + d.maxLineLength = 0 + d.maxLineChanged = false + + // Used for measuring wheel scrolling granularity + d.wheelDX = d.wheelDY = d.wheelStartX = d.wheelStartY = null + + // True when shift is held down. + d.shift = false + + // Used to track whether anything happened since the context menu + // was opened. + d.selForContextMenu = null + + d.activeTouch = null + + input.init(d) +} + +// Find the line object corresponding to the given line number. +function getLine(doc, n) { + n -= doc.first + if (n < 0 || n >= doc.size) { throw new Error("There is no line " + (n + doc.first) + " in the document.") } + var chunk = doc + while (!chunk.lines) { + for (var i = 0;; ++i) { + var child = chunk.children[i], sz = child.chunkSize() + if (n < sz) { chunk = child; break } + n -= sz + } + } + return chunk.lines[n] +} + +// Get the part of a document between two positions, as an array of +// strings. +function getBetween(doc, start, end) { + var out = [], n = start.line + doc.iter(start.line, end.line + 1, function (line) { + var text = line.text + if (n == end.line) { text = text.slice(0, end.ch) } + if (n == start.line) { text = text.slice(start.ch) } + out.push(text) + ++n + }) + return out +} +// Get the lines between from and to, as array of strings. +function getLines(doc, from, to) { + var out = [] + doc.iter(from, to, function (line) { out.push(line.text) }) // iter aborts when callback returns truthy value + return out +} + +// Update the height of a line, propagating the height change +// upwards to parent nodes. +function updateLineHeight(line, height) { + var diff = height - line.height + if (diff) { for (var n = line; n; n = n.parent) { n.height += diff } } +} + +// Given a line object, find its line number by walking up through +// its parent links. +function lineNo(line) { + if (line.parent == null) { return null } + var cur = line.parent, no = indexOf(cur.lines, line) + for (var chunk = cur.parent; chunk; cur = chunk, chunk = chunk.parent) { + for (var i = 0;; ++i) { + if (chunk.children[i] == cur) { break } + no += chunk.children[i].chunkSize() + } + } + return no + cur.first +} + +// Find the line at the given vertical position, using the height +// information in the document tree. +function lineAtHeight(chunk, h) { + var n = chunk.first + outer: do { + for (var i$1 = 0; i$1 < chunk.children.length; ++i$1) { + var child = chunk.children[i$1], ch = child.height + if (h < ch) { chunk = child; continue outer } + h -= ch + n += child.chunkSize() + } + return n + } while (!chunk.lines) + var i = 0 + for (; i < chunk.lines.length; ++i) { + var line = chunk.lines[i], lh = line.height + if (h < lh) { break } + h -= lh + } + return n + i +} + +function isLine(doc, l) {return l >= doc.first && l < doc.first + doc.size} + +function lineNumberFor(options, i) { + return String(options.lineNumberFormatter(i + options.firstLineNumber)) +} + +// A Pos instance represents a position within the text. +function Pos(line, ch, sticky) { + if ( sticky === void 0 ) sticky = null; + + if (!(this instanceof Pos)) { return new Pos(line, ch, sticky) } + this.line = line + this.ch = ch + this.sticky = sticky +} + +// Compare two positions, return 0 if they are the same, a negative +// number when a is less, and a positive number otherwise. +function cmp(a, b) { return a.line - b.line || a.ch - b.ch } + +function equalCursorPos(a, b) { return a.sticky == b.sticky && cmp(a, b) == 0 } + +function copyPos(x) {return Pos(x.line, x.ch)} +function maxPos(a, b) { return cmp(a, b) < 0 ? b : a } +function minPos(a, b) { return cmp(a, b) < 0 ? a : b } + +// Most of the external API clips given positions to make sure they +// actually exist within the document. +function clipLine(doc, n) {return Math.max(doc.first, Math.min(n, doc.first + doc.size - 1))} +function clipPos(doc, pos) { + if (pos.line < doc.first) { return Pos(doc.first, 0) } + var last = doc.first + doc.size - 1 + if (pos.line > last) { return Pos(last, getLine(doc, last).text.length) } + return clipToLen(pos, getLine(doc, pos.line).text.length) +} +function clipToLen(pos, linelen) { + var ch = pos.ch + if (ch == null || ch > linelen) { return Pos(pos.line, linelen) } + else if (ch < 0) { return Pos(pos.line, 0) } + else { return pos } +} +function clipPosArray(doc, array) { + var out = [] + for (var i = 0; i < array.length; i++) { out[i] = clipPos(doc, array[i]) } + return out +} + +// Optimize some code when these features are not used. +var sawReadOnlySpans = false; +var sawCollapsedSpans = false; +function seeReadOnlySpans() { + sawReadOnlySpans = true +} + +function seeCollapsedSpans() { + sawCollapsedSpans = true +} + +// TEXTMARKER SPANS + +function MarkedSpan(marker, from, to) { + this.marker = marker + this.from = from; this.to = to +} + +// Search an array of spans for a span matching the given marker. +function getMarkedSpanFor(spans, marker) { + if (spans) { for (var i = 0; i < spans.length; ++i) { + var span = spans[i] + if (span.marker == marker) { return span } + } } +} +// Remove a span from an array, returning undefined if no spans are +// left (we don't store arrays for lines without spans). +function removeMarkedSpan(spans, span) { + var r + for (var i = 0; i < spans.length; ++i) + { if (spans[i] != span) { (r || (r = [])).push(spans[i]) } } + return r +} +// Add a span to a line. +function addMarkedSpan(line, span) { + line.markedSpans = line.markedSpans ? line.markedSpans.concat([span]) : [span] + span.marker.attachLine(line) +} + +// Used for the algorithm that adjusts markers for a change in the +// document. These functions cut an array of spans at a given +// character position, returning an array of remaining chunks (or +// undefined if nothing remains). +function markedSpansBefore(old, startCh, isInsert) { + var nw + if (old) { for (var i = 0; i < old.length; ++i) { + var span = old[i], marker = span.marker + var startsBefore = span.from == null || (marker.inclusiveLeft ? span.from <= startCh : span.from < startCh) + if (startsBefore || span.from == startCh && marker.type == "bookmark" && (!isInsert || !span.marker.insertLeft)) { + var endsAfter = span.to == null || (marker.inclusiveRight ? span.to >= startCh : span.to > startCh) + ;(nw || (nw = [])).push(new MarkedSpan(marker, span.from, endsAfter ? null : span.to)) + } + } } + return nw +} +function markedSpansAfter(old, endCh, isInsert) { + var nw + if (old) { for (var i = 0; i < old.length; ++i) { + var span = old[i], marker = span.marker + var endsAfter = span.to == null || (marker.inclusiveRight ? span.to >= endCh : span.to > endCh) + if (endsAfter || span.from == endCh && marker.type == "bookmark" && (!isInsert || span.marker.insertLeft)) { + var startsBefore = span.from == null || (marker.inclusiveLeft ? span.from <= endCh : span.from < endCh) + ;(nw || (nw = [])).push(new MarkedSpan(marker, startsBefore ? null : span.from - endCh, + span.to == null ? null : span.to - endCh)) + } + } } + return nw +} + +// Given a change object, compute the new set of marker spans that +// cover the line in which the change took place. Removes spans +// entirely within the change, reconnects spans belonging to the +// same marker that appear on both sides of the change, and cuts off +// spans partially within the change. Returns an array of span +// arrays with one element for each line in (after) the change. +function stretchSpansOverChange(doc, change) { + if (change.full) { return null } + var oldFirst = isLine(doc, change.from.line) && getLine(doc, change.from.line).markedSpans + var oldLast = isLine(doc, change.to.line) && getLine(doc, change.to.line).markedSpans + if (!oldFirst && !oldLast) { return null } + + var startCh = change.from.ch, endCh = change.to.ch, isInsert = cmp(change.from, change.to) == 0 + // Get the spans that 'stick out' on both sides + var first = markedSpansBefore(oldFirst, startCh, isInsert) + var last = markedSpansAfter(oldLast, endCh, isInsert) + + // Next, merge those two ends + var sameLine = change.text.length == 1, offset = lst(change.text).length + (sameLine ? startCh : 0) + if (first) { + // Fix up .to properties of first + for (var i = 0; i < first.length; ++i) { + var span = first[i] + if (span.to == null) { + var found = getMarkedSpanFor(last, span.marker) + if (!found) { span.to = startCh } + else if (sameLine) { span.to = found.to == null ? null : found.to + offset } + } + } + } + if (last) { + // Fix up .from in last (or move them into first in case of sameLine) + for (var i$1 = 0; i$1 < last.length; ++i$1) { + var span$1 = last[i$1] + if (span$1.to != null) { span$1.to += offset } + if (span$1.from == null) { + var found$1 = getMarkedSpanFor(first, span$1.marker) + if (!found$1) { + span$1.from = offset + if (sameLine) { (first || (first = [])).push(span$1) } + } + } else { + span$1.from += offset + if (sameLine) { (first || (first = [])).push(span$1) } + } + } + } + // Make sure we didn't create any zero-length spans + if (first) { first = clearEmptySpans(first) } + if (last && last != first) { last = clearEmptySpans(last) } + + var newMarkers = [first] + if (!sameLine) { + // Fill gap with whole-line-spans + var gap = change.text.length - 2, gapMarkers + if (gap > 0 && first) + { for (var i$2 = 0; i$2 < first.length; ++i$2) + { if (first[i$2].to == null) + { (gapMarkers || (gapMarkers = [])).push(new MarkedSpan(first[i$2].marker, null, null)) } } } + for (var i$3 = 0; i$3 < gap; ++i$3) + { newMarkers.push(gapMarkers) } + newMarkers.push(last) + } + return newMarkers +} + +// Remove spans that are empty and don't have a clearWhenEmpty +// option of false. +function clearEmptySpans(spans) { + for (var i = 0; i < spans.length; ++i) { + var span = spans[i] + if (span.from != null && span.from == span.to && span.marker.clearWhenEmpty !== false) + { spans.splice(i--, 1) } + } + if (!spans.length) { return null } + return spans +} + +// Used to 'clip' out readOnly ranges when making a change. +function removeReadOnlyRanges(doc, from, to) { + var markers = null + doc.iter(from.line, to.line + 1, function (line) { + if (line.markedSpans) { for (var i = 0; i < line.markedSpans.length; ++i) { + var mark = line.markedSpans[i].marker + if (mark.readOnly && (!markers || indexOf(markers, mark) == -1)) + { (markers || (markers = [])).push(mark) } + } } + }) + if (!markers) { return null } + var parts = [{from: from, to: to}] + for (var i = 0; i < markers.length; ++i) { + var mk = markers[i], m = mk.find(0) + for (var j = 0; j < parts.length; ++j) { + var p = parts[j] + if (cmp(p.to, m.from) < 0 || cmp(p.from, m.to) > 0) { continue } + var newParts = [j, 1], dfrom = cmp(p.from, m.from), dto = cmp(p.to, m.to) + if (dfrom < 0 || !mk.inclusiveLeft && !dfrom) + { newParts.push({from: p.from, to: m.from}) } + if (dto > 0 || !mk.inclusiveRight && !dto) + { newParts.push({from: m.to, to: p.to}) } + parts.splice.apply(parts, newParts) + j += newParts.length - 3 + } + } + return parts +} + +// Connect or disconnect spans from a line. +function detachMarkedSpans(line) { + var spans = line.markedSpans + if (!spans) { return } + for (var i = 0; i < spans.length; ++i) + { spans[i].marker.detachLine(line) } + line.markedSpans = null +} +function attachMarkedSpans(line, spans) { + if (!spans) { return } + for (var i = 0; i < spans.length; ++i) + { spans[i].marker.attachLine(line) } + line.markedSpans = spans +} + +// Helpers used when computing which overlapping collapsed span +// counts as the larger one. +function extraLeft(marker) { return marker.inclusiveLeft ? -1 : 0 } +function extraRight(marker) { return marker.inclusiveRight ? 1 : 0 } + +// Returns a number indicating which of two overlapping collapsed +// spans is larger (and thus includes the other). Falls back to +// comparing ids when the spans cover exactly the same range. +function compareCollapsedMarkers(a, b) { + var lenDiff = a.lines.length - b.lines.length + if (lenDiff != 0) { return lenDiff } + var aPos = a.find(), bPos = b.find() + var fromCmp = cmp(aPos.from, bPos.from) || extraLeft(a) - extraLeft(b) + if (fromCmp) { return -fromCmp } + var toCmp = cmp(aPos.to, bPos.to) || extraRight(a) - extraRight(b) + if (toCmp) { return toCmp } + return b.id - a.id +} + +// Find out whether a line ends or starts in a collapsed span. If +// so, return the marker for that span. +function collapsedSpanAtSide(line, start) { + var sps = sawCollapsedSpans && line.markedSpans, found + if (sps) { for (var sp = (void 0), i = 0; i < sps.length; ++i) { + sp = sps[i] + if (sp.marker.collapsed && (start ? sp.from : sp.to) == null && + (!found || compareCollapsedMarkers(found, sp.marker) < 0)) + { found = sp.marker } + } } + return found +} +function collapsedSpanAtStart(line) { return collapsedSpanAtSide(line, true) } +function collapsedSpanAtEnd(line) { return collapsedSpanAtSide(line, false) } + +// Test whether there exists a collapsed span that partially +// overlaps (covers the start or end, but not both) of a new span. +// Such overlap is not allowed. +function conflictingCollapsedRange(doc, lineNo, from, to, marker) { + var line = getLine(doc, lineNo) + var sps = sawCollapsedSpans && line.markedSpans + if (sps) { for (var i = 0; i < sps.length; ++i) { + var sp = sps[i] + if (!sp.marker.collapsed) { continue } + var found = sp.marker.find(0) + var fromCmp = cmp(found.from, from) || extraLeft(sp.marker) - extraLeft(marker) + var toCmp = cmp(found.to, to) || extraRight(sp.marker) - extraRight(marker) + if (fromCmp >= 0 && toCmp <= 0 || fromCmp <= 0 && toCmp >= 0) { continue } + if (fromCmp <= 0 && (sp.marker.inclusiveRight && marker.inclusiveLeft ? cmp(found.to, from) >= 0 : cmp(found.to, from) > 0) || + fromCmp >= 0 && (sp.marker.inclusiveRight && marker.inclusiveLeft ? cmp(found.from, to) <= 0 : cmp(found.from, to) < 0)) + { return true } + } } +} + +// A visual line is a line as drawn on the screen. Folding, for +// example, can cause multiple logical lines to appear on the same +// visual line. This finds the start of the visual line that the +// given line is part of (usually that is the line itself). +function visualLine(line) { + var merged + while (merged = collapsedSpanAtStart(line)) + { line = merged.find(-1, true).line } + return line +} + +function visualLineEnd(line) { + var merged + while (merged = collapsedSpanAtEnd(line)) + { line = merged.find(1, true).line } + return line +} + +// Returns an array of logical lines that continue the visual line +// started by the argument, or undefined if there are no such lines. +function visualLineContinued(line) { + var merged, lines + while (merged = collapsedSpanAtEnd(line)) { + line = merged.find(1, true).line + ;(lines || (lines = [])).push(line) + } + return lines +} + +// Get the line number of the start of the visual line that the +// given line number is part of. +function visualLineNo(doc, lineN) { + var line = getLine(doc, lineN), vis = visualLine(line) + if (line == vis) { return lineN } + return lineNo(vis) +} + +// Get the line number of the start of the next visual line after +// the given line. +function visualLineEndNo(doc, lineN) { + if (lineN > doc.lastLine()) { return lineN } + var line = getLine(doc, lineN), merged + if (!lineIsHidden(doc, line)) { return lineN } + while (merged = collapsedSpanAtEnd(line)) + { line = merged.find(1, true).line } + return lineNo(line) + 1 +} + +// Compute whether a line is hidden. Lines count as hidden when they +// are part of a visual line that starts with another line, or when +// they are entirely covered by collapsed, non-widget span. +function lineIsHidden(doc, line) { + var sps = sawCollapsedSpans && line.markedSpans + if (sps) { for (var sp = (void 0), i = 0; i < sps.length; ++i) { + sp = sps[i] + if (!sp.marker.collapsed) { continue } + if (sp.from == null) { return true } + if (sp.marker.widgetNode) { continue } + if (sp.from == 0 && sp.marker.inclusiveLeft && lineIsHiddenInner(doc, line, sp)) + { return true } + } } +} +function lineIsHiddenInner(doc, line, span) { + if (span.to == null) { + var end = span.marker.find(1, true) + return lineIsHiddenInner(doc, end.line, getMarkedSpanFor(end.line.markedSpans, span.marker)) + } + if (span.marker.inclusiveRight && span.to == line.text.length) + { return true } + for (var sp = (void 0), i = 0; i < line.markedSpans.length; ++i) { + sp = line.markedSpans[i] + if (sp.marker.collapsed && !sp.marker.widgetNode && sp.from == span.to && + (sp.to == null || sp.to != span.from) && + (sp.marker.inclusiveLeft || span.marker.inclusiveRight) && + lineIsHiddenInner(doc, line, sp)) { return true } + } +} + +// Find the height above the given line. +function heightAtLine(lineObj) { + lineObj = visualLine(lineObj) + + var h = 0, chunk = lineObj.parent + for (var i = 0; i < chunk.lines.length; ++i) { + var line = chunk.lines[i] + if (line == lineObj) { break } + else { h += line.height } + } + for (var p = chunk.parent; p; chunk = p, p = chunk.parent) { + for (var i$1 = 0; i$1 < p.children.length; ++i$1) { + var cur = p.children[i$1] + if (cur == chunk) { break } + else { h += cur.height } + } + } + return h +} + +// Compute the character length of a line, taking into account +// collapsed ranges (see markText) that might hide parts, and join +// other lines onto it. +function lineLength(line) { + if (line.height == 0) { return 0 } + var len = line.text.length, merged, cur = line + while (merged = collapsedSpanAtStart(cur)) { + var found = merged.find(0, true) + cur = found.from.line + len += found.from.ch - found.to.ch + } + cur = line + while (merged = collapsedSpanAtEnd(cur)) { + var found$1 = merged.find(0, true) + len -= cur.text.length - found$1.from.ch + cur = found$1.to.line + len += cur.text.length - found$1.to.ch + } + return len +} + +// Find the longest line in the document. +function findMaxLine(cm) { + var d = cm.display, doc = cm.doc + d.maxLine = getLine(doc, doc.first) + d.maxLineLength = lineLength(d.maxLine) + d.maxLineChanged = true + doc.iter(function (line) { + var len = lineLength(line) + if (len > d.maxLineLength) { + d.maxLineLength = len + d.maxLine = line + } + }) +} + +// BIDI HELPERS + +function iterateBidiSections(order, from, to, f) { + if (!order) { return f(from, to, "ltr") } + var found = false + for (var i = 0; i < order.length; ++i) { + var part = order[i] + if (part.from < to && part.to > from || from == to && part.to == from) { + f(Math.max(part.from, from), Math.min(part.to, to), part.level == 1 ? "rtl" : "ltr") + found = true + } + } + if (!found) { f(from, to, "ltr") } +} + +var bidiOther = null +function getBidiPartAt(order, ch, sticky) { + var found + bidiOther = null + for (var i = 0; i < order.length; ++i) { + var cur = order[i] + if (cur.from < ch && cur.to > ch) { return i } + if (cur.to == ch) { + if (cur.from != cur.to && sticky == "before") { found = i } + else { bidiOther = i } + } + if (cur.from == ch) { + if (cur.from != cur.to && sticky != "before") { found = i } + else { bidiOther = i } + } + } + return found != null ? found : bidiOther +} + +// Bidirectional ordering algorithm +// See http://unicode.org/reports/tr9/tr9-13.html for the algorithm +// that this (partially) implements. + +// One-char codes used for character types: +// L (L): Left-to-Right +// R (R): Right-to-Left +// r (AL): Right-to-Left Arabic +// 1 (EN): European Number +// + (ES): European Number Separator +// % (ET): European Number Terminator +// n (AN): Arabic Number +// , (CS): Common Number Separator +// m (NSM): Non-Spacing Mark +// b (BN): Boundary Neutral +// s (B): Paragraph Separator +// t (S): Segment Separator +// w (WS): Whitespace +// N (ON): Other Neutrals + +// Returns null if characters are ordered as they appear +// (left-to-right), or an array of sections ({from, to, level} +// objects) in the order in which they occur visually. +var bidiOrdering = (function() { + // Character types for codepoints 0 to 0xff + var lowTypes = "bbbbbbbbbtstwsbbbbbbbbbbbbbbssstwNN%%%NNNNNN,N,N1111111111NNNNNNNLLLLLLLLLLLLLLLLLLLLLLLLLLNNNNNNLLLLLLLLLLLLLLLLLLLLLLLLLLNNNNbbbbbbsbbbbbbbbbbbbbbbbbbbbbbbbbb,N%%%%NNNNLNNNNN%%11NLNNN1LNNNNNLLLLLLLLLLLLLLLLLLLLLLLNLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLN" + // Character types for codepoints 0x600 to 0x6f9 + var arabicTypes = "nnnnnnNNr%%r,rNNmmmmmmmmmmmrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrmmmmmmmmmmmmmmmmmmmmmnnnnnnnnnn%nnrrrmrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrmmmmmmmnNmmmmmmrrmmNmmmmrr1111111111" + function charType(code) { + if (code <= 0xf7) { return lowTypes.charAt(code) } + else if (0x590 <= code && code <= 0x5f4) { return "R" } + else if (0x600 <= code && code <= 0x6f9) { return arabicTypes.charAt(code - 0x600) } + else if (0x6ee <= code && code <= 0x8ac) { return "r" } + else if (0x2000 <= code && code <= 0x200b) { return "w" } + else if (code == 0x200c) { return "b" } + else { return "L" } + } + + var bidiRE = /[\u0590-\u05f4\u0600-\u06ff\u0700-\u08ac]/ + var isNeutral = /[stwN]/, isStrong = /[LRr]/, countsAsLeft = /[Lb1n]/, countsAsNum = /[1n]/ + // Browsers seem to always treat the boundaries of block elements as being L. + var outerType = "L" + + function BidiSpan(level, from, to) { + this.level = level + this.from = from; this.to = to + } + + return function(str) { + if (!bidiRE.test(str)) { return false } + var len = str.length, types = [] + for (var i = 0; i < len; ++i) + { types.push(charType(str.charCodeAt(i))) } + + // W1. Examine each non-spacing mark (NSM) in the level run, and + // change the type of the NSM to the type of the previous + // character. If the NSM is at the start of the level run, it will + // get the type of sor. + for (var i$1 = 0, prev = outerType; i$1 < len; ++i$1) { + var type = types[i$1] + if (type == "m") { types[i$1] = prev } + else { prev = type } + } + + // W2. Search backwards from each instance of a European number + // until the first strong type (R, L, AL, or sor) is found. If an + // AL is found, change the type of the European number to Arabic + // number. + // W3. Change all ALs to R. + for (var i$2 = 0, cur = outerType; i$2 < len; ++i$2) { + var type$1 = types[i$2] + if (type$1 == "1" && cur == "r") { types[i$2] = "n" } + else if (isStrong.test(type$1)) { cur = type$1; if (type$1 == "r") { types[i$2] = "R" } } + } + + // W4. A single European separator between two European numbers + // changes to a European number. A single common separator between + // two numbers of the same type changes to that type. + for (var i$3 = 1, prev$1 = types[0]; i$3 < len - 1; ++i$3) { + var type$2 = types[i$3] + if (type$2 == "+" && prev$1 == "1" && types[i$3+1] == "1") { types[i$3] = "1" } + else if (type$2 == "," && prev$1 == types[i$3+1] && + (prev$1 == "1" || prev$1 == "n")) { types[i$3] = prev$1 } + prev$1 = type$2 + } + + // W5. A sequence of European terminators adjacent to European + // numbers changes to all European numbers. + // W6. Otherwise, separators and terminators change to Other + // Neutral. + for (var i$4 = 0; i$4 < len; ++i$4) { + var type$3 = types[i$4] + if (type$3 == ",") { types[i$4] = "N" } + else if (type$3 == "%") { + var end = (void 0) + for (end = i$4 + 1; end < len && types[end] == "%"; ++end) {} + var replace = (i$4 && types[i$4-1] == "!") || (end < len && types[end] == "1") ? "1" : "N" + for (var j = i$4; j < end; ++j) { types[j] = replace } + i$4 = end - 1 + } + } + + // W7. Search backwards from each instance of a European number + // until the first strong type (R, L, or sor) is found. If an L is + // found, then change the type of the European number to L. + for (var i$5 = 0, cur$1 = outerType; i$5 < len; ++i$5) { + var type$4 = types[i$5] + if (cur$1 == "L" && type$4 == "1") { types[i$5] = "L" } + else if (isStrong.test(type$4)) { cur$1 = type$4 } + } + + // N1. A sequence of neutrals takes the direction of the + // surrounding strong text if the text on both sides has the same + // direction. European and Arabic numbers act as if they were R in + // terms of their influence on neutrals. Start-of-level-run (sor) + // and end-of-level-run (eor) are used at level run boundaries. + // N2. Any remaining neutrals take the embedding direction. + for (var i$6 = 0; i$6 < len; ++i$6) { + if (isNeutral.test(types[i$6])) { + var end$1 = (void 0) + for (end$1 = i$6 + 1; end$1 < len && isNeutral.test(types[end$1]); ++end$1) {} + var before = (i$6 ? types[i$6-1] : outerType) == "L" + var after = (end$1 < len ? types[end$1] : outerType) == "L" + var replace$1 = before || after ? "L" : "R" + for (var j$1 = i$6; j$1 < end$1; ++j$1) { types[j$1] = replace$1 } + i$6 = end$1 - 1 + } + } + + // Here we depart from the documented algorithm, in order to avoid + // building up an actual levels array. Since there are only three + // levels (0, 1, 2) in an implementation that doesn't take + // explicit embedding into account, we can build up the order on + // the fly, without following the level-based algorithm. + var order = [], m + for (var i$7 = 0; i$7 < len;) { + if (countsAsLeft.test(types[i$7])) { + var start = i$7 + for (++i$7; i$7 < len && countsAsLeft.test(types[i$7]); ++i$7) {} + order.push(new BidiSpan(0, start, i$7)) + } else { + var pos = i$7, at = order.length + for (++i$7; i$7 < len && types[i$7] != "L"; ++i$7) {} + for (var j$2 = pos; j$2 < i$7;) { + if (countsAsNum.test(types[j$2])) { + if (pos < j$2) { order.splice(at, 0, new BidiSpan(1, pos, j$2)) } + var nstart = j$2 + for (++j$2; j$2 < i$7 && countsAsNum.test(types[j$2]); ++j$2) {} + order.splice(at, 0, new BidiSpan(2, nstart, j$2)) + pos = j$2 + } else { ++j$2 } + } + if (pos < i$7) { order.splice(at, 0, new BidiSpan(1, pos, i$7)) } + } + } + if (order[0].level == 1 && (m = str.match(/^\s+/))) { + order[0].from = m[0].length + order.unshift(new BidiSpan(0, 0, m[0].length)) + } + if (lst(order).level == 1 && (m = str.match(/\s+$/))) { + lst(order).to -= m[0].length + order.push(new BidiSpan(0, len - m[0].length, len)) + } + + return order + } +})() + +// Get the bidi ordering for the given line (and cache it). Returns +// false for lines that are fully left-to-right, and an array of +// BidiSpan objects otherwise. +function getOrder(line) { + var order = line.order + if (order == null) { order = line.order = bidiOrdering(line.text) } + return order +} + +function moveCharLogically(line, ch, dir) { + var target = skipExtendingChars(line.text, ch + dir, dir) + return target < 0 || target > line.text.length ? null : target +} + +function moveLogically(line, start, dir) { + var ch = moveCharLogically(line, start.ch, dir) + return ch == null ? null : new Pos(start.line, ch, dir < 0 ? "after" : "before") +} + +function endOfLine(visually, cm, lineObj, lineNo, dir) { + if (visually) { + var order = getOrder(lineObj) + if (order) { + var part = dir < 0 ? lst(order) : order[0] + var moveInStorageOrder = (dir < 0) == (part.level == 1) + var sticky = moveInStorageOrder ? "after" : "before" + var ch + // With a wrapped rtl chunk (possibly spanning multiple bidi parts), + // it could be that the last bidi part is not on the last visual line, + // since visual lines contain content order-consecutive chunks. + // Thus, in rtl, we are looking for the first (content-order) character + // in the rtl chunk that is on the last line (that is, the same line + // as the last (content-order) character). + if (part.level > 0) { + var prep = prepareMeasureForLine(cm, lineObj) + ch = dir < 0 ? lineObj.text.length - 1 : 0 + var targetTop = measureCharPrepared(cm, prep, ch).top + ch = findFirst(function (ch) { return measureCharPrepared(cm, prep, ch).top == targetTop; }, (dir < 0) == (part.level == 1) ? part.from : part.to - 1, ch) + if (sticky == "before") { ch = moveCharLogically(lineObj, ch, 1, true) } + } else { ch = dir < 0 ? part.to : part.from } + return new Pos(lineNo, ch, sticky) + } + } + return new Pos(lineNo, dir < 0 ? lineObj.text.length : 0, dir < 0 ? "before" : "after") +} + +function moveVisually(cm, line, start, dir) { + var bidi = getOrder(line) + if (!bidi) { return moveLogically(line, start, dir) } + if (start.ch >= line.text.length) { + start.ch = line.text.length + start.sticky = "before" + } else if (start.ch <= 0) { + start.ch = 0 + start.sticky = "after" + } + var partPos = getBidiPartAt(bidi, start.ch, start.sticky), part = bidi[partPos] + if (part.level % 2 == 0 && (dir > 0 ? part.to > start.ch : part.from < start.ch)) { + // Case 1: We move within an ltr part. Even with wrapped lines, + // nothing interesting happens. + return moveLogically(line, start, dir) + } + + var mv = function (pos, dir) { return moveCharLogically(line, pos instanceof Pos ? pos.ch : pos, dir); } + var prep + var getWrappedLineExtent = function (ch) { + if (!cm.options.lineWrapping) { return {begin: 0, end: line.text.length} } + prep = prep || prepareMeasureForLine(cm, line) + return wrappedLineExtentChar(cm, line, prep, ch) + } + var wrappedLineExtent = getWrappedLineExtent(start.sticky == "before" ? mv(start, -1) : start.ch) + + if (part.level % 2 == 1) { + var ch = mv(start, -dir) + if (ch != null && (dir > 0 ? ch >= part.from && ch >= wrappedLineExtent.begin : ch <= part.to && ch <= wrappedLineExtent.end)) { + // Case 2: We move within an rtl part on the same visual line + var sticky = dir < 0 ? "before" : "after" + return new Pos(start.line, ch, sticky) + } + } + + // Case 3: Could not move within this bidi part in this visual line, so leave + // the current bidi part + + var searchInVisualLine = function (partPos, dir, wrappedLineExtent) { + var getRes = function (ch, moveInStorageOrder) { return moveInStorageOrder + ? new Pos(start.line, mv(ch, 1), "before") + : new Pos(start.line, ch, "after"); } + + for (; partPos >= 0 && partPos < bidi.length; partPos += dir) { + var part = bidi[partPos] + var moveInStorageOrder = (dir > 0) == (part.level != 1) + var ch = moveInStorageOrder ? wrappedLineExtent.begin : mv(wrappedLineExtent.end, -1) + if (part.from <= ch && ch < part.to) { return getRes(ch, moveInStorageOrder) } + ch = moveInStorageOrder ? part.from : mv(part.to, -1) + if (wrappedLineExtent.begin <= ch && ch < wrappedLineExtent.end) { return getRes(ch, moveInStorageOrder) } + } + } + + // Case 3a: Look for other bidi parts on the same visual line + var res = searchInVisualLine(partPos + dir, dir, wrappedLineExtent) + if (res) { return res } + + // Case 3b: Look for other bidi parts on the next visual line + var nextCh = dir > 0 ? wrappedLineExtent.end : mv(wrappedLineExtent.begin, -1) + if (nextCh != null && !(dir > 0 && nextCh == line.text.length)) { + res = searchInVisualLine(dir > 0 ? 0 : bidi.length - 1, dir, getWrappedLineExtent(nextCh)) + if (res) { return res } + } + + // Case 4: Nowhere to move + return null +} + +// EVENT HANDLING + +// Lightweight event framework. on/off also work on DOM nodes, +// registering native DOM handlers. + +var noHandlers = [] + +var on = function(emitter, type, f) { + if (emitter.addEventListener) { + emitter.addEventListener(type, f, false) + } else if (emitter.attachEvent) { + emitter.attachEvent("on" + type, f) + } else { + var map = emitter._handlers || (emitter._handlers = {}) + map[type] = (map[type] || noHandlers).concat(f) + } +} + +function getHandlers(emitter, type) { + return emitter._handlers && emitter._handlers[type] || noHandlers +} + +function off(emitter, type, f) { + if (emitter.removeEventListener) { + emitter.removeEventListener(type, f, false) + } else if (emitter.detachEvent) { + emitter.detachEvent("on" + type, f) + } else { + var map = emitter._handlers, arr = map && map[type] + if (arr) { + var index = indexOf(arr, f) + if (index > -1) + { map[type] = arr.slice(0, index).concat(arr.slice(index + 1)) } + } + } +} + +function signal(emitter, type /*, values...*/) { + var handlers = getHandlers(emitter, type) + if (!handlers.length) { return } + var args = Array.prototype.slice.call(arguments, 2) + for (var i = 0; i < handlers.length; ++i) { handlers[i].apply(null, args) } +} + +// The DOM events that CodeMirror handles can be overridden by +// registering a (non-DOM) handler on the editor for the event name, +// and preventDefault-ing the event in that handler. +function signalDOMEvent(cm, e, override) { + if (typeof e == "string") + { e = {type: e, preventDefault: function() { this.defaultPrevented = true }} } + signal(cm, override || e.type, cm, e) + return e_defaultPrevented(e) || e.codemirrorIgnore +} + +function signalCursorActivity(cm) { + var arr = cm._handlers && cm._handlers.cursorActivity + if (!arr) { return } + var set = cm.curOp.cursorActivityHandlers || (cm.curOp.cursorActivityHandlers = []) + for (var i = 0; i < arr.length; ++i) { if (indexOf(set, arr[i]) == -1) + { set.push(arr[i]) } } +} + +function hasHandler(emitter, type) { + return getHandlers(emitter, type).length > 0 +} + +// Add on and off methods to a constructor's prototype, to make +// registering events on such objects more convenient. +function eventMixin(ctor) { + ctor.prototype.on = function(type, f) {on(this, type, f)} + ctor.prototype.off = function(type, f) {off(this, type, f)} +} + +// Due to the fact that we still support jurassic IE versions, some +// compatibility wrappers are needed. + +function e_preventDefault(e) { + if (e.preventDefault) { e.preventDefault() } + else { e.returnValue = false } +} +function e_stopPropagation(e) { + if (e.stopPropagation) { e.stopPropagation() } + else { e.cancelBubble = true } +} +function e_defaultPrevented(e) { + return e.defaultPrevented != null ? e.defaultPrevented : e.returnValue == false +} +function e_stop(e) {e_preventDefault(e); e_stopPropagation(e)} + +function e_target(e) {return e.target || e.srcElement} +function e_button(e) { + var b = e.which + if (b == null) { + if (e.button & 1) { b = 1 } + else if (e.button & 2) { b = 3 } + else if (e.button & 4) { b = 2 } + } + if (mac && e.ctrlKey && b == 1) { b = 3 } + return b +} + +// Detect drag-and-drop +var dragAndDrop = function() { + // There is *some* kind of drag-and-drop support in IE6-8, but I + // couldn't get it to work yet. + if (ie && ie_version < 9) { return false } + var div = elt('div') + return "draggable" in div || "dragDrop" in div +}() + +var zwspSupported +function zeroWidthElement(measure) { + if (zwspSupported == null) { + var test = elt("span", "\u200b") + removeChildrenAndAdd(measure, elt("span", [test, document.createTextNode("x")])) + if (measure.firstChild.offsetHeight != 0) + { zwspSupported = test.offsetWidth <= 1 && test.offsetHeight > 2 && !(ie && ie_version < 8) } + } + var node = zwspSupported ? elt("span", "\u200b") : + elt("span", "\u00a0", null, "display: inline-block; width: 1px; margin-right: -1px") + node.setAttribute("cm-text", "") + return node +} + +// Feature-detect IE's crummy client rect reporting for bidi text +var badBidiRects +function hasBadBidiRects(measure) { + if (badBidiRects != null) { return badBidiRects } + var txt = removeChildrenAndAdd(measure, document.createTextNode("A\u062eA")) + var r0 = range(txt, 0, 1).getBoundingClientRect() + var r1 = range(txt, 1, 2).getBoundingClientRect() + removeChildren(measure) + if (!r0 || r0.left == r0.right) { return false } // Safari returns null in some cases (#2780) + return badBidiRects = (r1.right - r0.right < 3) +} + +// See if "".split is the broken IE version, if so, provide an +// alternative way to split lines. +var splitLinesAuto = "\n\nb".split(/\n/).length != 3 ? function (string) { + var pos = 0, result = [], l = string.length + while (pos <= l) { + var nl = string.indexOf("\n", pos) + if (nl == -1) { nl = string.length } + var line = string.slice(pos, string.charAt(nl - 1) == "\r" ? nl - 1 : nl) + var rt = line.indexOf("\r") + if (rt != -1) { + result.push(line.slice(0, rt)) + pos += rt + 1 + } else { + result.push(line) + pos = nl + 1 + } + } + return result +} : function (string) { return string.split(/\r\n?|\n/); } + +var hasSelection = window.getSelection ? function (te) { + try { return te.selectionStart != te.selectionEnd } + catch(e) { return false } +} : function (te) { + var range + try {range = te.ownerDocument.selection.createRange()} + catch(e) {} + if (!range || range.parentElement() != te) { return false } + return range.compareEndPoints("StartToEnd", range) != 0 +} + +var hasCopyEvent = (function () { + var e = elt("div") + if ("oncopy" in e) { return true } + e.setAttribute("oncopy", "return;") + return typeof e.oncopy == "function" +})() + +var badZoomedRects = null +function hasBadZoomedRects(measure) { + if (badZoomedRects != null) { return badZoomedRects } + var node = removeChildrenAndAdd(measure, elt("span", "x")) + var normal = node.getBoundingClientRect() + var fromRange = range(node, 0, 1).getBoundingClientRect() + return badZoomedRects = Math.abs(normal.left - fromRange.left) > 1 +} + +var modes = {}; +var mimeModes = {}; +// Extra arguments are stored as the mode's dependencies, which is +// used by (legacy) mechanisms like loadmode.js to automatically +// load a mode. (Preferred mechanism is the require/define calls.) +function defineMode(name, mode) { + if (arguments.length > 2) + { mode.dependencies = Array.prototype.slice.call(arguments, 2) } + modes[name] = mode +} + +function defineMIME(mime, spec) { + mimeModes[mime] = spec +} + +// Given a MIME type, a {name, ...options} config object, or a name +// string, return a mode config object. +function resolveMode(spec) { + if (typeof spec == "string" && mimeModes.hasOwnProperty(spec)) { + spec = mimeModes[spec] + } else if (spec && typeof spec.name == "string" && mimeModes.hasOwnProperty(spec.name)) { + var found = mimeModes[spec.name] + if (typeof found == "string") { found = {name: found} } + spec = createObj(found, spec) + spec.name = found.name + } else if (typeof spec == "string" && /^[\w\-]+\/[\w\-]+\+xml$/.test(spec)) { + return resolveMode("application/xml") + } else if (typeof spec == "string" && /^[\w\-]+\/[\w\-]+\+json$/.test(spec)) { + return resolveMode("application/json") + } + if (typeof spec == "string") { return {name: spec} } + else { return spec || {name: "null"} } +} + +// Given a mode spec (anything that resolveMode accepts), find and +// initialize an actual mode object. +function getMode(options, spec) { + spec = resolveMode(spec) + var mfactory = modes[spec.name] + if (!mfactory) { return getMode(options, "text/plain") } + var modeObj = mfactory(options, spec) + if (modeExtensions.hasOwnProperty(spec.name)) { + var exts = modeExtensions[spec.name] + for (var prop in exts) { + if (!exts.hasOwnProperty(prop)) { continue } + if (modeObj.hasOwnProperty(prop)) { modeObj["_" + prop] = modeObj[prop] } + modeObj[prop] = exts[prop] + } + } + modeObj.name = spec.name + if (spec.helperType) { modeObj.helperType = spec.helperType } + if (spec.modeProps) { for (var prop$1 in spec.modeProps) + { modeObj[prop$1] = spec.modeProps[prop$1] } } + + return modeObj +} + +// This can be used to attach properties to mode objects from +// outside the actual mode definition. +var modeExtensions = {} +function extendMode(mode, properties) { + var exts = modeExtensions.hasOwnProperty(mode) ? modeExtensions[mode] : (modeExtensions[mode] = {}) + copyObj(properties, exts) +} + +function copyState(mode, state) { + if (state === true) { return state } + if (mode.copyState) { return mode.copyState(state) } + var nstate = {} + for (var n in state) { + var val = state[n] + if (val instanceof Array) { val = val.concat([]) } + nstate[n] = val + } + return nstate +} + +// Given a mode and a state (for that mode), find the inner mode and +// state at the position that the state refers to. +function innerMode(mode, state) { + var info + while (mode.innerMode) { + info = mode.innerMode(state) + if (!info || info.mode == mode) { break } + state = info.state + mode = info.mode + } + return info || {mode: mode, state: state} +} + +function startState(mode, a1, a2) { + return mode.startState ? mode.startState(a1, a2) : true +} + +// STRING STREAM + +// Fed to the mode parsers, provides helper functions to make +// parsers more succinct. + +var StringStream = function(string, tabSize) { + this.pos = this.start = 0 + this.string = string + this.tabSize = tabSize || 8 + this.lastColumnPos = this.lastColumnValue = 0 + this.lineStart = 0 +}; + +StringStream.prototype.eol = function () {return this.pos >= this.string.length}; +StringStream.prototype.sol = function () {return this.pos == this.lineStart}; +StringStream.prototype.peek = function () {return this.string.charAt(this.pos) || undefined}; +StringStream.prototype.next = function () { + if (this.pos < this.string.length) + { return this.string.charAt(this.pos++) } +}; +StringStream.prototype.eat = function (match) { + var ch = this.string.charAt(this.pos) + var ok + if (typeof match == "string") { ok = ch == match } + else { ok = ch && (match.test ? match.test(ch) : match(ch)) } + if (ok) {++this.pos; return ch} +}; +StringStream.prototype.eatWhile = function (match) { + var start = this.pos + while (this.eat(match)){} + return this.pos > start +}; +StringStream.prototype.eatSpace = function () { + var this$1 = this; + + var start = this.pos + while (/[\s\u00a0]/.test(this.string.charAt(this.pos))) { ++this$1.pos } + return this.pos > start +}; +StringStream.prototype.skipToEnd = function () {this.pos = this.string.length}; +StringStream.prototype.skipTo = function (ch) { + var found = this.string.indexOf(ch, this.pos) + if (found > -1) {this.pos = found; return true} +}; +StringStream.prototype.backUp = function (n) {this.pos -= n}; +StringStream.prototype.column = function () { + if (this.lastColumnPos < this.start) { + this.lastColumnValue = countColumn(this.string, this.start, this.tabSize, this.lastColumnPos, this.lastColumnValue) + this.lastColumnPos = this.start + } + return this.lastColumnValue - (this.lineStart ? countColumn(this.string, this.lineStart, this.tabSize) : 0) +}; +StringStream.prototype.indentation = function () { + return countColumn(this.string, null, this.tabSize) - + (this.lineStart ? countColumn(this.string, this.lineStart, this.tabSize) : 0) +}; +StringStream.prototype.match = function (pattern, consume, caseInsensitive) { + if (typeof pattern == "string") { + var cased = function (str) { return caseInsensitive ? str.toLowerCase() : str; } + var substr = this.string.substr(this.pos, pattern.length) + if (cased(substr) == cased(pattern)) { + if (consume !== false) { this.pos += pattern.length } + return true + } + } else { + var match = this.string.slice(this.pos).match(pattern) + if (match && match.index > 0) { return null } + if (match && consume !== false) { this.pos += match[0].length } + return match + } +}; +StringStream.prototype.current = function (){return this.string.slice(this.start, this.pos)}; +StringStream.prototype.hideFirstChars = function (n, inner) { + this.lineStart += n + try { return inner() } + finally { this.lineStart -= n } +}; + +// Compute a style array (an array starting with a mode generation +// -- for invalidation -- followed by pairs of end positions and +// style strings), which is used to highlight the tokens on the +// line. +function highlightLine(cm, line, state, forceToEnd) { + // A styles array always starts with a number identifying the + // mode/overlays that it is based on (for easy invalidation). + var st = [cm.state.modeGen], lineClasses = {} + // Compute the base array of styles + runMode(cm, line.text, cm.doc.mode, state, function (end, style) { return st.push(end, style); }, + lineClasses, forceToEnd) + + // Run overlays, adjust style array. + var loop = function ( o ) { + var overlay = cm.state.overlays[o], i = 1, at = 0 + runMode(cm, line.text, overlay.mode, true, function (end, style) { + var start = i + // Ensure there's a token end at the current position, and that i points at it + while (at < end) { + var i_end = st[i] + if (i_end > end) + { st.splice(i, 1, end, st[i+1], i_end) } + i += 2 + at = Math.min(end, i_end) + } + if (!style) { return } + if (overlay.opaque) { + st.splice(start, i - start, end, "overlay " + style) + i = start + 2 + } else { + for (; start < i; start += 2) { + var cur = st[start+1] + st[start+1] = (cur ? cur + " " : "") + "overlay " + style + } + } + }, lineClasses) + }; + + for (var o = 0; o < cm.state.overlays.length; ++o) loop( o ); + + return {styles: st, classes: lineClasses.bgClass || lineClasses.textClass ? lineClasses : null} +} + +function getLineStyles(cm, line, updateFrontier) { + if (!line.styles || line.styles[0] != cm.state.modeGen) { + var state = getStateBefore(cm, lineNo(line)) + var result = highlightLine(cm, line, line.text.length > cm.options.maxHighlightLength ? copyState(cm.doc.mode, state) : state) + line.stateAfter = state + line.styles = result.styles + if (result.classes) { line.styleClasses = result.classes } + else if (line.styleClasses) { line.styleClasses = null } + if (updateFrontier === cm.doc.frontier) { cm.doc.frontier++ } + } + return line.styles +} + +function getStateBefore(cm, n, precise) { + var doc = cm.doc, display = cm.display + if (!doc.mode.startState) { return true } + var pos = findStartLine(cm, n, precise), state = pos > doc.first && getLine(doc, pos-1).stateAfter + if (!state) { state = startState(doc.mode) } + else { state = copyState(doc.mode, state) } + doc.iter(pos, n, function (line) { + processLine(cm, line.text, state) + var save = pos == n - 1 || pos % 5 == 0 || pos >= display.viewFrom && pos < display.viewTo + line.stateAfter = save ? copyState(doc.mode, state) : null + ++pos + }) + if (precise) { doc.frontier = pos } + return state +} + +// Lightweight form of highlight -- proceed over this line and +// update state, but don't save a style array. Used for lines that +// aren't currently visible. +function processLine(cm, text, state, startAt) { + var mode = cm.doc.mode + var stream = new StringStream(text, cm.options.tabSize) + stream.start = stream.pos = startAt || 0 + if (text == "") { callBlankLine(mode, state) } + while (!stream.eol()) { + readToken(mode, stream, state) + stream.start = stream.pos + } +} + +function callBlankLine(mode, state) { + if (mode.blankLine) { return mode.blankLine(state) } + if (!mode.innerMode) { return } + var inner = innerMode(mode, state) + if (inner.mode.blankLine) { return inner.mode.blankLine(inner.state) } +} + +function readToken(mode, stream, state, inner) { + for (var i = 0; i < 10; i++) { + if (inner) { inner[0] = innerMode(mode, state).mode } + var style = mode.token(stream, state) + if (stream.pos > stream.start) { return style } + } + throw new Error("Mode " + mode.name + " failed to advance stream.") +} + +// Utility for getTokenAt and getLineTokens +function takeToken(cm, pos, precise, asArray) { + var getObj = function (copy) { return ({ + start: stream.start, end: stream.pos, + string: stream.current(), + type: style || null, + state: copy ? copyState(doc.mode, state) : state + }); } + + var doc = cm.doc, mode = doc.mode, style + pos = clipPos(doc, pos) + var line = getLine(doc, pos.line), state = getStateBefore(cm, pos.line, precise) + var stream = new StringStream(line.text, cm.options.tabSize), tokens + if (asArray) { tokens = [] } + while ((asArray || stream.pos < pos.ch) && !stream.eol()) { + stream.start = stream.pos + style = readToken(mode, stream, state) + if (asArray) { tokens.push(getObj(true)) } + } + return asArray ? tokens : getObj() +} + +function extractLineClasses(type, output) { + if (type) { for (;;) { + var lineClass = type.match(/(?:^|\s+)line-(background-)?(\S+)/) + if (!lineClass) { break } + type = type.slice(0, lineClass.index) + type.slice(lineClass.index + lineClass[0].length) + var prop = lineClass[1] ? "bgClass" : "textClass" + if (output[prop] == null) + { output[prop] = lineClass[2] } + else if (!(new RegExp("(?:^|\s)" + lineClass[2] + "(?:$|\s)")).test(output[prop])) + { output[prop] += " " + lineClass[2] } + } } + return type +} + +// Run the given mode's parser over a line, calling f for each token. +function runMode(cm, text, mode, state, f, lineClasses, forceToEnd) { + var flattenSpans = mode.flattenSpans + if (flattenSpans == null) { flattenSpans = cm.options.flattenSpans } + var curStart = 0, curStyle = null + var stream = new StringStream(text, cm.options.tabSize), style + var inner = cm.options.addModeClass && [null] + if (text == "") { extractLineClasses(callBlankLine(mode, state), lineClasses) } + while (!stream.eol()) { + if (stream.pos > cm.options.maxHighlightLength) { + flattenSpans = false + if (forceToEnd) { processLine(cm, text, state, stream.pos) } + stream.pos = text.length + style = null + } else { + style = extractLineClasses(readToken(mode, stream, state, inner), lineClasses) + } + if (inner) { + var mName = inner[0].name + if (mName) { style = "m-" + (style ? mName + " " + style : mName) } + } + if (!flattenSpans || curStyle != style) { + while (curStart < stream.start) { + curStart = Math.min(stream.start, curStart + 5000) + f(curStart, curStyle) + } + curStyle = style + } + stream.start = stream.pos + } + while (curStart < stream.pos) { + // Webkit seems to refuse to render text nodes longer than 57444 + // characters, and returns inaccurate measurements in nodes + // starting around 5000 chars. + var pos = Math.min(stream.pos, curStart + 5000) + f(pos, curStyle) + curStart = pos + } +} + +// Finds the line to start with when starting a parse. Tries to +// find a line with a stateAfter, so that it can start with a +// valid state. If that fails, it returns the line with the +// smallest indentation, which tends to need the least context to +// parse correctly. +function findStartLine(cm, n, precise) { + var minindent, minline, doc = cm.doc + var lim = precise ? -1 : n - (cm.doc.mode.innerMode ? 1000 : 100) + for (var search = n; search > lim; --search) { + if (search <= doc.first) { return doc.first } + var line = getLine(doc, search - 1) + if (line.stateAfter && (!precise || search <= doc.frontier)) { return search } + var indented = countColumn(line.text, null, cm.options.tabSize) + if (minline == null || minindent > indented) { + minline = search - 1 + minindent = indented + } + } + return minline +} + +// LINE DATA STRUCTURE + +// Line objects. These hold state related to a line, including +// highlighting info (the styles array). +var Line = function(text, markedSpans, estimateHeight) { + this.text = text + attachMarkedSpans(this, markedSpans) + this.height = estimateHeight ? estimateHeight(this) : 1 +}; + +Line.prototype.lineNo = function () { return lineNo(this) }; +eventMixin(Line) + +// Change the content (text, markers) of a line. Automatically +// invalidates cached information and tries to re-estimate the +// line's height. +function updateLine(line, text, markedSpans, estimateHeight) { + line.text = text + if (line.stateAfter) { line.stateAfter = null } + if (line.styles) { line.styles = null } + if (line.order != null) { line.order = null } + detachMarkedSpans(line) + attachMarkedSpans(line, markedSpans) + var estHeight = estimateHeight ? estimateHeight(line) : 1 + if (estHeight != line.height) { updateLineHeight(line, estHeight) } +} + +// Detach a line from the document tree and its markers. +function cleanUpLine(line) { + line.parent = null + detachMarkedSpans(line) +} + +// Convert a style as returned by a mode (either null, or a string +// containing one or more styles) to a CSS style. This is cached, +// and also looks for line-wide styles. +var styleToClassCache = {}; +var styleToClassCacheWithMode = {}; +function interpretTokenStyle(style, options) { + if (!style || /^\s*$/.test(style)) { return null } + var cache = options.addModeClass ? styleToClassCacheWithMode : styleToClassCache + return cache[style] || + (cache[style] = style.replace(/\S+/g, "cm-$&")) +} + +// Render the DOM representation of the text of a line. Also builds +// up a 'line map', which points at the DOM nodes that represent +// specific stretches of text, and is used by the measuring code. +// The returned object contains the DOM node, this map, and +// information about line-wide styles that were set by the mode. +function buildLineContent(cm, lineView) { + // The padding-right forces the element to have a 'border', which + // is needed on Webkit to be able to get line-level bounding + // rectangles for it (in measureChar). + var content = elt("span", null, null, webkit ? "padding-right: .1px" : null) + var builder = {pre: elt("pre", [content], "CodeMirror-line"), content: content, + col: 0, pos: 0, cm: cm, + trailingSpace: false, + splitSpaces: (ie || webkit) && cm.getOption("lineWrapping")} + // hide from accessibility tree + content.setAttribute("role", "presentation") + builder.pre.setAttribute("role", "presentation") + lineView.measure = {} + + // Iterate over the logical lines that make up this visual line. + for (var i = 0; i <= (lineView.rest ? lineView.rest.length : 0); i++) { + var line = i ? lineView.rest[i - 1] : lineView.line, order = (void 0) + builder.pos = 0 + builder.addToken = buildToken + // Optionally wire in some hacks into the token-rendering + // algorithm, to deal with browser quirks. + if (hasBadBidiRects(cm.display.measure) && (order = getOrder(line))) + { builder.addToken = buildTokenBadBidi(builder.addToken, order) } + builder.map = [] + var allowFrontierUpdate = lineView != cm.display.externalMeasured && lineNo(line) + insertLineContent(line, builder, getLineStyles(cm, line, allowFrontierUpdate)) + if (line.styleClasses) { + if (line.styleClasses.bgClass) + { builder.bgClass = joinClasses(line.styleClasses.bgClass, builder.bgClass || "") } + if (line.styleClasses.textClass) + { builder.textClass = joinClasses(line.styleClasses.textClass, builder.textClass || "") } + } + + // Ensure at least a single node is present, for measuring. + if (builder.map.length == 0) + { builder.map.push(0, 0, builder.content.appendChild(zeroWidthElement(cm.display.measure))) } + + // Store the map and a cache object for the current logical line + if (i == 0) { + lineView.measure.map = builder.map + lineView.measure.cache = {} + } else { + ;(lineView.measure.maps || (lineView.measure.maps = [])).push(builder.map) + ;(lineView.measure.caches || (lineView.measure.caches = [])).push({}) + } + } + + // See issue #2901 + if (webkit) { + var last = builder.content.lastChild + if (/\bcm-tab\b/.test(last.className) || (last.querySelector && last.querySelector(".cm-tab"))) + { builder.content.className = "cm-tab-wrap-hack" } + } + + signal(cm, "renderLine", cm, lineView.line, builder.pre) + if (builder.pre.className) + { builder.textClass = joinClasses(builder.pre.className, builder.textClass || "") } + + return builder +} + +function defaultSpecialCharPlaceholder(ch) { + var token = elt("span", "\u2022", "cm-invalidchar") + token.title = "\\u" + ch.charCodeAt(0).toString(16) + token.setAttribute("aria-label", token.title) + return token +} + +// Build up the DOM representation for a single token, and add it to +// the line map. Takes care to render special characters separately. +function buildToken(builder, text, style, startStyle, endStyle, title, css) { + if (!text) { return } + var displayText = builder.splitSpaces ? splitSpaces(text, builder.trailingSpace) : text + var special = builder.cm.state.specialChars, mustWrap = false + var content + if (!special.test(text)) { + builder.col += text.length + content = document.createTextNode(displayText) + builder.map.push(builder.pos, builder.pos + text.length, content) + if (ie && ie_version < 9) { mustWrap = true } + builder.pos += text.length + } else { + content = document.createDocumentFragment() + var pos = 0 + while (true) { + special.lastIndex = pos + var m = special.exec(text) + var skipped = m ? m.index - pos : text.length - pos + if (skipped) { + var txt = document.createTextNode(displayText.slice(pos, pos + skipped)) + if (ie && ie_version < 9) { content.appendChild(elt("span", [txt])) } + else { content.appendChild(txt) } + builder.map.push(builder.pos, builder.pos + skipped, txt) + builder.col += skipped + builder.pos += skipped + } + if (!m) { break } + pos += skipped + 1 + var txt$1 = (void 0) + if (m[0] == "\t") { + var tabSize = builder.cm.options.tabSize, tabWidth = tabSize - builder.col % tabSize + txt$1 = content.appendChild(elt("span", spaceStr(tabWidth), "cm-tab")) + txt$1.setAttribute("role", "presentation") + txt$1.setAttribute("cm-text", "\t") + builder.col += tabWidth + } else if (m[0] == "\r" || m[0] == "\n") { + txt$1 = content.appendChild(elt("span", m[0] == "\r" ? "\u240d" : "\u2424", "cm-invalidchar")) + txt$1.setAttribute("cm-text", m[0]) + builder.col += 1 + } else { + txt$1 = builder.cm.options.specialCharPlaceholder(m[0]) + txt$1.setAttribute("cm-text", m[0]) + if (ie && ie_version < 9) { content.appendChild(elt("span", [txt$1])) } + else { content.appendChild(txt$1) } + builder.col += 1 + } + builder.map.push(builder.pos, builder.pos + 1, txt$1) + builder.pos++ + } + } + builder.trailingSpace = displayText.charCodeAt(text.length - 1) == 32 + if (style || startStyle || endStyle || mustWrap || css) { + var fullStyle = style || "" + if (startStyle) { fullStyle += startStyle } + if (endStyle) { fullStyle += endStyle } + var token = elt("span", [content], fullStyle, css) + if (title) { token.title = title } + return builder.content.appendChild(token) + } + builder.content.appendChild(content) +} + +function splitSpaces(text, trailingBefore) { + if (text.length > 1 && !/ /.test(text)) { return text } + var spaceBefore = trailingBefore, result = "" + for (var i = 0; i < text.length; i++) { + var ch = text.charAt(i) + if (ch == " " && spaceBefore && (i == text.length - 1 || text.charCodeAt(i + 1) == 32)) + { ch = "\u00a0" } + result += ch + spaceBefore = ch == " " + } + return result +} + +// Work around nonsense dimensions being reported for stretches of +// right-to-left text. +function buildTokenBadBidi(inner, order) { + return function (builder, text, style, startStyle, endStyle, title, css) { + style = style ? style + " cm-force-border" : "cm-force-border" + var start = builder.pos, end = start + text.length + for (;;) { + // Find the part that overlaps with the start of this text + var part = (void 0) + for (var i = 0; i < order.length; i++) { + part = order[i] + if (part.to > start && part.from <= start) { break } + } + if (part.to >= end) { return inner(builder, text, style, startStyle, endStyle, title, css) } + inner(builder, text.slice(0, part.to - start), style, startStyle, null, title, css) + startStyle = null + text = text.slice(part.to - start) + start = part.to + } + } +} + +function buildCollapsedSpan(builder, size, marker, ignoreWidget) { + var widget = !ignoreWidget && marker.widgetNode + if (widget) { builder.map.push(builder.pos, builder.pos + size, widget) } + if (!ignoreWidget && builder.cm.display.input.needsContentAttribute) { + if (!widget) + { widget = builder.content.appendChild(document.createElement("span")) } + widget.setAttribute("cm-marker", marker.id) + } + if (widget) { + builder.cm.display.input.setUneditable(widget) + builder.content.appendChild(widget) + } + builder.pos += size + builder.trailingSpace = false +} + +// Outputs a number of spans to make up a line, taking highlighting +// and marked text into account. +function insertLineContent(line, builder, styles) { + var spans = line.markedSpans, allText = line.text, at = 0 + if (!spans) { + for (var i$1 = 1; i$1 < styles.length; i$1+=2) + { builder.addToken(builder, allText.slice(at, at = styles[i$1]), interpretTokenStyle(styles[i$1+1], builder.cm.options)) } + return + } + + var len = allText.length, pos = 0, i = 1, text = "", style, css + var nextChange = 0, spanStyle, spanEndStyle, spanStartStyle, title, collapsed + for (;;) { + if (nextChange == pos) { // Update current marker set + spanStyle = spanEndStyle = spanStartStyle = title = css = "" + collapsed = null; nextChange = Infinity + var foundBookmarks = [], endStyles = (void 0) + for (var j = 0; j < spans.length; ++j) { + var sp = spans[j], m = sp.marker + if (m.type == "bookmark" && sp.from == pos && m.widgetNode) { + foundBookmarks.push(m) + } else if (sp.from <= pos && (sp.to == null || sp.to > pos || m.collapsed && sp.to == pos && sp.from == pos)) { + if (sp.to != null && sp.to != pos && nextChange > sp.to) { + nextChange = sp.to + spanEndStyle = "" + } + if (m.className) { spanStyle += " " + m.className } + if (m.css) { css = (css ? css + ";" : "") + m.css } + if (m.startStyle && sp.from == pos) { spanStartStyle += " " + m.startStyle } + if (m.endStyle && sp.to == nextChange) { (endStyles || (endStyles = [])).push(m.endStyle, sp.to) } + if (m.title && !title) { title = m.title } + if (m.collapsed && (!collapsed || compareCollapsedMarkers(collapsed.marker, m) < 0)) + { collapsed = sp } + } else if (sp.from > pos && nextChange > sp.from) { + nextChange = sp.from + } + } + if (endStyles) { for (var j$1 = 0; j$1 < endStyles.length; j$1 += 2) + { if (endStyles[j$1 + 1] == nextChange) { spanEndStyle += " " + endStyles[j$1] } } } + + if (!collapsed || collapsed.from == pos) { for (var j$2 = 0; j$2 < foundBookmarks.length; ++j$2) + { buildCollapsedSpan(builder, 0, foundBookmarks[j$2]) } } + if (collapsed && (collapsed.from || 0) == pos) { + buildCollapsedSpan(builder, (collapsed.to == null ? len + 1 : collapsed.to) - pos, + collapsed.marker, collapsed.from == null) + if (collapsed.to == null) { return } + if (collapsed.to == pos) { collapsed = false } + } + } + if (pos >= len) { break } + + var upto = Math.min(len, nextChange) + while (true) { + if (text) { + var end = pos + text.length + if (!collapsed) { + var tokenText = end > upto ? text.slice(0, upto - pos) : text + builder.addToken(builder, tokenText, style ? style + spanStyle : spanStyle, + spanStartStyle, pos + tokenText.length == nextChange ? spanEndStyle : "", title, css) + } + if (end >= upto) {text = text.slice(upto - pos); pos = upto; break} + pos = end + spanStartStyle = "" + } + text = allText.slice(at, at = styles[i++]) + style = interpretTokenStyle(styles[i++], builder.cm.options) + } + } +} + + +// These objects are used to represent the visible (currently drawn) +// part of the document. A LineView may correspond to multiple +// logical lines, if those are connected by collapsed ranges. +function LineView(doc, line, lineN) { + // The starting line + this.line = line + // Continuing lines, if any + this.rest = visualLineContinued(line) + // Number of logical lines in this visual line + this.size = this.rest ? lineNo(lst(this.rest)) - lineN + 1 : 1 + this.node = this.text = null + this.hidden = lineIsHidden(doc, line) +} + +// Create a range of LineView objects for the given lines. +function buildViewArray(cm, from, to) { + var array = [], nextPos + for (var pos = from; pos < to; pos = nextPos) { + var view = new LineView(cm.doc, getLine(cm.doc, pos), pos) + nextPos = pos + view.size + array.push(view) + } + return array +} + +var operationGroup = null + +function pushOperation(op) { + if (operationGroup) { + operationGroup.ops.push(op) + } else { + op.ownsGroup = operationGroup = { + ops: [op], + delayedCallbacks: [] + } + } +} + +function fireCallbacksForOps(group) { + // Calls delayed callbacks and cursorActivity handlers until no + // new ones appear + var callbacks = group.delayedCallbacks, i = 0 + do { + for (; i < callbacks.length; i++) + { callbacks[i].call(null) } + for (var j = 0; j < group.ops.length; j++) { + var op = group.ops[j] + if (op.cursorActivityHandlers) + { while (op.cursorActivityCalled < op.cursorActivityHandlers.length) + { op.cursorActivityHandlers[op.cursorActivityCalled++].call(null, op.cm) } } + } + } while (i < callbacks.length) +} + +function finishOperation(op, endCb) { + var group = op.ownsGroup + if (!group) { return } + + try { fireCallbacksForOps(group) } + finally { + operationGroup = null + endCb(group) + } +} + +var orphanDelayedCallbacks = null + +// Often, we want to signal events at a point where we are in the +// middle of some work, but don't want the handler to start calling +// other methods on the editor, which might be in an inconsistent +// state or simply not expect any other events to happen. +// signalLater looks whether there are any handlers, and schedules +// them to be executed when the last operation ends, or, if no +// operation is active, when a timeout fires. +function signalLater(emitter, type /*, values...*/) { + var arr = getHandlers(emitter, type) + if (!arr.length) { return } + var args = Array.prototype.slice.call(arguments, 2), list + if (operationGroup) { + list = operationGroup.delayedCallbacks + } else if (orphanDelayedCallbacks) { + list = orphanDelayedCallbacks + } else { + list = orphanDelayedCallbacks = [] + setTimeout(fireOrphanDelayed, 0) + } + var loop = function ( i ) { + list.push(function () { return arr[i].apply(null, args); }) + }; + + for (var i = 0; i < arr.length; ++i) + loop( i ); +} + +function fireOrphanDelayed() { + var delayed = orphanDelayedCallbacks + orphanDelayedCallbacks = null + for (var i = 0; i < delayed.length; ++i) { delayed[i]() } +} + +// When an aspect of a line changes, a string is added to +// lineView.changes. This updates the relevant part of the line's +// DOM structure. +function updateLineForChanges(cm, lineView, lineN, dims) { + for (var j = 0; j < lineView.changes.length; j++) { + var type = lineView.changes[j] + if (type == "text") { updateLineText(cm, lineView) } + else if (type == "gutter") { updateLineGutter(cm, lineView, lineN, dims) } + else if (type == "class") { updateLineClasses(lineView) } + else if (type == "widget") { updateLineWidgets(cm, lineView, dims) } + } + lineView.changes = null +} + +// Lines with gutter elements, widgets or a background class need to +// be wrapped, and have the extra elements added to the wrapper div +function ensureLineWrapped(lineView) { + if (lineView.node == lineView.text) { + lineView.node = elt("div", null, null, "position: relative") + if (lineView.text.parentNode) + { lineView.text.parentNode.replaceChild(lineView.node, lineView.text) } + lineView.node.appendChild(lineView.text) + if (ie && ie_version < 8) { lineView.node.style.zIndex = 2 } + } + return lineView.node +} + +function updateLineBackground(lineView) { + var cls = lineView.bgClass ? lineView.bgClass + " " + (lineView.line.bgClass || "") : lineView.line.bgClass + if (cls) { cls += " CodeMirror-linebackground" } + if (lineView.background) { + if (cls) { lineView.background.className = cls } + else { lineView.background.parentNode.removeChild(lineView.background); lineView.background = null } + } else if (cls) { + var wrap = ensureLineWrapped(lineView) + lineView.background = wrap.insertBefore(elt("div", null, cls), wrap.firstChild) + } +} + +// Wrapper around buildLineContent which will reuse the structure +// in display.externalMeasured when possible. +function getLineContent(cm, lineView) { + var ext = cm.display.externalMeasured + if (ext && ext.line == lineView.line) { + cm.display.externalMeasured = null + lineView.measure = ext.measure + return ext.built + } + return buildLineContent(cm, lineView) +} + +// Redraw the line's text. Interacts with the background and text +// classes because the mode may output tokens that influence these +// classes. +function updateLineText(cm, lineView) { + var cls = lineView.text.className + var built = getLineContent(cm, lineView) + if (lineView.text == lineView.node) { lineView.node = built.pre } + lineView.text.parentNode.replaceChild(built.pre, lineView.text) + lineView.text = built.pre + if (built.bgClass != lineView.bgClass || built.textClass != lineView.textClass) { + lineView.bgClass = built.bgClass + lineView.textClass = built.textClass + updateLineClasses(lineView) + } else if (cls) { + lineView.text.className = cls + } +} + +function updateLineClasses(lineView) { + updateLineBackground(lineView) + if (lineView.line.wrapClass) + { ensureLineWrapped(lineView).className = lineView.line.wrapClass } + else if (lineView.node != lineView.text) + { lineView.node.className = "" } + var textClass = lineView.textClass ? lineView.textClass + " " + (lineView.line.textClass || "") : lineView.line.textClass + lineView.text.className = textClass || "" +} + +function updateLineGutter(cm, lineView, lineN, dims) { + if (lineView.gutter) { + lineView.node.removeChild(lineView.gutter) + lineView.gutter = null + } + if (lineView.gutterBackground) { + lineView.node.removeChild(lineView.gutterBackground) + lineView.gutterBackground = null + } + if (lineView.line.gutterClass) { + var wrap = ensureLineWrapped(lineView) + lineView.gutterBackground = elt("div", null, "CodeMirror-gutter-background " + lineView.line.gutterClass, + ("left: " + (cm.options.fixedGutter ? dims.fixedPos : -dims.gutterTotalWidth) + "px; width: " + (dims.gutterTotalWidth) + "px")) + wrap.insertBefore(lineView.gutterBackground, lineView.text) + } + var markers = lineView.line.gutterMarkers + if (cm.options.lineNumbers || markers) { + var wrap$1 = ensureLineWrapped(lineView) + var gutterWrap = lineView.gutter = elt("div", null, "CodeMirror-gutter-wrapper", ("left: " + (cm.options.fixedGutter ? dims.fixedPos : -dims.gutterTotalWidth) + "px")) + cm.display.input.setUneditable(gutterWrap) + wrap$1.insertBefore(gutterWrap, lineView.text) + if (lineView.line.gutterClass) + { gutterWrap.className += " " + lineView.line.gutterClass } + if (cm.options.lineNumbers && (!markers || !markers["CodeMirror-linenumbers"])) + { lineView.lineNumber = gutterWrap.appendChild( + elt("div", lineNumberFor(cm.options, lineN), + "CodeMirror-linenumber CodeMirror-gutter-elt", + ("left: " + (dims.gutterLeft["CodeMirror-linenumbers"]) + "px; width: " + (cm.display.lineNumInnerWidth) + "px"))) } + if (markers) { for (var k = 0; k < cm.options.gutters.length; ++k) { + var id = cm.options.gutters[k], found = markers.hasOwnProperty(id) && markers[id] + if (found) + { gutterWrap.appendChild(elt("div", [found], "CodeMirror-gutter-elt", + ("left: " + (dims.gutterLeft[id]) + "px; width: " + (dims.gutterWidth[id]) + "px"))) } + } } + } +} + +function updateLineWidgets(cm, lineView, dims) { + if (lineView.alignable) { lineView.alignable = null } + for (var node = lineView.node.firstChild, next = (void 0); node; node = next) { + next = node.nextSibling + if (node.className == "CodeMirror-linewidget") + { lineView.node.removeChild(node) } + } + insertLineWidgets(cm, lineView, dims) +} + +// Build a line's DOM representation from scratch +function buildLineElement(cm, lineView, lineN, dims) { + var built = getLineContent(cm, lineView) + lineView.text = lineView.node = built.pre + if (built.bgClass) { lineView.bgClass = built.bgClass } + if (built.textClass) { lineView.textClass = built.textClass } + + updateLineClasses(lineView) + updateLineGutter(cm, lineView, lineN, dims) + insertLineWidgets(cm, lineView, dims) + return lineView.node +} + +// A lineView may contain multiple logical lines (when merged by +// collapsed spans). The widgets for all of them need to be drawn. +function insertLineWidgets(cm, lineView, dims) { + insertLineWidgetsFor(cm, lineView.line, lineView, dims, true) + if (lineView.rest) { for (var i = 0; i < lineView.rest.length; i++) + { insertLineWidgetsFor(cm, lineView.rest[i], lineView, dims, false) } } +} + +function insertLineWidgetsFor(cm, line, lineView, dims, allowAbove) { + if (!line.widgets) { return } + var wrap = ensureLineWrapped(lineView) + for (var i = 0, ws = line.widgets; i < ws.length; ++i) { + var widget = ws[i], node = elt("div", [widget.node], "CodeMirror-linewidget") + if (!widget.handleMouseEvents) { node.setAttribute("cm-ignore-events", "true") } + positionLineWidget(widget, node, lineView, dims) + cm.display.input.setUneditable(node) + if (allowAbove && widget.above) + { wrap.insertBefore(node, lineView.gutter || lineView.text) } + else + { wrap.appendChild(node) } + signalLater(widget, "redraw") + } +} + +function positionLineWidget(widget, node, lineView, dims) { + if (widget.noHScroll) { + ;(lineView.alignable || (lineView.alignable = [])).push(node) + var width = dims.wrapperWidth + node.style.left = dims.fixedPos + "px" + if (!widget.coverGutter) { + width -= dims.gutterTotalWidth + node.style.paddingLeft = dims.gutterTotalWidth + "px" + } + node.style.width = width + "px" + } + if (widget.coverGutter) { + node.style.zIndex = 5 + node.style.position = "relative" + if (!widget.noHScroll) { node.style.marginLeft = -dims.gutterTotalWidth + "px" } + } +} + +function widgetHeight(widget) { + if (widget.height != null) { return widget.height } + var cm = widget.doc.cm + if (!cm) { return 0 } + if (!contains(document.body, widget.node)) { + var parentStyle = "position: relative;" + if (widget.coverGutter) + { parentStyle += "margin-left: -" + cm.display.gutters.offsetWidth + "px;" } + if (widget.noHScroll) + { parentStyle += "width: " + cm.display.wrapper.clientWidth + "px;" } + removeChildrenAndAdd(cm.display.measure, elt("div", [widget.node], null, parentStyle)) + } + return widget.height = widget.node.parentNode.offsetHeight +} + +// Return true when the given mouse event happened in a widget +function eventInWidget(display, e) { + for (var n = e_target(e); n != display.wrapper; n = n.parentNode) { + if (!n || (n.nodeType == 1 && n.getAttribute("cm-ignore-events") == "true") || + (n.parentNode == display.sizer && n != display.mover)) + { return true } + } +} + +// POSITION MEASUREMENT + +function paddingTop(display) {return display.lineSpace.offsetTop} +function paddingVert(display) {return display.mover.offsetHeight - display.lineSpace.offsetHeight} +function paddingH(display) { + if (display.cachedPaddingH) { return display.cachedPaddingH } + var e = removeChildrenAndAdd(display.measure, elt("pre", "x")) + var style = window.getComputedStyle ? window.getComputedStyle(e) : e.currentStyle + var data = {left: parseInt(style.paddingLeft), right: parseInt(style.paddingRight)} + if (!isNaN(data.left) && !isNaN(data.right)) { display.cachedPaddingH = data } + return data +} + +function scrollGap(cm) { return scrollerGap - cm.display.nativeBarWidth } +function displayWidth(cm) { + return cm.display.scroller.clientWidth - scrollGap(cm) - cm.display.barWidth +} +function displayHeight(cm) { + return cm.display.scroller.clientHeight - scrollGap(cm) - cm.display.barHeight +} + +// Ensure the lineView.wrapping.heights array is populated. This is +// an array of bottom offsets for the lines that make up a drawn +// line. When lineWrapping is on, there might be more than one +// height. +function ensureLineHeights(cm, lineView, rect) { + var wrapping = cm.options.lineWrapping + var curWidth = wrapping && displayWidth(cm) + if (!lineView.measure.heights || wrapping && lineView.measure.width != curWidth) { + var heights = lineView.measure.heights = [] + if (wrapping) { + lineView.measure.width = curWidth + var rects = lineView.text.firstChild.getClientRects() + for (var i = 0; i < rects.length - 1; i++) { + var cur = rects[i], next = rects[i + 1] + if (Math.abs(cur.bottom - next.bottom) > 2) + { heights.push((cur.bottom + next.top) / 2 - rect.top) } + } + } + heights.push(rect.bottom - rect.top) + } +} + +// Find a line map (mapping character offsets to text nodes) and a +// measurement cache for the given line number. (A line view might +// contain multiple lines when collapsed ranges are present.) +function mapFromLineView(lineView, line, lineN) { + if (lineView.line == line) + { return {map: lineView.measure.map, cache: lineView.measure.cache} } + for (var i = 0; i < lineView.rest.length; i++) + { if (lineView.rest[i] == line) + { return {map: lineView.measure.maps[i], cache: lineView.measure.caches[i]} } } + for (var i$1 = 0; i$1 < lineView.rest.length; i$1++) + { if (lineNo(lineView.rest[i$1]) > lineN) + { return {map: lineView.measure.maps[i$1], cache: lineView.measure.caches[i$1], before: true} } } +} + +// Render a line into the hidden node display.externalMeasured. Used +// when measurement is needed for a line that's not in the viewport. +function updateExternalMeasurement(cm, line) { + line = visualLine(line) + var lineN = lineNo(line) + var view = cm.display.externalMeasured = new LineView(cm.doc, line, lineN) + view.lineN = lineN + var built = view.built = buildLineContent(cm, view) + view.text = built.pre + removeChildrenAndAdd(cm.display.lineMeasure, built.pre) + return view +} + +// Get a {top, bottom, left, right} box (in line-local coordinates) +// for a given character. +function measureChar(cm, line, ch, bias) { + return measureCharPrepared(cm, prepareMeasureForLine(cm, line), ch, bias) +} + +// Find a line view that corresponds to the given line number. +function findViewForLine(cm, lineN) { + if (lineN >= cm.display.viewFrom && lineN < cm.display.viewTo) + { return cm.display.view[findViewIndex(cm, lineN)] } + var ext = cm.display.externalMeasured + if (ext && lineN >= ext.lineN && lineN < ext.lineN + ext.size) + { return ext } +} + +// Measurement can be split in two steps, the set-up work that +// applies to the whole line, and the measurement of the actual +// character. Functions like coordsChar, that need to do a lot of +// measurements in a row, can thus ensure that the set-up work is +// only done once. +function prepareMeasureForLine(cm, line) { + var lineN = lineNo(line) + var view = findViewForLine(cm, lineN) + if (view && !view.text) { + view = null + } else if (view && view.changes) { + updateLineForChanges(cm, view, lineN, getDimensions(cm)) + cm.curOp.forceUpdate = true + } + if (!view) + { view = updateExternalMeasurement(cm, line) } + + var info = mapFromLineView(view, line, lineN) + return { + line: line, view: view, rect: null, + map: info.map, cache: info.cache, before: info.before, + hasHeights: false + } +} + +// Given a prepared measurement object, measures the position of an +// actual character (or fetches it from the cache). +function measureCharPrepared(cm, prepared, ch, bias, varHeight) { + if (prepared.before) { ch = -1 } + var key = ch + (bias || ""), found + if (prepared.cache.hasOwnProperty(key)) { + found = prepared.cache[key] + } else { + if (!prepared.rect) + { prepared.rect = prepared.view.text.getBoundingClientRect() } + if (!prepared.hasHeights) { + ensureLineHeights(cm, prepared.view, prepared.rect) + prepared.hasHeights = true + } + found = measureCharInner(cm, prepared, ch, bias) + if (!found.bogus) { prepared.cache[key] = found } + } + return {left: found.left, right: found.right, + top: varHeight ? found.rtop : found.top, + bottom: varHeight ? found.rbottom : found.bottom} +} + +var nullRect = {left: 0, right: 0, top: 0, bottom: 0} + +function nodeAndOffsetInLineMap(map, ch, bias) { + var node, start, end, collapse, mStart, mEnd + // First, search the line map for the text node corresponding to, + // or closest to, the target character. + for (var i = 0; i < map.length; i += 3) { + mStart = map[i] + mEnd = map[i + 1] + if (ch < mStart) { + start = 0; end = 1 + collapse = "left" + } else if (ch < mEnd) { + start = ch - mStart + end = start + 1 + } else if (i == map.length - 3 || ch == mEnd && map[i + 3] > ch) { + end = mEnd - mStart + start = end - 1 + if (ch >= mEnd) { collapse = "right" } + } + if (start != null) { + node = map[i + 2] + if (mStart == mEnd && bias == (node.insertLeft ? "left" : "right")) + { collapse = bias } + if (bias == "left" && start == 0) + { while (i && map[i - 2] == map[i - 3] && map[i - 1].insertLeft) { + node = map[(i -= 3) + 2] + collapse = "left" + } } + if (bias == "right" && start == mEnd - mStart) + { while (i < map.length - 3 && map[i + 3] == map[i + 4] && !map[i + 5].insertLeft) { + node = map[(i += 3) + 2] + collapse = "right" + } } + break + } + } + return {node: node, start: start, end: end, collapse: collapse, coverStart: mStart, coverEnd: mEnd} +} + +function getUsefulRect(rects, bias) { + var rect = nullRect + if (bias == "left") { for (var i = 0; i < rects.length; i++) { + if ((rect = rects[i]).left != rect.right) { break } + } } else { for (var i$1 = rects.length - 1; i$1 >= 0; i$1--) { + if ((rect = rects[i$1]).left != rect.right) { break } + } } + return rect +} + +function measureCharInner(cm, prepared, ch, bias) { + var place = nodeAndOffsetInLineMap(prepared.map, ch, bias) + var node = place.node, start = place.start, end = place.end, collapse = place.collapse + + var rect + if (node.nodeType == 3) { // If it is a text node, use a range to retrieve the coordinates. + for (var i$1 = 0; i$1 < 4; i$1++) { // Retry a maximum of 4 times when nonsense rectangles are returned + while (start && isExtendingChar(prepared.line.text.charAt(place.coverStart + start))) { --start } + while (place.coverStart + end < place.coverEnd && isExtendingChar(prepared.line.text.charAt(place.coverStart + end))) { ++end } + if (ie && ie_version < 9 && start == 0 && end == place.coverEnd - place.coverStart) + { rect = node.parentNode.getBoundingClientRect() } + else + { rect = getUsefulRect(range(node, start, end).getClientRects(), bias) } + if (rect.left || rect.right || start == 0) { break } + end = start + start = start - 1 + collapse = "right" + } + if (ie && ie_version < 11) { rect = maybeUpdateRectForZooming(cm.display.measure, rect) } + } else { // If it is a widget, simply get the box for the whole widget. + if (start > 0) { collapse = bias = "right" } + var rects + if (cm.options.lineWrapping && (rects = node.getClientRects()).length > 1) + { rect = rects[bias == "right" ? rects.length - 1 : 0] } + else + { rect = node.getBoundingClientRect() } + } + if (ie && ie_version < 9 && !start && (!rect || !rect.left && !rect.right)) { + var rSpan = node.parentNode.getClientRects()[0] + if (rSpan) + { rect = {left: rSpan.left, right: rSpan.left + charWidth(cm.display), top: rSpan.top, bottom: rSpan.bottom} } + else + { rect = nullRect } + } + + var rtop = rect.top - prepared.rect.top, rbot = rect.bottom - prepared.rect.top + var mid = (rtop + rbot) / 2 + var heights = prepared.view.measure.heights + var i = 0 + for (; i < heights.length - 1; i++) + { if (mid < heights[i]) { break } } + var top = i ? heights[i - 1] : 0, bot = heights[i] + var result = {left: (collapse == "right" ? rect.right : rect.left) - prepared.rect.left, + right: (collapse == "left" ? rect.left : rect.right) - prepared.rect.left, + top: top, bottom: bot} + if (!rect.left && !rect.right) { result.bogus = true } + if (!cm.options.singleCursorHeightPerLine) { result.rtop = rtop; result.rbottom = rbot } + + return result +} + +// Work around problem with bounding client rects on ranges being +// returned incorrectly when zoomed on IE10 and below. +function maybeUpdateRectForZooming(measure, rect) { + if (!window.screen || screen.logicalXDPI == null || + screen.logicalXDPI == screen.deviceXDPI || !hasBadZoomedRects(measure)) + { return rect } + var scaleX = screen.logicalXDPI / screen.deviceXDPI + var scaleY = screen.logicalYDPI / screen.deviceYDPI + return {left: rect.left * scaleX, right: rect.right * scaleX, + top: rect.top * scaleY, bottom: rect.bottom * scaleY} +} + +function clearLineMeasurementCacheFor(lineView) { + if (lineView.measure) { + lineView.measure.cache = {} + lineView.measure.heights = null + if (lineView.rest) { for (var i = 0; i < lineView.rest.length; i++) + { lineView.measure.caches[i] = {} } } + } +} + +function clearLineMeasurementCache(cm) { + cm.display.externalMeasure = null + removeChildren(cm.display.lineMeasure) + for (var i = 0; i < cm.display.view.length; i++) + { clearLineMeasurementCacheFor(cm.display.view[i]) } +} + +function clearCaches(cm) { + clearLineMeasurementCache(cm) + cm.display.cachedCharWidth = cm.display.cachedTextHeight = cm.display.cachedPaddingH = null + if (!cm.options.lineWrapping) { cm.display.maxLineChanged = true } + cm.display.lineNumChars = null +} + +function pageScrollX() { return window.pageXOffset || (document.documentElement || document.body).scrollLeft } +function pageScrollY() { return window.pageYOffset || (document.documentElement || document.body).scrollTop } + +// Converts a {top, bottom, left, right} box from line-local +// coordinates into another coordinate system. Context may be one of +// "line", "div" (display.lineDiv), "local"./null (editor), "window", +// or "page". +function intoCoordSystem(cm, lineObj, rect, context, includeWidgets) { + if (!includeWidgets && lineObj.widgets) { for (var i = 0; i < lineObj.widgets.length; ++i) { if (lineObj.widgets[i].above) { + var size = widgetHeight(lineObj.widgets[i]) + rect.top += size; rect.bottom += size + } } } + if (context == "line") { return rect } + if (!context) { context = "local" } + var yOff = heightAtLine(lineObj) + if (context == "local") { yOff += paddingTop(cm.display) } + else { yOff -= cm.display.viewOffset } + if (context == "page" || context == "window") { + var lOff = cm.display.lineSpace.getBoundingClientRect() + yOff += lOff.top + (context == "window" ? 0 : pageScrollY()) + var xOff = lOff.left + (context == "window" ? 0 : pageScrollX()) + rect.left += xOff; rect.right += xOff + } + rect.top += yOff; rect.bottom += yOff + return rect +} + +// Coverts a box from "div" coords to another coordinate system. +// Context may be "window", "page", "div", or "local"./null. +function fromCoordSystem(cm, coords, context) { + if (context == "div") { return coords } + var left = coords.left, top = coords.top + // First move into "page" coordinate system + if (context == "page") { + left -= pageScrollX() + top -= pageScrollY() + } else if (context == "local" || !context) { + var localBox = cm.display.sizer.getBoundingClientRect() + left += localBox.left + top += localBox.top + } + + var lineSpaceBox = cm.display.lineSpace.getBoundingClientRect() + return {left: left - lineSpaceBox.left, top: top - lineSpaceBox.top} +} + +function charCoords(cm, pos, context, lineObj, bias) { + if (!lineObj) { lineObj = getLine(cm.doc, pos.line) } + return intoCoordSystem(cm, lineObj, measureChar(cm, lineObj, pos.ch, bias), context) +} + +// Returns a box for a given cursor position, which may have an +// 'other' property containing the position of the secondary cursor +// on a bidi boundary. +// A cursor Pos(line, char, "before") is on the same visual line as `char - 1` +// and after `char - 1` in writing order of `char - 1` +// A cursor Pos(line, char, "after") is on the same visual line as `char` +// and before `char` in writing order of `char` +// Examples (upper-case letters are RTL, lower-case are LTR): +// Pos(0, 1, ...) +// before after +// ab a|b a|b +// aB a|B aB| +// Ab |Ab A|b +// AB B|A B|A +// Every position after the last character on a line is considered to stick +// to the last character on the line. +function cursorCoords(cm, pos, context, lineObj, preparedMeasure, varHeight) { + lineObj = lineObj || getLine(cm.doc, pos.line) + if (!preparedMeasure) { preparedMeasure = prepareMeasureForLine(cm, lineObj) } + function get(ch, right) { + var m = measureCharPrepared(cm, preparedMeasure, ch, right ? "right" : "left", varHeight) + if (right) { m.left = m.right; } else { m.right = m.left } + return intoCoordSystem(cm, lineObj, m, context) + } + var order = getOrder(lineObj), ch = pos.ch, sticky = pos.sticky + if (ch >= lineObj.text.length) { + ch = lineObj.text.length + sticky = "before" + } else if (ch <= 0) { + ch = 0 + sticky = "after" + } + if (!order) { return get(sticky == "before" ? ch - 1 : ch, sticky == "before") } + + function getBidi(ch, partPos, invert) { + var part = order[partPos], right = (part.level % 2) != 0 + return get(invert ? ch - 1 : ch, right != invert) + } + var partPos = getBidiPartAt(order, ch, sticky) + var other = bidiOther + var val = getBidi(ch, partPos, sticky == "before") + if (other != null) { val.other = getBidi(ch, other, sticky != "before") } + return val +} + +// Used to cheaply estimate the coordinates for a position. Used for +// intermediate scroll updates. +function estimateCoords(cm, pos) { + var left = 0 + pos = clipPos(cm.doc, pos) + if (!cm.options.lineWrapping) { left = charWidth(cm.display) * pos.ch } + var lineObj = getLine(cm.doc, pos.line) + var top = heightAtLine(lineObj) + paddingTop(cm.display) + return {left: left, right: left, top: top, bottom: top + lineObj.height} +} + +// Positions returned by coordsChar contain some extra information. +// xRel is the relative x position of the input coordinates compared +// to the found position (so xRel > 0 means the coordinates are to +// the right of the character position, for example). When outside +// is true, that means the coordinates lie outside the line's +// vertical range. +function PosWithInfo(line, ch, sticky, outside, xRel) { + var pos = Pos(line, ch, sticky) + pos.xRel = xRel + if (outside) { pos.outside = true } + return pos +} + +// Compute the character position closest to the given coordinates. +// Input must be lineSpace-local ("div" coordinate system). +function coordsChar(cm, x, y) { + var doc = cm.doc + y += cm.display.viewOffset + if (y < 0) { return PosWithInfo(doc.first, 0, null, true, -1) } + var lineN = lineAtHeight(doc, y), last = doc.first + doc.size - 1 + if (lineN > last) + { return PosWithInfo(doc.first + doc.size - 1, getLine(doc, last).text.length, null, true, 1) } + if (x < 0) { x = 0 } + + var lineObj = getLine(doc, lineN) + for (;;) { + var found = coordsCharInner(cm, lineObj, lineN, x, y) + var merged = collapsedSpanAtEnd(lineObj) + var mergedPos = merged && merged.find(0, true) + if (merged && (found.ch > mergedPos.from.ch || found.ch == mergedPos.from.ch && found.xRel > 0)) + { lineN = lineNo(lineObj = mergedPos.to.line) } + else + { return found } + } +} + +function wrappedLineExtent(cm, lineObj, preparedMeasure, y) { + var measure = function (ch) { return intoCoordSystem(cm, lineObj, measureCharPrepared(cm, preparedMeasure, ch), "line"); } + var end = lineObj.text.length + var begin = findFirst(function (ch) { return measure(ch - 1).bottom <= y; }, end, 0) + end = findFirst(function (ch) { return measure(ch).top > y; }, begin, end) + return {begin: begin, end: end} +} + +function wrappedLineExtentChar(cm, lineObj, preparedMeasure, target) { + var targetTop = intoCoordSystem(cm, lineObj, measureCharPrepared(cm, preparedMeasure, target), "line").top + return wrappedLineExtent(cm, lineObj, preparedMeasure, targetTop) +} + +function coordsCharInner(cm, lineObj, lineNo, x, y) { + y -= heightAtLine(lineObj) + var begin = 0, end = lineObj.text.length + var preparedMeasure = prepareMeasureForLine(cm, lineObj) + var pos + var order = getOrder(lineObj) + if (order) { + if (cm.options.lineWrapping) { + ;var assign; + ((assign = wrappedLineExtent(cm, lineObj, preparedMeasure, y), begin = assign.begin, end = assign.end, assign)) + } + pos = new Pos(lineNo, begin) + var beginLeft = cursorCoords(cm, pos, "line", lineObj, preparedMeasure).left + var dir = beginLeft < x ? 1 : -1 + var prevDiff, diff = beginLeft - x, prevPos + do { + prevDiff = diff + prevPos = pos + pos = moveVisually(cm, lineObj, pos, dir) + if (pos == null || pos.ch < begin || end <= (pos.sticky == "before" ? pos.ch - 1 : pos.ch)) { + pos = prevPos + break + } + diff = cursorCoords(cm, pos, "line", lineObj, preparedMeasure).left - x + } while ((dir < 0) != (diff < 0) && (Math.abs(diff) <= Math.abs(prevDiff))) + if (Math.abs(diff) > Math.abs(prevDiff)) { + if ((diff < 0) == (prevDiff < 0)) { throw new Error("Broke out of infinite loop in coordsCharInner") } + pos = prevPos + } + } else { + var ch = findFirst(function (ch) { + var box = intoCoordSystem(cm, lineObj, measureCharPrepared(cm, preparedMeasure, ch), "line") + if (box.top > y) { + // For the cursor stickiness + end = Math.min(ch, end) + return true + } + else if (box.bottom <= y) { return false } + else if (box.left > x) { return true } + else if (box.right < x) { return false } + else { return (x - box.left < box.right - x) } + }, begin, end) + ch = skipExtendingChars(lineObj.text, ch, 1) + pos = new Pos(lineNo, ch, ch == end ? "before" : "after") + } + var coords = cursorCoords(cm, pos, "line", lineObj, preparedMeasure) + if (y < coords.top || coords.bottom < y) { pos.outside = true } + pos.xRel = x < coords.left ? -1 : (x > coords.right ? 1 : 0) + return pos +} + +var measureText +// Compute the default text height. +function textHeight(display) { + if (display.cachedTextHeight != null) { return display.cachedTextHeight } + if (measureText == null) { + measureText = elt("pre") + // Measure a bunch of lines, for browsers that compute + // fractional heights. + for (var i = 0; i < 49; ++i) { + measureText.appendChild(document.createTextNode("x")) + measureText.appendChild(elt("br")) + } + measureText.appendChild(document.createTextNode("x")) + } + removeChildrenAndAdd(display.measure, measureText) + var height = measureText.offsetHeight / 50 + if (height > 3) { display.cachedTextHeight = height } + removeChildren(display.measure) + return height || 1 +} + +// Compute the default character width. +function charWidth(display) { + if (display.cachedCharWidth != null) { return display.cachedCharWidth } + var anchor = elt("span", "xxxxxxxxxx") + var pre = elt("pre", [anchor]) + removeChildrenAndAdd(display.measure, pre) + var rect = anchor.getBoundingClientRect(), width = (rect.right - rect.left) / 10 + if (width > 2) { display.cachedCharWidth = width } + return width || 10 +} + +// Do a bulk-read of the DOM positions and sizes needed to draw the +// view, so that we don't interleave reading and writing to the DOM. +function getDimensions(cm) { + var d = cm.display, left = {}, width = {} + var gutterLeft = d.gutters.clientLeft + for (var n = d.gutters.firstChild, i = 0; n; n = n.nextSibling, ++i) { + left[cm.options.gutters[i]] = n.offsetLeft + n.clientLeft + gutterLeft + width[cm.options.gutters[i]] = n.clientWidth + } + return {fixedPos: compensateForHScroll(d), + gutterTotalWidth: d.gutters.offsetWidth, + gutterLeft: left, + gutterWidth: width, + wrapperWidth: d.wrapper.clientWidth} +} + +// Computes display.scroller.scrollLeft + display.gutters.offsetWidth, +// but using getBoundingClientRect to get a sub-pixel-accurate +// result. +function compensateForHScroll(display) { + return display.scroller.getBoundingClientRect().left - display.sizer.getBoundingClientRect().left +} + +// Returns a function that estimates the height of a line, to use as +// first approximation until the line becomes visible (and is thus +// properly measurable). +function estimateHeight(cm) { + var th = textHeight(cm.display), wrapping = cm.options.lineWrapping + var perLine = wrapping && Math.max(5, cm.display.scroller.clientWidth / charWidth(cm.display) - 3) + return function (line) { + if (lineIsHidden(cm.doc, line)) { return 0 } + + var widgetsHeight = 0 + if (line.widgets) { for (var i = 0; i < line.widgets.length; i++) { + if (line.widgets[i].height) { widgetsHeight += line.widgets[i].height } + } } + + if (wrapping) + { return widgetsHeight + (Math.ceil(line.text.length / perLine) || 1) * th } + else + { return widgetsHeight + th } + } +} + +function estimateLineHeights(cm) { + var doc = cm.doc, est = estimateHeight(cm) + doc.iter(function (line) { + var estHeight = est(line) + if (estHeight != line.height) { updateLineHeight(line, estHeight) } + }) +} + +// Given a mouse event, find the corresponding position. If liberal +// is false, it checks whether a gutter or scrollbar was clicked, +// and returns null if it was. forRect is used by rectangular +// selections, and tries to estimate a character position even for +// coordinates beyond the right of the text. +function posFromMouse(cm, e, liberal, forRect) { + var display = cm.display + if (!liberal && e_target(e).getAttribute("cm-not-content") == "true") { return null } + + var x, y, space = display.lineSpace.getBoundingClientRect() + // Fails unpredictably on IE[67] when mouse is dragged around quickly. + try { x = e.clientX - space.left; y = e.clientY - space.top } + catch (e) { return null } + var coords = coordsChar(cm, x, y), line + if (forRect && coords.xRel == 1 && (line = getLine(cm.doc, coords.line).text).length == coords.ch) { + var colDiff = countColumn(line, line.length, cm.options.tabSize) - line.length + coords = Pos(coords.line, Math.max(0, Math.round((x - paddingH(cm.display).left) / charWidth(cm.display)) - colDiff)) + } + return coords +} + +// Find the view element corresponding to a given line. Return null +// when the line isn't visible. +function findViewIndex(cm, n) { + if (n >= cm.display.viewTo) { return null } + n -= cm.display.viewFrom + if (n < 0) { return null } + var view = cm.display.view + for (var i = 0; i < view.length; i++) { + n -= view[i].size + if (n < 0) { return i } + } +} + +function updateSelection(cm) { + cm.display.input.showSelection(cm.display.input.prepareSelection()) +} + +function prepareSelection(cm, primary) { + var doc = cm.doc, result = {} + var curFragment = result.cursors = document.createDocumentFragment() + var selFragment = result.selection = document.createDocumentFragment() + + for (var i = 0; i < doc.sel.ranges.length; i++) { + if (primary === false && i == doc.sel.primIndex) { continue } + var range = doc.sel.ranges[i] + if (range.from().line >= cm.display.viewTo || range.to().line < cm.display.viewFrom) { continue } + var collapsed = range.empty() + if (collapsed || cm.options.showCursorWhenSelecting) + { drawSelectionCursor(cm, range.head, curFragment) } + if (!collapsed) + { drawSelectionRange(cm, range, selFragment) } + } + return result +} + +// Draws a cursor for the given range +function drawSelectionCursor(cm, head, output) { + var pos = cursorCoords(cm, head, "div", null, null, !cm.options.singleCursorHeightPerLine) + + var cursor = output.appendChild(elt("div", "\u00a0", "CodeMirror-cursor")) + cursor.style.left = pos.left + "px" + cursor.style.top = pos.top + "px" + cursor.style.height = Math.max(0, pos.bottom - pos.top) * cm.options.cursorHeight + "px" + + if (pos.other) { + // Secondary cursor, shown when on a 'jump' in bi-directional text + var otherCursor = output.appendChild(elt("div", "\u00a0", "CodeMirror-cursor CodeMirror-secondarycursor")) + otherCursor.style.display = "" + otherCursor.style.left = pos.other.left + "px" + otherCursor.style.top = pos.other.top + "px" + otherCursor.style.height = (pos.other.bottom - pos.other.top) * .85 + "px" + } +} + +// Draws the given range as a highlighted selection +function drawSelectionRange(cm, range, output) { + var display = cm.display, doc = cm.doc + var fragment = document.createDocumentFragment() + var padding = paddingH(cm.display), leftSide = padding.left + var rightSide = Math.max(display.sizerWidth, displayWidth(cm) - display.sizer.offsetLeft) - padding.right + + function add(left, top, width, bottom) { + if (top < 0) { top = 0 } + top = Math.round(top) + bottom = Math.round(bottom) + fragment.appendChild(elt("div", null, "CodeMirror-selected", ("position: absolute; left: " + left + "px;\n top: " + top + "px; width: " + (width == null ? rightSide - left : width) + "px;\n height: " + (bottom - top) + "px"))) + } + + function drawForLine(line, fromArg, toArg) { + var lineObj = getLine(doc, line) + var lineLen = lineObj.text.length + var start, end + function coords(ch, bias) { + return charCoords(cm, Pos(line, ch), "div", lineObj, bias) + } + + iterateBidiSections(getOrder(lineObj), fromArg || 0, toArg == null ? lineLen : toArg, function (from, to, dir) { + var leftPos = coords(from, "left"), rightPos, left, right + if (from == to) { + rightPos = leftPos + left = right = leftPos.left + } else { + rightPos = coords(to - 1, "right") + if (dir == "rtl") { var tmp = leftPos; leftPos = rightPos; rightPos = tmp } + left = leftPos.left + right = rightPos.right + } + if (fromArg == null && from == 0) { left = leftSide } + if (rightPos.top - leftPos.top > 3) { // Different lines, draw top part + add(left, leftPos.top, null, leftPos.bottom) + left = leftSide + if (leftPos.bottom < rightPos.top) { add(left, leftPos.bottom, null, rightPos.top) } + } + if (toArg == null && to == lineLen) { right = rightSide } + if (!start || leftPos.top < start.top || leftPos.top == start.top && leftPos.left < start.left) + { start = leftPos } + if (!end || rightPos.bottom > end.bottom || rightPos.bottom == end.bottom && rightPos.right > end.right) + { end = rightPos } + if (left < leftSide + 1) { left = leftSide } + add(left, rightPos.top, right - left, rightPos.bottom) + }) + return {start: start, end: end} + } + + var sFrom = range.from(), sTo = range.to() + if (sFrom.line == sTo.line) { + drawForLine(sFrom.line, sFrom.ch, sTo.ch) + } else { + var fromLine = getLine(doc, sFrom.line), toLine = getLine(doc, sTo.line) + var singleVLine = visualLine(fromLine) == visualLine(toLine) + var leftEnd = drawForLine(sFrom.line, sFrom.ch, singleVLine ? fromLine.text.length + 1 : null).end + var rightStart = drawForLine(sTo.line, singleVLine ? 0 : null, sTo.ch).start + if (singleVLine) { + if (leftEnd.top < rightStart.top - 2) { + add(leftEnd.right, leftEnd.top, null, leftEnd.bottom) + add(leftSide, rightStart.top, rightStart.left, rightStart.bottom) + } else { + add(leftEnd.right, leftEnd.top, rightStart.left - leftEnd.right, leftEnd.bottom) + } + } + if (leftEnd.bottom < rightStart.top) + { add(leftSide, leftEnd.bottom, null, rightStart.top) } + } + + output.appendChild(fragment) +} + +// Cursor-blinking +function restartBlink(cm) { + if (!cm.state.focused) { return } + var display = cm.display + clearInterval(display.blinker) + var on = true + display.cursorDiv.style.visibility = "" + if (cm.options.cursorBlinkRate > 0) + { display.blinker = setInterval(function () { return display.cursorDiv.style.visibility = (on = !on) ? "" : "hidden"; }, + cm.options.cursorBlinkRate) } + else if (cm.options.cursorBlinkRate < 0) + { display.cursorDiv.style.visibility = "hidden" } +} + +function ensureFocus(cm) { + if (!cm.state.focused) { cm.display.input.focus(); onFocus(cm) } +} + +function delayBlurEvent(cm) { + cm.state.delayingBlurEvent = true + setTimeout(function () { if (cm.state.delayingBlurEvent) { + cm.state.delayingBlurEvent = false + onBlur(cm) + } }, 100) +} + +function onFocus(cm, e) { + if (cm.state.delayingBlurEvent) { cm.state.delayingBlurEvent = false } + + if (cm.options.readOnly == "nocursor") { return } + if (!cm.state.focused) { + signal(cm, "focus", cm, e) + cm.state.focused = true + addClass(cm.display.wrapper, "CodeMirror-focused") + // This test prevents this from firing when a context + // menu is closed (since the input reset would kill the + // select-all detection hack) + if (!cm.curOp && cm.display.selForContextMenu != cm.doc.sel) { + cm.display.input.reset() + if (webkit) { setTimeout(function () { return cm.display.input.reset(true); }, 20) } // Issue #1730 + } + cm.display.input.receivedFocus() + } + restartBlink(cm) +} +function onBlur(cm, e) { + if (cm.state.delayingBlurEvent) { return } + + if (cm.state.focused) { + signal(cm, "blur", cm, e) + cm.state.focused = false + rmClass(cm.display.wrapper, "CodeMirror-focused") + } + clearInterval(cm.display.blinker) + setTimeout(function () { if (!cm.state.focused) { cm.display.shift = false } }, 150) +} + +// Re-align line numbers and gutter marks to compensate for +// horizontal scrolling. +function alignHorizontally(cm) { + var display = cm.display, view = display.view + if (!display.alignWidgets && (!display.gutters.firstChild || !cm.options.fixedGutter)) { return } + var comp = compensateForHScroll(display) - display.scroller.scrollLeft + cm.doc.scrollLeft + var gutterW = display.gutters.offsetWidth, left = comp + "px" + for (var i = 0; i < view.length; i++) { if (!view[i].hidden) { + if (cm.options.fixedGutter) { + if (view[i].gutter) + { view[i].gutter.style.left = left } + if (view[i].gutterBackground) + { view[i].gutterBackground.style.left = left } + } + var align = view[i].alignable + if (align) { for (var j = 0; j < align.length; j++) + { align[j].style.left = left } } + } } + if (cm.options.fixedGutter) + { display.gutters.style.left = (comp + gutterW) + "px" } +} + +// Used to ensure that the line number gutter is still the right +// size for the current document size. Returns true when an update +// is needed. +function maybeUpdateLineNumberWidth(cm) { + if (!cm.options.lineNumbers) { return false } + var doc = cm.doc, last = lineNumberFor(cm.options, doc.first + doc.size - 1), display = cm.display + if (last.length != display.lineNumChars) { + var test = display.measure.appendChild(elt("div", [elt("div", last)], + "CodeMirror-linenumber CodeMirror-gutter-elt")) + var innerW = test.firstChild.offsetWidth, padding = test.offsetWidth - innerW + display.lineGutter.style.width = "" + display.lineNumInnerWidth = Math.max(innerW, display.lineGutter.offsetWidth - padding) + 1 + display.lineNumWidth = display.lineNumInnerWidth + padding + display.lineNumChars = display.lineNumInnerWidth ? last.length : -1 + display.lineGutter.style.width = display.lineNumWidth + "px" + updateGutterSpace(cm) + return true + } + return false +} + +// Read the actual heights of the rendered lines, and update their +// stored heights to match. +function updateHeightsInViewport(cm) { + var display = cm.display + var prevBottom = display.lineDiv.offsetTop + for (var i = 0; i < display.view.length; i++) { + var cur = display.view[i], height = (void 0) + if (cur.hidden) { continue } + if (ie && ie_version < 8) { + var bot = cur.node.offsetTop + cur.node.offsetHeight + height = bot - prevBottom + prevBottom = bot + } else { + var box = cur.node.getBoundingClientRect() + height = box.bottom - box.top + } + var diff = cur.line.height - height + if (height < 2) { height = textHeight(display) } + if (diff > .001 || diff < -.001) { + updateLineHeight(cur.line, height) + updateWidgetHeight(cur.line) + if (cur.rest) { for (var j = 0; j < cur.rest.length; j++) + { updateWidgetHeight(cur.rest[j]) } } + } + } +} + +// Read and store the height of line widgets associated with the +// given line. +function updateWidgetHeight(line) { + if (line.widgets) { for (var i = 0; i < line.widgets.length; ++i) + { line.widgets[i].height = line.widgets[i].node.parentNode.offsetHeight } } +} + +// Compute the lines that are visible in a given viewport (defaults +// the the current scroll position). viewport may contain top, +// height, and ensure (see op.scrollToPos) properties. +function visibleLines(display, doc, viewport) { + var top = viewport && viewport.top != null ? Math.max(0, viewport.top) : display.scroller.scrollTop + top = Math.floor(top - paddingTop(display)) + var bottom = viewport && viewport.bottom != null ? viewport.bottom : top + display.wrapper.clientHeight + + var from = lineAtHeight(doc, top), to = lineAtHeight(doc, bottom) + // Ensure is a {from: {line, ch}, to: {line, ch}} object, and + // forces those lines into the viewport (if possible). + if (viewport && viewport.ensure) { + var ensureFrom = viewport.ensure.from.line, ensureTo = viewport.ensure.to.line + if (ensureFrom < from) { + from = ensureFrom + to = lineAtHeight(doc, heightAtLine(getLine(doc, ensureFrom)) + display.wrapper.clientHeight) + } else if (Math.min(ensureTo, doc.lastLine()) >= to) { + from = lineAtHeight(doc, heightAtLine(getLine(doc, ensureTo)) - display.wrapper.clientHeight) + to = ensureTo + } + } + return {from: from, to: Math.max(to, from + 1)} +} + +// Sync the scrollable area and scrollbars, ensure the viewport +// covers the visible area. +function setScrollTop(cm, val) { + if (Math.abs(cm.doc.scrollTop - val) < 2) { return } + cm.doc.scrollTop = val + if (!gecko) { updateDisplaySimple(cm, {top: val}) } + if (cm.display.scroller.scrollTop != val) { cm.display.scroller.scrollTop = val } + cm.display.scrollbars.setScrollTop(val) + if (gecko) { updateDisplaySimple(cm) } + startWorker(cm, 100) +} +// Sync scroller and scrollbar, ensure the gutter elements are +// aligned. +function setScrollLeft(cm, val, isScroller) { + if (isScroller ? val == cm.doc.scrollLeft : Math.abs(cm.doc.scrollLeft - val) < 2) { return } + val = Math.min(val, cm.display.scroller.scrollWidth - cm.display.scroller.clientWidth) + cm.doc.scrollLeft = val + alignHorizontally(cm) + if (cm.display.scroller.scrollLeft != val) { cm.display.scroller.scrollLeft = val } + cm.display.scrollbars.setScrollLeft(val) +} + +// Since the delta values reported on mouse wheel events are +// unstandardized between browsers and even browser versions, and +// generally horribly unpredictable, this code starts by measuring +// the scroll effect that the first few mouse wheel events have, +// and, from that, detects the way it can convert deltas to pixel +// offsets afterwards. +// +// The reason we want to know the amount a wheel event will scroll +// is that it gives us a chance to update the display before the +// actual scrolling happens, reducing flickering. + +var wheelSamples = 0; +var wheelPixelsPerUnit = null; +// Fill in a browser-detected starting value on browsers where we +// know one. These don't have to be accurate -- the result of them +// being wrong would just be a slight flicker on the first wheel +// scroll (if it is large enough). +if (ie) { wheelPixelsPerUnit = -.53 } +else if (gecko) { wheelPixelsPerUnit = 15 } +else if (chrome) { wheelPixelsPerUnit = -.7 } +else if (safari) { wheelPixelsPerUnit = -1/3 } + +function wheelEventDelta(e) { + var dx = e.wheelDeltaX, dy = e.wheelDeltaY + if (dx == null && e.detail && e.axis == e.HORIZONTAL_AXIS) { dx = e.detail } + if (dy == null && e.detail && e.axis == e.VERTICAL_AXIS) { dy = e.detail } + else if (dy == null) { dy = e.wheelDelta } + return {x: dx, y: dy} +} +function wheelEventPixels(e) { + var delta = wheelEventDelta(e) + delta.x *= wheelPixelsPerUnit + delta.y *= wheelPixelsPerUnit + return delta +} + +function onScrollWheel(cm, e) { + var delta = wheelEventDelta(e), dx = delta.x, dy = delta.y + + var display = cm.display, scroll = display.scroller + // Quit if there's nothing to scroll here + var canScrollX = scroll.scrollWidth > scroll.clientWidth + var canScrollY = scroll.scrollHeight > scroll.clientHeight + if (!(dx && canScrollX || dy && canScrollY)) { return } + + // Webkit browsers on OS X abort momentum scrolls when the target + // of the scroll event is removed from the scrollable element. + // This hack (see related code in patchDisplay) makes sure the + // element is kept around. + if (dy && mac && webkit) { + outer: for (var cur = e.target, view = display.view; cur != scroll; cur = cur.parentNode) { + for (var i = 0; i < view.length; i++) { + if (view[i].node == cur) { + cm.display.currentWheelTarget = cur + break outer + } + } + } + } + + // On some browsers, horizontal scrolling will cause redraws to + // happen before the gutter has been realigned, causing it to + // wriggle around in a most unseemly way. When we have an + // estimated pixels/delta value, we just handle horizontal + // scrolling entirely here. It'll be slightly off from native, but + // better than glitching out. + if (dx && !gecko && !presto && wheelPixelsPerUnit != null) { + if (dy && canScrollY) + { setScrollTop(cm, Math.max(0, Math.min(scroll.scrollTop + dy * wheelPixelsPerUnit, scroll.scrollHeight - scroll.clientHeight))) } + setScrollLeft(cm, Math.max(0, Math.min(scroll.scrollLeft + dx * wheelPixelsPerUnit, scroll.scrollWidth - scroll.clientWidth))) + // Only prevent default scrolling if vertical scrolling is + // actually possible. Otherwise, it causes vertical scroll + // jitter on OSX trackpads when deltaX is small and deltaY + // is large (issue #3579) + if (!dy || (dy && canScrollY)) + { e_preventDefault(e) } + display.wheelStartX = null // Abort measurement, if in progress + return + } + + // 'Project' the visible viewport to cover the area that is being + // scrolled into view (if we know enough to estimate it). + if (dy && wheelPixelsPerUnit != null) { + var pixels = dy * wheelPixelsPerUnit + var top = cm.doc.scrollTop, bot = top + display.wrapper.clientHeight + if (pixels < 0) { top = Math.max(0, top + pixels - 50) } + else { bot = Math.min(cm.doc.height, bot + pixels + 50) } + updateDisplaySimple(cm, {top: top, bottom: bot}) + } + + if (wheelSamples < 20) { + if (display.wheelStartX == null) { + display.wheelStartX = scroll.scrollLeft; display.wheelStartY = scroll.scrollTop + display.wheelDX = dx; display.wheelDY = dy + setTimeout(function () { + if (display.wheelStartX == null) { return } + var movedX = scroll.scrollLeft - display.wheelStartX + var movedY = scroll.scrollTop - display.wheelStartY + var sample = (movedY && display.wheelDY && movedY / display.wheelDY) || + (movedX && display.wheelDX && movedX / display.wheelDX) + display.wheelStartX = display.wheelStartY = null + if (!sample) { return } + wheelPixelsPerUnit = (wheelPixelsPerUnit * wheelSamples + sample) / (wheelSamples + 1) + ++wheelSamples + }, 200) + } else { + display.wheelDX += dx; display.wheelDY += dy + } + } +} + +// SCROLLBARS + +// Prepare DOM reads needed to update the scrollbars. Done in one +// shot to minimize update/measure roundtrips. +function measureForScrollbars(cm) { + var d = cm.display, gutterW = d.gutters.offsetWidth + var docH = Math.round(cm.doc.height + paddingVert(cm.display)) + return { + clientHeight: d.scroller.clientHeight, + viewHeight: d.wrapper.clientHeight, + scrollWidth: d.scroller.scrollWidth, clientWidth: d.scroller.clientWidth, + viewWidth: d.wrapper.clientWidth, + barLeft: cm.options.fixedGutter ? gutterW : 0, + docHeight: docH, + scrollHeight: docH + scrollGap(cm) + d.barHeight, + nativeBarWidth: d.nativeBarWidth, + gutterWidth: gutterW + } +} + +var NativeScrollbars = function(place, scroll, cm) { + this.cm = cm + var vert = this.vert = elt("div", [elt("div", null, null, "min-width: 1px")], "CodeMirror-vscrollbar") + var horiz = this.horiz = elt("div", [elt("div", null, null, "height: 100%; min-height: 1px")], "CodeMirror-hscrollbar") + place(vert); place(horiz) + + on(vert, "scroll", function () { + if (vert.clientHeight) { scroll(vert.scrollTop, "vertical") } + }) + on(horiz, "scroll", function () { + if (horiz.clientWidth) { scroll(horiz.scrollLeft, "horizontal") } + }) + + this.checkedZeroWidth = false + // Need to set a minimum width to see the scrollbar on IE7 (but must not set it on IE8). + if (ie && ie_version < 8) { this.horiz.style.minHeight = this.vert.style.minWidth = "18px" } +}; + +NativeScrollbars.prototype.update = function (measure) { + var needsH = measure.scrollWidth > measure.clientWidth + 1 + var needsV = measure.scrollHeight > measure.clientHeight + 1 + var sWidth = measure.nativeBarWidth + + if (needsV) { + this.vert.style.display = "block" + this.vert.style.bottom = needsH ? sWidth + "px" : "0" + var totalHeight = measure.viewHeight - (needsH ? sWidth : 0) + // A bug in IE8 can cause this value to be negative, so guard it. + this.vert.firstChild.style.height = + Math.max(0, measure.scrollHeight - measure.clientHeight + totalHeight) + "px" + } else { + this.vert.style.display = "" + this.vert.firstChild.style.height = "0" + } + + if (needsH) { + this.horiz.style.display = "block" + this.horiz.style.right = needsV ? sWidth + "px" : "0" + this.horiz.style.left = measure.barLeft + "px" + var totalWidth = measure.viewWidth - measure.barLeft - (needsV ? sWidth : 0) + this.horiz.firstChild.style.width = + Math.max(0, measure.scrollWidth - measure.clientWidth + totalWidth) + "px" + } else { + this.horiz.style.display = "" + this.horiz.firstChild.style.width = "0" + } + + if (!this.checkedZeroWidth && measure.clientHeight > 0) { + if (sWidth == 0) { this.zeroWidthHack() } + this.checkedZeroWidth = true + } + + return {right: needsV ? sWidth : 0, bottom: needsH ? sWidth : 0} +}; + +NativeScrollbars.prototype.setScrollLeft = function (pos) { + if (this.horiz.scrollLeft != pos) { this.horiz.scrollLeft = pos } + if (this.disableHoriz) { this.enableZeroWidthBar(this.horiz, this.disableHoriz) } +}; + +NativeScrollbars.prototype.setScrollTop = function (pos) { + if (this.vert.scrollTop != pos) { this.vert.scrollTop = pos } + if (this.disableVert) { this.enableZeroWidthBar(this.vert, this.disableVert) } +}; + +NativeScrollbars.prototype.zeroWidthHack = function () { + var w = mac && !mac_geMountainLion ? "12px" : "18px" + this.horiz.style.height = this.vert.style.width = w + this.horiz.style.pointerEvents = this.vert.style.pointerEvents = "none" + this.disableHoriz = new Delayed + this.disableVert = new Delayed +}; + +NativeScrollbars.prototype.enableZeroWidthBar = function (bar, delay) { + bar.style.pointerEvents = "auto" + function maybeDisable() { + // To find out whether the scrollbar is still visible, we + // check whether the element under the pixel in the bottom + // left corner of the scrollbar box is the scrollbar box + // itself (when the bar is still visible) or its filler child + // (when the bar is hidden). If it is still visible, we keep + // it enabled, if it's hidden, we disable pointer events. + var box = bar.getBoundingClientRect() + var elt = document.elementFromPoint(box.left + 1, box.bottom - 1) + if (elt != bar) { bar.style.pointerEvents = "none" } + else { delay.set(1000, maybeDisable) } + } + delay.set(1000, maybeDisable) +}; + +NativeScrollbars.prototype.clear = function () { + var parent = this.horiz.parentNode + parent.removeChild(this.horiz) + parent.removeChild(this.vert) +}; + +var NullScrollbars = function () {}; + +NullScrollbars.prototype.update = function () { return {bottom: 0, right: 0} }; +NullScrollbars.prototype.setScrollLeft = function () {}; +NullScrollbars.prototype.setScrollTop = function () {}; +NullScrollbars.prototype.clear = function () {}; + +function updateScrollbars(cm, measure) { + if (!measure) { measure = measureForScrollbars(cm) } + var startWidth = cm.display.barWidth, startHeight = cm.display.barHeight + updateScrollbarsInner(cm, measure) + for (var i = 0; i < 4 && startWidth != cm.display.barWidth || startHeight != cm.display.barHeight; i++) { + if (startWidth != cm.display.barWidth && cm.options.lineWrapping) + { updateHeightsInViewport(cm) } + updateScrollbarsInner(cm, measureForScrollbars(cm)) + startWidth = cm.display.barWidth; startHeight = cm.display.barHeight + } +} + +// Re-synchronize the fake scrollbars with the actual size of the +// content. +function updateScrollbarsInner(cm, measure) { + var d = cm.display + var sizes = d.scrollbars.update(measure) + + d.sizer.style.paddingRight = (d.barWidth = sizes.right) + "px" + d.sizer.style.paddingBottom = (d.barHeight = sizes.bottom) + "px" + d.heightForcer.style.borderBottom = sizes.bottom + "px solid transparent" + + if (sizes.right && sizes.bottom) { + d.scrollbarFiller.style.display = "block" + d.scrollbarFiller.style.height = sizes.bottom + "px" + d.scrollbarFiller.style.width = sizes.right + "px" + } else { d.scrollbarFiller.style.display = "" } + if (sizes.bottom && cm.options.coverGutterNextToScrollbar && cm.options.fixedGutter) { + d.gutterFiller.style.display = "block" + d.gutterFiller.style.height = sizes.bottom + "px" + d.gutterFiller.style.width = measure.gutterWidth + "px" + } else { d.gutterFiller.style.display = "" } +} + +var scrollbarModel = {"native": NativeScrollbars, "null": NullScrollbars} + +function initScrollbars(cm) { + if (cm.display.scrollbars) { + cm.display.scrollbars.clear() + if (cm.display.scrollbars.addClass) + { rmClass(cm.display.wrapper, cm.display.scrollbars.addClass) } + } + + cm.display.scrollbars = new scrollbarModel[cm.options.scrollbarStyle](function (node) { + cm.display.wrapper.insertBefore(node, cm.display.scrollbarFiller) + // Prevent clicks in the scrollbars from killing focus + on(node, "mousedown", function () { + if (cm.state.focused) { setTimeout(function () { return cm.display.input.focus(); }, 0) } + }) + node.setAttribute("cm-not-content", "true") + }, function (pos, axis) { + if (axis == "horizontal") { setScrollLeft(cm, pos) } + else { setScrollTop(cm, pos) } + }, cm) + if (cm.display.scrollbars.addClass) + { addClass(cm.display.wrapper, cm.display.scrollbars.addClass) } +} + +// SCROLLING THINGS INTO VIEW + +// If an editor sits on the top or bottom of the window, partially +// scrolled out of view, this ensures that the cursor is visible. +function maybeScrollWindow(cm, coords) { + if (signalDOMEvent(cm, "scrollCursorIntoView")) { return } + + var display = cm.display, box = display.sizer.getBoundingClientRect(), doScroll = null + if (coords.top + box.top < 0) { doScroll = true } + else if (coords.bottom + box.top > (window.innerHeight || document.documentElement.clientHeight)) { doScroll = false } + if (doScroll != null && !phantom) { + var scrollNode = elt("div", "\u200b", null, ("position: absolute;\n top: " + (coords.top - display.viewOffset - paddingTop(cm.display)) + "px;\n height: " + (coords.bottom - coords.top + scrollGap(cm) + display.barHeight) + "px;\n left: " + (coords.left) + "px; width: 2px;")) + cm.display.lineSpace.appendChild(scrollNode) + scrollNode.scrollIntoView(doScroll) + cm.display.lineSpace.removeChild(scrollNode) + } +} + +// Scroll a given position into view (immediately), verifying that +// it actually became visible (as line heights are accurately +// measured, the position of something may 'drift' during drawing). +function scrollPosIntoView(cm, pos, end, margin) { + if (margin == null) { margin = 0 } + var coords + for (var limit = 0; limit < 5; limit++) { + var changed = false + coords = cursorCoords(cm, pos) + var endCoords = !end || end == pos ? coords : cursorCoords(cm, end) + var scrollPos = calculateScrollPos(cm, Math.min(coords.left, endCoords.left), + Math.min(coords.top, endCoords.top) - margin, + Math.max(coords.left, endCoords.left), + Math.max(coords.bottom, endCoords.bottom) + margin) + var startTop = cm.doc.scrollTop, startLeft = cm.doc.scrollLeft + if (scrollPos.scrollTop != null) { + setScrollTop(cm, scrollPos.scrollTop) + if (Math.abs(cm.doc.scrollTop - startTop) > 1) { changed = true } + } + if (scrollPos.scrollLeft != null) { + setScrollLeft(cm, scrollPos.scrollLeft) + if (Math.abs(cm.doc.scrollLeft - startLeft) > 1) { changed = true } + } + if (!changed) { break } + } + return coords +} + +// Scroll a given set of coordinates into view (immediately). +function scrollIntoView(cm, x1, y1, x2, y2) { + var scrollPos = calculateScrollPos(cm, x1, y1, x2, y2) + if (scrollPos.scrollTop != null) { setScrollTop(cm, scrollPos.scrollTop) } + if (scrollPos.scrollLeft != null) { setScrollLeft(cm, scrollPos.scrollLeft) } +} + +// Calculate a new scroll position needed to scroll the given +// rectangle into view. Returns an object with scrollTop and +// scrollLeft properties. When these are undefined, the +// vertical/horizontal position does not need to be adjusted. +function calculateScrollPos(cm, x1, y1, x2, y2) { + var display = cm.display, snapMargin = textHeight(cm.display) + if (y1 < 0) { y1 = 0 } + var screentop = cm.curOp && cm.curOp.scrollTop != null ? cm.curOp.scrollTop : display.scroller.scrollTop + var screen = displayHeight(cm), result = {} + if (y2 - y1 > screen) { y2 = y1 + screen } + var docBottom = cm.doc.height + paddingVert(display) + var atTop = y1 < snapMargin, atBottom = y2 > docBottom - snapMargin + if (y1 < screentop) { + result.scrollTop = atTop ? 0 : y1 + } else if (y2 > screentop + screen) { + var newTop = Math.min(y1, (atBottom ? docBottom : y2) - screen) + if (newTop != screentop) { result.scrollTop = newTop } + } + + var screenleft = cm.curOp && cm.curOp.scrollLeft != null ? cm.curOp.scrollLeft : display.scroller.scrollLeft + var screenw = displayWidth(cm) - (cm.options.fixedGutter ? display.gutters.offsetWidth : 0) + var tooWide = x2 - x1 > screenw + if (tooWide) { x2 = x1 + screenw } + if (x1 < 10) + { result.scrollLeft = 0 } + else if (x1 < screenleft) + { result.scrollLeft = Math.max(0, x1 - (tooWide ? 0 : 10)) } + else if (x2 > screenw + screenleft - 3) + { result.scrollLeft = x2 + (tooWide ? 0 : 10) - screenw } + return result +} + +// Store a relative adjustment to the scroll position in the current +// operation (to be applied when the operation finishes). +function addToScrollPos(cm, left, top) { + if (left != null || top != null) { resolveScrollToPos(cm) } + if (left != null) + { cm.curOp.scrollLeft = (cm.curOp.scrollLeft == null ? cm.doc.scrollLeft : cm.curOp.scrollLeft) + left } + if (top != null) + { cm.curOp.scrollTop = (cm.curOp.scrollTop == null ? cm.doc.scrollTop : cm.curOp.scrollTop) + top } +} + +// Make sure that at the end of the operation the current cursor is +// shown. +function ensureCursorVisible(cm) { + resolveScrollToPos(cm) + var cur = cm.getCursor(), from = cur, to = cur + if (!cm.options.lineWrapping) { + from = cur.ch ? Pos(cur.line, cur.ch - 1) : cur + to = Pos(cur.line, cur.ch + 1) + } + cm.curOp.scrollToPos = {from: from, to: to, margin: cm.options.cursorScrollMargin, isCursor: true} +} + +// When an operation has its scrollToPos property set, and another +// scroll action is applied before the end of the operation, this +// 'simulates' scrolling that position into view in a cheap way, so +// that the effect of intermediate scroll commands is not ignored. +function resolveScrollToPos(cm) { + var range = cm.curOp.scrollToPos + if (range) { + cm.curOp.scrollToPos = null + var from = estimateCoords(cm, range.from), to = estimateCoords(cm, range.to) + var sPos = calculateScrollPos(cm, Math.min(from.left, to.left), + Math.min(from.top, to.top) - range.margin, + Math.max(from.right, to.right), + Math.max(from.bottom, to.bottom) + range.margin) + cm.scrollTo(sPos.scrollLeft, sPos.scrollTop) + } +} + +// Operations are used to wrap a series of changes to the editor +// state in such a way that each change won't have to update the +// cursor and display (which would be awkward, slow, and +// error-prone). Instead, display updates are batched and then all +// combined and executed at once. + +var nextOpId = 0 +// Start a new operation. +function startOperation(cm) { + cm.curOp = { + cm: cm, + viewChanged: false, // Flag that indicates that lines might need to be redrawn + startHeight: cm.doc.height, // Used to detect need to update scrollbar + forceUpdate: false, // Used to force a redraw + updateInput: null, // Whether to reset the input textarea + typing: false, // Whether this reset should be careful to leave existing text (for compositing) + changeObjs: null, // Accumulated changes, for firing change events + cursorActivityHandlers: null, // Set of handlers to fire cursorActivity on + cursorActivityCalled: 0, // Tracks which cursorActivity handlers have been called already + selectionChanged: false, // Whether the selection needs to be redrawn + updateMaxLine: false, // Set when the widest line needs to be determined anew + scrollLeft: null, scrollTop: null, // Intermediate scroll position, not pushed to DOM yet + scrollToPos: null, // Used to scroll to a specific position + focus: false, + id: ++nextOpId // Unique ID + } + pushOperation(cm.curOp) +} + +// Finish an operation, updating the display and signalling delayed events +function endOperation(cm) { + var op = cm.curOp + finishOperation(op, function (group) { + for (var i = 0; i < group.ops.length; i++) + { group.ops[i].cm.curOp = null } + endOperations(group) + }) +} + +// The DOM updates done when an operation finishes are batched so +// that the minimum number of relayouts are required. +function endOperations(group) { + var ops = group.ops + for (var i = 0; i < ops.length; i++) // Read DOM + { endOperation_R1(ops[i]) } + for (var i$1 = 0; i$1 < ops.length; i$1++) // Write DOM (maybe) + { endOperation_W1(ops[i$1]) } + for (var i$2 = 0; i$2 < ops.length; i$2++) // Read DOM + { endOperation_R2(ops[i$2]) } + for (var i$3 = 0; i$3 < ops.length; i$3++) // Write DOM (maybe) + { endOperation_W2(ops[i$3]) } + for (var i$4 = 0; i$4 < ops.length; i$4++) // Read DOM + { endOperation_finish(ops[i$4]) } +} + +function endOperation_R1(op) { + var cm = op.cm, display = cm.display + maybeClipScrollbars(cm) + if (op.updateMaxLine) { findMaxLine(cm) } + + op.mustUpdate = op.viewChanged || op.forceUpdate || op.scrollTop != null || + op.scrollToPos && (op.scrollToPos.from.line < display.viewFrom || + op.scrollToPos.to.line >= display.viewTo) || + display.maxLineChanged && cm.options.lineWrapping + op.update = op.mustUpdate && + new DisplayUpdate(cm, op.mustUpdate && {top: op.scrollTop, ensure: op.scrollToPos}, op.forceUpdate) +} + +function endOperation_W1(op) { + op.updatedDisplay = op.mustUpdate && updateDisplayIfNeeded(op.cm, op.update) +} + +function endOperation_R2(op) { + var cm = op.cm, display = cm.display + if (op.updatedDisplay) { updateHeightsInViewport(cm) } + + op.barMeasure = measureForScrollbars(cm) + + // If the max line changed since it was last measured, measure it, + // and ensure the document's width matches it. + // updateDisplay_W2 will use these properties to do the actual resizing + if (display.maxLineChanged && !cm.options.lineWrapping) { + op.adjustWidthTo = measureChar(cm, display.maxLine, display.maxLine.text.length).left + 3 + cm.display.sizerWidth = op.adjustWidthTo + op.barMeasure.scrollWidth = + Math.max(display.scroller.clientWidth, display.sizer.offsetLeft + op.adjustWidthTo + scrollGap(cm) + cm.display.barWidth) + op.maxScrollLeft = Math.max(0, display.sizer.offsetLeft + op.adjustWidthTo - displayWidth(cm)) + } + + if (op.updatedDisplay || op.selectionChanged) + { op.preparedSelection = display.input.prepareSelection(op.focus) } +} + +function endOperation_W2(op) { + var cm = op.cm + + if (op.adjustWidthTo != null) { + cm.display.sizer.style.minWidth = op.adjustWidthTo + "px" + if (op.maxScrollLeft < cm.doc.scrollLeft) + { setScrollLeft(cm, Math.min(cm.display.scroller.scrollLeft, op.maxScrollLeft), true) } + cm.display.maxLineChanged = false + } + + var takeFocus = op.focus && op.focus == activeElt() && (!document.hasFocus || document.hasFocus()) + if (op.preparedSelection) + { cm.display.input.showSelection(op.preparedSelection, takeFocus) } + if (op.updatedDisplay || op.startHeight != cm.doc.height) + { updateScrollbars(cm, op.barMeasure) } + if (op.updatedDisplay) + { setDocumentHeight(cm, op.barMeasure) } + + if (op.selectionChanged) { restartBlink(cm) } + + if (cm.state.focused && op.updateInput) + { cm.display.input.reset(op.typing) } + if (takeFocus) { ensureFocus(op.cm) } +} + +function endOperation_finish(op) { + var cm = op.cm, display = cm.display, doc = cm.doc + + if (op.updatedDisplay) { postUpdateDisplay(cm, op.update) } + + // Abort mouse wheel delta measurement, when scrolling explicitly + if (display.wheelStartX != null && (op.scrollTop != null || op.scrollLeft != null || op.scrollToPos)) + { display.wheelStartX = display.wheelStartY = null } + + // Propagate the scroll position to the actual DOM scroller + if (op.scrollTop != null && (display.scroller.scrollTop != op.scrollTop || op.forceScroll)) { + doc.scrollTop = Math.max(0, Math.min(display.scroller.scrollHeight - display.scroller.clientHeight, op.scrollTop)) + display.scrollbars.setScrollTop(doc.scrollTop) + display.scroller.scrollTop = doc.scrollTop + } + if (op.scrollLeft != null && (display.scroller.scrollLeft != op.scrollLeft || op.forceScroll)) { + doc.scrollLeft = Math.max(0, Math.min(display.scroller.scrollWidth - display.scroller.clientWidth, op.scrollLeft)) + display.scrollbars.setScrollLeft(doc.scrollLeft) + display.scroller.scrollLeft = doc.scrollLeft + alignHorizontally(cm) + } + // If we need to scroll a specific position into view, do so. + if (op.scrollToPos) { + var coords = scrollPosIntoView(cm, clipPos(doc, op.scrollToPos.from), + clipPos(doc, op.scrollToPos.to), op.scrollToPos.margin) + if (op.scrollToPos.isCursor && cm.state.focused) { maybeScrollWindow(cm, coords) } + } + + // Fire events for markers that are hidden/unidden by editing or + // undoing + var hidden = op.maybeHiddenMarkers, unhidden = op.maybeUnhiddenMarkers + if (hidden) { for (var i = 0; i < hidden.length; ++i) + { if (!hidden[i].lines.length) { signal(hidden[i], "hide") } } } + if (unhidden) { for (var i$1 = 0; i$1 < unhidden.length; ++i$1) + { if (unhidden[i$1].lines.length) { signal(unhidden[i$1], "unhide") } } } + + if (display.wrapper.offsetHeight) + { doc.scrollTop = cm.display.scroller.scrollTop } + + // Fire change events, and delayed event handlers + if (op.changeObjs) + { signal(cm, "changes", cm, op.changeObjs) } + if (op.update) + { op.update.finish() } +} + +// Run the given function in an operation +function runInOp(cm, f) { + if (cm.curOp) { return f() } + startOperation(cm) + try { return f() } + finally { endOperation(cm) } +} +// Wraps a function in an operation. Returns the wrapped function. +function operation(cm, f) { + return function() { + if (cm.curOp) { return f.apply(cm, arguments) } + startOperation(cm) + try { return f.apply(cm, arguments) } + finally { endOperation(cm) } + } +} +// Used to add methods to editor and doc instances, wrapping them in +// operations. +function methodOp(f) { + return function() { + if (this.curOp) { return f.apply(this, arguments) } + startOperation(this) + try { return f.apply(this, arguments) } + finally { endOperation(this) } + } +} +function docMethodOp(f) { + return function() { + var cm = this.cm + if (!cm || cm.curOp) { return f.apply(this, arguments) } + startOperation(cm) + try { return f.apply(this, arguments) } + finally { endOperation(cm) } + } +} + +// Updates the display.view data structure for a given change to the +// document. From and to are in pre-change coordinates. Lendiff is +// the amount of lines added or subtracted by the change. This is +// used for changes that span multiple lines, or change the way +// lines are divided into visual lines. regLineChange (below) +// registers single-line changes. +function regChange(cm, from, to, lendiff) { + if (from == null) { from = cm.doc.first } + if (to == null) { to = cm.doc.first + cm.doc.size } + if (!lendiff) { lendiff = 0 } + + var display = cm.display + if (lendiff && to < display.viewTo && + (display.updateLineNumbers == null || display.updateLineNumbers > from)) + { display.updateLineNumbers = from } + + cm.curOp.viewChanged = true + + if (from >= display.viewTo) { // Change after + if (sawCollapsedSpans && visualLineNo(cm.doc, from) < display.viewTo) + { resetView(cm) } + } else if (to <= display.viewFrom) { // Change before + if (sawCollapsedSpans && visualLineEndNo(cm.doc, to + lendiff) > display.viewFrom) { + resetView(cm) + } else { + display.viewFrom += lendiff + display.viewTo += lendiff + } + } else if (from <= display.viewFrom && to >= display.viewTo) { // Full overlap + resetView(cm) + } else if (from <= display.viewFrom) { // Top overlap + var cut = viewCuttingPoint(cm, to, to + lendiff, 1) + if (cut) { + display.view = display.view.slice(cut.index) + display.viewFrom = cut.lineN + display.viewTo += lendiff + } else { + resetView(cm) + } + } else if (to >= display.viewTo) { // Bottom overlap + var cut$1 = viewCuttingPoint(cm, from, from, -1) + if (cut$1) { + display.view = display.view.slice(0, cut$1.index) + display.viewTo = cut$1.lineN + } else { + resetView(cm) + } + } else { // Gap in the middle + var cutTop = viewCuttingPoint(cm, from, from, -1) + var cutBot = viewCuttingPoint(cm, to, to + lendiff, 1) + if (cutTop && cutBot) { + display.view = display.view.slice(0, cutTop.index) + .concat(buildViewArray(cm, cutTop.lineN, cutBot.lineN)) + .concat(display.view.slice(cutBot.index)) + display.viewTo += lendiff + } else { + resetView(cm) + } + } + + var ext = display.externalMeasured + if (ext) { + if (to < ext.lineN) + { ext.lineN += lendiff } + else if (from < ext.lineN + ext.size) + { display.externalMeasured = null } + } +} + +// Register a change to a single line. Type must be one of "text", +// "gutter", "class", "widget" +function regLineChange(cm, line, type) { + cm.curOp.viewChanged = true + var display = cm.display, ext = cm.display.externalMeasured + if (ext && line >= ext.lineN && line < ext.lineN + ext.size) + { display.externalMeasured = null } + + if (line < display.viewFrom || line >= display.viewTo) { return } + var lineView = display.view[findViewIndex(cm, line)] + if (lineView.node == null) { return } + var arr = lineView.changes || (lineView.changes = []) + if (indexOf(arr, type) == -1) { arr.push(type) } +} + +// Clear the view. +function resetView(cm) { + cm.display.viewFrom = cm.display.viewTo = cm.doc.first + cm.display.view = [] + cm.display.viewOffset = 0 +} + +function viewCuttingPoint(cm, oldN, newN, dir) { + var index = findViewIndex(cm, oldN), diff, view = cm.display.view + if (!sawCollapsedSpans || newN == cm.doc.first + cm.doc.size) + { return {index: index, lineN: newN} } + var n = cm.display.viewFrom + for (var i = 0; i < index; i++) + { n += view[i].size } + if (n != oldN) { + if (dir > 0) { + if (index == view.length - 1) { return null } + diff = (n + view[index].size) - oldN + index++ + } else { + diff = n - oldN + } + oldN += diff; newN += diff + } + while (visualLineNo(cm.doc, newN) != newN) { + if (index == (dir < 0 ? 0 : view.length - 1)) { return null } + newN += dir * view[index - (dir < 0 ? 1 : 0)].size + index += dir + } + return {index: index, lineN: newN} +} + +// Force the view to cover a given range, adding empty view element +// or clipping off existing ones as needed. +function adjustView(cm, from, to) { + var display = cm.display, view = display.view + if (view.length == 0 || from >= display.viewTo || to <= display.viewFrom) { + display.view = buildViewArray(cm, from, to) + display.viewFrom = from + } else { + if (display.viewFrom > from) + { display.view = buildViewArray(cm, from, display.viewFrom).concat(display.view) } + else if (display.viewFrom < from) + { display.view = display.view.slice(findViewIndex(cm, from)) } + display.viewFrom = from + if (display.viewTo < to) + { display.view = display.view.concat(buildViewArray(cm, display.viewTo, to)) } + else if (display.viewTo > to) + { display.view = display.view.slice(0, findViewIndex(cm, to)) } + } + display.viewTo = to +} + +// Count the number of lines in the view whose DOM representation is +// out of date (or nonexistent). +function countDirtyView(cm) { + var view = cm.display.view, dirty = 0 + for (var i = 0; i < view.length; i++) { + var lineView = view[i] + if (!lineView.hidden && (!lineView.node || lineView.changes)) { ++dirty } + } + return dirty +} + +// HIGHLIGHT WORKER + +function startWorker(cm, time) { + if (cm.doc.mode.startState && cm.doc.frontier < cm.display.viewTo) + { cm.state.highlight.set(time, bind(highlightWorker, cm)) } +} + +function highlightWorker(cm) { + var doc = cm.doc + if (doc.frontier < doc.first) { doc.frontier = doc.first } + if (doc.frontier >= cm.display.viewTo) { return } + var end = +new Date + cm.options.workTime + var state = copyState(doc.mode, getStateBefore(cm, doc.frontier)) + var changedLines = [] + + doc.iter(doc.frontier, Math.min(doc.first + doc.size, cm.display.viewTo + 500), function (line) { + if (doc.frontier >= cm.display.viewFrom) { // Visible + var oldStyles = line.styles, tooLong = line.text.length > cm.options.maxHighlightLength + var highlighted = highlightLine(cm, line, tooLong ? copyState(doc.mode, state) : state, true) + line.styles = highlighted.styles + var oldCls = line.styleClasses, newCls = highlighted.classes + if (newCls) { line.styleClasses = newCls } + else if (oldCls) { line.styleClasses = null } + var ischange = !oldStyles || oldStyles.length != line.styles.length || + oldCls != newCls && (!oldCls || !newCls || oldCls.bgClass != newCls.bgClass || oldCls.textClass != newCls.textClass) + for (var i = 0; !ischange && i < oldStyles.length; ++i) { ischange = oldStyles[i] != line.styles[i] } + if (ischange) { changedLines.push(doc.frontier) } + line.stateAfter = tooLong ? state : copyState(doc.mode, state) + } else { + if (line.text.length <= cm.options.maxHighlightLength) + { processLine(cm, line.text, state) } + line.stateAfter = doc.frontier % 5 == 0 ? copyState(doc.mode, state) : null + } + ++doc.frontier + if (+new Date > end) { + startWorker(cm, cm.options.workDelay) + return true + } + }) + if (changedLines.length) { runInOp(cm, function () { + for (var i = 0; i < changedLines.length; i++) + { regLineChange(cm, changedLines[i], "text") } + }) } +} + +// DISPLAY DRAWING + +var DisplayUpdate = function(cm, viewport, force) { + var display = cm.display + + this.viewport = viewport + // Store some values that we'll need later (but don't want to force a relayout for) + this.visible = visibleLines(display, cm.doc, viewport) + this.editorIsHidden = !display.wrapper.offsetWidth + this.wrapperHeight = display.wrapper.clientHeight + this.wrapperWidth = display.wrapper.clientWidth + this.oldDisplayWidth = displayWidth(cm) + this.force = force + this.dims = getDimensions(cm) + this.events = [] +}; + +DisplayUpdate.prototype.signal = function (emitter, type) { + if (hasHandler(emitter, type)) + { this.events.push(arguments) } +}; +DisplayUpdate.prototype.finish = function () { + var this$1 = this; + + for (var i = 0; i < this.events.length; i++) + { signal.apply(null, this$1.events[i]) } +}; + +function maybeClipScrollbars(cm) { + var display = cm.display + if (!display.scrollbarsClipped && display.scroller.offsetWidth) { + display.nativeBarWidth = display.scroller.offsetWidth - display.scroller.clientWidth + display.heightForcer.style.height = scrollGap(cm) + "px" + display.sizer.style.marginBottom = -display.nativeBarWidth + "px" + display.sizer.style.borderRightWidth = scrollGap(cm) + "px" + display.scrollbarsClipped = true + } +} + +// Does the actual updating of the line display. Bails out +// (returning false) when there is nothing to be done and forced is +// false. +function updateDisplayIfNeeded(cm, update) { + var display = cm.display, doc = cm.doc + + if (update.editorIsHidden) { + resetView(cm) + return false + } + + // Bail out if the visible area is already rendered and nothing changed. + if (!update.force && + update.visible.from >= display.viewFrom && update.visible.to <= display.viewTo && + (display.updateLineNumbers == null || display.updateLineNumbers >= display.viewTo) && + display.renderedView == display.view && countDirtyView(cm) == 0) + { return false } + + if (maybeUpdateLineNumberWidth(cm)) { + resetView(cm) + update.dims = getDimensions(cm) + } + + // Compute a suitable new viewport (from & to) + var end = doc.first + doc.size + var from = Math.max(update.visible.from - cm.options.viewportMargin, doc.first) + var to = Math.min(end, update.visible.to + cm.options.viewportMargin) + if (display.viewFrom < from && from - display.viewFrom < 20) { from = Math.max(doc.first, display.viewFrom) } + if (display.viewTo > to && display.viewTo - to < 20) { to = Math.min(end, display.viewTo) } + if (sawCollapsedSpans) { + from = visualLineNo(cm.doc, from) + to = visualLineEndNo(cm.doc, to) + } + + var different = from != display.viewFrom || to != display.viewTo || + display.lastWrapHeight != update.wrapperHeight || display.lastWrapWidth != update.wrapperWidth + adjustView(cm, from, to) + + display.viewOffset = heightAtLine(getLine(cm.doc, display.viewFrom)) + // Position the mover div to align with the current scroll position + cm.display.mover.style.top = display.viewOffset + "px" + + var toUpdate = countDirtyView(cm) + if (!different && toUpdate == 0 && !update.force && display.renderedView == display.view && + (display.updateLineNumbers == null || display.updateLineNumbers >= display.viewTo)) + { return false } + + // For big changes, we hide the enclosing element during the + // update, since that speeds up the operations on most browsers. + var focused = activeElt() + if (toUpdate > 4) { display.lineDiv.style.display = "none" } + patchDisplay(cm, display.updateLineNumbers, update.dims) + if (toUpdate > 4) { display.lineDiv.style.display = "" } + display.renderedView = display.view + // There might have been a widget with a focused element that got + // hidden or updated, if so re-focus it. + if (focused && activeElt() != focused && focused.offsetHeight) { focused.focus() } + + // Prevent selection and cursors from interfering with the scroll + // width and height. + removeChildren(display.cursorDiv) + removeChildren(display.selectionDiv) + display.gutters.style.height = display.sizer.style.minHeight = 0 + + if (different) { + display.lastWrapHeight = update.wrapperHeight + display.lastWrapWidth = update.wrapperWidth + startWorker(cm, 400) + } + + display.updateLineNumbers = null + + return true +} + +function postUpdateDisplay(cm, update) { + var viewport = update.viewport + + for (var first = true;; first = false) { + if (!first || !cm.options.lineWrapping || update.oldDisplayWidth == displayWidth(cm)) { + // Clip forced viewport to actual scrollable area. + if (viewport && viewport.top != null) + { viewport = {top: Math.min(cm.doc.height + paddingVert(cm.display) - displayHeight(cm), viewport.top)} } + // Updated line heights might result in the drawn area not + // actually covering the viewport. Keep looping until it does. + update.visible = visibleLines(cm.display, cm.doc, viewport) + if (update.visible.from >= cm.display.viewFrom && update.visible.to <= cm.display.viewTo) + { break } + } + if (!updateDisplayIfNeeded(cm, update)) { break } + updateHeightsInViewport(cm) + var barMeasure = measureForScrollbars(cm) + updateSelection(cm) + updateScrollbars(cm, barMeasure) + setDocumentHeight(cm, barMeasure) + } + + update.signal(cm, "update", cm) + if (cm.display.viewFrom != cm.display.reportedViewFrom || cm.display.viewTo != cm.display.reportedViewTo) { + update.signal(cm, "viewportChange", cm, cm.display.viewFrom, cm.display.viewTo) + cm.display.reportedViewFrom = cm.display.viewFrom; cm.display.reportedViewTo = cm.display.viewTo + } +} + +function updateDisplaySimple(cm, viewport) { + var update = new DisplayUpdate(cm, viewport) + if (updateDisplayIfNeeded(cm, update)) { + updateHeightsInViewport(cm) + postUpdateDisplay(cm, update) + var barMeasure = measureForScrollbars(cm) + updateSelection(cm) + updateScrollbars(cm, barMeasure) + setDocumentHeight(cm, barMeasure) + update.finish() + } +} + +// Sync the actual display DOM structure with display.view, removing +// nodes for lines that are no longer in view, and creating the ones +// that are not there yet, and updating the ones that are out of +// date. +function patchDisplay(cm, updateNumbersFrom, dims) { + var display = cm.display, lineNumbers = cm.options.lineNumbers + var container = display.lineDiv, cur = container.firstChild + + function rm(node) { + var next = node.nextSibling + // Works around a throw-scroll bug in OS X Webkit + if (webkit && mac && cm.display.currentWheelTarget == node) + { node.style.display = "none" } + else + { node.parentNode.removeChild(node) } + return next + } + + var view = display.view, lineN = display.viewFrom + // Loop over the elements in the view, syncing cur (the DOM nodes + // in display.lineDiv) with the view as we go. + for (var i = 0; i < view.length; i++) { + var lineView = view[i] + if (lineView.hidden) { + } else if (!lineView.node || lineView.node.parentNode != container) { // Not drawn yet + var node = buildLineElement(cm, lineView, lineN, dims) + container.insertBefore(node, cur) + } else { // Already drawn + while (cur != lineView.node) { cur = rm(cur) } + var updateNumber = lineNumbers && updateNumbersFrom != null && + updateNumbersFrom <= lineN && lineView.lineNumber + if (lineView.changes) { + if (indexOf(lineView.changes, "gutter") > -1) { updateNumber = false } + updateLineForChanges(cm, lineView, lineN, dims) + } + if (updateNumber) { + removeChildren(lineView.lineNumber) + lineView.lineNumber.appendChild(document.createTextNode(lineNumberFor(cm.options, lineN))) + } + cur = lineView.node.nextSibling + } + lineN += lineView.size + } + while (cur) { cur = rm(cur) } +} + +function updateGutterSpace(cm) { + var width = cm.display.gutters.offsetWidth + cm.display.sizer.style.marginLeft = width + "px" +} + +function setDocumentHeight(cm, measure) { + cm.display.sizer.style.minHeight = measure.docHeight + "px" + cm.display.heightForcer.style.top = measure.docHeight + "px" + cm.display.gutters.style.height = (measure.docHeight + cm.display.barHeight + scrollGap(cm)) + "px" +} + +// Rebuild the gutter elements, ensure the margin to the left of the +// code matches their width. +function updateGutters(cm) { + var gutters = cm.display.gutters, specs = cm.options.gutters + removeChildren(gutters) + var i = 0 + for (; i < specs.length; ++i) { + var gutterClass = specs[i] + var gElt = gutters.appendChild(elt("div", null, "CodeMirror-gutter " + gutterClass)) + if (gutterClass == "CodeMirror-linenumbers") { + cm.display.lineGutter = gElt + gElt.style.width = (cm.display.lineNumWidth || 1) + "px" + } + } + gutters.style.display = i ? "" : "none" + updateGutterSpace(cm) +} + +// Make sure the gutters options contains the element +// "CodeMirror-linenumbers" when the lineNumbers option is true. +function setGuttersForLineNumbers(options) { + var found = indexOf(options.gutters, "CodeMirror-linenumbers") + if (found == -1 && options.lineNumbers) { + options.gutters = options.gutters.concat(["CodeMirror-linenumbers"]) + } else if (found > -1 && !options.lineNumbers) { + options.gutters = options.gutters.slice(0) + options.gutters.splice(found, 1) + } +} + +// Selection objects are immutable. A new one is created every time +// the selection changes. A selection is one or more non-overlapping +// (and non-touching) ranges, sorted, and an integer that indicates +// which one is the primary selection (the one that's scrolled into +// view, that getCursor returns, etc). +var Selection = function(ranges, primIndex) { + this.ranges = ranges + this.primIndex = primIndex +}; + +Selection.prototype.primary = function () { return this.ranges[this.primIndex] }; + +Selection.prototype.equals = function (other) { + var this$1 = this; + + if (other == this) { return true } + if (other.primIndex != this.primIndex || other.ranges.length != this.ranges.length) { return false } + for (var i = 0; i < this.ranges.length; i++) { + var here = this$1.ranges[i], there = other.ranges[i] + if (!equalCursorPos(here.anchor, there.anchor) || !equalCursorPos(here.head, there.head)) { return false } + } + return true +}; + +Selection.prototype.deepCopy = function () { + var this$1 = this; + + var out = [] + for (var i = 0; i < this.ranges.length; i++) + { out[i] = new Range(copyPos(this$1.ranges[i].anchor), copyPos(this$1.ranges[i].head)) } + return new Selection(out, this.primIndex) +}; + +Selection.prototype.somethingSelected = function () { + var this$1 = this; + + for (var i = 0; i < this.ranges.length; i++) + { if (!this$1.ranges[i].empty()) { return true } } + return false +}; + +Selection.prototype.contains = function (pos, end) { + var this$1 = this; + + if (!end) { end = pos } + for (var i = 0; i < this.ranges.length; i++) { + var range = this$1.ranges[i] + if (cmp(end, range.from()) >= 0 && cmp(pos, range.to()) <= 0) + { return i } + } + return -1 +}; + +var Range = function(anchor, head) { + this.anchor = anchor; this.head = head +}; + +Range.prototype.from = function () { return minPos(this.anchor, this.head) }; +Range.prototype.to = function () { return maxPos(this.anchor, this.head) }; +Range.prototype.empty = function () { return this.head.line == this.anchor.line && this.head.ch == this.anchor.ch }; + +// Take an unsorted, potentially overlapping set of ranges, and +// build a selection out of it. 'Consumes' ranges array (modifying +// it). +function normalizeSelection(ranges, primIndex) { + var prim = ranges[primIndex] + ranges.sort(function (a, b) { return cmp(a.from(), b.from()); }) + primIndex = indexOf(ranges, prim) + for (var i = 1; i < ranges.length; i++) { + var cur = ranges[i], prev = ranges[i - 1] + if (cmp(prev.to(), cur.from()) >= 0) { + var from = minPos(prev.from(), cur.from()), to = maxPos(prev.to(), cur.to()) + var inv = prev.empty() ? cur.from() == cur.head : prev.from() == prev.head + if (i <= primIndex) { --primIndex } + ranges.splice(--i, 2, new Range(inv ? to : from, inv ? from : to)) + } + } + return new Selection(ranges, primIndex) +} + +function simpleSelection(anchor, head) { + return new Selection([new Range(anchor, head || anchor)], 0) +} + +// Compute the position of the end of a change (its 'to' property +// refers to the pre-change end). +function changeEnd(change) { + if (!change.text) { return change.to } + return Pos(change.from.line + change.text.length - 1, + lst(change.text).length + (change.text.length == 1 ? change.from.ch : 0)) +} + +// Adjust a position to refer to the post-change position of the +// same text, or the end of the change if the change covers it. +function adjustForChange(pos, change) { + if (cmp(pos, change.from) < 0) { return pos } + if (cmp(pos, change.to) <= 0) { return changeEnd(change) } + + var line = pos.line + change.text.length - (change.to.line - change.from.line) - 1, ch = pos.ch + if (pos.line == change.to.line) { ch += changeEnd(change).ch - change.to.ch } + return Pos(line, ch) +} + +function computeSelAfterChange(doc, change) { + var out = [] + for (var i = 0; i < doc.sel.ranges.length; i++) { + var range = doc.sel.ranges[i] + out.push(new Range(adjustForChange(range.anchor, change), + adjustForChange(range.head, change))) + } + return normalizeSelection(out, doc.sel.primIndex) +} + +function offsetPos(pos, old, nw) { + if (pos.line == old.line) + { return Pos(nw.line, pos.ch - old.ch + nw.ch) } + else + { return Pos(nw.line + (pos.line - old.line), pos.ch) } +} + +// Used by replaceSelections to allow moving the selection to the +// start or around the replaced test. Hint may be "start" or "around". +function computeReplacedSel(doc, changes, hint) { + var out = [] + var oldPrev = Pos(doc.first, 0), newPrev = oldPrev + for (var i = 0; i < changes.length; i++) { + var change = changes[i] + var from = offsetPos(change.from, oldPrev, newPrev) + var to = offsetPos(changeEnd(change), oldPrev, newPrev) + oldPrev = change.to + newPrev = to + if (hint == "around") { + var range = doc.sel.ranges[i], inv = cmp(range.head, range.anchor) < 0 + out[i] = new Range(inv ? to : from, inv ? from : to) + } else { + out[i] = new Range(from, from) + } + } + return new Selection(out, doc.sel.primIndex) +} + +// Used to get the editor into a consistent state again when options change. + +function loadMode(cm) { + cm.doc.mode = getMode(cm.options, cm.doc.modeOption) + resetModeState(cm) +} + +function resetModeState(cm) { + cm.doc.iter(function (line) { + if (line.stateAfter) { line.stateAfter = null } + if (line.styles) { line.styles = null } + }) + cm.doc.frontier = cm.doc.first + startWorker(cm, 100) + cm.state.modeGen++ + if (cm.curOp) { regChange(cm) } +} + +// DOCUMENT DATA STRUCTURE + +// By default, updates that start and end at the beginning of a line +// are treated specially, in order to make the association of line +// widgets and marker elements with the text behave more intuitive. +function isWholeLineUpdate(doc, change) { + return change.from.ch == 0 && change.to.ch == 0 && lst(change.text) == "" && + (!doc.cm || doc.cm.options.wholeLineUpdateBefore) +} + +// Perform a change on the document data structure. +function updateDoc(doc, change, markedSpans, estimateHeight) { + function spansFor(n) {return markedSpans ? markedSpans[n] : null} + function update(line, text, spans) { + updateLine(line, text, spans, estimateHeight) + signalLater(line, "change", line, change) + } + function linesFor(start, end) { + var result = [] + for (var i = start; i < end; ++i) + { result.push(new Line(text[i], spansFor(i), estimateHeight)) } + return result + } + + var from = change.from, to = change.to, text = change.text + var firstLine = getLine(doc, from.line), lastLine = getLine(doc, to.line) + var lastText = lst(text), lastSpans = spansFor(text.length - 1), nlines = to.line - from.line + + // Adjust the line structure + if (change.full) { + doc.insert(0, linesFor(0, text.length)) + doc.remove(text.length, doc.size - text.length) + } else if (isWholeLineUpdate(doc, change)) { + // This is a whole-line replace. Treated specially to make + // sure line objects move the way they are supposed to. + var added = linesFor(0, text.length - 1) + update(lastLine, lastLine.text, lastSpans) + if (nlines) { doc.remove(from.line, nlines) } + if (added.length) { doc.insert(from.line, added) } + } else if (firstLine == lastLine) { + if (text.length == 1) { + update(firstLine, firstLine.text.slice(0, from.ch) + lastText + firstLine.text.slice(to.ch), lastSpans) + } else { + var added$1 = linesFor(1, text.length - 1) + added$1.push(new Line(lastText + firstLine.text.slice(to.ch), lastSpans, estimateHeight)) + update(firstLine, firstLine.text.slice(0, from.ch) + text[0], spansFor(0)) + doc.insert(from.line + 1, added$1) + } + } else if (text.length == 1) { + update(firstLine, firstLine.text.slice(0, from.ch) + text[0] + lastLine.text.slice(to.ch), spansFor(0)) + doc.remove(from.line + 1, nlines) + } else { + update(firstLine, firstLine.text.slice(0, from.ch) + text[0], spansFor(0)) + update(lastLine, lastText + lastLine.text.slice(to.ch), lastSpans) + var added$2 = linesFor(1, text.length - 1) + if (nlines > 1) { doc.remove(from.line + 1, nlines - 1) } + doc.insert(from.line + 1, added$2) + } + + signalLater(doc, "change", doc, change) +} + +// Call f for all linked documents. +function linkedDocs(doc, f, sharedHistOnly) { + function propagate(doc, skip, sharedHist) { + if (doc.linked) { for (var i = 0; i < doc.linked.length; ++i) { + var rel = doc.linked[i] + if (rel.doc == skip) { continue } + var shared = sharedHist && rel.sharedHist + if (sharedHistOnly && !shared) { continue } + f(rel.doc, shared) + propagate(rel.doc, doc, shared) + } } + } + propagate(doc, null, true) +} + +// Attach a document to an editor. +function attachDoc(cm, doc) { + if (doc.cm) { throw new Error("This document is already in use.") } + cm.doc = doc + doc.cm = cm + estimateLineHeights(cm) + loadMode(cm) + if (!cm.options.lineWrapping) { findMaxLine(cm) } + cm.options.mode = doc.modeOption + regChange(cm) +} + +function History(startGen) { + // Arrays of change events and selections. Doing something adds an + // event to done and clears undo. Undoing moves events from done + // to undone, redoing moves them in the other direction. + this.done = []; this.undone = [] + this.undoDepth = Infinity + // Used to track when changes can be merged into a single undo + // event + this.lastModTime = this.lastSelTime = 0 + this.lastOp = this.lastSelOp = null + this.lastOrigin = this.lastSelOrigin = null + // Used by the isClean() method + this.generation = this.maxGeneration = startGen || 1 +} + +// Create a history change event from an updateDoc-style change +// object. +function historyChangeFromChange(doc, change) { + var histChange = {from: copyPos(change.from), to: changeEnd(change), text: getBetween(doc, change.from, change.to)} + attachLocalSpans(doc, histChange, change.from.line, change.to.line + 1) + linkedDocs(doc, function (doc) { return attachLocalSpans(doc, histChange, change.from.line, change.to.line + 1); }, true) + return histChange +} + +// Pop all selection events off the end of a history array. Stop at +// a change event. +function clearSelectionEvents(array) { + while (array.length) { + var last = lst(array) + if (last.ranges) { array.pop() } + else { break } + } +} + +// Find the top change event in the history. Pop off selection +// events that are in the way. +function lastChangeEvent(hist, force) { + if (force) { + clearSelectionEvents(hist.done) + return lst(hist.done) + } else if (hist.done.length && !lst(hist.done).ranges) { + return lst(hist.done) + } else if (hist.done.length > 1 && !hist.done[hist.done.length - 2].ranges) { + hist.done.pop() + return lst(hist.done) + } +} + +// Register a change in the history. Merges changes that are within +// a single operation, or are close together with an origin that +// allows merging (starting with "+") into a single event. +function addChangeToHistory(doc, change, selAfter, opId) { + var hist = doc.history + hist.undone.length = 0 + var time = +new Date, cur + var last + + if ((hist.lastOp == opId || + hist.lastOrigin == change.origin && change.origin && + ((change.origin.charAt(0) == "+" && doc.cm && hist.lastModTime > time - doc.cm.options.historyEventDelay) || + change.origin.charAt(0) == "*")) && + (cur = lastChangeEvent(hist, hist.lastOp == opId))) { + // Merge this change into the last event + last = lst(cur.changes) + if (cmp(change.from, change.to) == 0 && cmp(change.from, last.to) == 0) { + // Optimized case for simple insertion -- don't want to add + // new changesets for every character typed + last.to = changeEnd(change) + } else { + // Add new sub-event + cur.changes.push(historyChangeFromChange(doc, change)) + } + } else { + // Can not be merged, start a new event. + var before = lst(hist.done) + if (!before || !before.ranges) + { pushSelectionToHistory(doc.sel, hist.done) } + cur = {changes: [historyChangeFromChange(doc, change)], + generation: hist.generation} + hist.done.push(cur) + while (hist.done.length > hist.undoDepth) { + hist.done.shift() + if (!hist.done[0].ranges) { hist.done.shift() } + } + } + hist.done.push(selAfter) + hist.generation = ++hist.maxGeneration + hist.lastModTime = hist.lastSelTime = time + hist.lastOp = hist.lastSelOp = opId + hist.lastOrigin = hist.lastSelOrigin = change.origin + + if (!last) { signal(doc, "historyAdded") } +} + +function selectionEventCanBeMerged(doc, origin, prev, sel) { + var ch = origin.charAt(0) + return ch == "*" || + ch == "+" && + prev.ranges.length == sel.ranges.length && + prev.somethingSelected() == sel.somethingSelected() && + new Date - doc.history.lastSelTime <= (doc.cm ? doc.cm.options.historyEventDelay : 500) +} + +// Called whenever the selection changes, sets the new selection as +// the pending selection in the history, and pushes the old pending +// selection into the 'done' array when it was significantly +// different (in number of selected ranges, emptiness, or time). +function addSelectionToHistory(doc, sel, opId, options) { + var hist = doc.history, origin = options && options.origin + + // A new event is started when the previous origin does not match + // the current, or the origins don't allow matching. Origins + // starting with * are always merged, those starting with + are + // merged when similar and close together in time. + if (opId == hist.lastSelOp || + (origin && hist.lastSelOrigin == origin && + (hist.lastModTime == hist.lastSelTime && hist.lastOrigin == origin || + selectionEventCanBeMerged(doc, origin, lst(hist.done), sel)))) + { hist.done[hist.done.length - 1] = sel } + else + { pushSelectionToHistory(sel, hist.done) } + + hist.lastSelTime = +new Date + hist.lastSelOrigin = origin + hist.lastSelOp = opId + if (options && options.clearRedo !== false) + { clearSelectionEvents(hist.undone) } +} + +function pushSelectionToHistory(sel, dest) { + var top = lst(dest) + if (!(top && top.ranges && top.equals(sel))) + { dest.push(sel) } +} + +// Used to store marked span information in the history. +function attachLocalSpans(doc, change, from, to) { + var existing = change["spans_" + doc.id], n = 0 + doc.iter(Math.max(doc.first, from), Math.min(doc.first + doc.size, to), function (line) { + if (line.markedSpans) + { (existing || (existing = change["spans_" + doc.id] = {}))[n] = line.markedSpans } + ++n + }) +} + +// When un/re-doing restores text containing marked spans, those +// that have been explicitly cleared should not be restored. +function removeClearedSpans(spans) { + if (!spans) { return null } + var out + for (var i = 0; i < spans.length; ++i) { + if (spans[i].marker.explicitlyCleared) { if (!out) { out = spans.slice(0, i) } } + else if (out) { out.push(spans[i]) } + } + return !out ? spans : out.length ? out : null +} + +// Retrieve and filter the old marked spans stored in a change event. +function getOldSpans(doc, change) { + var found = change["spans_" + doc.id] + if (!found) { return null } + var nw = [] + for (var i = 0; i < change.text.length; ++i) + { nw.push(removeClearedSpans(found[i])) } + return nw +} + +// Used for un/re-doing changes from the history. Combines the +// result of computing the existing spans with the set of spans that +// existed in the history (so that deleting around a span and then +// undoing brings back the span). +function mergeOldSpans(doc, change) { + var old = getOldSpans(doc, change) + var stretched = stretchSpansOverChange(doc, change) + if (!old) { return stretched } + if (!stretched) { return old } + + for (var i = 0; i < old.length; ++i) { + var oldCur = old[i], stretchCur = stretched[i] + if (oldCur && stretchCur) { + spans: for (var j = 0; j < stretchCur.length; ++j) { + var span = stretchCur[j] + for (var k = 0; k < oldCur.length; ++k) + { if (oldCur[k].marker == span.marker) { continue spans } } + oldCur.push(span) + } + } else if (stretchCur) { + old[i] = stretchCur + } + } + return old +} + +// Used both to provide a JSON-safe object in .getHistory, and, when +// detaching a document, to split the history in two +function copyHistoryArray(events, newGroup, instantiateSel) { + var copy = [] + for (var i = 0; i < events.length; ++i) { + var event = events[i] + if (event.ranges) { + copy.push(instantiateSel ? Selection.prototype.deepCopy.call(event) : event) + continue + } + var changes = event.changes, newChanges = [] + copy.push({changes: newChanges}) + for (var j = 0; j < changes.length; ++j) { + var change = changes[j], m = (void 0) + newChanges.push({from: change.from, to: change.to, text: change.text}) + if (newGroup) { for (var prop in change) { if (m = prop.match(/^spans_(\d+)$/)) { + if (indexOf(newGroup, Number(m[1])) > -1) { + lst(newChanges)[prop] = change[prop] + delete change[prop] + } + } } } + } + } + return copy +} + +// The 'scroll' parameter given to many of these indicated whether +// the new cursor position should be scrolled into view after +// modifying the selection. + +// If shift is held or the extend flag is set, extends a range to +// include a given position (and optionally a second position). +// Otherwise, simply returns the range between the given positions. +// Used for cursor motion and such. +function extendRange(doc, range, head, other) { + if (doc.cm && doc.cm.display.shift || doc.extend) { + var anchor = range.anchor + if (other) { + var posBefore = cmp(head, anchor) < 0 + if (posBefore != (cmp(other, anchor) < 0)) { + anchor = head + head = other + } else if (posBefore != (cmp(head, other) < 0)) { + head = other + } + } + return new Range(anchor, head) + } else { + return new Range(other || head, head) + } +} + +// Extend the primary selection range, discard the rest. +function extendSelection(doc, head, other, options) { + setSelection(doc, new Selection([extendRange(doc, doc.sel.primary(), head, other)], 0), options) +} + +// Extend all selections (pos is an array of selections with length +// equal the number of selections) +function extendSelections(doc, heads, options) { + var out = [] + for (var i = 0; i < doc.sel.ranges.length; i++) + { out[i] = extendRange(doc, doc.sel.ranges[i], heads[i], null) } + var newSel = normalizeSelection(out, doc.sel.primIndex) + setSelection(doc, newSel, options) +} + +// Updates a single range in the selection. +function replaceOneSelection(doc, i, range, options) { + var ranges = doc.sel.ranges.slice(0) + ranges[i] = range + setSelection(doc, normalizeSelection(ranges, doc.sel.primIndex), options) +} + +// Reset the selection to a single range. +function setSimpleSelection(doc, anchor, head, options) { + setSelection(doc, simpleSelection(anchor, head), options) +} + +// Give beforeSelectionChange handlers a change to influence a +// selection update. +function filterSelectionChange(doc, sel, options) { + var obj = { + ranges: sel.ranges, + update: function(ranges) { + var this$1 = this; + + this.ranges = [] + for (var i = 0; i < ranges.length; i++) + { this$1.ranges[i] = new Range(clipPos(doc, ranges[i].anchor), + clipPos(doc, ranges[i].head)) } + }, + origin: options && options.origin + } + signal(doc, "beforeSelectionChange", doc, obj) + if (doc.cm) { signal(doc.cm, "beforeSelectionChange", doc.cm, obj) } + if (obj.ranges != sel.ranges) { return normalizeSelection(obj.ranges, obj.ranges.length - 1) } + else { return sel } +} + +function setSelectionReplaceHistory(doc, sel, options) { + var done = doc.history.done, last = lst(done) + if (last && last.ranges) { + done[done.length - 1] = sel + setSelectionNoUndo(doc, sel, options) + } else { + setSelection(doc, sel, options) + } +} + +// Set a new selection. +function setSelection(doc, sel, options) { + setSelectionNoUndo(doc, sel, options) + addSelectionToHistory(doc, doc.sel, doc.cm ? doc.cm.curOp.id : NaN, options) +} + +function setSelectionNoUndo(doc, sel, options) { + if (hasHandler(doc, "beforeSelectionChange") || doc.cm && hasHandler(doc.cm, "beforeSelectionChange")) + { sel = filterSelectionChange(doc, sel, options) } + + var bias = options && options.bias || + (cmp(sel.primary().head, doc.sel.primary().head) < 0 ? -1 : 1) + setSelectionInner(doc, skipAtomicInSelection(doc, sel, bias, true)) + + if (!(options && options.scroll === false) && doc.cm) + { ensureCursorVisible(doc.cm) } +} + +function setSelectionInner(doc, sel) { + if (sel.equals(doc.sel)) { return } + + doc.sel = sel + + if (doc.cm) { + doc.cm.curOp.updateInput = doc.cm.curOp.selectionChanged = true + signalCursorActivity(doc.cm) + } + signalLater(doc, "cursorActivity", doc) +} + +// Verify that the selection does not partially select any atomic +// marked ranges. +function reCheckSelection(doc) { + setSelectionInner(doc, skipAtomicInSelection(doc, doc.sel, null, false), sel_dontScroll) +} + +// Return a selection that does not partially select any atomic +// ranges. +function skipAtomicInSelection(doc, sel, bias, mayClear) { + var out + for (var i = 0; i < sel.ranges.length; i++) { + var range = sel.ranges[i] + var old = sel.ranges.length == doc.sel.ranges.length && doc.sel.ranges[i] + var newAnchor = skipAtomic(doc, range.anchor, old && old.anchor, bias, mayClear) + var newHead = skipAtomic(doc, range.head, old && old.head, bias, mayClear) + if (out || newAnchor != range.anchor || newHead != range.head) { + if (!out) { out = sel.ranges.slice(0, i) } + out[i] = new Range(newAnchor, newHead) + } + } + return out ? normalizeSelection(out, sel.primIndex) : sel +} + +function skipAtomicInner(doc, pos, oldPos, dir, mayClear) { + var line = getLine(doc, pos.line) + if (line.markedSpans) { for (var i = 0; i < line.markedSpans.length; ++i) { + var sp = line.markedSpans[i], m = sp.marker + if ((sp.from == null || (m.inclusiveLeft ? sp.from <= pos.ch : sp.from < pos.ch)) && + (sp.to == null || (m.inclusiveRight ? sp.to >= pos.ch : sp.to > pos.ch))) { + if (mayClear) { + signal(m, "beforeCursorEnter") + if (m.explicitlyCleared) { + if (!line.markedSpans) { break } + else {--i; continue} + } + } + if (!m.atomic) { continue } + + if (oldPos) { + var near = m.find(dir < 0 ? 1 : -1), diff = (void 0) + if (dir < 0 ? m.inclusiveRight : m.inclusiveLeft) + { near = movePos(doc, near, -dir, near && near.line == pos.line ? line : null) } + if (near && near.line == pos.line && (diff = cmp(near, oldPos)) && (dir < 0 ? diff < 0 : diff > 0)) + { return skipAtomicInner(doc, near, pos, dir, mayClear) } + } + + var far = m.find(dir < 0 ? -1 : 1) + if (dir < 0 ? m.inclusiveLeft : m.inclusiveRight) + { far = movePos(doc, far, dir, far.line == pos.line ? line : null) } + return far ? skipAtomicInner(doc, far, pos, dir, mayClear) : null + } + } } + return pos +} + +// Ensure a given position is not inside an atomic range. +function skipAtomic(doc, pos, oldPos, bias, mayClear) { + var dir = bias || 1 + var found = skipAtomicInner(doc, pos, oldPos, dir, mayClear) || + (!mayClear && skipAtomicInner(doc, pos, oldPos, dir, true)) || + skipAtomicInner(doc, pos, oldPos, -dir, mayClear) || + (!mayClear && skipAtomicInner(doc, pos, oldPos, -dir, true)) + if (!found) { + doc.cantEdit = true + return Pos(doc.first, 0) + } + return found +} + +function movePos(doc, pos, dir, line) { + if (dir < 0 && pos.ch == 0) { + if (pos.line > doc.first) { return clipPos(doc, Pos(pos.line - 1)) } + else { return null } + } else if (dir > 0 && pos.ch == (line || getLine(doc, pos.line)).text.length) { + if (pos.line < doc.first + doc.size - 1) { return Pos(pos.line + 1, 0) } + else { return null } + } else { + return new Pos(pos.line, pos.ch + dir) + } +} + +function selectAll(cm) { + cm.setSelection(Pos(cm.firstLine(), 0), Pos(cm.lastLine()), sel_dontScroll) +} + +// UPDATING + +// Allow "beforeChange" event handlers to influence a change +function filterChange(doc, change, update) { + var obj = { + canceled: false, + from: change.from, + to: change.to, + text: change.text, + origin: change.origin, + cancel: function () { return obj.canceled = true; } + } + if (update) { obj.update = function (from, to, text, origin) { + if (from) { obj.from = clipPos(doc, from) } + if (to) { obj.to = clipPos(doc, to) } + if (text) { obj.text = text } + if (origin !== undefined) { obj.origin = origin } + } } + signal(doc, "beforeChange", doc, obj) + if (doc.cm) { signal(doc.cm, "beforeChange", doc.cm, obj) } + + if (obj.canceled) { return null } + return {from: obj.from, to: obj.to, text: obj.text, origin: obj.origin} +} + +// Apply a change to a document, and add it to the document's +// history, and propagating it to all linked documents. +function makeChange(doc, change, ignoreReadOnly) { + if (doc.cm) { + if (!doc.cm.curOp) { return operation(doc.cm, makeChange)(doc, change, ignoreReadOnly) } + if (doc.cm.state.suppressEdits) { return } + } + + if (hasHandler(doc, "beforeChange") || doc.cm && hasHandler(doc.cm, "beforeChange")) { + change = filterChange(doc, change, true) + if (!change) { return } + } + + // Possibly split or suppress the update based on the presence + // of read-only spans in its range. + var split = sawReadOnlySpans && !ignoreReadOnly && removeReadOnlyRanges(doc, change.from, change.to) + if (split) { + for (var i = split.length - 1; i >= 0; --i) + { makeChangeInner(doc, {from: split[i].from, to: split[i].to, text: i ? [""] : change.text}) } + } else { + makeChangeInner(doc, change) + } +} + +function makeChangeInner(doc, change) { + if (change.text.length == 1 && change.text[0] == "" && cmp(change.from, change.to) == 0) { return } + var selAfter = computeSelAfterChange(doc, change) + addChangeToHistory(doc, change, selAfter, doc.cm ? doc.cm.curOp.id : NaN) + + makeChangeSingleDoc(doc, change, selAfter, stretchSpansOverChange(doc, change)) + var rebased = [] + + linkedDocs(doc, function (doc, sharedHist) { + if (!sharedHist && indexOf(rebased, doc.history) == -1) { + rebaseHist(doc.history, change) + rebased.push(doc.history) + } + makeChangeSingleDoc(doc, change, null, stretchSpansOverChange(doc, change)) + }) +} + +// Revert a change stored in a document's history. +function makeChangeFromHistory(doc, type, allowSelectionOnly) { + if (doc.cm && doc.cm.state.suppressEdits && !allowSelectionOnly) { return } + + var hist = doc.history, event, selAfter = doc.sel + var source = type == "undo" ? hist.done : hist.undone, dest = type == "undo" ? hist.undone : hist.done + + // Verify that there is a useable event (so that ctrl-z won't + // needlessly clear selection events) + var i = 0 + for (; i < source.length; i++) { + event = source[i] + if (allowSelectionOnly ? event.ranges && !event.equals(doc.sel) : !event.ranges) + { break } + } + if (i == source.length) { return } + hist.lastOrigin = hist.lastSelOrigin = null + + for (;;) { + event = source.pop() + if (event.ranges) { + pushSelectionToHistory(event, dest) + if (allowSelectionOnly && !event.equals(doc.sel)) { + setSelection(doc, event, {clearRedo: false}) + return + } + selAfter = event + } + else { break } + } + + // Build up a reverse change object to add to the opposite history + // stack (redo when undoing, and vice versa). + var antiChanges = [] + pushSelectionToHistory(selAfter, dest) + dest.push({changes: antiChanges, generation: hist.generation}) + hist.generation = event.generation || ++hist.maxGeneration + + var filter = hasHandler(doc, "beforeChange") || doc.cm && hasHandler(doc.cm, "beforeChange") + + var loop = function ( i ) { + var change = event.changes[i] + change.origin = type + if (filter && !filterChange(doc, change, false)) { + source.length = 0 + return {} + } + + antiChanges.push(historyChangeFromChange(doc, change)) + + var after = i ? computeSelAfterChange(doc, change) : lst(source) + makeChangeSingleDoc(doc, change, after, mergeOldSpans(doc, change)) + if (!i && doc.cm) { doc.cm.scrollIntoView({from: change.from, to: changeEnd(change)}) } + var rebased = [] + + // Propagate to the linked documents + linkedDocs(doc, function (doc, sharedHist) { + if (!sharedHist && indexOf(rebased, doc.history) == -1) { + rebaseHist(doc.history, change) + rebased.push(doc.history) + } + makeChangeSingleDoc(doc, change, null, mergeOldSpans(doc, change)) + }) + }; + + for (var i$1 = event.changes.length - 1; i$1 >= 0; --i$1) { + var returned = loop( i$1 ); + + if ( returned ) return returned.v; + } +} + +// Sub-views need their line numbers shifted when text is added +// above or below them in the parent document. +function shiftDoc(doc, distance) { + if (distance == 0) { return } + doc.first += distance + doc.sel = new Selection(map(doc.sel.ranges, function (range) { return new Range( + Pos(range.anchor.line + distance, range.anchor.ch), + Pos(range.head.line + distance, range.head.ch) + ); }), doc.sel.primIndex) + if (doc.cm) { + regChange(doc.cm, doc.first, doc.first - distance, distance) + for (var d = doc.cm.display, l = d.viewFrom; l < d.viewTo; l++) + { regLineChange(doc.cm, l, "gutter") } + } +} + +// More lower-level change function, handling only a single document +// (not linked ones). +function makeChangeSingleDoc(doc, change, selAfter, spans) { + if (doc.cm && !doc.cm.curOp) + { return operation(doc.cm, makeChangeSingleDoc)(doc, change, selAfter, spans) } + + if (change.to.line < doc.first) { + shiftDoc(doc, change.text.length - 1 - (change.to.line - change.from.line)) + return + } + if (change.from.line > doc.lastLine()) { return } + + // Clip the change to the size of this doc + if (change.from.line < doc.first) { + var shift = change.text.length - 1 - (doc.first - change.from.line) + shiftDoc(doc, shift) + change = {from: Pos(doc.first, 0), to: Pos(change.to.line + shift, change.to.ch), + text: [lst(change.text)], origin: change.origin} + } + var last = doc.lastLine() + if (change.to.line > last) { + change = {from: change.from, to: Pos(last, getLine(doc, last).text.length), + text: [change.text[0]], origin: change.origin} + } + + change.removed = getBetween(doc, change.from, change.to) + + if (!selAfter) { selAfter = computeSelAfterChange(doc, change) } + if (doc.cm) { makeChangeSingleDocInEditor(doc.cm, change, spans) } + else { updateDoc(doc, change, spans) } + setSelectionNoUndo(doc, selAfter, sel_dontScroll) +} + +// Handle the interaction of a change to a document with the editor +// that this document is part of. +function makeChangeSingleDocInEditor(cm, change, spans) { + var doc = cm.doc, display = cm.display, from = change.from, to = change.to + + var recomputeMaxLength = false, checkWidthStart = from.line + if (!cm.options.lineWrapping) { + checkWidthStart = lineNo(visualLine(getLine(doc, from.line))) + doc.iter(checkWidthStart, to.line + 1, function (line) { + if (line == display.maxLine) { + recomputeMaxLength = true + return true + } + }) + } + + if (doc.sel.contains(change.from, change.to) > -1) + { signalCursorActivity(cm) } + + updateDoc(doc, change, spans, estimateHeight(cm)) + + if (!cm.options.lineWrapping) { + doc.iter(checkWidthStart, from.line + change.text.length, function (line) { + var len = lineLength(line) + if (len > display.maxLineLength) { + display.maxLine = line + display.maxLineLength = len + display.maxLineChanged = true + recomputeMaxLength = false + } + }) + if (recomputeMaxLength) { cm.curOp.updateMaxLine = true } + } + + // Adjust frontier, schedule worker + doc.frontier = Math.min(doc.frontier, from.line) + startWorker(cm, 400) + + var lendiff = change.text.length - (to.line - from.line) - 1 + // Remember that these lines changed, for updating the display + if (change.full) + { regChange(cm) } + else if (from.line == to.line && change.text.length == 1 && !isWholeLineUpdate(cm.doc, change)) + { regLineChange(cm, from.line, "text") } + else + { regChange(cm, from.line, to.line + 1, lendiff) } + + var changesHandler = hasHandler(cm, "changes"), changeHandler = hasHandler(cm, "change") + if (changeHandler || changesHandler) { + var obj = { + from: from, to: to, + text: change.text, + removed: change.removed, + origin: change.origin + } + if (changeHandler) { signalLater(cm, "change", cm, obj) } + if (changesHandler) { (cm.curOp.changeObjs || (cm.curOp.changeObjs = [])).push(obj) } + } + cm.display.selForContextMenu = null +} + +function replaceRange(doc, code, from, to, origin) { + if (!to) { to = from } + if (cmp(to, from) < 0) { var tmp = to; to = from; from = tmp } + if (typeof code == "string") { code = doc.splitLines(code) } + makeChange(doc, {from: from, to: to, text: code, origin: origin}) +} + +// Rebasing/resetting history to deal with externally-sourced changes + +function rebaseHistSelSingle(pos, from, to, diff) { + if (to < pos.line) { + pos.line += diff + } else if (from < pos.line) { + pos.line = from + pos.ch = 0 + } +} + +// Tries to rebase an array of history events given a change in the +// document. If the change touches the same lines as the event, the +// event, and everything 'behind' it, is discarded. If the change is +// before the event, the event's positions are updated. Uses a +// copy-on-write scheme for the positions, to avoid having to +// reallocate them all on every rebase, but also avoid problems with +// shared position objects being unsafely updated. +function rebaseHistArray(array, from, to, diff) { + for (var i = 0; i < array.length; ++i) { + var sub = array[i], ok = true + if (sub.ranges) { + if (!sub.copied) { sub = array[i] = sub.deepCopy(); sub.copied = true } + for (var j = 0; j < sub.ranges.length; j++) { + rebaseHistSelSingle(sub.ranges[j].anchor, from, to, diff) + rebaseHistSelSingle(sub.ranges[j].head, from, to, diff) + } + continue + } + for (var j$1 = 0; j$1 < sub.changes.length; ++j$1) { + var cur = sub.changes[j$1] + if (to < cur.from.line) { + cur.from = Pos(cur.from.line + diff, cur.from.ch) + cur.to = Pos(cur.to.line + diff, cur.to.ch) + } else if (from <= cur.to.line) { + ok = false + break + } + } + if (!ok) { + array.splice(0, i + 1) + i = 0 + } + } +} + +function rebaseHist(hist, change) { + var from = change.from.line, to = change.to.line, diff = change.text.length - (to - from) - 1 + rebaseHistArray(hist.done, from, to, diff) + rebaseHistArray(hist.undone, from, to, diff) +} + +// Utility for applying a change to a line by handle or number, +// returning the number and optionally registering the line as +// changed. +function changeLine(doc, handle, changeType, op) { + var no = handle, line = handle + if (typeof handle == "number") { line = getLine(doc, clipLine(doc, handle)) } + else { no = lineNo(handle) } + if (no == null) { return null } + if (op(line, no) && doc.cm) { regLineChange(doc.cm, no, changeType) } + return line +} + +// The document is represented as a BTree consisting of leaves, with +// chunk of lines in them, and branches, with up to ten leaves or +// other branch nodes below them. The top node is always a branch +// node, and is the document object itself (meaning it has +// additional methods and properties). +// +// All nodes have parent links. The tree is used both to go from +// line numbers to line objects, and to go from objects to numbers. +// It also indexes by height, and is used to convert between height +// and line object, and to find the total height of the document. +// +// See also http://marijnhaverbeke.nl/blog/codemirror-line-tree.html + +var LeafChunk = function(lines) { + var this$1 = this; + + this.lines = lines + this.parent = null + var height = 0 + for (var i = 0; i < lines.length; ++i) { + lines[i].parent = this$1 + height += lines[i].height + } + this.height = height +}; + +LeafChunk.prototype.chunkSize = function () { return this.lines.length }; + +// Remove the n lines at offset 'at'. +LeafChunk.prototype.removeInner = function (at, n) { + var this$1 = this; + + for (var i = at, e = at + n; i < e; ++i) { + var line = this$1.lines[i] + this$1.height -= line.height + cleanUpLine(line) + signalLater(line, "delete") + } + this.lines.splice(at, n) +}; + +// Helper used to collapse a small branch into a single leaf. +LeafChunk.prototype.collapse = function (lines) { + lines.push.apply(lines, this.lines) +}; + +// Insert the given array of lines at offset 'at', count them as +// having the given height. +LeafChunk.prototype.insertInner = function (at, lines, height) { + var this$1 = this; + + this.height += height + this.lines = this.lines.slice(0, at).concat(lines).concat(this.lines.slice(at)) + for (var i = 0; i < lines.length; ++i) { lines[i].parent = this$1 } +}; + +// Used to iterate over a part of the tree. +LeafChunk.prototype.iterN = function (at, n, op) { + var this$1 = this; + + for (var e = at + n; at < e; ++at) + { if (op(this$1.lines[at])) { return true } } +}; + +var BranchChunk = function(children) { + var this$1 = this; + + this.children = children + var size = 0, height = 0 + for (var i = 0; i < children.length; ++i) { + var ch = children[i] + size += ch.chunkSize(); height += ch.height + ch.parent = this$1 + } + this.size = size + this.height = height + this.parent = null +}; + +BranchChunk.prototype.chunkSize = function () { return this.size }; + +BranchChunk.prototype.removeInner = function (at, n) { + var this$1 = this; + + this.size -= n + for (var i = 0; i < this.children.length; ++i) { + var child = this$1.children[i], sz = child.chunkSize() + if (at < sz) { + var rm = Math.min(n, sz - at), oldHeight = child.height + child.removeInner(at, rm) + this$1.height -= oldHeight - child.height + if (sz == rm) { this$1.children.splice(i--, 1); child.parent = null } + if ((n -= rm) == 0) { break } + at = 0 + } else { at -= sz } + } + // If the result is smaller than 25 lines, ensure that it is a + // single leaf node. + if (this.size - n < 25 && + (this.children.length > 1 || !(this.children[0] instanceof LeafChunk))) { + var lines = [] + this.collapse(lines) + this.children = [new LeafChunk(lines)] + this.children[0].parent = this + } +}; + +BranchChunk.prototype.collapse = function (lines) { + var this$1 = this; + + for (var i = 0; i < this.children.length; ++i) { this$1.children[i].collapse(lines) } +}; + +BranchChunk.prototype.insertInner = function (at, lines, height) { + var this$1 = this; + + this.size += lines.length + this.height += height + for (var i = 0; i < this.children.length; ++i) { + var child = this$1.children[i], sz = child.chunkSize() + if (at <= sz) { + child.insertInner(at, lines, height) + if (child.lines && child.lines.length > 50) { + // To avoid memory thrashing when child.lines is huge (e.g., first view of a large file), it's never spliced. + // Instead, small slices are taken. They're taken in order because sequential memory accesses are fastest. + var remaining = child.lines.length % 25 + 25 + for (var pos = remaining; pos < child.lines.length;) { + var leaf = new LeafChunk(child.lines.slice(pos, pos += 25)) + child.height -= leaf.height + this$1.children.splice(++i, 0, leaf) + leaf.parent = this$1 + } + child.lines = child.lines.slice(0, remaining) + this$1.maybeSpill() + } + break + } + at -= sz + } +}; + +// When a node has grown, check whether it should be split. +BranchChunk.prototype.maybeSpill = function () { + if (this.children.length <= 10) { return } + var me = this + do { + var spilled = me.children.splice(me.children.length - 5, 5) + var sibling = new BranchChunk(spilled) + if (!me.parent) { // Become the parent node + var copy = new BranchChunk(me.children) + copy.parent = me + me.children = [copy, sibling] + me = copy + } else { + me.size -= sibling.size + me.height -= sibling.height + var myIndex = indexOf(me.parent.children, me) + me.parent.children.splice(myIndex + 1, 0, sibling) + } + sibling.parent = me.parent + } while (me.children.length > 10) + me.parent.maybeSpill() +}; + +BranchChunk.prototype.iterN = function (at, n, op) { + var this$1 = this; + + for (var i = 0; i < this.children.length; ++i) { + var child = this$1.children[i], sz = child.chunkSize() + if (at < sz) { + var used = Math.min(n, sz - at) + if (child.iterN(at, used, op)) { return true } + if ((n -= used) == 0) { break } + at = 0 + } else { at -= sz } + } +}; + +// Line widgets are block elements displayed above or below a line. + +var LineWidget = function(doc, node, options) { + var this$1 = this; + + if (options) { for (var opt in options) { if (options.hasOwnProperty(opt)) + { this$1[opt] = options[opt] } } } + this.doc = doc + this.node = node +}; + +LineWidget.prototype.clear = function () { + var this$1 = this; + + var cm = this.doc.cm, ws = this.line.widgets, line = this.line, no = lineNo(line) + if (no == null || !ws) { return } + for (var i = 0; i < ws.length; ++i) { if (ws[i] == this$1) { ws.splice(i--, 1) } } + if (!ws.length) { line.widgets = null } + var height = widgetHeight(this) + updateLineHeight(line, Math.max(0, line.height - height)) + if (cm) { + runInOp(cm, function () { + adjustScrollWhenAboveVisible(cm, line, -height) + regLineChange(cm, no, "widget") + }) + signalLater(cm, "lineWidgetCleared", cm, this, no) + } +}; + +LineWidget.prototype.changed = function () { + var this$1 = this; + + var oldH = this.height, cm = this.doc.cm, line = this.line + this.height = null + var diff = widgetHeight(this) - oldH + if (!diff) { return } + updateLineHeight(line, line.height + diff) + if (cm) { + runInOp(cm, function () { + cm.curOp.forceUpdate = true + adjustScrollWhenAboveVisible(cm, line, diff) + signalLater(cm, "lineWidgetChanged", cm, this$1, lineNo(line)) + }) + } +}; +eventMixin(LineWidget) + +function adjustScrollWhenAboveVisible(cm, line, diff) { + if (heightAtLine(line) < ((cm.curOp && cm.curOp.scrollTop) || cm.doc.scrollTop)) + { addToScrollPos(cm, null, diff) } +} + +function addLineWidget(doc, handle, node, options) { + var widget = new LineWidget(doc, node, options) + var cm = doc.cm + if (cm && widget.noHScroll) { cm.display.alignWidgets = true } + changeLine(doc, handle, "widget", function (line) { + var widgets = line.widgets || (line.widgets = []) + if (widget.insertAt == null) { widgets.push(widget) } + else { widgets.splice(Math.min(widgets.length - 1, Math.max(0, widget.insertAt)), 0, widget) } + widget.line = line + if (cm && !lineIsHidden(doc, line)) { + var aboveVisible = heightAtLine(line) < doc.scrollTop + updateLineHeight(line, line.height + widgetHeight(widget)) + if (aboveVisible) { addToScrollPos(cm, null, widget.height) } + cm.curOp.forceUpdate = true + } + return true + }) + signalLater(cm, "lineWidgetAdded", cm, widget, typeof handle == "number" ? handle : lineNo(handle)) + return widget +} + +// TEXTMARKERS + +// Created with markText and setBookmark methods. A TextMarker is a +// handle that can be used to clear or find a marked position in the +// document. Line objects hold arrays (markedSpans) containing +// {from, to, marker} object pointing to such marker objects, and +// indicating that such a marker is present on that line. Multiple +// lines may point to the same marker when it spans across lines. +// The spans will have null for their from/to properties when the +// marker continues beyond the start/end of the line. Markers have +// links back to the lines they currently touch. + +// Collapsed markers have unique ids, in order to be able to order +// them, which is needed for uniquely determining an outer marker +// when they overlap (they may nest, but not partially overlap). +var nextMarkerId = 0 + +var TextMarker = function(doc, type) { + this.lines = [] + this.type = type + this.doc = doc + this.id = ++nextMarkerId +}; + +// Clear the marker. +TextMarker.prototype.clear = function () { + var this$1 = this; + + if (this.explicitlyCleared) { return } + var cm = this.doc.cm, withOp = cm && !cm.curOp + if (withOp) { startOperation(cm) } + if (hasHandler(this, "clear")) { + var found = this.find() + if (found) { signalLater(this, "clear", found.from, found.to) } + } + var min = null, max = null + for (var i = 0; i < this.lines.length; ++i) { + var line = this$1.lines[i] + var span = getMarkedSpanFor(line.markedSpans, this$1) + if (cm && !this$1.collapsed) { regLineChange(cm, lineNo(line), "text") } + else if (cm) { + if (span.to != null) { max = lineNo(line) } + if (span.from != null) { min = lineNo(line) } + } + line.markedSpans = removeMarkedSpan(line.markedSpans, span) + if (span.from == null && this$1.collapsed && !lineIsHidden(this$1.doc, line) && cm) + { updateLineHeight(line, textHeight(cm.display)) } + } + if (cm && this.collapsed && !cm.options.lineWrapping) { for (var i$1 = 0; i$1 < this.lines.length; ++i$1) { + var visual = visualLine(this$1.lines[i$1]), len = lineLength(visual) + if (len > cm.display.maxLineLength) { + cm.display.maxLine = visual + cm.display.maxLineLength = len + cm.display.maxLineChanged = true + } + } } + + if (min != null && cm && this.collapsed) { regChange(cm, min, max + 1) } + this.lines.length = 0 + this.explicitlyCleared = true + if (this.atomic && this.doc.cantEdit) { + this.doc.cantEdit = false + if (cm) { reCheckSelection(cm.doc) } + } + if (cm) { signalLater(cm, "markerCleared", cm, this, min, max) } + if (withOp) { endOperation(cm) } + if (this.parent) { this.parent.clear() } +}; + +// Find the position of the marker in the document. Returns a {from, +// to} object by default. Side can be passed to get a specific side +// -- 0 (both), -1 (left), or 1 (right). When lineObj is true, the +// Pos objects returned contain a line object, rather than a line +// number (used to prevent looking up the same line twice). +TextMarker.prototype.find = function (side, lineObj) { + var this$1 = this; + + if (side == null && this.type == "bookmark") { side = 1 } + var from, to + for (var i = 0; i < this.lines.length; ++i) { + var line = this$1.lines[i] + var span = getMarkedSpanFor(line.markedSpans, this$1) + if (span.from != null) { + from = Pos(lineObj ? line : lineNo(line), span.from) + if (side == -1) { return from } + } + if (span.to != null) { + to = Pos(lineObj ? line : lineNo(line), span.to) + if (side == 1) { return to } + } + } + return from && {from: from, to: to} +}; + +// Signals that the marker's widget changed, and surrounding layout +// should be recomputed. +TextMarker.prototype.changed = function () { + var this$1 = this; + + var pos = this.find(-1, true), widget = this, cm = this.doc.cm + if (!pos || !cm) { return } + runInOp(cm, function () { + var line = pos.line, lineN = lineNo(pos.line) + var view = findViewForLine(cm, lineN) + if (view) { + clearLineMeasurementCacheFor(view) + cm.curOp.selectionChanged = cm.curOp.forceUpdate = true + } + cm.curOp.updateMaxLine = true + if (!lineIsHidden(widget.doc, line) && widget.height != null) { + var oldHeight = widget.height + widget.height = null + var dHeight = widgetHeight(widget) - oldHeight + if (dHeight) + { updateLineHeight(line, line.height + dHeight) } + } + signalLater(cm, "markerChanged", cm, this$1) + }) +}; + +TextMarker.prototype.attachLine = function (line) { + if (!this.lines.length && this.doc.cm) { + var op = this.doc.cm.curOp + if (!op.maybeHiddenMarkers || indexOf(op.maybeHiddenMarkers, this) == -1) + { (op.maybeUnhiddenMarkers || (op.maybeUnhiddenMarkers = [])).push(this) } + } + this.lines.push(line) +}; + +TextMarker.prototype.detachLine = function (line) { + this.lines.splice(indexOf(this.lines, line), 1) + if (!this.lines.length && this.doc.cm) { + var op = this.doc.cm.curOp + ;(op.maybeHiddenMarkers || (op.maybeHiddenMarkers = [])).push(this) + } +}; +eventMixin(TextMarker) + +// Create a marker, wire it up to the right lines, and +function markText(doc, from, to, options, type) { + // Shared markers (across linked documents) are handled separately + // (markTextShared will call out to this again, once per + // document). + if (options && options.shared) { return markTextShared(doc, from, to, options, type) } + // Ensure we are in an operation. + if (doc.cm && !doc.cm.curOp) { return operation(doc.cm, markText)(doc, from, to, options, type) } + + var marker = new TextMarker(doc, type), diff = cmp(from, to) + if (options) { copyObj(options, marker, false) } + // Don't connect empty markers unless clearWhenEmpty is false + if (diff > 0 || diff == 0 && marker.clearWhenEmpty !== false) + { return marker } + if (marker.replacedWith) { + // Showing up as a widget implies collapsed (widget replaces text) + marker.collapsed = true + marker.widgetNode = elt("span", [marker.replacedWith], "CodeMirror-widget") + marker.widgetNode.setAttribute("role", "presentation") // hide from accessibility tree + if (!options.handleMouseEvents) { marker.widgetNode.setAttribute("cm-ignore-events", "true") } + if (options.insertLeft) { marker.widgetNode.insertLeft = true } + } + if (marker.collapsed) { + if (conflictingCollapsedRange(doc, from.line, from, to, marker) || + from.line != to.line && conflictingCollapsedRange(doc, to.line, from, to, marker)) + { throw new Error("Inserting collapsed marker partially overlapping an existing one") } + seeCollapsedSpans() + } + + if (marker.addToHistory) + { addChangeToHistory(doc, {from: from, to: to, origin: "markText"}, doc.sel, NaN) } + + var curLine = from.line, cm = doc.cm, updateMaxLine + doc.iter(curLine, to.line + 1, function (line) { + if (cm && marker.collapsed && !cm.options.lineWrapping && visualLine(line) == cm.display.maxLine) + { updateMaxLine = true } + if (marker.collapsed && curLine != from.line) { updateLineHeight(line, 0) } + addMarkedSpan(line, new MarkedSpan(marker, + curLine == from.line ? from.ch : null, + curLine == to.line ? to.ch : null)) + ++curLine + }) + // lineIsHidden depends on the presence of the spans, so needs a second pass + if (marker.collapsed) { doc.iter(from.line, to.line + 1, function (line) { + if (lineIsHidden(doc, line)) { updateLineHeight(line, 0) } + }) } + + if (marker.clearOnEnter) { on(marker, "beforeCursorEnter", function () { return marker.clear(); }) } + + if (marker.readOnly) { + seeReadOnlySpans() + if (doc.history.done.length || doc.history.undone.length) + { doc.clearHistory() } + } + if (marker.collapsed) { + marker.id = ++nextMarkerId + marker.atomic = true + } + if (cm) { + // Sync editor state + if (updateMaxLine) { cm.curOp.updateMaxLine = true } + if (marker.collapsed) + { regChange(cm, from.line, to.line + 1) } + else if (marker.className || marker.title || marker.startStyle || marker.endStyle || marker.css) + { for (var i = from.line; i <= to.line; i++) { regLineChange(cm, i, "text") } } + if (marker.atomic) { reCheckSelection(cm.doc) } + signalLater(cm, "markerAdded", cm, marker) + } + return marker +} + +// SHARED TEXTMARKERS + +// A shared marker spans multiple linked documents. It is +// implemented as a meta-marker-object controlling multiple normal +// markers. +var SharedTextMarker = function(markers, primary) { + var this$1 = this; + + this.markers = markers + this.primary = primary + for (var i = 0; i < markers.length; ++i) + { markers[i].parent = this$1 } +}; + +SharedTextMarker.prototype.clear = function () { + var this$1 = this; + + if (this.explicitlyCleared) { return } + this.explicitlyCleared = true + for (var i = 0; i < this.markers.length; ++i) + { this$1.markers[i].clear() } + signalLater(this, "clear") +}; + +SharedTextMarker.prototype.find = function (side, lineObj) { + return this.primary.find(side, lineObj) +}; +eventMixin(SharedTextMarker) + +function markTextShared(doc, from, to, options, type) { + options = copyObj(options) + options.shared = false + var markers = [markText(doc, from, to, options, type)], primary = markers[0] + var widget = options.widgetNode + linkedDocs(doc, function (doc) { + if (widget) { options.widgetNode = widget.cloneNode(true) } + markers.push(markText(doc, clipPos(doc, from), clipPos(doc, to), options, type)) + for (var i = 0; i < doc.linked.length; ++i) + { if (doc.linked[i].isParent) { return } } + primary = lst(markers) + }) + return new SharedTextMarker(markers, primary) +} + +function findSharedMarkers(doc) { + return doc.findMarks(Pos(doc.first, 0), doc.clipPos(Pos(doc.lastLine())), function (m) { return m.parent; }) +} + +function copySharedMarkers(doc, markers) { + for (var i = 0; i < markers.length; i++) { + var marker = markers[i], pos = marker.find() + var mFrom = doc.clipPos(pos.from), mTo = doc.clipPos(pos.to) + if (cmp(mFrom, mTo)) { + var subMark = markText(doc, mFrom, mTo, marker.primary, marker.primary.type) + marker.markers.push(subMark) + subMark.parent = marker + } + } +} + +function detachSharedMarkers(markers) { + var loop = function ( i ) { + var marker = markers[i], linked = [marker.primary.doc] + linkedDocs(marker.primary.doc, function (d) { return linked.push(d); }) + for (var j = 0; j < marker.markers.length; j++) { + var subMarker = marker.markers[j] + if (indexOf(linked, subMarker.doc) == -1) { + subMarker.parent = null + marker.markers.splice(j--, 1) + } + } + }; + + for (var i = 0; i < markers.length; i++) loop( i ); +} + +var nextDocId = 0 +var Doc = function(text, mode, firstLine, lineSep) { + if (!(this instanceof Doc)) { return new Doc(text, mode, firstLine, lineSep) } + if (firstLine == null) { firstLine = 0 } + + BranchChunk.call(this, [new LeafChunk([new Line("", null)])]) + this.first = firstLine + this.scrollTop = this.scrollLeft = 0 + this.cantEdit = false + this.cleanGeneration = 1 + this.frontier = firstLine + var start = Pos(firstLine, 0) + this.sel = simpleSelection(start) + this.history = new History(null) + this.id = ++nextDocId + this.modeOption = mode + this.lineSep = lineSep + this.extend = false + + if (typeof text == "string") { text = this.splitLines(text) } + updateDoc(this, {from: start, to: start, text: text}) + setSelection(this, simpleSelection(start), sel_dontScroll) +} + +Doc.prototype = createObj(BranchChunk.prototype, { + constructor: Doc, + // Iterate over the document. Supports two forms -- with only one + // argument, it calls that for each line in the document. With + // three, it iterates over the range given by the first two (with + // the second being non-inclusive). + iter: function(from, to, op) { + if (op) { this.iterN(from - this.first, to - from, op) } + else { this.iterN(this.first, this.first + this.size, from) } + }, + + // Non-public interface for adding and removing lines. + insert: function(at, lines) { + var height = 0 + for (var i = 0; i < lines.length; ++i) { height += lines[i].height } + this.insertInner(at - this.first, lines, height) + }, + remove: function(at, n) { this.removeInner(at - this.first, n) }, + + // From here, the methods are part of the public interface. Most + // are also available from CodeMirror (editor) instances. + + getValue: function(lineSep) { + var lines = getLines(this, this.first, this.first + this.size) + if (lineSep === false) { return lines } + return lines.join(lineSep || this.lineSeparator()) + }, + setValue: docMethodOp(function(code) { + var top = Pos(this.first, 0), last = this.first + this.size - 1 + makeChange(this, {from: top, to: Pos(last, getLine(this, last).text.length), + text: this.splitLines(code), origin: "setValue", full: true}, true) + setSelection(this, simpleSelection(top)) + }), + replaceRange: function(code, from, to, origin) { + from = clipPos(this, from) + to = to ? clipPos(this, to) : from + replaceRange(this, code, from, to, origin) + }, + getRange: function(from, to, lineSep) { + var lines = getBetween(this, clipPos(this, from), clipPos(this, to)) + if (lineSep === false) { return lines } + return lines.join(lineSep || this.lineSeparator()) + }, + + getLine: function(line) {var l = this.getLineHandle(line); return l && l.text}, + + getLineHandle: function(line) {if (isLine(this, line)) { return getLine(this, line) }}, + getLineNumber: function(line) {return lineNo(line)}, + + getLineHandleVisualStart: function(line) { + if (typeof line == "number") { line = getLine(this, line) } + return visualLine(line) + }, + + lineCount: function() {return this.size}, + firstLine: function() {return this.first}, + lastLine: function() {return this.first + this.size - 1}, + + clipPos: function(pos) {return clipPos(this, pos)}, + + getCursor: function(start) { + var range = this.sel.primary(), pos + if (start == null || start == "head") { pos = range.head } + else if (start == "anchor") { pos = range.anchor } + else if (start == "end" || start == "to" || start === false) { pos = range.to() } + else { pos = range.from() } + return pos + }, + listSelections: function() { return this.sel.ranges }, + somethingSelected: function() {return this.sel.somethingSelected()}, + + setCursor: docMethodOp(function(line, ch, options) { + setSimpleSelection(this, clipPos(this, typeof line == "number" ? Pos(line, ch || 0) : line), null, options) + }), + setSelection: docMethodOp(function(anchor, head, options) { + setSimpleSelection(this, clipPos(this, anchor), clipPos(this, head || anchor), options) + }), + extendSelection: docMethodOp(function(head, other, options) { + extendSelection(this, clipPos(this, head), other && clipPos(this, other), options) + }), + extendSelections: docMethodOp(function(heads, options) { + extendSelections(this, clipPosArray(this, heads), options) + }), + extendSelectionsBy: docMethodOp(function(f, options) { + var heads = map(this.sel.ranges, f) + extendSelections(this, clipPosArray(this, heads), options) + }), + setSelections: docMethodOp(function(ranges, primary, options) { + var this$1 = this; + + if (!ranges.length) { return } + var out = [] + for (var i = 0; i < ranges.length; i++) + { out[i] = new Range(clipPos(this$1, ranges[i].anchor), + clipPos(this$1, ranges[i].head)) } + if (primary == null) { primary = Math.min(ranges.length - 1, this.sel.primIndex) } + setSelection(this, normalizeSelection(out, primary), options) + }), + addSelection: docMethodOp(function(anchor, head, options) { + var ranges = this.sel.ranges.slice(0) + ranges.push(new Range(clipPos(this, anchor), clipPos(this, head || anchor))) + setSelection(this, normalizeSelection(ranges, ranges.length - 1), options) + }), + + getSelection: function(lineSep) { + var this$1 = this; + + var ranges = this.sel.ranges, lines + for (var i = 0; i < ranges.length; i++) { + var sel = getBetween(this$1, ranges[i].from(), ranges[i].to()) + lines = lines ? lines.concat(sel) : sel + } + if (lineSep === false) { return lines } + else { return lines.join(lineSep || this.lineSeparator()) } + }, + getSelections: function(lineSep) { + var this$1 = this; + + var parts = [], ranges = this.sel.ranges + for (var i = 0; i < ranges.length; i++) { + var sel = getBetween(this$1, ranges[i].from(), ranges[i].to()) + if (lineSep !== false) { sel = sel.join(lineSep || this$1.lineSeparator()) } + parts[i] = sel + } + return parts + }, + replaceSelection: function(code, collapse, origin) { + var dup = [] + for (var i = 0; i < this.sel.ranges.length; i++) + { dup[i] = code } + this.replaceSelections(dup, collapse, origin || "+input") + }, + replaceSelections: docMethodOp(function(code, collapse, origin) { + var this$1 = this; + + var changes = [], sel = this.sel + for (var i = 0; i < sel.ranges.length; i++) { + var range = sel.ranges[i] + changes[i] = {from: range.from(), to: range.to(), text: this$1.splitLines(code[i]), origin: origin} + } + var newSel = collapse && collapse != "end" && computeReplacedSel(this, changes, collapse) + for (var i$1 = changes.length - 1; i$1 >= 0; i$1--) + { makeChange(this$1, changes[i$1]) } + if (newSel) { setSelectionReplaceHistory(this, newSel) } + else if (this.cm) { ensureCursorVisible(this.cm) } + }), + undo: docMethodOp(function() {makeChangeFromHistory(this, "undo")}), + redo: docMethodOp(function() {makeChangeFromHistory(this, "redo")}), + undoSelection: docMethodOp(function() {makeChangeFromHistory(this, "undo", true)}), + redoSelection: docMethodOp(function() {makeChangeFromHistory(this, "redo", true)}), + + setExtending: function(val) {this.extend = val}, + getExtending: function() {return this.extend}, + + historySize: function() { + var hist = this.history, done = 0, undone = 0 + for (var i = 0; i < hist.done.length; i++) { if (!hist.done[i].ranges) { ++done } } + for (var i$1 = 0; i$1 < hist.undone.length; i$1++) { if (!hist.undone[i$1].ranges) { ++undone } } + return {undo: done, redo: undone} + }, + clearHistory: function() {this.history = new History(this.history.maxGeneration)}, + + markClean: function() { + this.cleanGeneration = this.changeGeneration(true) + }, + changeGeneration: function(forceSplit) { + if (forceSplit) + { this.history.lastOp = this.history.lastSelOp = this.history.lastOrigin = null } + return this.history.generation + }, + isClean: function (gen) { + return this.history.generation == (gen || this.cleanGeneration) + }, + + getHistory: function() { + return {done: copyHistoryArray(this.history.done), + undone: copyHistoryArray(this.history.undone)} + }, + setHistory: function(histData) { + var hist = this.history = new History(this.history.maxGeneration) + hist.done = copyHistoryArray(histData.done.slice(0), null, true) + hist.undone = copyHistoryArray(histData.undone.slice(0), null, true) + }, + + setGutterMarker: docMethodOp(function(line, gutterID, value) { + return changeLine(this, line, "gutter", function (line) { + var markers = line.gutterMarkers || (line.gutterMarkers = {}) + markers[gutterID] = value + if (!value && isEmpty(markers)) { line.gutterMarkers = null } + return true + }) + }), + + clearGutter: docMethodOp(function(gutterID) { + var this$1 = this; + + this.iter(function (line) { + if (line.gutterMarkers && line.gutterMarkers[gutterID]) { + changeLine(this$1, line, "gutter", function () { + line.gutterMarkers[gutterID] = null + if (isEmpty(line.gutterMarkers)) { line.gutterMarkers = null } + return true + }) + } + }) + }), + + lineInfo: function(line) { + var n + if (typeof line == "number") { + if (!isLine(this, line)) { return null } + n = line + line = getLine(this, line) + if (!line) { return null } + } else { + n = lineNo(line) + if (n == null) { return null } + } + return {line: n, handle: line, text: line.text, gutterMarkers: line.gutterMarkers, + textClass: line.textClass, bgClass: line.bgClass, wrapClass: line.wrapClass, + widgets: line.widgets} + }, + + addLineClass: docMethodOp(function(handle, where, cls) { + return changeLine(this, handle, where == "gutter" ? "gutter" : "class", function (line) { + var prop = where == "text" ? "textClass" + : where == "background" ? "bgClass" + : where == "gutter" ? "gutterClass" : "wrapClass" + if (!line[prop]) { line[prop] = cls } + else if (classTest(cls).test(line[prop])) { return false } + else { line[prop] += " " + cls } + return true + }) + }), + removeLineClass: docMethodOp(function(handle, where, cls) { + return changeLine(this, handle, where == "gutter" ? "gutter" : "class", function (line) { + var prop = where == "text" ? "textClass" + : where == "background" ? "bgClass" + : where == "gutter" ? "gutterClass" : "wrapClass" + var cur = line[prop] + if (!cur) { return false } + else if (cls == null) { line[prop] = null } + else { + var found = cur.match(classTest(cls)) + if (!found) { return false } + var end = found.index + found[0].length + line[prop] = cur.slice(0, found.index) + (!found.index || end == cur.length ? "" : " ") + cur.slice(end) || null + } + return true + }) + }), + + addLineWidget: docMethodOp(function(handle, node, options) { + return addLineWidget(this, handle, node, options) + }), + removeLineWidget: function(widget) { widget.clear() }, + + markText: function(from, to, options) { + return markText(this, clipPos(this, from), clipPos(this, to), options, options && options.type || "range") + }, + setBookmark: function(pos, options) { + var realOpts = {replacedWith: options && (options.nodeType == null ? options.widget : options), + insertLeft: options && options.insertLeft, + clearWhenEmpty: false, shared: options && options.shared, + handleMouseEvents: options && options.handleMouseEvents} + pos = clipPos(this, pos) + return markText(this, pos, pos, realOpts, "bookmark") + }, + findMarksAt: function(pos) { + pos = clipPos(this, pos) + var markers = [], spans = getLine(this, pos.line).markedSpans + if (spans) { for (var i = 0; i < spans.length; ++i) { + var span = spans[i] + if ((span.from == null || span.from <= pos.ch) && + (span.to == null || span.to >= pos.ch)) + { markers.push(span.marker.parent || span.marker) } + } } + return markers + }, + findMarks: function(from, to, filter) { + from = clipPos(this, from); to = clipPos(this, to) + var found = [], lineNo = from.line + this.iter(from.line, to.line + 1, function (line) { + var spans = line.markedSpans + if (spans) { for (var i = 0; i < spans.length; i++) { + var span = spans[i] + if (!(span.to != null && lineNo == from.line && from.ch >= span.to || + span.from == null && lineNo != from.line || + span.from != null && lineNo == to.line && span.from >= to.ch) && + (!filter || filter(span.marker))) + { found.push(span.marker.parent || span.marker) } + } } + ++lineNo + }) + return found + }, + getAllMarks: function() { + var markers = [] + this.iter(function (line) { + var sps = line.markedSpans + if (sps) { for (var i = 0; i < sps.length; ++i) + { if (sps[i].from != null) { markers.push(sps[i].marker) } } } + }) + return markers + }, + + posFromIndex: function(off) { + var ch, lineNo = this.first, sepSize = this.lineSeparator().length + this.iter(function (line) { + var sz = line.text.length + sepSize + if (sz > off) { ch = off; return true } + off -= sz + ++lineNo + }) + return clipPos(this, Pos(lineNo, ch)) + }, + indexFromPos: function (coords) { + coords = clipPos(this, coords) + var index = coords.ch + if (coords.line < this.first || coords.ch < 0) { return 0 } + var sepSize = this.lineSeparator().length + this.iter(this.first, coords.line, function (line) { // iter aborts when callback returns a truthy value + index += line.text.length + sepSize + }) + return index + }, + + copy: function(copyHistory) { + var doc = new Doc(getLines(this, this.first, this.first + this.size), + this.modeOption, this.first, this.lineSep) + doc.scrollTop = this.scrollTop; doc.scrollLeft = this.scrollLeft + doc.sel = this.sel + doc.extend = false + if (copyHistory) { + doc.history.undoDepth = this.history.undoDepth + doc.setHistory(this.getHistory()) + } + return doc + }, + + linkedDoc: function(options) { + if (!options) { options = {} } + var from = this.first, to = this.first + this.size + if (options.from != null && options.from > from) { from = options.from } + if (options.to != null && options.to < to) { to = options.to } + var copy = new Doc(getLines(this, from, to), options.mode || this.modeOption, from, this.lineSep) + if (options.sharedHist) { copy.history = this.history + ; }(this.linked || (this.linked = [])).push({doc: copy, sharedHist: options.sharedHist}) + copy.linked = [{doc: this, isParent: true, sharedHist: options.sharedHist}] + copySharedMarkers(copy, findSharedMarkers(this)) + return copy + }, + unlinkDoc: function(other) { + var this$1 = this; + + if (other instanceof CodeMirror) { other = other.doc } + if (this.linked) { for (var i = 0; i < this.linked.length; ++i) { + var link = this$1.linked[i] + if (link.doc != other) { continue } + this$1.linked.splice(i, 1) + other.unlinkDoc(this$1) + detachSharedMarkers(findSharedMarkers(this$1)) + break + } } + // If the histories were shared, split them again + if (other.history == this.history) { + var splitIds = [other.id] + linkedDocs(other, function (doc) { return splitIds.push(doc.id); }, true) + other.history = new History(null) + other.history.done = copyHistoryArray(this.history.done, splitIds) + other.history.undone = copyHistoryArray(this.history.undone, splitIds) + } + }, + iterLinkedDocs: function(f) {linkedDocs(this, f)}, + + getMode: function() {return this.mode}, + getEditor: function() {return this.cm}, + + splitLines: function(str) { + if (this.lineSep) { return str.split(this.lineSep) } + return splitLinesAuto(str) + }, + lineSeparator: function() { return this.lineSep || "\n" } +}) + +// Public alias. +Doc.prototype.eachLine = Doc.prototype.iter + +// Kludge to work around strange IE behavior where it'll sometimes +// re-fire a series of drag-related events right after the drop (#1551) +var lastDrop = 0 + +function onDrop(e) { + var cm = this + clearDragCursor(cm) + if (signalDOMEvent(cm, e) || eventInWidget(cm.display, e)) + { return } + e_preventDefault(e) + if (ie) { lastDrop = +new Date } + var pos = posFromMouse(cm, e, true), files = e.dataTransfer.files + if (!pos || cm.isReadOnly()) { return } + // Might be a file drop, in which case we simply extract the text + // and insert it. + if (files && files.length && window.FileReader && window.File) { + var n = files.length, text = Array(n), read = 0 + var loadFile = function (file, i) { + if (cm.options.allowDropFileTypes && + indexOf(cm.options.allowDropFileTypes, file.type) == -1) + { return } + + var reader = new FileReader + reader.onload = operation(cm, function () { + var content = reader.result + if (/[\x00-\x08\x0e-\x1f]{2}/.test(content)) { content = "" } + text[i] = content + if (++read == n) { + pos = clipPos(cm.doc, pos) + var change = {from: pos, to: pos, + text: cm.doc.splitLines(text.join(cm.doc.lineSeparator())), + origin: "paste"} + makeChange(cm.doc, change) + setSelectionReplaceHistory(cm.doc, simpleSelection(pos, changeEnd(change))) + } + }) + reader.readAsText(file) + } + for (var i = 0; i < n; ++i) { loadFile(files[i], i) } + } else { // Normal drop + // Don't do a replace if the drop happened inside of the selected text. + if (cm.state.draggingText && cm.doc.sel.contains(pos) > -1) { + cm.state.draggingText(e) + // Ensure the editor is re-focused + setTimeout(function () { return cm.display.input.focus(); }, 20) + return + } + try { + var text$1 = e.dataTransfer.getData("Text") + if (text$1) { + var selected + if (cm.state.draggingText && !cm.state.draggingText.copy) + { selected = cm.listSelections() } + setSelectionNoUndo(cm.doc, simpleSelection(pos, pos)) + if (selected) { for (var i$1 = 0; i$1 < selected.length; ++i$1) + { replaceRange(cm.doc, "", selected[i$1].anchor, selected[i$1].head, "drag") } } + cm.replaceSelection(text$1, "around", "paste") + cm.display.input.focus() + } + } + catch(e){} + } +} + +function onDragStart(cm, e) { + if (ie && (!cm.state.draggingText || +new Date - lastDrop < 100)) { e_stop(e); return } + if (signalDOMEvent(cm, e) || eventInWidget(cm.display, e)) { return } + + e.dataTransfer.setData("Text", cm.getSelection()) + e.dataTransfer.effectAllowed = "copyMove" + + // Use dummy image instead of default browsers image. + // Recent Safari (~6.0.2) have a tendency to segfault when this happens, so we don't do it there. + if (e.dataTransfer.setDragImage && !safari) { + var img = elt("img", null, null, "position: fixed; left: 0; top: 0;") + img.src = "data:image/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" + if (presto) { + img.width = img.height = 1 + cm.display.wrapper.appendChild(img) + // Force a relayout, or Opera won't use our image for some obscure reason + img._top = img.offsetTop + } + e.dataTransfer.setDragImage(img, 0, 0) + if (presto) { img.parentNode.removeChild(img) } + } +} + +function onDragOver(cm, e) { + var pos = posFromMouse(cm, e) + if (!pos) { return } + var frag = document.createDocumentFragment() + drawSelectionCursor(cm, pos, frag) + if (!cm.display.dragCursor) { + cm.display.dragCursor = elt("div", null, "CodeMirror-cursors CodeMirror-dragcursors") + cm.display.lineSpace.insertBefore(cm.display.dragCursor, cm.display.cursorDiv) + } + removeChildrenAndAdd(cm.display.dragCursor, frag) +} + +function clearDragCursor(cm) { + if (cm.display.dragCursor) { + cm.display.lineSpace.removeChild(cm.display.dragCursor) + cm.display.dragCursor = null + } +} + +// These must be handled carefully, because naively registering a +// handler for each editor will cause the editors to never be +// garbage collected. + +function forEachCodeMirror(f) { + if (!document.body.getElementsByClassName) { return } + var byClass = document.body.getElementsByClassName("CodeMirror") + for (var i = 0; i < byClass.length; i++) { + var cm = byClass[i].CodeMirror + if (cm) { f(cm) } + } +} + +var globalsRegistered = false +function ensureGlobalHandlers() { + if (globalsRegistered) { return } + registerGlobalHandlers() + globalsRegistered = true +} +function registerGlobalHandlers() { + // When the window resizes, we need to refresh active editors. + var resizeTimer + on(window, "resize", function () { + if (resizeTimer == null) { resizeTimer = setTimeout(function () { + resizeTimer = null + forEachCodeMirror(onResize) + }, 100) } + }) + // When the window loses focus, we want to show the editor as blurred + on(window, "blur", function () { return forEachCodeMirror(onBlur); }) +} +// Called when the window resizes +function onResize(cm) { + var d = cm.display + if (d.lastWrapHeight == d.wrapper.clientHeight && d.lastWrapWidth == d.wrapper.clientWidth) + { return } + // Might be a text scaling operation, clear size caches. + d.cachedCharWidth = d.cachedTextHeight = d.cachedPaddingH = null + d.scrollbarsClipped = false + cm.setSize() +} + +var keyNames = { + 3: "Enter", 8: "Backspace", 9: "Tab", 13: "Enter", 16: "Shift", 17: "Ctrl", 18: "Alt", + 19: "Pause", 20: "CapsLock", 27: "Esc", 32: "Space", 33: "PageUp", 34: "PageDown", 35: "End", + 36: "Home", 37: "Left", 38: "Up", 39: "Right", 40: "Down", 44: "PrintScrn", 45: "Insert", + 46: "Delete", 59: ";", 61: "=", 91: "Mod", 92: "Mod", 93: "Mod", + 106: "*", 107: "=", 109: "-", 110: ".", 111: "/", 127: "Delete", + 173: "-", 186: ";", 187: "=", 188: ",", 189: "-", 190: ".", 191: "/", 192: "`", 219: "[", 220: "\\", + 221: "]", 222: "'", 63232: "Up", 63233: "Down", 63234: "Left", 63235: "Right", 63272: "Delete", + 63273: "Home", 63275: "End", 63276: "PageUp", 63277: "PageDown", 63302: "Insert" +} + +// Number keys +for (var i = 0; i < 10; i++) { keyNames[i + 48] = keyNames[i + 96] = String(i) } +// Alphabetic keys +for (var i$1 = 65; i$1 <= 90; i$1++) { keyNames[i$1] = String.fromCharCode(i$1) } +// Function keys +for (var i$2 = 1; i$2 <= 12; i$2++) { keyNames[i$2 + 111] = keyNames[i$2 + 63235] = "F" + i$2 } + +var keyMap = {} + +keyMap.basic = { + "Left": "goCharLeft", "Right": "goCharRight", "Up": "goLineUp", "Down": "goLineDown", + "End": "goLineEnd", "Home": "goLineStartSmart", "PageUp": "goPageUp", "PageDown": "goPageDown", + "Delete": "delCharAfter", "Backspace": "delCharBefore", "Shift-Backspace": "delCharBefore", + "Tab": "defaultTab", "Shift-Tab": "indentAuto", + "Enter": "newlineAndIndent", "Insert": "toggleOverwrite", + "Esc": "singleSelection" +} +// Note that the save and find-related commands aren't defined by +// default. User code or addons can define them. Unknown commands +// are simply ignored. +keyMap.pcDefault = { + "Ctrl-A": "selectAll", "Ctrl-D": "deleteLine", "Ctrl-Z": "undo", "Shift-Ctrl-Z": "redo", "Ctrl-Y": "redo", + "Ctrl-Home": "goDocStart", "Ctrl-End": "goDocEnd", "Ctrl-Up": "goLineUp", "Ctrl-Down": "goLineDown", + "Ctrl-Left": "goGroupLeft", "Ctrl-Right": "goGroupRight", "Alt-Left": "goLineStart", "Alt-Right": "goLineEnd", + "Ctrl-Backspace": "delGroupBefore", "Ctrl-Delete": "delGroupAfter", "Ctrl-S": "save", "Ctrl-F": "find", + "Ctrl-G": "findNext", "Shift-Ctrl-G": "findPrev", "Shift-Ctrl-F": "replace", "Shift-Ctrl-R": "replaceAll", + "Ctrl-[": "indentLess", "Ctrl-]": "indentMore", + "Ctrl-U": "undoSelection", "Shift-Ctrl-U": "redoSelection", "Alt-U": "redoSelection", + fallthrough: "basic" +} +// Very basic readline/emacs-style bindings, which are standard on Mac. +keyMap.emacsy = { + "Ctrl-F": "goCharRight", "Ctrl-B": "goCharLeft", "Ctrl-P": "goLineUp", "Ctrl-N": "goLineDown", + "Alt-F": "goWordRight", "Alt-B": "goWordLeft", "Ctrl-A": "goLineStart", "Ctrl-E": "goLineEnd", + "Ctrl-V": "goPageDown", "Shift-Ctrl-V": "goPageUp", "Ctrl-D": "delCharAfter", "Ctrl-H": "delCharBefore", + "Alt-D": "delWordAfter", "Alt-Backspace": "delWordBefore", "Ctrl-K": "killLine", "Ctrl-T": "transposeChars", + "Ctrl-O": "openLine" +} +keyMap.macDefault = { + "Cmd-A": "selectAll", "Cmd-D": "deleteLine", "Cmd-Z": "undo", "Shift-Cmd-Z": "redo", "Cmd-Y": "redo", + "Cmd-Home": "goDocStart", "Cmd-Up": "goDocStart", "Cmd-End": "goDocEnd", "Cmd-Down": "goDocEnd", "Alt-Left": "goGroupLeft", + "Alt-Right": "goGroupRight", "Cmd-Left": "goLineLeft", "Cmd-Right": "goLineRight", "Alt-Backspace": "delGroupBefore", + "Ctrl-Alt-Backspace": "delGroupAfter", "Alt-Delete": "delGroupAfter", "Cmd-S": "save", "Cmd-F": "find", + "Cmd-G": "findNext", "Shift-Cmd-G": "findPrev", "Cmd-Alt-F": "replace", "Shift-Cmd-Alt-F": "replaceAll", + "Cmd-[": "indentLess", "Cmd-]": "indentMore", "Cmd-Backspace": "delWrappedLineLeft", "Cmd-Delete": "delWrappedLineRight", + "Cmd-U": "undoSelection", "Shift-Cmd-U": "redoSelection", "Ctrl-Up": "goDocStart", "Ctrl-Down": "goDocEnd", + fallthrough: ["basic", "emacsy"] +} +keyMap["default"] = mac ? keyMap.macDefault : keyMap.pcDefault + +// KEYMAP DISPATCH + +function normalizeKeyName(name) { + var parts = name.split(/-(?!$)/) + name = parts[parts.length - 1] + var alt, ctrl, shift, cmd + for (var i = 0; i < parts.length - 1; i++) { + var mod = parts[i] + if (/^(cmd|meta|m)$/i.test(mod)) { cmd = true } + else if (/^a(lt)?$/i.test(mod)) { alt = true } + else if (/^(c|ctrl|control)$/i.test(mod)) { ctrl = true } + else if (/^s(hift)?$/i.test(mod)) { shift = true } + else { throw new Error("Unrecognized modifier name: " + mod) } + } + if (alt) { name = "Alt-" + name } + if (ctrl) { name = "Ctrl-" + name } + if (cmd) { name = "Cmd-" + name } + if (shift) { name = "Shift-" + name } + return name +} + +// This is a kludge to keep keymaps mostly working as raw objects +// (backwards compatibility) while at the same time support features +// like normalization and multi-stroke key bindings. It compiles a +// new normalized keymap, and then updates the old object to reflect +// this. +function normalizeKeyMap(keymap) { + var copy = {} + for (var keyname in keymap) { if (keymap.hasOwnProperty(keyname)) { + var value = keymap[keyname] + if (/^(name|fallthrough|(de|at)tach)$/.test(keyname)) { continue } + if (value == "...") { delete keymap[keyname]; continue } + + var keys = map(keyname.split(" "), normalizeKeyName) + for (var i = 0; i < keys.length; i++) { + var val = (void 0), name = (void 0) + if (i == keys.length - 1) { + name = keys.join(" ") + val = value + } else { + name = keys.slice(0, i + 1).join(" ") + val = "..." + } + var prev = copy[name] + if (!prev) { copy[name] = val } + else if (prev != val) { throw new Error("Inconsistent bindings for " + name) } + } + delete keymap[keyname] + } } + for (var prop in copy) { keymap[prop] = copy[prop] } + return keymap +} + +function lookupKey(key, map, handle, context) { + map = getKeyMap(map) + var found = map.call ? map.call(key, context) : map[key] + if (found === false) { return "nothing" } + if (found === "...") { return "multi" } + if (found != null && handle(found)) { return "handled" } + + if (map.fallthrough) { + if (Object.prototype.toString.call(map.fallthrough) != "[object Array]") + { return lookupKey(key, map.fallthrough, handle, context) } + for (var i = 0; i < map.fallthrough.length; i++) { + var result = lookupKey(key, map.fallthrough[i], handle, context) + if (result) { return result } + } + } +} + +// Modifier key presses don't count as 'real' key presses for the +// purpose of keymap fallthrough. +function isModifierKey(value) { + var name = typeof value == "string" ? value : keyNames[value.keyCode] + return name == "Ctrl" || name == "Alt" || name == "Shift" || name == "Mod" +} + +// Look up the name of a key as indicated by an event object. +function keyName(event, noShift) { + if (presto && event.keyCode == 34 && event["char"]) { return false } + var base = keyNames[event.keyCode], name = base + if (name == null || event.altGraphKey) { return false } + if (event.altKey && base != "Alt") { name = "Alt-" + name } + if ((flipCtrlCmd ? event.metaKey : event.ctrlKey) && base != "Ctrl") { name = "Ctrl-" + name } + if ((flipCtrlCmd ? event.ctrlKey : event.metaKey) && base != "Cmd") { name = "Cmd-" + name } + if (!noShift && event.shiftKey && base != "Shift") { name = "Shift-" + name } + return name +} + +function getKeyMap(val) { + return typeof val == "string" ? keyMap[val] : val +} + +// Helper for deleting text near the selection(s), used to implement +// backspace, delete, and similar functionality. +function deleteNearSelection(cm, compute) { + var ranges = cm.doc.sel.ranges, kill = [] + // Build up a set of ranges to kill first, merging overlapping + // ranges. + for (var i = 0; i < ranges.length; i++) { + var toKill = compute(ranges[i]) + while (kill.length && cmp(toKill.from, lst(kill).to) <= 0) { + var replaced = kill.pop() + if (cmp(replaced.from, toKill.from) < 0) { + toKill.from = replaced.from + break + } + } + kill.push(toKill) + } + // Next, remove those actual ranges. + runInOp(cm, function () { + for (var i = kill.length - 1; i >= 0; i--) + { replaceRange(cm.doc, "", kill[i].from, kill[i].to, "+delete") } + ensureCursorVisible(cm) + }) +} + +// Commands are parameter-less actions that can be performed on an +// editor, mostly used for keybindings. +var commands = { + selectAll: selectAll, + singleSelection: function (cm) { return cm.setSelection(cm.getCursor("anchor"), cm.getCursor("head"), sel_dontScroll); }, + killLine: function (cm) { return deleteNearSelection(cm, function (range) { + if (range.empty()) { + var len = getLine(cm.doc, range.head.line).text.length + if (range.head.ch == len && range.head.line < cm.lastLine()) + { return {from: range.head, to: Pos(range.head.line + 1, 0)} } + else + { return {from: range.head, to: Pos(range.head.line, len)} } + } else { + return {from: range.from(), to: range.to()} + } + }); }, + deleteLine: function (cm) { return deleteNearSelection(cm, function (range) { return ({ + from: Pos(range.from().line, 0), + to: clipPos(cm.doc, Pos(range.to().line + 1, 0)) + }); }); }, + delLineLeft: function (cm) { return deleteNearSelection(cm, function (range) { return ({ + from: Pos(range.from().line, 0), to: range.from() + }); }); }, + delWrappedLineLeft: function (cm) { return deleteNearSelection(cm, function (range) { + var top = cm.charCoords(range.head, "div").top + 5 + var leftPos = cm.coordsChar({left: 0, top: top}, "div") + return {from: leftPos, to: range.from()} + }); }, + delWrappedLineRight: function (cm) { return deleteNearSelection(cm, function (range) { + var top = cm.charCoords(range.head, "div").top + 5 + var rightPos = cm.coordsChar({left: cm.display.lineDiv.offsetWidth + 100, top: top}, "div") + return {from: range.from(), to: rightPos } + }); }, + undo: function (cm) { return cm.undo(); }, + redo: function (cm) { return cm.redo(); }, + undoSelection: function (cm) { return cm.undoSelection(); }, + redoSelection: function (cm) { return cm.redoSelection(); }, + goDocStart: function (cm) { return cm.extendSelection(Pos(cm.firstLine(), 0)); }, + goDocEnd: function (cm) { return cm.extendSelection(Pos(cm.lastLine())); }, + goLineStart: function (cm) { return cm.extendSelectionsBy(function (range) { return lineStart(cm, range.head.line); }, + {origin: "+move", bias: 1} + ); }, + goLineStartSmart: function (cm) { return cm.extendSelectionsBy(function (range) { return lineStartSmart(cm, range.head); }, + {origin: "+move", bias: 1} + ); }, + goLineEnd: function (cm) { return cm.extendSelectionsBy(function (range) { return lineEnd(cm, range.head.line); }, + {origin: "+move", bias: -1} + ); }, + goLineRight: function (cm) { return cm.extendSelectionsBy(function (range) { + var top = cm.charCoords(range.head, "div").top + 5 + return cm.coordsChar({left: cm.display.lineDiv.offsetWidth + 100, top: top}, "div") + }, sel_move); }, + goLineLeft: function (cm) { return cm.extendSelectionsBy(function (range) { + var top = cm.charCoords(range.head, "div").top + 5 + return cm.coordsChar({left: 0, top: top}, "div") + }, sel_move); }, + goLineLeftSmart: function (cm) { return cm.extendSelectionsBy(function (range) { + var top = cm.charCoords(range.head, "div").top + 5 + var pos = cm.coordsChar({left: 0, top: top}, "div") + if (pos.ch < cm.getLine(pos.line).search(/\S/)) { return lineStartSmart(cm, range.head) } + return pos + }, sel_move); }, + goLineUp: function (cm) { return cm.moveV(-1, "line"); }, + goLineDown: function (cm) { return cm.moveV(1, "line"); }, + goPageUp: function (cm) { return cm.moveV(-1, "page"); }, + goPageDown: function (cm) { return cm.moveV(1, "page"); }, + goCharLeft: function (cm) { return cm.moveH(-1, "char"); }, + goCharRight: function (cm) { return cm.moveH(1, "char"); }, + goColumnLeft: function (cm) { return cm.moveH(-1, "column"); }, + goColumnRight: function (cm) { return cm.moveH(1, "column"); }, + goWordLeft: function (cm) { return cm.moveH(-1, "word"); }, + goGroupRight: function (cm) { return cm.moveH(1, "group"); }, + goGroupLeft: function (cm) { return cm.moveH(-1, "group"); }, + goWordRight: function (cm) { return cm.moveH(1, "word"); }, + delCharBefore: function (cm) { return cm.deleteH(-1, "char"); }, + delCharAfter: function (cm) { return cm.deleteH(1, "char"); }, + delWordBefore: function (cm) { return cm.deleteH(-1, "word"); }, + delWordAfter: function (cm) { return cm.deleteH(1, "word"); }, + delGroupBefore: function (cm) { return cm.deleteH(-1, "group"); }, + delGroupAfter: function (cm) { return cm.deleteH(1, "group"); }, + indentAuto: function (cm) { return cm.indentSelection("smart"); }, + indentMore: function (cm) { return cm.indentSelection("add"); }, + indentLess: function (cm) { return cm.indentSelection("subtract"); }, + insertTab: function (cm) { return cm.replaceSelection("\t"); }, + insertSoftTab: function (cm) { + var spaces = [], ranges = cm.listSelections(), tabSize = cm.options.tabSize + for (var i = 0; i < ranges.length; i++) { + var pos = ranges[i].from() + var col = countColumn(cm.getLine(pos.line), pos.ch, tabSize) + spaces.push(spaceStr(tabSize - col % tabSize)) + } + cm.replaceSelections(spaces) + }, + defaultTab: function (cm) { + if (cm.somethingSelected()) { cm.indentSelection("add") } + else { cm.execCommand("insertTab") } + }, + // Swap the two chars left and right of each selection's head. + // Move cursor behind the two swapped characters afterwards. + // + // Doesn't consider line feeds a character. + // Doesn't scan more than one line above to find a character. + // Doesn't do anything on an empty line. + // Doesn't do anything with non-empty selections. + transposeChars: function (cm) { return runInOp(cm, function () { + var ranges = cm.listSelections(), newSel = [] + for (var i = 0; i < ranges.length; i++) { + if (!ranges[i].empty()) { continue } + var cur = ranges[i].head, line = getLine(cm.doc, cur.line).text + if (line) { + if (cur.ch == line.length) { cur = new Pos(cur.line, cur.ch - 1) } + if (cur.ch > 0) { + cur = new Pos(cur.line, cur.ch + 1) + cm.replaceRange(line.charAt(cur.ch - 1) + line.charAt(cur.ch - 2), + Pos(cur.line, cur.ch - 2), cur, "+transpose") + } else if (cur.line > cm.doc.first) { + var prev = getLine(cm.doc, cur.line - 1).text + if (prev) { + cur = new Pos(cur.line, 1) + cm.replaceRange(line.charAt(0) + cm.doc.lineSeparator() + + prev.charAt(prev.length - 1), + Pos(cur.line - 1, prev.length - 1), cur, "+transpose") + } + } + } + newSel.push(new Range(cur, cur)) + } + cm.setSelections(newSel) + }); }, + newlineAndIndent: function (cm) { return runInOp(cm, function () { + var sels = cm.listSelections() + for (var i = sels.length - 1; i >= 0; i--) + { cm.replaceRange(cm.doc.lineSeparator(), sels[i].anchor, sels[i].head, "+input") } + sels = cm.listSelections() + for (var i$1 = 0; i$1 < sels.length; i$1++) + { cm.indentLine(sels[i$1].from().line, null, true) } + ensureCursorVisible(cm) + }); }, + openLine: function (cm) { return cm.replaceSelection("\n", "start"); }, + toggleOverwrite: function (cm) { return cm.toggleOverwrite(); } +} + + +function lineStart(cm, lineN) { + var line = getLine(cm.doc, lineN) + var visual = visualLine(line) + if (visual != line) { lineN = lineNo(visual) } + return endOfLine(true, cm, visual, lineN, 1) +} +function lineEnd(cm, lineN) { + var line = getLine(cm.doc, lineN) + var visual = visualLineEnd(line) + if (visual != line) { lineN = lineNo(visual) } + return endOfLine(true, cm, line, lineN, -1) +} +function lineStartSmart(cm, pos) { + var start = lineStart(cm, pos.line) + var line = getLine(cm.doc, start.line) + var order = getOrder(line) + if (!order || order[0].level == 0) { + var firstNonWS = Math.max(0, line.text.search(/\S/)) + var inWS = pos.line == start.line && pos.ch <= firstNonWS && pos.ch + return Pos(start.line, inWS ? 0 : firstNonWS, start.sticky) + } + return start +} + +// Run a handler that was bound to a key. +function doHandleBinding(cm, bound, dropShift) { + if (typeof bound == "string") { + bound = commands[bound] + if (!bound) { return false } + } + // Ensure previous input has been read, so that the handler sees a + // consistent view of the document + cm.display.input.ensurePolled() + var prevShift = cm.display.shift, done = false + try { + if (cm.isReadOnly()) { cm.state.suppressEdits = true } + if (dropShift) { cm.display.shift = false } + done = bound(cm) != Pass + } finally { + cm.display.shift = prevShift + cm.state.suppressEdits = false + } + return done +} + +function lookupKeyForEditor(cm, name, handle) { + for (var i = 0; i < cm.state.keyMaps.length; i++) { + var result = lookupKey(name, cm.state.keyMaps[i], handle, cm) + if (result) { return result } + } + return (cm.options.extraKeys && lookupKey(name, cm.options.extraKeys, handle, cm)) + || lookupKey(name, cm.options.keyMap, handle, cm) +} + +var stopSeq = new Delayed +function dispatchKey(cm, name, e, handle) { + var seq = cm.state.keySeq + if (seq) { + if (isModifierKey(name)) { return "handled" } + stopSeq.set(50, function () { + if (cm.state.keySeq == seq) { + cm.state.keySeq = null + cm.display.input.reset() + } + }) + name = seq + " " + name + } + var result = lookupKeyForEditor(cm, name, handle) + + if (result == "multi") + { cm.state.keySeq = name } + if (result == "handled") + { signalLater(cm, "keyHandled", cm, name, e) } + + if (result == "handled" || result == "multi") { + e_preventDefault(e) + restartBlink(cm) + } + + if (seq && !result && /\'$/.test(name)) { + e_preventDefault(e) + return true + } + return !!result +} + +// Handle a key from the keydown event. +function handleKeyBinding(cm, e) { + var name = keyName(e, true) + if (!name) { return false } + + if (e.shiftKey && !cm.state.keySeq) { + // First try to resolve full name (including 'Shift-'). Failing + // that, see if there is a cursor-motion command (starting with + // 'go') bound to the keyname without 'Shift-'. + return dispatchKey(cm, "Shift-" + name, e, function (b) { return doHandleBinding(cm, b, true); }) + || dispatchKey(cm, name, e, function (b) { + if (typeof b == "string" ? /^go[A-Z]/.test(b) : b.motion) + { return doHandleBinding(cm, b) } + }) + } else { + return dispatchKey(cm, name, e, function (b) { return doHandleBinding(cm, b); }) + } +} + +// Handle a key from the keypress event +function handleCharBinding(cm, e, ch) { + return dispatchKey(cm, "'" + ch + "'", e, function (b) { return doHandleBinding(cm, b, true); }) +} + +var lastStoppedKey = null +function onKeyDown(e) { + var cm = this + cm.curOp.focus = activeElt() + if (signalDOMEvent(cm, e)) { return } + // IE does strange things with escape. + if (ie && ie_version < 11 && e.keyCode == 27) { e.returnValue = false } + var code = e.keyCode + cm.display.shift = code == 16 || e.shiftKey + var handled = handleKeyBinding(cm, e) + if (presto) { + lastStoppedKey = handled ? code : null + // Opera has no cut event... we try to at least catch the key combo + if (!handled && code == 88 && !hasCopyEvent && (mac ? e.metaKey : e.ctrlKey)) + { cm.replaceSelection("", null, "cut") } + } + + // Turn mouse into crosshair when Alt is held on Mac. + if (code == 18 && !/\bCodeMirror-crosshair\b/.test(cm.display.lineDiv.className)) + { showCrossHair(cm) } +} + +function showCrossHair(cm) { + var lineDiv = cm.display.lineDiv + addClass(lineDiv, "CodeMirror-crosshair") + + function up(e) { + if (e.keyCode == 18 || !e.altKey) { + rmClass(lineDiv, "CodeMirror-crosshair") + off(document, "keyup", up) + off(document, "mouseover", up) + } + } + on(document, "keyup", up) + on(document, "mouseover", up) +} + +function onKeyUp(e) { + if (e.keyCode == 16) { this.doc.sel.shift = false } + signalDOMEvent(this, e) +} + +function onKeyPress(e) { + var cm = this + if (eventInWidget(cm.display, e) || signalDOMEvent(cm, e) || e.ctrlKey && !e.altKey || mac && e.metaKey) { return } + var keyCode = e.keyCode, charCode = e.charCode + if (presto && keyCode == lastStoppedKey) {lastStoppedKey = null; e_preventDefault(e); return} + if ((presto && (!e.which || e.which < 10)) && handleKeyBinding(cm, e)) { return } + var ch = String.fromCharCode(charCode == null ? keyCode : charCode) + // Some browsers fire keypress events for backspace + if (ch == "\x08") { return } + if (handleCharBinding(cm, e, ch)) { return } + cm.display.input.onKeyPress(e) +} + +// A mouse down can be a single click, double click, triple click, +// start of selection drag, start of text drag, new cursor +// (ctrl-click), rectangle drag (alt-drag), or xwin +// middle-click-paste. Or it might be a click on something we should +// not interfere with, such as a scrollbar or widget. +function onMouseDown(e) { + var cm = this, display = cm.display + if (signalDOMEvent(cm, e) || display.activeTouch && display.input.supportsTouch()) { return } + display.input.ensurePolled() + display.shift = e.shiftKey + + if (eventInWidget(display, e)) { + if (!webkit) { + // Briefly turn off draggability, to allow widgets to do + // normal dragging things. + display.scroller.draggable = false + setTimeout(function () { return display.scroller.draggable = true; }, 100) + } + return + } + if (clickInGutter(cm, e)) { return } + var start = posFromMouse(cm, e) + window.focus() + + switch (e_button(e)) { + case 1: + // #3261: make sure, that we're not starting a second selection + if (cm.state.selectingText) + { cm.state.selectingText(e) } + else if (start) + { leftButtonDown(cm, e, start) } + else if (e_target(e) == display.scroller) + { e_preventDefault(e) } + break + case 2: + if (webkit) { cm.state.lastMiddleDown = +new Date } + if (start) { extendSelection(cm.doc, start) } + setTimeout(function () { return display.input.focus(); }, 20) + e_preventDefault(e) + break + case 3: + if (captureRightClick) { onContextMenu(cm, e) } + else { delayBlurEvent(cm) } + break + } +} + +var lastClick; +var lastDoubleClick; +function leftButtonDown(cm, e, start) { + if (ie) { setTimeout(bind(ensureFocus, cm), 0) } + else { cm.curOp.focus = activeElt() } + + var now = +new Date, type + if (lastDoubleClick && lastDoubleClick.time > now - 400 && cmp(lastDoubleClick.pos, start) == 0) { + type = "triple" + } else if (lastClick && lastClick.time > now - 400 && cmp(lastClick.pos, start) == 0) { + type = "double" + lastDoubleClick = {time: now, pos: start} + } else { + type = "single" + lastClick = {time: now, pos: start} + } + + var sel = cm.doc.sel, modifier = mac ? e.metaKey : e.ctrlKey, contained + if (cm.options.dragDrop && dragAndDrop && !cm.isReadOnly() && + type == "single" && (contained = sel.contains(start)) > -1 && + (cmp((contained = sel.ranges[contained]).from(), start) < 0 || start.xRel > 0) && + (cmp(contained.to(), start) > 0 || start.xRel < 0)) + { leftButtonStartDrag(cm, e, start, modifier) } + else + { leftButtonSelect(cm, e, start, type, modifier) } +} + +// Start a text drag. When it ends, see if any dragging actually +// happen, and treat as a click if it didn't. +function leftButtonStartDrag(cm, e, start, modifier) { + var display = cm.display, startTime = +new Date + var dragEnd = operation(cm, function (e2) { + if (webkit) { display.scroller.draggable = false } + cm.state.draggingText = false + off(document, "mouseup", dragEnd) + off(display.scroller, "drop", dragEnd) + if (Math.abs(e.clientX - e2.clientX) + Math.abs(e.clientY - e2.clientY) < 10) { + e_preventDefault(e2) + if (!modifier && +new Date - 200 < startTime) + { extendSelection(cm.doc, start) } + // Work around unexplainable focus problem in IE9 (#2127) and Chrome (#3081) + if (webkit || ie && ie_version == 9) + { setTimeout(function () {document.body.focus(); display.input.focus()}, 20) } + else + { display.input.focus() } + } + }) + // Let the drag handler handle this. + if (webkit) { display.scroller.draggable = true } + cm.state.draggingText = dragEnd + dragEnd.copy = mac ? e.altKey : e.ctrlKey + // IE's approach to draggable + if (display.scroller.dragDrop) { display.scroller.dragDrop() } + on(document, "mouseup", dragEnd) + on(display.scroller, "drop", dragEnd) +} + +// Normal selection, as opposed to text dragging. +function leftButtonSelect(cm, e, start, type, addNew) { + var display = cm.display, doc = cm.doc + e_preventDefault(e) + + var ourRange, ourIndex, startSel = doc.sel, ranges = startSel.ranges + if (addNew && !e.shiftKey) { + ourIndex = doc.sel.contains(start) + if (ourIndex > -1) + { ourRange = ranges[ourIndex] } + else + { ourRange = new Range(start, start) } + } else { + ourRange = doc.sel.primary() + ourIndex = doc.sel.primIndex + } + + if (chromeOS ? e.shiftKey && e.metaKey : e.altKey) { + type = "rect" + if (!addNew) { ourRange = new Range(start, start) } + start = posFromMouse(cm, e, true, true) + ourIndex = -1 + } else if (type == "double") { + var word = cm.findWordAt(start) + if (cm.display.shift || doc.extend) + { ourRange = extendRange(doc, ourRange, word.anchor, word.head) } + else + { ourRange = word } + } else if (type == "triple") { + var line = new Range(Pos(start.line, 0), clipPos(doc, Pos(start.line + 1, 0))) + if (cm.display.shift || doc.extend) + { ourRange = extendRange(doc, ourRange, line.anchor, line.head) } + else + { ourRange = line } + } else { + ourRange = extendRange(doc, ourRange, start) + } + + if (!addNew) { + ourIndex = 0 + setSelection(doc, new Selection([ourRange], 0), sel_mouse) + startSel = doc.sel + } else if (ourIndex == -1) { + ourIndex = ranges.length + setSelection(doc, normalizeSelection(ranges.concat([ourRange]), ourIndex), + {scroll: false, origin: "*mouse"}) + } else if (ranges.length > 1 && ranges[ourIndex].empty() && type == "single" && !e.shiftKey) { + setSelection(doc, normalizeSelection(ranges.slice(0, ourIndex).concat(ranges.slice(ourIndex + 1)), 0), + {scroll: false, origin: "*mouse"}) + startSel = doc.sel + } else { + replaceOneSelection(doc, ourIndex, ourRange, sel_mouse) + } + + var lastPos = start + function extendTo(pos) { + if (cmp(lastPos, pos) == 0) { return } + lastPos = pos + + if (type == "rect") { + var ranges = [], tabSize = cm.options.tabSize + var startCol = countColumn(getLine(doc, start.line).text, start.ch, tabSize) + var posCol = countColumn(getLine(doc, pos.line).text, pos.ch, tabSize) + var left = Math.min(startCol, posCol), right = Math.max(startCol, posCol) + for (var line = Math.min(start.line, pos.line), end = Math.min(cm.lastLine(), Math.max(start.line, pos.line)); + line <= end; line++) { + var text = getLine(doc, line).text, leftPos = findColumn(text, left, tabSize) + if (left == right) + { ranges.push(new Range(Pos(line, leftPos), Pos(line, leftPos))) } + else if (text.length > leftPos) + { ranges.push(new Range(Pos(line, leftPos), Pos(line, findColumn(text, right, tabSize)))) } + } + if (!ranges.length) { ranges.push(new Range(start, start)) } + setSelection(doc, normalizeSelection(startSel.ranges.slice(0, ourIndex).concat(ranges), ourIndex), + {origin: "*mouse", scroll: false}) + cm.scrollIntoView(pos) + } else { + var oldRange = ourRange + var anchor = oldRange.anchor, head = pos + if (type != "single") { + var range + if (type == "double") + { range = cm.findWordAt(pos) } + else + { range = new Range(Pos(pos.line, 0), clipPos(doc, Pos(pos.line + 1, 0))) } + if (cmp(range.anchor, anchor) > 0) { + head = range.head + anchor = minPos(oldRange.from(), range.anchor) + } else { + head = range.anchor + anchor = maxPos(oldRange.to(), range.head) + } + } + var ranges$1 = startSel.ranges.slice(0) + ranges$1[ourIndex] = new Range(clipPos(doc, anchor), head) + setSelection(doc, normalizeSelection(ranges$1, ourIndex), sel_mouse) + } + } + + var editorSize = display.wrapper.getBoundingClientRect() + // Used to ensure timeout re-tries don't fire when another extend + // happened in the meantime (clearTimeout isn't reliable -- at + // least on Chrome, the timeouts still happen even when cleared, + // if the clear happens after their scheduled firing time). + var counter = 0 + + function extend(e) { + var curCount = ++counter + var cur = posFromMouse(cm, e, true, type == "rect") + if (!cur) { return } + if (cmp(cur, lastPos) != 0) { + cm.curOp.focus = activeElt() + extendTo(cur) + var visible = visibleLines(display, doc) + if (cur.line >= visible.to || cur.line < visible.from) + { setTimeout(operation(cm, function () {if (counter == curCount) { extend(e) }}), 150) } + } else { + var outside = e.clientY < editorSize.top ? -20 : e.clientY > editorSize.bottom ? 20 : 0 + if (outside) { setTimeout(operation(cm, function () { + if (counter != curCount) { return } + display.scroller.scrollTop += outside + extend(e) + }), 50) } + } + } + + function done(e) { + cm.state.selectingText = false + counter = Infinity + e_preventDefault(e) + display.input.focus() + off(document, "mousemove", move) + off(document, "mouseup", up) + doc.history.lastSelOrigin = null + } + + var move = operation(cm, function (e) { + if (!e_button(e)) { done(e) } + else { extend(e) } + }) + var up = operation(cm, done) + cm.state.selectingText = up + on(document, "mousemove", move) + on(document, "mouseup", up) +} + + +// Determines whether an event happened in the gutter, and fires the +// handlers for the corresponding event. +function gutterEvent(cm, e, type, prevent) { + var mX, mY + try { mX = e.clientX; mY = e.clientY } + catch(e) { return false } + if (mX >= Math.floor(cm.display.gutters.getBoundingClientRect().right)) { return false } + if (prevent) { e_preventDefault(e) } + + var display = cm.display + var lineBox = display.lineDiv.getBoundingClientRect() + + if (mY > lineBox.bottom || !hasHandler(cm, type)) { return e_defaultPrevented(e) } + mY -= lineBox.top - display.viewOffset + + for (var i = 0; i < cm.options.gutters.length; ++i) { + var g = display.gutters.childNodes[i] + if (g && g.getBoundingClientRect().right >= mX) { + var line = lineAtHeight(cm.doc, mY) + var gutter = cm.options.gutters[i] + signal(cm, type, cm, line, gutter, e) + return e_defaultPrevented(e) + } + } +} + +function clickInGutter(cm, e) { + return gutterEvent(cm, e, "gutterClick", true) +} + +// CONTEXT MENU HANDLING + +// To make the context menu work, we need to briefly unhide the +// textarea (making it as unobtrusive as possible) to let the +// right-click take effect on it. +function onContextMenu(cm, e) { + if (eventInWidget(cm.display, e) || contextMenuInGutter(cm, e)) { return } + if (signalDOMEvent(cm, e, "contextmenu")) { return } + cm.display.input.onContextMenu(e) +} + +function contextMenuInGutter(cm, e) { + if (!hasHandler(cm, "gutterContextMenu")) { return false } + return gutterEvent(cm, e, "gutterContextMenu", false) +} + +function themeChanged(cm) { + cm.display.wrapper.className = cm.display.wrapper.className.replace(/\s*cm-s-\S+/g, "") + + cm.options.theme.replace(/(^|\s)\s*/g, " cm-s-") + clearCaches(cm) +} + +var Init = {toString: function(){return "CodeMirror.Init"}} + +var defaults = {} +var optionHandlers = {} + +function defineOptions(CodeMirror) { + var optionHandlers = CodeMirror.optionHandlers + + function option(name, deflt, handle, notOnInit) { + CodeMirror.defaults[name] = deflt + if (handle) { optionHandlers[name] = + notOnInit ? function (cm, val, old) {if (old != Init) { handle(cm, val, old) }} : handle } + } + + CodeMirror.defineOption = option + + // Passed to option handlers when there is no old value. + CodeMirror.Init = Init + + // These two are, on init, called from the constructor because they + // have to be initialized before the editor can start at all. + option("value", "", function (cm, val) { return cm.setValue(val); }, true) + option("mode", null, function (cm, val) { + cm.doc.modeOption = val + loadMode(cm) + }, true) + + option("indentUnit", 2, loadMode, true) + option("indentWithTabs", false) + option("smartIndent", true) + option("tabSize", 4, function (cm) { + resetModeState(cm) + clearCaches(cm) + regChange(cm) + }, true) + option("lineSeparator", null, function (cm, val) { + cm.doc.lineSep = val + if (!val) { return } + var newBreaks = [], lineNo = cm.doc.first + cm.doc.iter(function (line) { + for (var pos = 0;;) { + var found = line.text.indexOf(val, pos) + if (found == -1) { break } + pos = found + val.length + newBreaks.push(Pos(lineNo, found)) + } + lineNo++ + }) + for (var i = newBreaks.length - 1; i >= 0; i--) + { replaceRange(cm.doc, val, newBreaks[i], Pos(newBreaks[i].line, newBreaks[i].ch + val.length)) } + }) + option("specialChars", /[\u0000-\u001f\u007f\u00ad\u061c\u200b-\u200f\u2028\u2029\ufeff]/g, function (cm, val, old) { + cm.state.specialChars = new RegExp(val.source + (val.test("\t") ? "" : "|\t"), "g") + if (old != Init) { cm.refresh() } + }) + option("specialCharPlaceholder", defaultSpecialCharPlaceholder, function (cm) { return cm.refresh(); }, true) + option("electricChars", true) + option("inputStyle", mobile ? "contenteditable" : "textarea", function () { + throw new Error("inputStyle can not (yet) be changed in a running editor") // FIXME + }, true) + option("spellcheck", false, function (cm, val) { return cm.getInputField().spellcheck = val; }, true) + option("rtlMoveVisually", !windows) + option("wholeLineUpdateBefore", true) + + option("theme", "default", function (cm) { + themeChanged(cm) + guttersChanged(cm) + }, true) + option("keyMap", "default", function (cm, val, old) { + var next = getKeyMap(val) + var prev = old != Init && getKeyMap(old) + if (prev && prev.detach) { prev.detach(cm, next) } + if (next.attach) { next.attach(cm, prev || null) } + }) + option("extraKeys", null) + + option("lineWrapping", false, wrappingChanged, true) + option("gutters", [], function (cm) { + setGuttersForLineNumbers(cm.options) + guttersChanged(cm) + }, true) + option("fixedGutter", true, function (cm, val) { + cm.display.gutters.style.left = val ? compensateForHScroll(cm.display) + "px" : "0" + cm.refresh() + }, true) + option("coverGutterNextToScrollbar", false, function (cm) { return updateScrollbars(cm); }, true) + option("scrollbarStyle", "native", function (cm) { + initScrollbars(cm) + updateScrollbars(cm) + cm.display.scrollbars.setScrollTop(cm.doc.scrollTop) + cm.display.scrollbars.setScrollLeft(cm.doc.scrollLeft) + }, true) + option("lineNumbers", false, function (cm) { + setGuttersForLineNumbers(cm.options) + guttersChanged(cm) + }, true) + option("firstLineNumber", 1, guttersChanged, true) + option("lineNumberFormatter", function (integer) { return integer; }, guttersChanged, true) + option("showCursorWhenSelecting", false, updateSelection, true) + + option("resetSelectionOnContextMenu", true) + option("lineWiseCopyCut", true) + + option("readOnly", false, function (cm, val) { + if (val == "nocursor") { + onBlur(cm) + cm.display.input.blur() + cm.display.disabled = true + } else { + cm.display.disabled = false + } + cm.display.input.readOnlyChanged(val) + }) + option("disableInput", false, function (cm, val) {if (!val) { cm.display.input.reset() }}, true) + option("dragDrop", true, dragDropChanged) + option("allowDropFileTypes", null) + + option("cursorBlinkRate", 530) + option("cursorScrollMargin", 0) + option("cursorHeight", 1, updateSelection, true) + option("singleCursorHeightPerLine", true, updateSelection, true) + option("workTime", 100) + option("workDelay", 100) + option("flattenSpans", true, resetModeState, true) + option("addModeClass", false, resetModeState, true) + option("pollInterval", 100) + option("undoDepth", 200, function (cm, val) { return cm.doc.history.undoDepth = val; }) + option("historyEventDelay", 1250) + option("viewportMargin", 10, function (cm) { return cm.refresh(); }, true) + option("maxHighlightLength", 10000, resetModeState, true) + option("moveInputWithCursor", true, function (cm, val) { + if (!val) { cm.display.input.resetPosition() } + }) + + option("tabindex", null, function (cm, val) { return cm.display.input.getField().tabIndex = val || ""; }) + option("autofocus", null) +} + +function guttersChanged(cm) { + updateGutters(cm) + regChange(cm) + alignHorizontally(cm) +} + +function dragDropChanged(cm, value, old) { + var wasOn = old && old != Init + if (!value != !wasOn) { + var funcs = cm.display.dragFunctions + var toggle = value ? on : off + toggle(cm.display.scroller, "dragstart", funcs.start) + toggle(cm.display.scroller, "dragenter", funcs.enter) + toggle(cm.display.scroller, "dragover", funcs.over) + toggle(cm.display.scroller, "dragleave", funcs.leave) + toggle(cm.display.scroller, "drop", funcs.drop) + } +} + +function wrappingChanged(cm) { + if (cm.options.lineWrapping) { + addClass(cm.display.wrapper, "CodeMirror-wrap") + cm.display.sizer.style.minWidth = "" + cm.display.sizerWidth = null + } else { + rmClass(cm.display.wrapper, "CodeMirror-wrap") + findMaxLine(cm) + } + estimateLineHeights(cm) + regChange(cm) + clearCaches(cm) + setTimeout(function () { return updateScrollbars(cm); }, 100) +} + +// A CodeMirror instance represents an editor. This is the object +// that user code is usually dealing with. + +function CodeMirror(place, options) { + var this$1 = this; + + if (!(this instanceof CodeMirror)) { return new CodeMirror(place, options) } + + this.options = options = options ? copyObj(options) : {} + // Determine effective options based on given values and defaults. + copyObj(defaults, options, false) + setGuttersForLineNumbers(options) + + var doc = options.value + if (typeof doc == "string") { doc = new Doc(doc, options.mode, null, options.lineSeparator) } + this.doc = doc + + var input = new CodeMirror.inputStyles[options.inputStyle](this) + var display = this.display = new Display(place, doc, input) + display.wrapper.CodeMirror = this + updateGutters(this) + themeChanged(this) + if (options.lineWrapping) + { this.display.wrapper.className += " CodeMirror-wrap" } + initScrollbars(this) + + this.state = { + keyMaps: [], // stores maps added by addKeyMap + overlays: [], // highlighting overlays, as added by addOverlay + modeGen: 0, // bumped when mode/overlay changes, used to invalidate highlighting info + overwrite: false, + delayingBlurEvent: false, + focused: false, + suppressEdits: false, // used to disable editing during key handlers when in readOnly mode + pasteIncoming: false, cutIncoming: false, // help recognize paste/cut edits in input.poll + selectingText: false, + draggingText: false, + highlight: new Delayed(), // stores highlight worker timeout + keySeq: null, // Unfinished key sequence + specialChars: null + } + + if (options.autofocus && !mobile) { display.input.focus() } + + // Override magic textarea content restore that IE sometimes does + // on our hidden textarea on reload + if (ie && ie_version < 11) { setTimeout(function () { return this$1.display.input.reset(true); }, 20) } + + registerEventHandlers(this) + ensureGlobalHandlers() + + startOperation(this) + this.curOp.forceUpdate = true + attachDoc(this, doc) + + if ((options.autofocus && !mobile) || this.hasFocus()) + { setTimeout(bind(onFocus, this), 20) } + else + { onBlur(this) } + + for (var opt in optionHandlers) { if (optionHandlers.hasOwnProperty(opt)) + { optionHandlers[opt](this$1, options[opt], Init) } } + maybeUpdateLineNumberWidth(this) + if (options.finishInit) { options.finishInit(this) } + for (var i = 0; i < initHooks.length; ++i) { initHooks[i](this$1) } + endOperation(this) + // Suppress optimizelegibility in Webkit, since it breaks text + // measuring on line wrapping boundaries. + if (webkit && options.lineWrapping && + getComputedStyle(display.lineDiv).textRendering == "optimizelegibility") + { display.lineDiv.style.textRendering = "auto" } +} + +// The default configuration options. +CodeMirror.defaults = defaults +// Functions to run when options are changed. +CodeMirror.optionHandlers = optionHandlers + +// Attach the necessary event handlers when initializing the editor +function registerEventHandlers(cm) { + var d = cm.display + on(d.scroller, "mousedown", operation(cm, onMouseDown)) + // Older IE's will not fire a second mousedown for a double click + if (ie && ie_version < 11) + { on(d.scroller, "dblclick", operation(cm, function (e) { + if (signalDOMEvent(cm, e)) { return } + var pos = posFromMouse(cm, e) + if (!pos || clickInGutter(cm, e) || eventInWidget(cm.display, e)) { return } + e_preventDefault(e) + var word = cm.findWordAt(pos) + extendSelection(cm.doc, word.anchor, word.head) + })) } + else + { on(d.scroller, "dblclick", function (e) { return signalDOMEvent(cm, e) || e_preventDefault(e); }) } + // Some browsers fire contextmenu *after* opening the menu, at + // which point we can't mess with it anymore. Context menu is + // handled in onMouseDown for these browsers. + if (!captureRightClick) { on(d.scroller, "contextmenu", function (e) { return onContextMenu(cm, e); }) } + + // Used to suppress mouse event handling when a touch happens + var touchFinished, prevTouch = {end: 0} + function finishTouch() { + if (d.activeTouch) { + touchFinished = setTimeout(function () { return d.activeTouch = null; }, 1000) + prevTouch = d.activeTouch + prevTouch.end = +new Date + } + } + function isMouseLikeTouchEvent(e) { + if (e.touches.length != 1) { return false } + var touch = e.touches[0] + return touch.radiusX <= 1 && touch.radiusY <= 1 + } + function farAway(touch, other) { + if (other.left == null) { return true } + var dx = other.left - touch.left, dy = other.top - touch.top + return dx * dx + dy * dy > 20 * 20 + } + on(d.scroller, "touchstart", function (e) { + if (!signalDOMEvent(cm, e) && !isMouseLikeTouchEvent(e)) { + d.input.ensurePolled() + clearTimeout(touchFinished) + var now = +new Date + d.activeTouch = {start: now, moved: false, + prev: now - prevTouch.end <= 300 ? prevTouch : null} + if (e.touches.length == 1) { + d.activeTouch.left = e.touches[0].pageX + d.activeTouch.top = e.touches[0].pageY + } + } + }) + on(d.scroller, "touchmove", function () { + if (d.activeTouch) { d.activeTouch.moved = true } + }) + on(d.scroller, "touchend", function (e) { + var touch = d.activeTouch + if (touch && !eventInWidget(d, e) && touch.left != null && + !touch.moved && new Date - touch.start < 300) { + var pos = cm.coordsChar(d.activeTouch, "page"), range + if (!touch.prev || farAway(touch, touch.prev)) // Single tap + { range = new Range(pos, pos) } + else if (!touch.prev.prev || farAway(touch, touch.prev.prev)) // Double tap + { range = cm.findWordAt(pos) } + else // Triple tap + { range = new Range(Pos(pos.line, 0), clipPos(cm.doc, Pos(pos.line + 1, 0))) } + cm.setSelection(range.anchor, range.head) + cm.focus() + e_preventDefault(e) + } + finishTouch() + }) + on(d.scroller, "touchcancel", finishTouch) + + // Sync scrolling between fake scrollbars and real scrollable + // area, ensure viewport is updated when scrolling. + on(d.scroller, "scroll", function () { + if (d.scroller.clientHeight) { + setScrollTop(cm, d.scroller.scrollTop) + setScrollLeft(cm, d.scroller.scrollLeft, true) + signal(cm, "scroll", cm) + } + }) + + // Listen to wheel events in order to try and update the viewport on time. + on(d.scroller, "mousewheel", function (e) { return onScrollWheel(cm, e); }) + on(d.scroller, "DOMMouseScroll", function (e) { return onScrollWheel(cm, e); }) + + // Prevent wrapper from ever scrolling + on(d.wrapper, "scroll", function () { return d.wrapper.scrollTop = d.wrapper.scrollLeft = 0; }) + + d.dragFunctions = { + enter: function (e) {if (!signalDOMEvent(cm, e)) { e_stop(e) }}, + over: function (e) {if (!signalDOMEvent(cm, e)) { onDragOver(cm, e); e_stop(e) }}, + start: function (e) { return onDragStart(cm, e); }, + drop: operation(cm, onDrop), + leave: function (e) {if (!signalDOMEvent(cm, e)) { clearDragCursor(cm) }} + } + + var inp = d.input.getField() + on(inp, "keyup", function (e) { return onKeyUp.call(cm, e); }) + on(inp, "keydown", operation(cm, onKeyDown)) + on(inp, "keypress", operation(cm, onKeyPress)) + on(inp, "focus", function (e) { return onFocus(cm, e); }) + on(inp, "blur", function (e) { return onBlur(cm, e); }) +} + +var initHooks = [] +CodeMirror.defineInitHook = function (f) { return initHooks.push(f); } + +// Indent the given line. The how parameter can be "smart", +// "add"/null, "subtract", or "prev". When aggressive is false +// (typically set to true for forced single-line indents), empty +// lines are not indented, and places where the mode returns Pass +// are left alone. +function indentLine(cm, n, how, aggressive) { + var doc = cm.doc, state + if (how == null) { how = "add" } + if (how == "smart") { + // Fall back to "prev" when the mode doesn't have an indentation + // method. + if (!doc.mode.indent) { how = "prev" } + else { state = getStateBefore(cm, n) } + } + + var tabSize = cm.options.tabSize + var line = getLine(doc, n), curSpace = countColumn(line.text, null, tabSize) + if (line.stateAfter) { line.stateAfter = null } + var curSpaceString = line.text.match(/^\s*/)[0], indentation + if (!aggressive && !/\S/.test(line.text)) { + indentation = 0 + how = "not" + } else if (how == "smart") { + indentation = doc.mode.indent(state, line.text.slice(curSpaceString.length), line.text) + if (indentation == Pass || indentation > 150) { + if (!aggressive) { return } + how = "prev" + } + } + if (how == "prev") { + if (n > doc.first) { indentation = countColumn(getLine(doc, n-1).text, null, tabSize) } + else { indentation = 0 } + } else if (how == "add") { + indentation = curSpace + cm.options.indentUnit + } else if (how == "subtract") { + indentation = curSpace - cm.options.indentUnit + } else if (typeof how == "number") { + indentation = curSpace + how + } + indentation = Math.max(0, indentation) + + var indentString = "", pos = 0 + if (cm.options.indentWithTabs) + { for (var i = Math.floor(indentation / tabSize); i; --i) {pos += tabSize; indentString += "\t"} } + if (pos < indentation) { indentString += spaceStr(indentation - pos) } + + if (indentString != curSpaceString) { + replaceRange(doc, indentString, Pos(n, 0), Pos(n, curSpaceString.length), "+input") + line.stateAfter = null + return true + } else { + // Ensure that, if the cursor was in the whitespace at the start + // of the line, it is moved to the end of that space. + for (var i$1 = 0; i$1 < doc.sel.ranges.length; i$1++) { + var range = doc.sel.ranges[i$1] + if (range.head.line == n && range.head.ch < curSpaceString.length) { + var pos$1 = Pos(n, curSpaceString.length) + replaceOneSelection(doc, i$1, new Range(pos$1, pos$1)) + break + } + } + } +} + +// This will be set to a {lineWise: bool, text: [string]} object, so +// that, when pasting, we know what kind of selections the copied +// text was made out of. +var lastCopied = null + +function setLastCopied(newLastCopied) { + lastCopied = newLastCopied +} + +function applyTextInput(cm, inserted, deleted, sel, origin) { + var doc = cm.doc + cm.display.shift = false + if (!sel) { sel = doc.sel } + + var paste = cm.state.pasteIncoming || origin == "paste" + var textLines = splitLinesAuto(inserted), multiPaste = null + // When pasing N lines into N selections, insert one line per selection + if (paste && sel.ranges.length > 1) { + if (lastCopied && lastCopied.text.join("\n") == inserted) { + if (sel.ranges.length % lastCopied.text.length == 0) { + multiPaste = [] + for (var i = 0; i < lastCopied.text.length; i++) + { multiPaste.push(doc.splitLines(lastCopied.text[i])) } + } + } else if (textLines.length == sel.ranges.length) { + multiPaste = map(textLines, function (l) { return [l]; }) + } + } + + var updateInput + // Normal behavior is to insert the new text into every selection + for (var i$1 = sel.ranges.length - 1; i$1 >= 0; i$1--) { + var range = sel.ranges[i$1] + var from = range.from(), to = range.to() + if (range.empty()) { + if (deleted && deleted > 0) // Handle deletion + { from = Pos(from.line, from.ch - deleted) } + else if (cm.state.overwrite && !paste) // Handle overwrite + { to = Pos(to.line, Math.min(getLine(doc, to.line).text.length, to.ch + lst(textLines).length)) } + else if (lastCopied && lastCopied.lineWise && lastCopied.text.join("\n") == inserted) + { from = to = Pos(from.line, 0) } + } + updateInput = cm.curOp.updateInput + var changeEvent = {from: from, to: to, text: multiPaste ? multiPaste[i$1 % multiPaste.length] : textLines, + origin: origin || (paste ? "paste" : cm.state.cutIncoming ? "cut" : "+input")} + makeChange(cm.doc, changeEvent) + signalLater(cm, "inputRead", cm, changeEvent) + } + if (inserted && !paste) + { triggerElectric(cm, inserted) } + + ensureCursorVisible(cm) + cm.curOp.updateInput = updateInput + cm.curOp.typing = true + cm.state.pasteIncoming = cm.state.cutIncoming = false +} + +function handlePaste(e, cm) { + var pasted = e.clipboardData && e.clipboardData.getData("Text") + if (pasted) { + e.preventDefault() + if (!cm.isReadOnly() && !cm.options.disableInput) + { runInOp(cm, function () { return applyTextInput(cm, pasted, 0, null, "paste"); }) } + return true + } +} + +function triggerElectric(cm, inserted) { + // When an 'electric' character is inserted, immediately trigger a reindent + if (!cm.options.electricChars || !cm.options.smartIndent) { return } + var sel = cm.doc.sel + + for (var i = sel.ranges.length - 1; i >= 0; i--) { + var range = sel.ranges[i] + if (range.head.ch > 100 || (i && sel.ranges[i - 1].head.line == range.head.line)) { continue } + var mode = cm.getModeAt(range.head) + var indented = false + if (mode.electricChars) { + for (var j = 0; j < mode.electricChars.length; j++) + { if (inserted.indexOf(mode.electricChars.charAt(j)) > -1) { + indented = indentLine(cm, range.head.line, "smart") + break + } } + } else if (mode.electricInput) { + if (mode.electricInput.test(getLine(cm.doc, range.head.line).text.slice(0, range.head.ch))) + { indented = indentLine(cm, range.head.line, "smart") } + } + if (indented) { signalLater(cm, "electricInput", cm, range.head.line) } + } +} + +function copyableRanges(cm) { + var text = [], ranges = [] + for (var i = 0; i < cm.doc.sel.ranges.length; i++) { + var line = cm.doc.sel.ranges[i].head.line + var lineRange = {anchor: Pos(line, 0), head: Pos(line + 1, 0)} + ranges.push(lineRange) + text.push(cm.getRange(lineRange.anchor, lineRange.head)) + } + return {text: text, ranges: ranges} +} + +function disableBrowserMagic(field, spellcheck) { + field.setAttribute("autocorrect", "off") + field.setAttribute("autocapitalize", "off") + field.setAttribute("spellcheck", !!spellcheck) +} + +function hiddenTextarea() { + var te = elt("textarea", null, null, "position: absolute; bottom: -1em; padding: 0; width: 1px; height: 1em; outline: none") + var div = elt("div", [te], null, "overflow: hidden; position: relative; width: 3px; height: 0px;") + // The textarea is kept positioned near the cursor to prevent the + // fact that it'll be scrolled into view on input from scrolling + // our fake cursor out of view. On webkit, when wrap=off, paste is + // very slow. So make the area wide instead. + if (webkit) { te.style.width = "1000px" } + else { te.setAttribute("wrap", "off") } + // If border: 0; -- iOS fails to open keyboard (issue #1287) + if (ios) { te.style.border = "1px solid black" } + disableBrowserMagic(te) + return div +} + +// The publicly visible API. Note that methodOp(f) means +// 'wrap f in an operation, performed on its `this` parameter'. + +// This is not the complete set of editor methods. Most of the +// methods defined on the Doc type are also injected into +// CodeMirror.prototype, for backwards compatibility and +// convenience. + +function addEditorMethods(CodeMirror) { + var optionHandlers = CodeMirror.optionHandlers + + var helpers = CodeMirror.helpers = {} + + CodeMirror.prototype = { + constructor: CodeMirror, + focus: function(){window.focus(); this.display.input.focus()}, + + setOption: function(option, value) { + var options = this.options, old = options[option] + if (options[option] == value && option != "mode") { return } + options[option] = value + if (optionHandlers.hasOwnProperty(option)) + { operation(this, optionHandlers[option])(this, value, old) } + signal(this, "optionChange", this, option) + }, + + getOption: function(option) {return this.options[option]}, + getDoc: function() {return this.doc}, + + addKeyMap: function(map, bottom) { + this.state.keyMaps[bottom ? "push" : "unshift"](getKeyMap(map)) + }, + removeKeyMap: function(map) { + var maps = this.state.keyMaps + for (var i = 0; i < maps.length; ++i) + { if (maps[i] == map || maps[i].name == map) { + maps.splice(i, 1) + return true + } } + }, + + addOverlay: methodOp(function(spec, options) { + var mode = spec.token ? spec : CodeMirror.getMode(this.options, spec) + if (mode.startState) { throw new Error("Overlays may not be stateful.") } + insertSorted(this.state.overlays, + {mode: mode, modeSpec: spec, opaque: options && options.opaque, + priority: (options && options.priority) || 0}, + function (overlay) { return overlay.priority; }) + this.state.modeGen++ + regChange(this) + }), + removeOverlay: methodOp(function(spec) { + var this$1 = this; + + var overlays = this.state.overlays + for (var i = 0; i < overlays.length; ++i) { + var cur = overlays[i].modeSpec + if (cur == spec || typeof spec == "string" && cur.name == spec) { + overlays.splice(i, 1) + this$1.state.modeGen++ + regChange(this$1) + return + } + } + }), + + indentLine: methodOp(function(n, dir, aggressive) { + if (typeof dir != "string" && typeof dir != "number") { + if (dir == null) { dir = this.options.smartIndent ? "smart" : "prev" } + else { dir = dir ? "add" : "subtract" } + } + if (isLine(this.doc, n)) { indentLine(this, n, dir, aggressive) } + }), + indentSelection: methodOp(function(how) { + var this$1 = this; + + var ranges = this.doc.sel.ranges, end = -1 + for (var i = 0; i < ranges.length; i++) { + var range = ranges[i] + if (!range.empty()) { + var from = range.from(), to = range.to() + var start = Math.max(end, from.line) + end = Math.min(this$1.lastLine(), to.line - (to.ch ? 0 : 1)) + 1 + for (var j = start; j < end; ++j) + { indentLine(this$1, j, how) } + var newRanges = this$1.doc.sel.ranges + if (from.ch == 0 && ranges.length == newRanges.length && newRanges[i].from().ch > 0) + { replaceOneSelection(this$1.doc, i, new Range(from, newRanges[i].to()), sel_dontScroll) } + } else if (range.head.line > end) { + indentLine(this$1, range.head.line, how, true) + end = range.head.line + if (i == this$1.doc.sel.primIndex) { ensureCursorVisible(this$1) } + } + } + }), + + // Fetch the parser token for a given character. Useful for hacks + // that want to inspect the mode state (say, for completion). + getTokenAt: function(pos, precise) { + return takeToken(this, pos, precise) + }, + + getLineTokens: function(line, precise) { + return takeToken(this, Pos(line), precise, true) + }, + + getTokenTypeAt: function(pos) { + pos = clipPos(this.doc, pos) + var styles = getLineStyles(this, getLine(this.doc, pos.line)) + var before = 0, after = (styles.length - 1) / 2, ch = pos.ch + var type + if (ch == 0) { type = styles[2] } + else { for (;;) { + var mid = (before + after) >> 1 + if ((mid ? styles[mid * 2 - 1] : 0) >= ch) { after = mid } + else if (styles[mid * 2 + 1] < ch) { before = mid + 1 } + else { type = styles[mid * 2 + 2]; break } + } } + var cut = type ? type.indexOf("overlay ") : -1 + return cut < 0 ? type : cut == 0 ? null : type.slice(0, cut - 1) + }, + + getModeAt: function(pos) { + var mode = this.doc.mode + if (!mode.innerMode) { return mode } + return CodeMirror.innerMode(mode, this.getTokenAt(pos).state).mode + }, + + getHelper: function(pos, type) { + return this.getHelpers(pos, type)[0] + }, + + getHelpers: function(pos, type) { + var this$1 = this; + + var found = [] + if (!helpers.hasOwnProperty(type)) { return found } + var help = helpers[type], mode = this.getModeAt(pos) + if (typeof mode[type] == "string") { + if (help[mode[type]]) { found.push(help[mode[type]]) } + } else if (mode[type]) { + for (var i = 0; i < mode[type].length; i++) { + var val = help[mode[type][i]] + if (val) { found.push(val) } + } + } else if (mode.helperType && help[mode.helperType]) { + found.push(help[mode.helperType]) + } else if (help[mode.name]) { + found.push(help[mode.name]) + } + for (var i$1 = 0; i$1 < help._global.length; i$1++) { + var cur = help._global[i$1] + if (cur.pred(mode, this$1) && indexOf(found, cur.val) == -1) + { found.push(cur.val) } + } + return found + }, + + getStateAfter: function(line, precise) { + var doc = this.doc + line = clipLine(doc, line == null ? doc.first + doc.size - 1: line) + return getStateBefore(this, line + 1, precise) + }, + + cursorCoords: function(start, mode) { + var pos, range = this.doc.sel.primary() + if (start == null) { pos = range.head } + else if (typeof start == "object") { pos = clipPos(this.doc, start) } + else { pos = start ? range.from() : range.to() } + return cursorCoords(this, pos, mode || "page") + }, + + charCoords: function(pos, mode) { + return charCoords(this, clipPos(this.doc, pos), mode || "page") + }, + + coordsChar: function(coords, mode) { + coords = fromCoordSystem(this, coords, mode || "page") + return coordsChar(this, coords.left, coords.top) + }, + + lineAtHeight: function(height, mode) { + height = fromCoordSystem(this, {top: height, left: 0}, mode || "page").top + return lineAtHeight(this.doc, height + this.display.viewOffset) + }, + heightAtLine: function(line, mode, includeWidgets) { + var end = false, lineObj + if (typeof line == "number") { + var last = this.doc.first + this.doc.size - 1 + if (line < this.doc.first) { line = this.doc.first } + else if (line > last) { line = last; end = true } + lineObj = getLine(this.doc, line) + } else { + lineObj = line + } + return intoCoordSystem(this, lineObj, {top: 0, left: 0}, mode || "page", includeWidgets || end).top + + (end ? this.doc.height - heightAtLine(lineObj) : 0) + }, + + defaultTextHeight: function() { return textHeight(this.display) }, + defaultCharWidth: function() { return charWidth(this.display) }, + + getViewport: function() { return {from: this.display.viewFrom, to: this.display.viewTo}}, + + addWidget: function(pos, node, scroll, vert, horiz) { + var display = this.display + pos = cursorCoords(this, clipPos(this.doc, pos)) + var top = pos.bottom, left = pos.left + node.style.position = "absolute" + node.setAttribute("cm-ignore-events", "true") + this.display.input.setUneditable(node) + display.sizer.appendChild(node) + if (vert == "over") { + top = pos.top + } else if (vert == "above" || vert == "near") { + var vspace = Math.max(display.wrapper.clientHeight, this.doc.height), + hspace = Math.max(display.sizer.clientWidth, display.lineSpace.clientWidth) + // Default to positioning above (if specified and possible); otherwise default to positioning below + if ((vert == 'above' || pos.bottom + node.offsetHeight > vspace) && pos.top > node.offsetHeight) + { top = pos.top - node.offsetHeight } + else if (pos.bottom + node.offsetHeight <= vspace) + { top = pos.bottom } + if (left + node.offsetWidth > hspace) + { left = hspace - node.offsetWidth } + } + node.style.top = top + "px" + node.style.left = node.style.right = "" + if (horiz == "right") { + left = display.sizer.clientWidth - node.offsetWidth + node.style.right = "0px" + } else { + if (horiz == "left") { left = 0 } + else if (horiz == "middle") { left = (display.sizer.clientWidth - node.offsetWidth) / 2 } + node.style.left = left + "px" + } + if (scroll) + { scrollIntoView(this, left, top, left + node.offsetWidth, top + node.offsetHeight) } + }, + + triggerOnKeyDown: methodOp(onKeyDown), + triggerOnKeyPress: methodOp(onKeyPress), + triggerOnKeyUp: onKeyUp, + + execCommand: function(cmd) { + if (commands.hasOwnProperty(cmd)) + { return commands[cmd].call(null, this) } + }, + + triggerElectric: methodOp(function(text) { triggerElectric(this, text) }), + + findPosH: function(from, amount, unit, visually) { + var this$1 = this; + + var dir = 1 + if (amount < 0) { dir = -1; amount = -amount } + var cur = clipPos(this.doc, from) + for (var i = 0; i < amount; ++i) { + cur = findPosH(this$1.doc, cur, dir, unit, visually) + if (cur.hitSide) { break } + } + return cur + }, + + moveH: methodOp(function(dir, unit) { + var this$1 = this; + + this.extendSelectionsBy(function (range) { + if (this$1.display.shift || this$1.doc.extend || range.empty()) + { return findPosH(this$1.doc, range.head, dir, unit, this$1.options.rtlMoveVisually) } + else + { return dir < 0 ? range.from() : range.to() } + }, sel_move) + }), + + deleteH: methodOp(function(dir, unit) { + var sel = this.doc.sel, doc = this.doc + if (sel.somethingSelected()) + { doc.replaceSelection("", null, "+delete") } + else + { deleteNearSelection(this, function (range) { + var other = findPosH(doc, range.head, dir, unit, false) + return dir < 0 ? {from: other, to: range.head} : {from: range.head, to: other} + }) } + }), + + findPosV: function(from, amount, unit, goalColumn) { + var this$1 = this; + + var dir = 1, x = goalColumn + if (amount < 0) { dir = -1; amount = -amount } + var cur = clipPos(this.doc, from) + for (var i = 0; i < amount; ++i) { + var coords = cursorCoords(this$1, cur, "div") + if (x == null) { x = coords.left } + else { coords.left = x } + cur = findPosV(this$1, coords, dir, unit) + if (cur.hitSide) { break } + } + return cur + }, + + moveV: methodOp(function(dir, unit) { + var this$1 = this; + + var doc = this.doc, goals = [] + var collapse = !this.display.shift && !doc.extend && doc.sel.somethingSelected() + doc.extendSelectionsBy(function (range) { + if (collapse) + { return dir < 0 ? range.from() : range.to() } + var headPos = cursorCoords(this$1, range.head, "div") + if (range.goalColumn != null) { headPos.left = range.goalColumn } + goals.push(headPos.left) + var pos = findPosV(this$1, headPos, dir, unit) + if (unit == "page" && range == doc.sel.primary()) + { addToScrollPos(this$1, null, charCoords(this$1, pos, "div").top - headPos.top) } + return pos + }, sel_move) + if (goals.length) { for (var i = 0; i < doc.sel.ranges.length; i++) + { doc.sel.ranges[i].goalColumn = goals[i] } } + }), + + // Find the word at the given position (as returned by coordsChar). + findWordAt: function(pos) { + var doc = this.doc, line = getLine(doc, pos.line).text + var start = pos.ch, end = pos.ch + if (line) { + var helper = this.getHelper(pos, "wordChars") + if ((pos.sticky == "before" || end == line.length) && start) { --start; } else { ++end } + var startChar = line.charAt(start) + var check = isWordChar(startChar, helper) + ? function (ch) { return isWordChar(ch, helper); } + : /\s/.test(startChar) ? function (ch) { return /\s/.test(ch); } + : function (ch) { return (!/\s/.test(ch) && !isWordChar(ch)); } + while (start > 0 && check(line.charAt(start - 1))) { --start } + while (end < line.length && check(line.charAt(end))) { ++end } + } + return new Range(Pos(pos.line, start), Pos(pos.line, end)) + }, + + toggleOverwrite: function(value) { + if (value != null && value == this.state.overwrite) { return } + if (this.state.overwrite = !this.state.overwrite) + { addClass(this.display.cursorDiv, "CodeMirror-overwrite") } + else + { rmClass(this.display.cursorDiv, "CodeMirror-overwrite") } + + signal(this, "overwriteToggle", this, this.state.overwrite) + }, + hasFocus: function() { return this.display.input.getField() == activeElt() }, + isReadOnly: function() { return !!(this.options.readOnly || this.doc.cantEdit) }, + + scrollTo: methodOp(function(x, y) { + if (x != null || y != null) { resolveScrollToPos(this) } + if (x != null) { this.curOp.scrollLeft = x } + if (y != null) { this.curOp.scrollTop = y } + }), + getScrollInfo: function() { + var scroller = this.display.scroller + return {left: scroller.scrollLeft, top: scroller.scrollTop, + height: scroller.scrollHeight - scrollGap(this) - this.display.barHeight, + width: scroller.scrollWidth - scrollGap(this) - this.display.barWidth, + clientHeight: displayHeight(this), clientWidth: displayWidth(this)} + }, + + scrollIntoView: methodOp(function(range, margin) { + if (range == null) { + range = {from: this.doc.sel.primary().head, to: null} + if (margin == null) { margin = this.options.cursorScrollMargin } + } else if (typeof range == "number") { + range = {from: Pos(range, 0), to: null} + } else if (range.from == null) { + range = {from: range, to: null} + } + if (!range.to) { range.to = range.from } + range.margin = margin || 0 + + if (range.from.line != null) { + resolveScrollToPos(this) + this.curOp.scrollToPos = range + } else { + var sPos = calculateScrollPos(this, Math.min(range.from.left, range.to.left), + Math.min(range.from.top, range.to.top) - range.margin, + Math.max(range.from.right, range.to.right), + Math.max(range.from.bottom, range.to.bottom) + range.margin) + this.scrollTo(sPos.scrollLeft, sPos.scrollTop) + } + }), + + setSize: methodOp(function(width, height) { + var this$1 = this; + + var interpret = function (val) { return typeof val == "number" || /^\d+$/.test(String(val)) ? val + "px" : val; } + if (width != null) { this.display.wrapper.style.width = interpret(width) } + if (height != null) { this.display.wrapper.style.height = interpret(height) } + if (this.options.lineWrapping) { clearLineMeasurementCache(this) } + var lineNo = this.display.viewFrom + this.doc.iter(lineNo, this.display.viewTo, function (line) { + if (line.widgets) { for (var i = 0; i < line.widgets.length; i++) + { if (line.widgets[i].noHScroll) { regLineChange(this$1, lineNo, "widget"); break } } } + ++lineNo + }) + this.curOp.forceUpdate = true + signal(this, "refresh", this) + }), + + operation: function(f){return runInOp(this, f)}, + + refresh: methodOp(function() { + var oldHeight = this.display.cachedTextHeight + regChange(this) + this.curOp.forceUpdate = true + clearCaches(this) + this.scrollTo(this.doc.scrollLeft, this.doc.scrollTop) + updateGutterSpace(this) + if (oldHeight == null || Math.abs(oldHeight - textHeight(this.display)) > .5) + { estimateLineHeights(this) } + signal(this, "refresh", this) + }), + + swapDoc: methodOp(function(doc) { + var old = this.doc + old.cm = null + attachDoc(this, doc) + clearCaches(this) + this.display.input.reset() + this.scrollTo(doc.scrollLeft, doc.scrollTop) + this.curOp.forceScroll = true + signalLater(this, "swapDoc", this, old) + return old + }), + + getInputField: function(){return this.display.input.getField()}, + getWrapperElement: function(){return this.display.wrapper}, + getScrollerElement: function(){return this.display.scroller}, + getGutterElement: function(){return this.display.gutters} + } + eventMixin(CodeMirror) + + CodeMirror.registerHelper = function(type, name, value) { + if (!helpers.hasOwnProperty(type)) { helpers[type] = CodeMirror[type] = {_global: []} } + helpers[type][name] = value + } + CodeMirror.registerGlobalHelper = function(type, name, predicate, value) { + CodeMirror.registerHelper(type, name, value) + helpers[type]._global.push({pred: predicate, val: value}) + } +} + +// Used for horizontal relative motion. Dir is -1 or 1 (left or +// right), unit can be "char", "column" (like char, but doesn't +// cross line boundaries), "word" (across next word), or "group" (to +// the start of next group of word or non-word-non-whitespace +// chars). The visually param controls whether, in right-to-left +// text, direction 1 means to move towards the next index in the +// string, or towards the character to the right of the current +// position. The resulting position will have a hitSide=true +// property if it reached the end of the document. +function findPosH(doc, pos, dir, unit, visually) { + var oldPos = pos + var origDir = dir + var lineObj = getLine(doc, pos.line) + function findNextLine() { + var l = pos.line + dir + if (l < doc.first || l >= doc.first + doc.size) { return false } + pos = new Pos(l, pos.ch, pos.sticky) + return lineObj = getLine(doc, l) + } + function moveOnce(boundToLine) { + var next + if (visually) { + next = moveVisually(doc.cm, lineObj, pos, dir) + } else { + next = moveLogically(lineObj, pos, dir) + } + if (next == null) { + if (!boundToLine && findNextLine()) + { pos = endOfLine(visually, doc.cm, lineObj, pos.line, dir) } + else + { return false } + } else { + pos = next + } + return true + } + + if (unit == "char") { + moveOnce() + } else if (unit == "column") { + moveOnce(true) + } else if (unit == "word" || unit == "group") { + var sawType = null, group = unit == "group" + var helper = doc.cm && doc.cm.getHelper(pos, "wordChars") + for (var first = true;; first = false) { + if (dir < 0 && !moveOnce(!first)) { break } + var cur = lineObj.text.charAt(pos.ch) || "\n" + var type = isWordChar(cur, helper) ? "w" + : group && cur == "\n" ? "n" + : !group || /\s/.test(cur) ? null + : "p" + if (group && !first && !type) { type = "s" } + if (sawType && sawType != type) { + if (dir < 0) {dir = 1; moveOnce(); pos.sticky = "after"} + break + } + + if (type) { sawType = type } + if (dir > 0 && !moveOnce(!first)) { break } + } + } + var result = skipAtomic(doc, pos, oldPos, origDir, true) + if (equalCursorPos(oldPos, result)) { result.hitSide = true } + return result +} + +// For relative vertical movement. Dir may be -1 or 1. Unit can be +// "page" or "line". The resulting position will have a hitSide=true +// property if it reached the end of the document. +function findPosV(cm, pos, dir, unit) { + var doc = cm.doc, x = pos.left, y + if (unit == "page") { + var pageSize = Math.min(cm.display.wrapper.clientHeight, window.innerHeight || document.documentElement.clientHeight) + var moveAmount = Math.max(pageSize - .5 * textHeight(cm.display), 3) + y = (dir > 0 ? pos.bottom : pos.top) + dir * moveAmount + + } else if (unit == "line") { + y = dir > 0 ? pos.bottom + 3 : pos.top - 3 + } + var target + for (;;) { + target = coordsChar(cm, x, y) + if (!target.outside) { break } + if (dir < 0 ? y <= 0 : y >= doc.height) { target.hitSide = true; break } + y += dir * 5 + } + return target +} + +// CONTENTEDITABLE INPUT STYLE + +var ContentEditableInput = function(cm) { + this.cm = cm + this.lastAnchorNode = this.lastAnchorOffset = this.lastFocusNode = this.lastFocusOffset = null + this.polling = new Delayed() + this.composing = null + this.gracePeriod = false + this.readDOMTimeout = null +}; + +ContentEditableInput.prototype.init = function (display) { + var this$1 = this; + + var input = this, cm = input.cm + var div = input.div = display.lineDiv + disableBrowserMagic(div, cm.options.spellcheck) + + on(div, "paste", function (e) { + if (signalDOMEvent(cm, e) || handlePaste(e, cm)) { return } + // IE doesn't fire input events, so we schedule a read for the pasted content in this way + if (ie_version <= 11) { setTimeout(operation(cm, function () { + if (!input.pollContent()) { regChange(cm) } + }), 20) } + }) + + on(div, "compositionstart", function (e) { + this$1.composing = {data: e.data, done: false} + }) + on(div, "compositionupdate", function (e) { + if (!this$1.composing) { this$1.composing = {data: e.data, done: false} } + }) + on(div, "compositionend", function (e) { + if (this$1.composing) { + if (e.data != this$1.composing.data) { this$1.readFromDOMSoon() } + this$1.composing.done = true + } + }) + + on(div, "touchstart", function () { return input.forceCompositionEnd(); }) + + on(div, "input", function () { + if (!this$1.composing) { this$1.readFromDOMSoon() } + }) + + function onCopyCut(e) { + if (signalDOMEvent(cm, e)) { return } + if (cm.somethingSelected()) { + setLastCopied({lineWise: false, text: cm.getSelections()}) + if (e.type == "cut") { cm.replaceSelection("", null, "cut") } + } else if (!cm.options.lineWiseCopyCut) { + return + } else { + var ranges = copyableRanges(cm) + setLastCopied({lineWise: true, text: ranges.text}) + if (e.type == "cut") { + cm.operation(function () { + cm.setSelections(ranges.ranges, 0, sel_dontScroll) + cm.replaceSelection("", null, "cut") + }) + } + } + if (e.clipboardData) { + e.clipboardData.clearData() + var content = lastCopied.text.join("\n") + // iOS exposes the clipboard API, but seems to discard content inserted into it + e.clipboardData.setData("Text", content) + if (e.clipboardData.getData("Text") == content) { + e.preventDefault() + return + } + } + // Old-fashioned briefly-focus-a-textarea hack + var kludge = hiddenTextarea(), te = kludge.firstChild + cm.display.lineSpace.insertBefore(kludge, cm.display.lineSpace.firstChild) + te.value = lastCopied.text.join("\n") + var hadFocus = document.activeElement + selectInput(te) + setTimeout(function () { + cm.display.lineSpace.removeChild(kludge) + hadFocus.focus() + if (hadFocus == div) { input.showPrimarySelection() } + }, 50) + } + on(div, "copy", onCopyCut) + on(div, "cut", onCopyCut) +}; + +ContentEditableInput.prototype.prepareSelection = function () { + var result = prepareSelection(this.cm, false) + result.focus = this.cm.state.focused + return result +}; + +ContentEditableInput.prototype.showSelection = function (info, takeFocus) { + if (!info || !this.cm.display.view.length) { return } + if (info.focus || takeFocus) { this.showPrimarySelection() } + this.showMultipleSelections(info) +}; + +ContentEditableInput.prototype.showPrimarySelection = function () { + var sel = window.getSelection(), prim = this.cm.doc.sel.primary() + var curAnchor = domToPos(this.cm, sel.anchorNode, sel.anchorOffset) + var curFocus = domToPos(this.cm, sel.focusNode, sel.focusOffset) + if (curAnchor && !curAnchor.bad && curFocus && !curFocus.bad && + cmp(minPos(curAnchor, curFocus), prim.from()) == 0 && + cmp(maxPos(curAnchor, curFocus), prim.to()) == 0) + { return } + + var start = posToDOM(this.cm, prim.from()) + var end = posToDOM(this.cm, prim.to()) + if (!start && !end) { return } + + var view = this.cm.display.view + var old = sel.rangeCount && sel.getRangeAt(0) + if (!start) { + start = {node: view[0].measure.map[2], offset: 0} + } else if (!end) { // FIXME dangerously hacky + var measure = view[view.length - 1].measure + var map = measure.maps ? measure.maps[measure.maps.length - 1] : measure.map + end = {node: map[map.length - 1], offset: map[map.length - 2] - map[map.length - 3]} + } + + var rng + try { rng = range(start.node, start.offset, end.offset, end.node) } + catch(e) {} // Our model of the DOM might be outdated, in which case the range we try to set can be impossible + if (rng) { + if (!gecko && this.cm.state.focused) { + sel.collapse(start.node, start.offset) + if (!rng.collapsed) { + sel.removeAllRanges() + sel.addRange(rng) + } + } else { + sel.removeAllRanges() + sel.addRange(rng) + } + if (old && sel.anchorNode == null) { sel.addRange(old) } + else if (gecko) { this.startGracePeriod() } + } + this.rememberSelection() +}; + +ContentEditableInput.prototype.startGracePeriod = function () { + var this$1 = this; + + clearTimeout(this.gracePeriod) + this.gracePeriod = setTimeout(function () { + this$1.gracePeriod = false + if (this$1.selectionChanged()) + { this$1.cm.operation(function () { return this$1.cm.curOp.selectionChanged = true; }) } + }, 20) +}; + +ContentEditableInput.prototype.showMultipleSelections = function (info) { + removeChildrenAndAdd(this.cm.display.cursorDiv, info.cursors) + removeChildrenAndAdd(this.cm.display.selectionDiv, info.selection) +}; + +ContentEditableInput.prototype.rememberSelection = function () { + var sel = window.getSelection() + this.lastAnchorNode = sel.anchorNode; this.lastAnchorOffset = sel.anchorOffset + this.lastFocusNode = sel.focusNode; this.lastFocusOffset = sel.focusOffset +}; + +ContentEditableInput.prototype.selectionInEditor = function () { + var sel = window.getSelection() + if (!sel.rangeCount) { return false } + var node = sel.getRangeAt(0).commonAncestorContainer + return contains(this.div, node) +}; + +ContentEditableInput.prototype.focus = function () { + if (this.cm.options.readOnly != "nocursor") { + if (!this.selectionInEditor()) + { this.showSelection(this.prepareSelection(), true) } + this.div.focus() + } +}; +ContentEditableInput.prototype.blur = function () { this.div.blur() }; +ContentEditableInput.prototype.getField = function () { return this.div }; + +ContentEditableInput.prototype.supportsTouch = function () { return true }; + +ContentEditableInput.prototype.receivedFocus = function () { + var input = this + if (this.selectionInEditor()) + { this.pollSelection() } + else + { runInOp(this.cm, function () { return input.cm.curOp.selectionChanged = true; }) } + + function poll() { + if (input.cm.state.focused) { + input.pollSelection() + input.polling.set(input.cm.options.pollInterval, poll) + } + } + this.polling.set(this.cm.options.pollInterval, poll) +}; + +ContentEditableInput.prototype.selectionChanged = function () { + var sel = window.getSelection() + return sel.anchorNode != this.lastAnchorNode || sel.anchorOffset != this.lastAnchorOffset || + sel.focusNode != this.lastFocusNode || sel.focusOffset != this.lastFocusOffset +}; + +ContentEditableInput.prototype.pollSelection = function () { + if (!this.composing && this.readDOMTimeout == null && !this.gracePeriod && this.selectionChanged()) { + var sel = window.getSelection(), cm = this.cm + this.rememberSelection() + var anchor = domToPos(cm, sel.anchorNode, sel.anchorOffset) + var head = domToPos(cm, sel.focusNode, sel.focusOffset) + if (anchor && head) { runInOp(cm, function () { + setSelection(cm.doc, simpleSelection(anchor, head), sel_dontScroll) + if (anchor.bad || head.bad) { cm.curOp.selectionChanged = true } + }) } + } +}; + +ContentEditableInput.prototype.pollContent = function () { + if (this.readDOMTimeout != null) { + clearTimeout(this.readDOMTimeout) + this.readDOMTimeout = null + } + + var cm = this.cm, display = cm.display, sel = cm.doc.sel.primary() + var from = sel.from(), to = sel.to() + if (from.ch == 0 && from.line > cm.firstLine()) + { from = Pos(from.line - 1, getLine(cm.doc, from.line - 1).length) } + if (to.ch == getLine(cm.doc, to.line).text.length && to.line < cm.lastLine()) + { to = Pos(to.line + 1, 0) } + if (from.line < display.viewFrom || to.line > display.viewTo - 1) { return false } + + var fromIndex, fromLine, fromNode + if (from.line == display.viewFrom || (fromIndex = findViewIndex(cm, from.line)) == 0) { + fromLine = lineNo(display.view[0].line) + fromNode = display.view[0].node + } else { + fromLine = lineNo(display.view[fromIndex].line) + fromNode = display.view[fromIndex - 1].node.nextSibling + } + var toIndex = findViewIndex(cm, to.line) + var toLine, toNode + if (toIndex == display.view.length - 1) { + toLine = display.viewTo - 1 + toNode = display.lineDiv.lastChild + } else { + toLine = lineNo(display.view[toIndex + 1].line) - 1 + toNode = display.view[toIndex + 1].node.previousSibling + } + + if (!fromNode) { return false } + var newText = cm.doc.splitLines(domTextBetween(cm, fromNode, toNode, fromLine, toLine)) + var oldText = getBetween(cm.doc, Pos(fromLine, 0), Pos(toLine, getLine(cm.doc, toLine).text.length)) + while (newText.length > 1 && oldText.length > 1) { + if (lst(newText) == lst(oldText)) { newText.pop(); oldText.pop(); toLine-- } + else if (newText[0] == oldText[0]) { newText.shift(); oldText.shift(); fromLine++ } + else { break } + } + + var cutFront = 0, cutEnd = 0 + var newTop = newText[0], oldTop = oldText[0], maxCutFront = Math.min(newTop.length, oldTop.length) + while (cutFront < maxCutFront && newTop.charCodeAt(cutFront) == oldTop.charCodeAt(cutFront)) + { ++cutFront } + var newBot = lst(newText), oldBot = lst(oldText) + var maxCutEnd = Math.min(newBot.length - (newText.length == 1 ? cutFront : 0), + oldBot.length - (oldText.length == 1 ? cutFront : 0)) + while (cutEnd < maxCutEnd && + newBot.charCodeAt(newBot.length - cutEnd - 1) == oldBot.charCodeAt(oldBot.length - cutEnd - 1)) + { ++cutEnd } + + newText[newText.length - 1] = newBot.slice(0, newBot.length - cutEnd).replace(/^\u200b+/, "") + newText[0] = newText[0].slice(cutFront).replace(/\u200b+$/, "") + + var chFrom = Pos(fromLine, cutFront) + var chTo = Pos(toLine, oldText.length ? lst(oldText).length - cutEnd : 0) + if (newText.length > 1 || newText[0] || cmp(chFrom, chTo)) { + replaceRange(cm.doc, newText, chFrom, chTo, "+input") + return true + } +}; + +ContentEditableInput.prototype.ensurePolled = function () { + this.forceCompositionEnd() +}; +ContentEditableInput.prototype.reset = function () { + this.forceCompositionEnd() +}; +ContentEditableInput.prototype.forceCompositionEnd = function () { + if (!this.composing) { return } + clearTimeout(this.readDOMTimeout) + this.composing = null + if (!this.pollContent()) { regChange(this.cm) } + this.div.blur() + this.div.focus() +}; +ContentEditableInput.prototype.readFromDOMSoon = function () { + var this$1 = this; + + if (this.readDOMTimeout != null) { return } + this.readDOMTimeout = setTimeout(function () { + this$1.readDOMTimeout = null + if (this$1.composing) { + if (this$1.composing.done) { this$1.composing = null } + else { return } + } + if (this$1.cm.isReadOnly() || !this$1.pollContent()) + { runInOp(this$1.cm, function () { return regChange(this$1.cm); }) } + }, 80) +}; + +ContentEditableInput.prototype.setUneditable = function (node) { + node.contentEditable = "false" +}; + +ContentEditableInput.prototype.onKeyPress = function (e) { + if (e.charCode == 0) { return } + e.preventDefault() + if (!this.cm.isReadOnly()) + { operation(this.cm, applyTextInput)(this.cm, String.fromCharCode(e.charCode == null ? e.keyCode : e.charCode), 0) } +}; + +ContentEditableInput.prototype.readOnlyChanged = function (val) { + this.div.contentEditable = String(val != "nocursor") +}; + +ContentEditableInput.prototype.onContextMenu = function () {}; +ContentEditableInput.prototype.resetPosition = function () {}; + +ContentEditableInput.prototype.needsContentAttribute = true + +function posToDOM(cm, pos) { + var view = findViewForLine(cm, pos.line) + if (!view || view.hidden) { return null } + var line = getLine(cm.doc, pos.line) + var info = mapFromLineView(view, line, pos.line) + + var order = getOrder(line), side = "left" + if (order) { + var partPos = getBidiPartAt(order, pos.ch) + side = partPos % 2 ? "right" : "left" + } + var result = nodeAndOffsetInLineMap(info.map, pos.ch, side) + result.offset = result.collapse == "right" ? result.end : result.start + return result +} + +function badPos(pos, bad) { if (bad) { pos.bad = true; } return pos } + +function domTextBetween(cm, from, to, fromLine, toLine) { + var text = "", closing = false, lineSep = cm.doc.lineSeparator() + function recognizeMarker(id) { return function (marker) { return marker.id == id; } } + function walk(node) { + if (node.nodeType == 1) { + var cmText = node.getAttribute("cm-text") + if (cmText != null) { + if (cmText == "") { text += node.textContent.replace(/\u200b/g, "") } + else { text += cmText } + return + } + var markerID = node.getAttribute("cm-marker"), range + if (markerID) { + var found = cm.findMarks(Pos(fromLine, 0), Pos(toLine + 1, 0), recognizeMarker(+markerID)) + if (found.length && (range = found[0].find())) + { text += getBetween(cm.doc, range.from, range.to).join(lineSep) } + return + } + if (node.getAttribute("contenteditable") == "false") { return } + for (var i = 0; i < node.childNodes.length; i++) + { walk(node.childNodes[i]) } + if (/^(pre|div|p)$/i.test(node.nodeName)) + { closing = true } + } else if (node.nodeType == 3) { + var val = node.nodeValue + if (!val) { return } + if (closing) { + text += lineSep + closing = false + } + text += val + } + } + for (;;) { + walk(from) + if (from == to) { break } + from = from.nextSibling + } + return text +} + +function domToPos(cm, node, offset) { + var lineNode + if (node == cm.display.lineDiv) { + lineNode = cm.display.lineDiv.childNodes[offset] + if (!lineNode) { return badPos(cm.clipPos(Pos(cm.display.viewTo - 1)), true) } + node = null; offset = 0 + } else { + for (lineNode = node;; lineNode = lineNode.parentNode) { + if (!lineNode || lineNode == cm.display.lineDiv) { return null } + if (lineNode.parentNode && lineNode.parentNode == cm.display.lineDiv) { break } + } + } + for (var i = 0; i < cm.display.view.length; i++) { + var lineView = cm.display.view[i] + if (lineView.node == lineNode) + { return locateNodeInLineView(lineView, node, offset) } + } +} + +function locateNodeInLineView(lineView, node, offset) { + var wrapper = lineView.text.firstChild, bad = false + if (!node || !contains(wrapper, node)) { return badPos(Pos(lineNo(lineView.line), 0), true) } + if (node == wrapper) { + bad = true + node = wrapper.childNodes[offset] + offset = 0 + if (!node) { + var line = lineView.rest ? lst(lineView.rest) : lineView.line + return badPos(Pos(lineNo(line), line.text.length), bad) + } + } + + var textNode = node.nodeType == 3 ? node : null, topNode = node + if (!textNode && node.childNodes.length == 1 && node.firstChild.nodeType == 3) { + textNode = node.firstChild + if (offset) { offset = textNode.nodeValue.length } + } + while (topNode.parentNode != wrapper) { topNode = topNode.parentNode } + var measure = lineView.measure, maps = measure.maps + + function find(textNode, topNode, offset) { + for (var i = -1; i < (maps ? maps.length : 0); i++) { + var map = i < 0 ? measure.map : maps[i] + for (var j = 0; j < map.length; j += 3) { + var curNode = map[j + 2] + if (curNode == textNode || curNode == topNode) { + var line = lineNo(i < 0 ? lineView.line : lineView.rest[i]) + var ch = map[j] + offset + if (offset < 0 || curNode != textNode) { ch = map[j + (offset ? 1 : 0)] } + return Pos(line, ch) + } + } + } + } + var found = find(textNode, topNode, offset) + if (found) { return badPos(found, bad) } + + // FIXME this is all really shaky. might handle the few cases it needs to handle, but likely to cause problems + for (var after = topNode.nextSibling, dist = textNode ? textNode.nodeValue.length - offset : 0; after; after = after.nextSibling) { + found = find(after, after.firstChild, 0) + if (found) + { return badPos(Pos(found.line, found.ch - dist), bad) } + else + { dist += after.textContent.length } + } + for (var before = topNode.previousSibling, dist$1 = offset; before; before = before.previousSibling) { + found = find(before, before.firstChild, -1) + if (found) + { return badPos(Pos(found.line, found.ch + dist$1), bad) } + else + { dist$1 += before.textContent.length } + } +} + +// TEXTAREA INPUT STYLE + +var TextareaInput = function(cm) { + this.cm = cm + // See input.poll and input.reset + this.prevInput = "" + + // Flag that indicates whether we expect input to appear real soon + // now (after some event like 'keypress' or 'input') and are + // polling intensively. + this.pollingFast = false + // Self-resetting timeout for the poller + this.polling = new Delayed() + // Tracks when input.reset has punted to just putting a short + // string into the textarea instead of the full selection. + this.inaccurateSelection = false + // Used to work around IE issue with selection being forgotten when focus moves away from textarea + this.hasSelection = false + this.composing = null +}; + +TextareaInput.prototype.init = function (display) { + var this$1 = this; + + var input = this, cm = this.cm + + // Wraps and hides input textarea + var div = this.wrapper = hiddenTextarea() + // The semihidden textarea that is focused when the editor is + // focused, and receives input. + var te = this.textarea = div.firstChild + display.wrapper.insertBefore(div, display.wrapper.firstChild) + + // Needed to hide big blue blinking cursor on Mobile Safari (doesn't seem to work in iOS 8 anymore) + if (ios) { te.style.width = "0px" } + + on(te, "input", function () { + if (ie && ie_version >= 9 && this$1.hasSelection) { this$1.hasSelection = null } + input.poll() + }) + + on(te, "paste", function (e) { + if (signalDOMEvent(cm, e) || handlePaste(e, cm)) { return } + + cm.state.pasteIncoming = true + input.fastPoll() + }) + + function prepareCopyCut(e) { + if (signalDOMEvent(cm, e)) { return } + if (cm.somethingSelected()) { + setLastCopied({lineWise: false, text: cm.getSelections()}) + if (input.inaccurateSelection) { + input.prevInput = "" + input.inaccurateSelection = false + te.value = lastCopied.text.join("\n") + selectInput(te) + } + } else if (!cm.options.lineWiseCopyCut) { + return + } else { + var ranges = copyableRanges(cm) + setLastCopied({lineWise: true, text: ranges.text}) + if (e.type == "cut") { + cm.setSelections(ranges.ranges, null, sel_dontScroll) + } else { + input.prevInput = "" + te.value = ranges.text.join("\n") + selectInput(te) + } + } + if (e.type == "cut") { cm.state.cutIncoming = true } + } + on(te, "cut", prepareCopyCut) + on(te, "copy", prepareCopyCut) + + on(display.scroller, "paste", function (e) { + if (eventInWidget(display, e) || signalDOMEvent(cm, e)) { return } + cm.state.pasteIncoming = true + input.focus() + }) + + // Prevent normal selection in the editor (we handle our own) + on(display.lineSpace, "selectstart", function (e) { + if (!eventInWidget(display, e)) { e_preventDefault(e) } + }) + + on(te, "compositionstart", function () { + var start = cm.getCursor("from") + if (input.composing) { input.composing.range.clear() } + input.composing = { + start: start, + range: cm.markText(start, cm.getCursor("to"), {className: "CodeMirror-composing"}) + } + }) + on(te, "compositionend", function () { + if (input.composing) { + input.poll() + input.composing.range.clear() + input.composing = null + } + }) +}; + +TextareaInput.prototype.prepareSelection = function () { + // Redraw the selection and/or cursor + var cm = this.cm, display = cm.display, doc = cm.doc + var result = prepareSelection(cm) + + // Move the hidden textarea near the cursor to prevent scrolling artifacts + if (cm.options.moveInputWithCursor) { + var headPos = cursorCoords(cm, doc.sel.primary().head, "div") + var wrapOff = display.wrapper.getBoundingClientRect(), lineOff = display.lineDiv.getBoundingClientRect() + result.teTop = Math.max(0, Math.min(display.wrapper.clientHeight - 10, + headPos.top + lineOff.top - wrapOff.top)) + result.teLeft = Math.max(0, Math.min(display.wrapper.clientWidth - 10, + headPos.left + lineOff.left - wrapOff.left)) + } + + return result +}; + +TextareaInput.prototype.showSelection = function (drawn) { + var cm = this.cm, display = cm.display + removeChildrenAndAdd(display.cursorDiv, drawn.cursors) + removeChildrenAndAdd(display.selectionDiv, drawn.selection) + if (drawn.teTop != null) { + this.wrapper.style.top = drawn.teTop + "px" + this.wrapper.style.left = drawn.teLeft + "px" + } +}; + +// Reset the input to correspond to the selection (or to be empty, +// when not typing and nothing is selected) +TextareaInput.prototype.reset = function (typing) { + if (this.contextMenuPending) { return } + var minimal, selected, cm = this.cm, doc = cm.doc + if (cm.somethingSelected()) { + this.prevInput = "" + var range = doc.sel.primary() + minimal = hasCopyEvent && + (range.to().line - range.from().line > 100 || (selected = cm.getSelection()).length > 1000) + var content = minimal ? "-" : selected || cm.getSelection() + this.textarea.value = content + if (cm.state.focused) { selectInput(this.textarea) } + if (ie && ie_version >= 9) { this.hasSelection = content } + } else if (!typing) { + this.prevInput = this.textarea.value = "" + if (ie && ie_version >= 9) { this.hasSelection = null } + } + this.inaccurateSelection = minimal +}; + +TextareaInput.prototype.getField = function () { return this.textarea }; + +TextareaInput.prototype.supportsTouch = function () { return false }; + +TextareaInput.prototype.focus = function () { + if (this.cm.options.readOnly != "nocursor" && (!mobile || activeElt() != this.textarea)) { + try { this.textarea.focus() } + catch (e) {} // IE8 will throw if the textarea is display: none or not in DOM + } +}; + +TextareaInput.prototype.blur = function () { this.textarea.blur() }; + +TextareaInput.prototype.resetPosition = function () { + this.wrapper.style.top = this.wrapper.style.left = 0 +}; + +TextareaInput.prototype.receivedFocus = function () { this.slowPoll() }; + +// Poll for input changes, using the normal rate of polling. This +// runs as long as the editor is focused. +TextareaInput.prototype.slowPoll = function () { + var this$1 = this; + + if (this.pollingFast) { return } + this.polling.set(this.cm.options.pollInterval, function () { + this$1.poll() + if (this$1.cm.state.focused) { this$1.slowPoll() } + }) +}; + +// When an event has just come in that is likely to add or change +// something in the input textarea, we poll faster, to ensure that +// the change appears on the screen quickly. +TextareaInput.prototype.fastPoll = function () { + var missed = false, input = this + input.pollingFast = true + function p() { + var changed = input.poll() + if (!changed && !missed) {missed = true; input.polling.set(60, p)} + else {input.pollingFast = false; input.slowPoll()} + } + input.polling.set(20, p) +}; + +// Read input from the textarea, and update the document to match. +// When something is selected, it is present in the textarea, and +// selected (unless it is huge, in which case a placeholder is +// used). When nothing is selected, the cursor sits after previously +// seen text (can be empty), which is stored in prevInput (we must +// not reset the textarea when typing, because that breaks IME). +TextareaInput.prototype.poll = function () { + var this$1 = this; + + var cm = this.cm, input = this.textarea, prevInput = this.prevInput + // Since this is called a *lot*, try to bail out as cheaply as + // possible when it is clear that nothing happened. hasSelection + // will be the case when there is a lot of text in the textarea, + // in which case reading its value would be expensive. + if (this.contextMenuPending || !cm.state.focused || + (hasSelection(input) && !prevInput && !this.composing) || + cm.isReadOnly() || cm.options.disableInput || cm.state.keySeq) + { return false } + + var text = input.value + // If nothing changed, bail. + if (text == prevInput && !cm.somethingSelected()) { return false } + // Work around nonsensical selection resetting in IE9/10, and + // inexplicable appearance of private area unicode characters on + // some key combos in Mac (#2689). + if (ie && ie_version >= 9 && this.hasSelection === text || + mac && /[\uf700-\uf7ff]/.test(text)) { + cm.display.input.reset() + return false + } + + if (cm.doc.sel == cm.display.selForContextMenu) { + var first = text.charCodeAt(0) + if (first == 0x200b && !prevInput) { prevInput = "\u200b" } + if (first == 0x21da) { this.reset(); return this.cm.execCommand("undo") } + } + // Find the part of the input that is actually new + var same = 0, l = Math.min(prevInput.length, text.length) + while (same < l && prevInput.charCodeAt(same) == text.charCodeAt(same)) { ++same } + + runInOp(cm, function () { + applyTextInput(cm, text.slice(same), prevInput.length - same, + null, this$1.composing ? "*compose" : null) + + // Don't leave long text in the textarea, since it makes further polling slow + if (text.length > 1000 || text.indexOf("\n") > -1) { input.value = this$1.prevInput = "" } + else { this$1.prevInput = text } + + if (this$1.composing) { + this$1.composing.range.clear() + this$1.composing.range = cm.markText(this$1.composing.start, cm.getCursor("to"), + {className: "CodeMirror-composing"}) + } + }) + return true +}; + +TextareaInput.prototype.ensurePolled = function () { + if (this.pollingFast && this.poll()) { this.pollingFast = false } +}; + +TextareaInput.prototype.onKeyPress = function () { + if (ie && ie_version >= 9) { this.hasSelection = null } + this.fastPoll() +}; + +TextareaInput.prototype.onContextMenu = function (e) { + var input = this, cm = input.cm, display = cm.display, te = input.textarea + var pos = posFromMouse(cm, e), scrollPos = display.scroller.scrollTop + if (!pos || presto) { return } // Opera is difficult. + + // Reset the current text selection only if the click is done outside of the selection + // and 'resetSelectionOnContextMenu' option is true. + var reset = cm.options.resetSelectionOnContextMenu + if (reset && cm.doc.sel.contains(pos) == -1) + { operation(cm, setSelection)(cm.doc, simpleSelection(pos), sel_dontScroll) } + + var oldCSS = te.style.cssText, oldWrapperCSS = input.wrapper.style.cssText + input.wrapper.style.cssText = "position: absolute" + var wrapperBox = input.wrapper.getBoundingClientRect() + te.style.cssText = "position: absolute; width: 30px; height: 30px;\n top: " + (e.clientY - wrapperBox.top - 5) + "px; left: " + (e.clientX - wrapperBox.left - 5) + "px;\n z-index: 1000; background: " + (ie ? "rgba(255, 255, 255, .05)" : "transparent") + ";\n outline: none; border-width: 0; outline: none; overflow: hidden; opacity: .05; filter: alpha(opacity=5);" + var oldScrollY + if (webkit) { oldScrollY = window.scrollY } // Work around Chrome issue (#2712) + display.input.focus() + if (webkit) { window.scrollTo(null, oldScrollY) } + display.input.reset() + // Adds "Select all" to context menu in FF + if (!cm.somethingSelected()) { te.value = input.prevInput = " " } + input.contextMenuPending = true + display.selForContextMenu = cm.doc.sel + clearTimeout(display.detectingSelectAll) + + // Select-all will be greyed out if there's nothing to select, so + // this adds a zero-width space so that we can later check whether + // it got selected. + function prepareSelectAllHack() { + if (te.selectionStart != null) { + var selected = cm.somethingSelected() + var extval = "\u200b" + (selected ? te.value : "") + te.value = "\u21da" // Used to catch context-menu undo + te.value = extval + input.prevInput = selected ? "" : "\u200b" + te.selectionStart = 1; te.selectionEnd = extval.length + // Re-set this, in case some other handler touched the + // selection in the meantime. + display.selForContextMenu = cm.doc.sel + } + } + function rehide() { + input.contextMenuPending = false + input.wrapper.style.cssText = oldWrapperCSS + te.style.cssText = oldCSS + if (ie && ie_version < 9) { display.scrollbars.setScrollTop(display.scroller.scrollTop = scrollPos) } + + // Try to detect the user choosing select-all + if (te.selectionStart != null) { + if (!ie || (ie && ie_version < 9)) { prepareSelectAllHack() } + var i = 0, poll = function () { + if (display.selForContextMenu == cm.doc.sel && te.selectionStart == 0 && + te.selectionEnd > 0 && input.prevInput == "\u200b") { + operation(cm, selectAll)(cm) + } else if (i++ < 10) { + display.detectingSelectAll = setTimeout(poll, 500) + } else { + display.selForContextMenu = null + display.input.reset() + } + } + display.detectingSelectAll = setTimeout(poll, 200) + } + } + + if (ie && ie_version >= 9) { prepareSelectAllHack() } + if (captureRightClick) { + e_stop(e) + var mouseup = function () { + off(window, "mouseup", mouseup) + setTimeout(rehide, 20) + } + on(window, "mouseup", mouseup) + } else { + setTimeout(rehide, 50) + } +}; + +TextareaInput.prototype.readOnlyChanged = function (val) { + if (!val) { this.reset() } +}; + +TextareaInput.prototype.setUneditable = function () {}; + +TextareaInput.prototype.needsContentAttribute = false + +function fromTextArea(textarea, options) { + options = options ? copyObj(options) : {} + options.value = textarea.value + if (!options.tabindex && textarea.tabIndex) + { options.tabindex = textarea.tabIndex } + if (!options.placeholder && textarea.placeholder) + { options.placeholder = textarea.placeholder } + // Set autofocus to true if this textarea is focused, or if it has + // autofocus and no other element is focused. + if (options.autofocus == null) { + var hasFocus = activeElt() + options.autofocus = hasFocus == textarea || + textarea.getAttribute("autofocus") != null && hasFocus == document.body + } + + function save() {textarea.value = cm.getValue()} + + var realSubmit + if (textarea.form) { + on(textarea.form, "submit", save) + // Deplorable hack to make the submit method do the right thing. + if (!options.leaveSubmitMethodAlone) { + var form = textarea.form + realSubmit = form.submit + try { + var wrappedSubmit = form.submit = function () { + save() + form.submit = realSubmit + form.submit() + form.submit = wrappedSubmit + } + } catch(e) {} + } + } + + options.finishInit = function (cm) { + cm.save = save + cm.getTextArea = function () { return textarea; } + cm.toTextArea = function () { + cm.toTextArea = isNaN // Prevent this from being ran twice + save() + textarea.parentNode.removeChild(cm.getWrapperElement()) + textarea.style.display = "" + if (textarea.form) { + off(textarea.form, "submit", save) + if (typeof textarea.form.submit == "function") + { textarea.form.submit = realSubmit } + } + } + } + + textarea.style.display = "none" + var cm = CodeMirror(function (node) { return textarea.parentNode.insertBefore(node, textarea.nextSibling); }, + options) + return cm +} + +function addLegacyProps(CodeMirror) { + CodeMirror.off = off + CodeMirror.on = on + CodeMirror.wheelEventPixels = wheelEventPixels + CodeMirror.Doc = Doc + CodeMirror.splitLines = splitLinesAuto + CodeMirror.countColumn = countColumn + CodeMirror.findColumn = findColumn + CodeMirror.isWordChar = isWordCharBasic + CodeMirror.Pass = Pass + CodeMirror.signal = signal + CodeMirror.Line = Line + CodeMirror.changeEnd = changeEnd + CodeMirror.scrollbarModel = scrollbarModel + CodeMirror.Pos = Pos + CodeMirror.cmpPos = cmp + CodeMirror.modes = modes + CodeMirror.mimeModes = mimeModes + CodeMirror.resolveMode = resolveMode + CodeMirror.getMode = getMode + CodeMirror.modeExtensions = modeExtensions + CodeMirror.extendMode = extendMode + CodeMirror.copyState = copyState + CodeMirror.startState = startState + CodeMirror.innerMode = innerMode + CodeMirror.commands = commands + CodeMirror.keyMap = keyMap + CodeMirror.keyName = keyName + CodeMirror.isModifierKey = isModifierKey + CodeMirror.lookupKey = lookupKey + CodeMirror.normalizeKeyMap = normalizeKeyMap + CodeMirror.StringStream = StringStream + CodeMirror.SharedTextMarker = SharedTextMarker + CodeMirror.TextMarker = TextMarker + CodeMirror.LineWidget = LineWidget + CodeMirror.e_preventDefault = e_preventDefault + CodeMirror.e_stopPropagation = e_stopPropagation + CodeMirror.e_stop = e_stop + CodeMirror.addClass = addClass + CodeMirror.contains = contains + CodeMirror.rmClass = rmClass + CodeMirror.keyNames = keyNames +} + +// EDITOR CONSTRUCTOR + +defineOptions(CodeMirror) + +addEditorMethods(CodeMirror) + +// Set up methods on CodeMirror's prototype to redirect to the editor's document. +var dontDelegate = "iter insert remove copy getEditor constructor".split(" ") +for (var prop in Doc.prototype) { if (Doc.prototype.hasOwnProperty(prop) && indexOf(dontDelegate, prop) < 0) + { CodeMirror.prototype[prop] = (function(method) { + return function() {return method.apply(this.doc, arguments)} + })(Doc.prototype[prop]) } } + +eventMixin(Doc) + +// INPUT HANDLING + +CodeMirror.inputStyles = {"textarea": TextareaInput, "contenteditable": ContentEditableInput} + +// MODE DEFINITION AND QUERYING + +// Extra arguments are stored as the mode's dependencies, which is +// used by (legacy) mechanisms like loadmode.js to automatically +// load a mode. (Preferred mechanism is the require/define calls.) +CodeMirror.defineMode = function(name/*, mode, …*/) { + if (!CodeMirror.defaults.mode && name != "null") { CodeMirror.defaults.mode = name } + defineMode.apply(this, arguments) +} + +CodeMirror.defineMIME = defineMIME + +// Minimal default mode. +CodeMirror.defineMode("null", function () { return ({token: function (stream) { return stream.skipToEnd(); }}); }) +CodeMirror.defineMIME("text/plain", "null") + +// EXTENSIONS + +CodeMirror.defineExtension = function (name, func) { + CodeMirror.prototype[name] = func +} +CodeMirror.defineDocExtension = function (name, func) { + Doc.prototype[name] = func +} + +CodeMirror.fromTextArea = fromTextArea + +addLegacyProps(CodeMirror) + +CodeMirror.version = "5.24.2" + +return CodeMirror; + +}))); \ No newline at end of file diff --git a/docs/archive/1.0/sql/tutorial/js/docs.min.js b/docs/archive/1.0/sql/tutorial/js/docs.min.js new file mode 100644 index 00000000000..5a494818504 --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/js/docs.min.js @@ -0,0 +1,26 @@ +/*! + +Holder - client side image placeholders +Version 2.6.0+51ebp +© 2015 Ivan Malopinsky - http://imsky.co + +Site: http://holderjs.com +Issues: https://github.com/imsky/holder/issues +License: http://opensource.org/licenses/MIT + +*/ +function AnchorJS(a){"use strict";function b(a){a.icon=a.hasOwnProperty("icon")?a.icon:"",a.visible=a.hasOwnProperty("visible")?a.visible:"hover",a.placement=a.hasOwnProperty("placement")?a.placement:"right",a["class"]=a.hasOwnProperty("class")?a["class"]:"",a.truncate=a.hasOwnProperty("truncate")?Math.floor(a.truncate):64}function c(a){var b;if("string"==typeof a||a instanceof String)b=[].slice.call(document.querySelectorAll(a));else{if(!(Array.isArray(a)||a instanceof NodeList))throw new Error("The selector provided to AnchorJS was invalid.");b=[].slice.call(a)}return b}function d(){if(null===document.head.querySelector("style.anchorjs")){var a,b=document.createElement("style"),c=" .anchorjs-link { opacity: 0; text-decoration: none; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; }",d=" *:hover > .anchorjs-link, .anchorjs-link:focus { opacity: 1; }",e=' @font-face { font-family: "anchorjs-icons"; font-style: normal; font-weight: normal; src: url(data:application/x-font-ttf;charset=utf-8;base64,AAEAAAALAIAAAwAwT1MvMg8SBTUAAAC8AAAAYGNtYXAWi9QdAAABHAAAAFRnYXNwAAAAEAAAAXAAAAAIZ2x5Zgq29TcAAAF4AAABNGhlYWQEZM3pAAACrAAAADZoaGVhBhUDxgAAAuQAAAAkaG10eASAADEAAAMIAAAAFGxvY2EAKACuAAADHAAAAAxtYXhwAAgAVwAAAygAAAAgbmFtZQ5yJ3cAAANIAAAB2nBvc3QAAwAAAAAFJAAAACAAAwJAAZAABQAAApkCzAAAAI8CmQLMAAAB6wAzAQkAAAAAAAAAAAAAAAAAAAABEAAAAAAAAAAAAAAAAAAAAABAAADpywPA/8AAQAPAAEAAAAABAAAAAAAAAAAAAAAgAAAAAAADAAAAAwAAABwAAQADAAAAHAADAAEAAAAcAAQAOAAAAAoACAACAAIAAQAg6cv//f//AAAAAAAg6cv//f//AAH/4xY5AAMAAQAAAAAAAAAAAAAAAQAB//8ADwABAAAAAAAAAAAAAgAANzkBAAAAAAEAAAAAAAAAAAACAAA3OQEAAAAAAQAAAAAAAAAAAAIAADc5AQAAAAACADEARAJTAsAAKwBUAAABIiYnJjQ/AT4BMzIWFxYUDwEGIicmND8BNjQnLgEjIgYPAQYUFxYUBw4BIwciJicmND8BNjIXFhQPAQYUFx4BMzI2PwE2NCcmNDc2MhcWFA8BDgEjARQGDAUtLXoWOR8fORYtLTgKGwoKCjgaGg0gEhIgDXoaGgkJBQwHdR85Fi0tOAobCgoKOBoaDSASEiANehoaCQkKGwotLXoWOR8BMwUFLYEuehYXFxYugC44CQkKGwo4GkoaDQ0NDXoaShoKGwoFBe8XFi6ALjgJCQobCjgaShoNDQ0NehpKGgobCgoKLYEuehYXAAEAAAABAACiToc1Xw889QALBAAAAAAA0XnFFgAAAADRecUWAAAAAAJTAsAAAAAIAAIAAAAAAAAAAQAAA8D/wAAABAAAAAAAAlMAAQAAAAAAAAAAAAAAAAAAAAUAAAAAAAAAAAAAAAACAAAAAoAAMQAAAAAACgAUAB4AmgABAAAABQBVAAIAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAAADgCuAAEAAAAAAAEADgAAAAEAAAAAAAIABwCfAAEAAAAAAAMADgBLAAEAAAAAAAQADgC0AAEAAAAAAAUACwAqAAEAAAAAAAYADgB1AAEAAAAAAAoAGgDeAAMAAQQJAAEAHAAOAAMAAQQJAAIADgCmAAMAAQQJAAMAHABZAAMAAQQJAAQAHADCAAMAAQQJAAUAFgA1AAMAAQQJAAYAHACDAAMAAQQJAAoANAD4YW5jaG9yanMtaWNvbnMAYQBuAGMAaABvAHIAagBzAC0AaQBjAG8AbgBzVmVyc2lvbiAxLjAAVgBlAHIAcwBpAG8AbgAgADEALgAwYW5jaG9yanMtaWNvbnMAYQBuAGMAaABvAHIAagBzAC0AaQBjAG8AbgBzYW5jaG9yanMtaWNvbnMAYQBuAGMAaABvAHIAagBzAC0AaQBjAG8AbgBzUmVndWxhcgBSAGUAZwB1AGwAYQByYW5jaG9yanMtaWNvbnMAYQBuAGMAaABvAHIAagBzAC0AaQBjAG8AbgBzRm9udCBnZW5lcmF0ZWQgYnkgSWNvTW9vbi4ARgBvAG4AdAAgAGcAZQBuAGUAcgBhAHQAZQBkACAAYgB5ACAASQBjAG8ATQBvAG8AbgAuAAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==) format("truetype"); }',f=" [data-anchorjs-icon]::after { content: attr(data-anchorjs-icon); }";b.className="anchorjs",b.appendChild(document.createTextNode("")),a=document.head.querySelector('[rel="stylesheet"], style'),void 0===a?document.head.appendChild(b):document.head.insertBefore(b,a),b.sheet.insertRule(c,b.sheet.cssRules.length),b.sheet.insertRule(d,b.sheet.cssRules.length),b.sheet.insertRule(f,b.sheet.cssRules.length),b.sheet.insertRule(e,b.sheet.cssRules.length)}}this.options=a||{},this.elements=[],b(this.options),this.isTouchDevice=function(){return!!("ontouchstart"in window||window.DocumentTouch&&document instanceof DocumentTouch)},this.add=function(a){var e,f,g,h,i,j,k,l,m,n,o,p,q=[];if(b(this.options),p=this.options.visible,"touch"===p&&(p=this.isTouchDevice()?"always":"hover"),a||(a="h1, h2, h3, h4, h5, h6"),e=c(a),0===e.length)return!1;for(d(),f=document.querySelectorAll("[id]"),g=[].map.call(f,function(a){return a.id}),i=0;i-1,c=(" "+a.lastChild.className+" ").indexOf(" anchorjs-link ")>-1;return b||c}}!function(a,b){"object"==typeof exports&&"object"==typeof module?module.exports=b():"function"==typeof define&&define.amd?define(b):"object"==typeof exports?exports.Holder=b():a.Holder=b()}(this,function(){return function(a){function b(d){if(c[d])return c[d].exports;var e=c[d]={exports:{},id:d,loaded:!1};return a[d].call(e.exports,e,e.exports,b),e.loaded=!0,e.exports}var c={};return b.m=a,b.c=c,b.p="",b(0)}([function(a,b,c){(function(b){function d(a,b,c,d){var g=e(c.substr(c.lastIndexOf(a.domain)),a);g&&f({mode:null,el:d,flags:g,engineSettings:b})}function e(a,b){for(var c={theme:y(K.settings.themes.gray,null),stylesheets:b.stylesheets,holderURL:[]},d=!1,e=String.fromCharCode(11),f=a.replace(/([^\\])\//g,"$1"+e).split(e),g=/%[0-9a-f]{2}/gi,h=f.length,i=0;h>i;i++){var j=f[i];if(j.match(g))try{j=decodeURIComponent(j)}catch(k){j=f[i]}var l=!1;if(K.flags.dimensions.match(j))d=!0,c.dimensions=K.flags.dimensions.output(j),l=!0;else if(K.flags.fluid.match(j))d=!0,c.dimensions=K.flags.fluid.output(j),c.fluid=!0,l=!0;else if(K.flags.textmode.match(j))c.textmode=K.flags.textmode.output(j),l=!0;else if(K.flags.colors.match(j)){var m=K.flags.colors.output(j);c.theme=y(c.theme,m),l=!0}else if(b.themes[j])b.themes.hasOwnProperty(j)&&(c.theme=y(b.themes[j],null)),l=!0;else if(K.flags.font.match(j))c.font=K.flags.font.output(j),l=!0;else if(K.flags.auto.match(j))c.auto=!0,l=!0;else if(K.flags.text.match(j))c.text=K.flags.text.output(j),l=!0;else if(K.flags.size.match(j))c.size=K.flags.size.output(j),l=!0;else if(K.flags.random.match(j)){null==K.vars.cache.themeKeys&&(K.vars.cache.themeKeys=Object.keys(b.themes));var n=K.vars.cache.themeKeys[0|Math.random()*K.vars.cache.themeKeys.length];c.theme=y(b.themes[n],null),l=!0}l&&c.holderURL.push(j)}return c.holderURL.unshift(b.domain),c.holderURL=c.holderURL.join("/"),!!d&&c}function f(a){var b=a.mode,c=a.el,d=a.flags,e=a.engineSettings,f=d.dimensions,h=d.theme,i=f.width+"x"+f.height;if(b=null==b?d.fluid?"fluid":"image":b,null!=d.text&&(h.text=d.text,"object"===c.nodeName.toLowerCase())){for(var l=h.text.split("\\n"),m=0;m1){var l=0,m=0,n=a.width*K.setup.lineWrapRatio,o=0;k=new e.Group("line"+o);for(var p=0;p=n||r===!0)&&(b(g,k,l,g.properties.leading),l=0,m+=g.properties.leading,o+=1,k=new e.Group("line"+o),k.y=m),r!==!0&&(j.moveTo(l,0),l+=h.spaceWidth+q.width,k.add(j))}b(g,k,l,g.properties.leading);for(var s in g.children)k=g.children[s],k.moveTo((g.width-k.width)/2,null,null);g.moveTo((a.width-g.width)/2,(a.height-g.height)/2,null),(a.height-g.height)/2<0&&g.moveTo(null,0,null)}else j=new e.Text(a.text),k=new e.Group("line0"),k.add(j),g.add(k),g.moveTo((a.width-h.boundingBox.width)/2,(a.height-h.boundingBox.height)/2,null);return d}function i(a,b,c){var d=parseInt(a,10),e=parseInt(b,10),f=Math.max(d,e),g=Math.min(d,e),h=.8*Math.min(g,f*K.defaults.scale);return Math.round(Math.max(c,h))}function j(a){var b;b=null==a||null==a.nodeType?K.vars.resizableImages:[a];for(var c=0,d=b.length;d>c;c++){var e=b[c];if(e.holderData){var f=e.holderData.flags,h=E(e);if(h){if(!e.holderData.resizeUpdate)continue;if(f.fluid&&f.auto){var i=e.holderData.fluidConfig;switch(i.mode){case"width":h.height=h.width/i.ratio;break;case"height":h.width=h.height*i.ratio}}var j={mode:"image",holderSettings:{dimensions:h,theme:f.theme,flags:f},el:e,engineSettings:e.holderData.engineSettings};"exact"==f.textmode&&(f.exactDimensions=h,j.holderSettings.dimensions=f.dimensions),g(j)}else n(e)}}}function k(a){if(a.holderData){var b=E(a);if(b){var c=a.holderData.flags,d={fluidHeight:"%"==c.dimensions.height.slice(-1),fluidWidth:"%"==c.dimensions.width.slice(-1),mode:null,initialDimensions:b};d.fluidWidth&&!d.fluidHeight?(d.mode="width",d.ratio=d.initialDimensions.width/parseFloat(c.dimensions.height)):!d.fluidWidth&&d.fluidHeight&&(d.mode="height",d.ratio=parseFloat(c.dimensions.width)/d.initialDimensions.height),a.holderData.fluidConfig=d}else n(a)}}function l(){for(var a,c=[],d=Object.keys(K.vars.invisibleImages),e=0,f=d.length;f>e;e++)a=K.vars.invisibleImages[d[e]],E(a)&&"img"==a.nodeName.toLowerCase()&&(c.push(a),delete K.vars.invisibleImages[d[e]]);c.length&&J.run({images:c}),b.requestAnimationFrame(l)}function m(){K.vars.visibilityCheckStarted||(b.requestAnimationFrame(l),K.vars.visibilityCheckStarted=!0)}function n(a){a.holderData.invisibleId||(K.vars.invisibleId+=1,K.vars.invisibleImages["i"+K.vars.invisibleId]=a,a.holderData.invisibleId=K.vars.invisibleId)}function o(a,b){return null==b?document.createElement(a):document.createElementNS(b,a)}function p(a,b){for(var c in b)a.setAttribute(c,b[c])}function q(a,b,c){var d,e;null==a?(a=o("svg",F),d=o("defs",F),e=o("style",F),p(e,{type:"text/css"}),d.appendChild(e),a.appendChild(d)):e=a.querySelector("style"),a.webkitMatchesSelector&&a.setAttribute("xmlns",F);for(var f=0;f=0;h--){var i=g.createProcessingInstruction("xml-stylesheet",'href="'+f[h]+'" rel="stylesheet"');g.insertBefore(i,g.firstChild)}var j=g.createProcessingInstruction("xml",'version="1.0" encoding="UTF-8" standalone="yes"');g.insertBefore(j,g.firstChild),g.removeChild(g.documentElement),e=d.serializeToString(g)}var k=d.serializeToString(a);return k=k.replace(/\&(\#[0-9]{2,}\;)/g,"&$1"),e+k}}function s(){return b.DOMParser?(new DOMParser).parseFromString("","application/xml"):void 0}function t(a){K.vars.debounceTimer||a.call(this),K.vars.debounceTimer&&b.clearTimeout(K.vars.debounceTimer),K.vars.debounceTimer=b.setTimeout(function(){K.vars.debounceTimer=null,a.call(this)},K.setup.debounce)}function u(){t(function(){j(null)})}var v=c(1),w=c(2),x=c(3),y=x.extend,z=x.cssProps,A=x.encodeHtmlEntity,B=x.decodeHtmlEntity,C=x.imageExists,D=x.getNodeArray,E=x.dimensionCheck,F="http://www.w3.org/2000/svg",G=8,H="2.6.0",I="\nCreated with Holder.js "+H+".\nLearn more at http://holderjs.com\n(c) 2012-2015 Ivan Malopinsky - http://imsky.co\n",J={version:H,addTheme:function(a,b){return null!=a&&null!=b&&(K.settings.themes[a]=b),delete K.vars.cache.themeKeys,this},addImage:function(a,b){var c=document.querySelectorAll(b);if(c.length)for(var d=0,e=c.length;e>d;d++){var f=o("img"),g={};g[K.vars.dataAttr]=a,p(f,g),c[d].appendChild(f)}return this},setResizeUpdate:function(a,b){a.holderData&&(a.holderData.resizeUpdate=!!b,a.holderData.resizeUpdate&&j(a))},run:function(a){a=a||{};var c={},g=y(K.settings,a);K.vars.preempted=!0,K.vars.dataAttr=g.dataAttr||K.vars.dataAttr,c.renderer=g.renderer?g.renderer:K.setup.renderer,-1===K.setup.renderers.join(",").indexOf(c.renderer)&&(c.renderer=K.setup.supportsSVG?"svg":K.setup.supportsCanvas?"canvas":"html");var h=D(g.images),i=D(g.bgnodes),j=D(g.stylenodes),k=D(g.objects);c.stylesheets=[],c.svgXMLStylesheet=!0,c.noFontFallback=!!g.noFontFallback&&g.noFontFallback;for(var l=0;l1){c.nodeValue="";for(var u=0;u=0?b:1)}function f(a){v?e(a):w.push(a)}null==document.readyState&&document.addEventListener&&(document.addEventListener("DOMContentLoaded",function y(){document.removeEventListener("DOMContentLoaded",y,!1),document.readyState="complete"},!1),document.readyState="loading");var g=a.document,h=g.documentElement,i="load",j=!1,k="on"+i,l="complete",m="readyState",n="attachEvent",o="detachEvent",p="addEventListener",q="DOMContentLoaded",r="onreadystatechange",s="removeEventListener",t=p in g,u=j,v=j,w=[];if(g[m]===l)e(b);else if(t)g[p](q,c,j),a[p](i,c,j);else{g[n](r,c),a[n](k,c);try{u=null==a.frameElement&&h}catch(x){}u&&u.doScroll&&!function z(){if(!v){try{u.doScroll("left")}catch(a){return e(z,50)}d(),b()}}()}return f.version="1.4.0",f.isReady=function(){return v},f}a.exports="undefined"!=typeof window&&b(window)},function(a,b,c){var d=c(4),e=function(a){function b(a,b){for(var c in b)a[c]=b[c];return a}var c=1,e=d.defclass({constructor:function(a){c++,this.parent=null,this.children={},this.id=c,this.name="n"+c,null!=a&&(this.name=a),this.x=0,this.y=0,this.z=0,this.width=0,this.height=0},resize:function(a,b){null!=a&&(this.width=a),null!=b&&(this.height=b)},moveTo:function(a,b,c){this.x=null!=a?a:this.x,this.y=null!=b?b:this.y,this.z=null!=c?c:this.z},add:function(a){var b=a.name;if(null!=this.children[b])throw"SceneGraph: child with that name already exists: "+b;this.children[b]=a,a.parent=this}}),f=d(e,function(b){this.constructor=function(){b.constructor.call(this,"root"),this.properties=a}}),g=d(e,function(a){function c(c,d){if(a.constructor.call(this,c),this.properties={fill:"#000"},null!=d)b(this.properties,d);else if(null!=c&&"string"!=typeof c)throw"SceneGraph: invalid node name"}this.Group=d.extend(this,{constructor:c,type:"group"}),this.Rect=d.extend(this,{constructor:c,type:"rect"}),this.Text=d.extend(this,{constructor:function(a){c.call(this),this.properties.text=a},type:"text"})}),h=new f;return this.Shape=g,this.root=h,this};a.exports=e},function(a,b){(function(a){b.extend=function(a,b){var c={};for(var d in a)a.hasOwnProperty(d)&&(c[d]=a[d]);if(null!=b)for(var e in b)b.hasOwnProperty(e)&&(c[e]=b[e]);return c},b.cssProps=function(a){var b=[];for(var c in a)a.hasOwnProperty(c)&&b.push(c+":"+a[c]);return b.join(";")},b.encodeHtmlEntity=function(a){for(var b=[],c=0,d=a.length-1;d>=0;d--)c=a.charCodeAt(d),b.unshift(c>128?["&#",c,";"].join(""):a[d]);return b.join("")},b.getNodeArray=function(b){var c=null;return"string"==typeof b?c=document.querySelectorAll(b):a.NodeList&&b instanceof a.NodeList?c=b:a.Node&&b instanceof a.Node?c=[b]:a.HTMLCollection&&b instanceof a.HTMLCollection?c=b:b instanceof Array?c=b:null===b&&(c=[]),c},b.imageExists=function(a,b){var c=new Image;c.onerror=function(){b.call(this,!1)},c.onload=function(){b.call(this,!0)},c.src=a},b.decodeHtmlEntity=function(a){return a.replace(/&#(\d+);/g,function(a,b){return String.fromCharCode(b)})},b.dimensionCheck=function(a){var b={height:a.clientHeight,width:a.clientWidth};return!(!b.height||!b.width)&&b}}).call(b,function(){return this}())},function(a){var b=function(){},c=Array.prototype.slice,d=function(a,d){var e=b.prototype="function"==typeof a?a.prototype:a,f=new b,g=d.apply(f,c.call(arguments,2).concat(e));if("object"==typeof g)for(var h in g)f[h]=g[h];if(!f.hasOwnProperty("constructor"))return f;var i=f.constructor;return i.prototype=f,i};d.defclass=function(a){var b=a.constructor;return b.prototype=a,b},d.extend=function(a,b){return d(a,function(a){return this.uber=a,b})},a.exports=d}])}),/*! +* ZeroClipboard +* The ZeroClipboard library provides an easy way to copy text to the clipboard using an invisible Adobe Flash movie and a JavaScript interface. +* Copyright (c) 2014 Jon Rohan, James M. Greene +* Licensed MIT +* http://zeroclipboard.org/ +* v1.3.5 +*/ +!function(a){"use strict";function b(a){return a.replace(/,/g,".").replace(/[^0-9\.]/g,"")}function c(a){return parseFloat(b(a))>=10}var d,e={bridge:null,version:"0.0.0",disabled:null,outdated:null,ready:null},f={},g=0,h={},i=0,j={},k=null,l=null,m=function(){var a,b,c,d,e="ZeroClipboard.swf";if(document.currentScript&&(d=document.currentScript.src));else{var f=document.getElementsByTagName("script");if("readyState"in f[0])for(a=f.length;a--&&("interactive"!==f[a].readyState||!(d=f[a].src)););else if("loading"===document.readyState)d=f[f.length-1].src;else{for(a=f.length;a--;){if(c=f[a].src,!c){b=null;break}if(c=c.split("#")[0].split("?")[0],c=c.slice(0,c.lastIndexOf("/")+1),null==b)b=c;else if(b!==c){b=null;break}}null!==b&&(d=b)}}return d&&(d=d.split("#")[0].split("?")[0],e=d.slice(0,d.lastIndexOf("/")+1)+e),e}(),n=function(){var a=/\-([a-z])/g,b=function(a,b){return b.toUpperCase()};return function(c){return c.replace(a,b)}}(),o=function(b,c){var d,e,f;return a.getComputedStyle?d=a.getComputedStyle(b,null).getPropertyValue(c):(e=n(c),d=b.currentStyle?b.currentStyle[e]:b.style[e]),"cursor"!==c||d&&"auto"!==d||(f=b.tagName.toLowerCase(),"a"!==f)?d:"pointer"},p=function(b){b||(b=a.event);var c;this!==a?c=this:b.target?c=b.target:b.srcElement&&(c=b.srcElement),K.activate(c)},q=function(a,b,c){a&&1===a.nodeType&&(a.addEventListener?a.addEventListener(b,c,!1):a.attachEvent&&a.attachEvent("on"+b,c))},r=function(a,b,c){a&&1===a.nodeType&&(a.removeEventListener?a.removeEventListener(b,c,!1):a.detachEvent&&a.detachEvent("on"+b,c))},s=function(a,b){if(!a||1!==a.nodeType)return a;if(a.classList)return a.classList.contains(b)||a.classList.add(b),a;if(b&&"string"==typeof b){var c=(b||"").split(/\s+/);if(1===a.nodeType)if(a.className){for(var d=" "+a.className+" ",e=a.className,f=0,g=c.length;g>f;f++)d.indexOf(" "+c[f]+" ")<0&&(e+=" "+c[f]);a.className=e.replace(/^\s+|\s+$/g,"")}else a.className=b}return a},t=function(a,b){if(!a||1!==a.nodeType)return a;if(a.classList)return a.classList.contains(b)&&a.classList.remove(b),a;if(b&&"string"==typeof b||void 0===b){var c=(b||"").split(/\s+/);if(1===a.nodeType&&a.className)if(b){for(var d=(" "+a.className+" ").replace(/[\n\t]/g," "),e=0,f=c.length;f>e;e++)d=d.replace(" "+c[e]+" "," ");a.className=d.replace(/^\s+|\s+$/g,"")}else a.className=""}return a},u=function(){var a,b,c,d=1;return"function"==typeof document.body.getBoundingClientRect&&(a=document.body.getBoundingClientRect(),b=a.right-a.left,c=document.body.offsetWidth,d=Math.round(b/c*100)/100),d},v=function(b,c){var d={left:0,top:0,width:0,height:0,zIndex:B(c)-1};if(b.getBoundingClientRect){var e,f,g,h=b.getBoundingClientRect();"pageXOffset"in a&&"pageYOffset"in a?(e=a.pageXOffset,f=a.pageYOffset):(g=u(),e=Math.round(document.documentElement.scrollLeft/g),f=Math.round(document.documentElement.scrollTop/g));var i=document.documentElement.clientLeft||0,j=document.documentElement.clientTop||0;d.left=h.left+e-i,d.top=h.top+f-j,d.width="width"in h?h.width:h.right-h.left,d.height="height"in h?h.height:h.bottom-h.top}return d},w=function(a,b){var c=null==b||b&&b.cacheBust===!0&&b.useNoCache===!0;return c?(-1===a.indexOf("?")?"?":"&")+"noCache="+(new Date).getTime():""},x=function(b){var c,d,e,f=[],g=[],h=[];if(b.trustedOrigins&&("string"==typeof b.trustedOrigins?g.push(b.trustedOrigins):"object"==typeof b.trustedOrigins&&"length"in b.trustedOrigins&&(g=g.concat(b.trustedOrigins))),b.trustedDomains&&("string"==typeof b.trustedDomains?g.push(b.trustedDomains):"object"==typeof b.trustedDomains&&"length"in b.trustedDomains&&(g=g.concat(b.trustedDomains))),g.length)for(c=0,d=g.length;d>c;c++)if(g.hasOwnProperty(c)&&g[c]&&"string"==typeof g[c]){if(e=E(g[c]),!e)continue;if("*"===e){h=[e];break}h.push.apply(h,[e,"//"+e,a.location.protocol+"//"+e])}return h.length&&f.push("trustedOrigins="+encodeURIComponent(h.join(","))),"string"==typeof b.jsModuleId&&b.jsModuleId&&f.push("jsModuleId="+encodeURIComponent(b.jsModuleId)),f.join("&")},y=function(a,b,c){if("function"==typeof b.indexOf)return b.indexOf(a,c);var d,e=b.length;for("undefined"==typeof c?c=0:0>c&&(c=e+c),d=c;e>d;d++)if(b.hasOwnProperty(d)&&b[d]===a)return d;return-1},z=function(a){if("string"==typeof a)throw new TypeError("ZeroClipboard doesn't accept query strings.");return a.length?a:[a]},A=function(b,c,d,e){e?a.setTimeout(function(){b.apply(c,d)},0):b.apply(c,d)},B=function(a){var b,c;return a&&("number"==typeof a&&a>0?b=a:"string"==typeof a&&(c=parseInt(a,10))&&!isNaN(c)&&c>0&&(b=c)),b||("number"==typeof N.zIndex&&N.zIndex>0?b=N.zIndex:"string"==typeof N.zIndex&&(c=parseInt(N.zIndex,10))&&!isNaN(c)&&c>0&&(b=c)),b||0},C=function(a,b){if(a&&b!==!1&&"undefined"!=typeof console&&console&&(console.warn||console.log)){var c="`"+a+"` is deprecated. See docs for more info:\n https://github.com/zeroclipboard/zeroclipboard/blob/master/docs/instructions.md#deprecations";console.warn?console.warn(c):console.log(c)}},D=function(){var a,b,c,d,e,f,g=arguments[0]||{};for(a=1,b=arguments.length;b>a;a++)if(null!=(c=arguments[a]))for(d in c)if(c.hasOwnProperty(d)){if(e=g[d],f=c[d],g===f)continue;void 0!==f&&(g[d]=f)}return g},E=function(a){if(null==a||""===a)return null;if(a=a.replace(/^\s+|\s+$/g,""),""===a)return null;var b=a.indexOf("//");a=-1===b?a:a.slice(b+2);var c=a.indexOf("/");return a=-1===c?a:-1===b||0===c?null:a.slice(0,c),a&&".swf"===a.slice(-4).toLowerCase()?null:a||null},F=function(){var a=function(a,b){var c,d,e;if(null!=a&&"*"!==b[0]&&("string"==typeof a&&(a=[a]),"object"==typeof a&&"length"in a))for(c=0,d=a.length;d>c;c++)if(a.hasOwnProperty(c)&&(e=E(a[c]))){if("*"===e){b.length=0,b.push("*");break}-1===y(e,b)&&b.push(e)}},b={always:"always",samedomain:"sameDomain",never:"never"};return function(c,d){var e,f=d.allowScriptAccess;if("string"==typeof f&&(e=f.toLowerCase())&&/^always|samedomain|never$/.test(e))return b[e];var g=E(d.moviePath);null===g&&(g=c);var h=[];a(d.trustedOrigins,h),a(d.trustedDomains,h);var i=h.length;if(i>0){if(1===i&&"*"===h[0])return"always";if(-1!==y(c,h))return 1===i&&c===g?"sameDomain":"always"}return"never"}}(),G=function(a){if(null==a)return[];if(Object.keys)return Object.keys(a);var b=[];for(var c in a)a.hasOwnProperty(c)&&b.push(c);return b},H=function(a){if(a)for(var b in a)a.hasOwnProperty(b)&&delete a[b];return a},I=function(){try{return document.activeElement}catch(a){}return null},J=function(){var a=!1;if("boolean"==typeof e.disabled)a=e.disabled===!1;else{if("function"==typeof ActiveXObject)try{new ActiveXObject("ShockwaveFlash.ShockwaveFlash")&&(a=!0)}catch(b){}!a&&navigator.mimeTypes["application/x-shockwave-flash"]&&(a=!0)}return a},K=function(a,b){return this instanceof K?(this.id=""+g++,h[this.id]={instance:this,elements:[],handlers:{}},a&&this.clip(a),"undefined"!=typeof b&&(C("new ZeroClipboard(elements, options)",N.debug),K.config(b)),this.options=K.config(),"boolean"!=typeof e.disabled&&(e.disabled=!J()),void(e.disabled===!1&&e.outdated!==!0&&null===e.bridge&&(e.outdated=!1,e.ready=!1,O()))):new K(a,b)};K.prototype.setText=function(a){return a&&""!==a&&(f["text/plain"]=a,e.ready===!0&&e.bridge&&"function"==typeof e.bridge.setText?e.bridge.setText(a):e.ready=!1),this},K.prototype.setSize=function(a,b){return e.ready===!0&&e.bridge&&"function"==typeof e.bridge.setSize?e.bridge.setSize(a,b):e.ready=!1,this};var L=function(a){e.ready===!0&&e.bridge&&"function"==typeof e.bridge.setHandCursor?e.bridge.setHandCursor(a):e.ready=!1};K.prototype.destroy=function(){this.unclip(),this.off(),delete h[this.id]};var M=function(){var a,b,c,d=[],e=G(h);for(a=0,b=e.length;b>a;a++)c=h[e[a]].instance,c&&c instanceof K&&d.push(c);return d};K.version="1.3.5";var N={swfPath:m,trustedDomains:a.location.host?[a.location.host]:[],cacheBust:!0,forceHandCursor:!1,zIndex:999999999,debug:!0,title:null,autoActivate:!0};K.config=function(a){if("object"==typeof a&&null!==a&&D(N,a),"string"!=typeof a||!a){var b={};for(var c in N)N.hasOwnProperty(c)&&(b[c]="object"==typeof N[c]&&null!==N[c]?"length"in N[c]?N[c].slice(0):D({},N[c]):N[c]);return b}if(N.hasOwnProperty(a))return N[a]},K.destroy=function(){K.deactivate();for(var a in h)if(h.hasOwnProperty(a)&&h[a]){var b=h[a].instance;b&&"function"==typeof b.destroy&&b.destroy()}var c=P(e.bridge);c&&c.parentNode&&(c.parentNode.removeChild(c),e.ready=null,e.bridge=null)},K.activate=function(a){d&&(t(d,N.hoverClass),t(d,N.activeClass)),d=a,s(a,N.hoverClass),Q();var b=N.title||a.getAttribute("title");if(b){var c=P(e.bridge);c&&c.setAttribute("title",b)}var f=N.forceHandCursor===!0||"pointer"===o(a,"cursor");L(f)},K.deactivate=function(){var a=P(e.bridge);a&&(a.style.left="0px",a.style.top="-9999px",a.removeAttribute("title")),d&&(t(d,N.hoverClass),t(d,N.activeClass),d=null)};var O=function(){var b,c,d=document.getElementById("global-zeroclipboard-html-bridge");if(!d){var f=K.config();f.jsModuleId="string"==typeof k&&k||"string"==typeof l&&l||null;var g=F(a.location.host,N),h=x(f),i=N.moviePath+w(N.moviePath,N),j=' ';d=document.createElement("div"),d.id="global-zeroclipboard-html-bridge",d.setAttribute("class","global-zeroclipboard-container"),d.style.position="absolute",d.style.left="0px",d.style.top="-9999px",d.style.width="15px",d.style.height="15px",d.style.zIndex=""+B(N.zIndex),document.body.appendChild(d),d.innerHTML=j}b=document["global-zeroclipboard-flash-bridge"],b&&(c=b.length)&&(b=b[c-1]),e.bridge=b||d.children[0].lastElementChild},P=function(a){for(var b=/^OBJECT|EMBED$/,c=a&&a.parentNode;c&&b.test(c.nodeName)&&c.parentNode;)c=c.parentNode;return c||null},Q=function(){if(d){var a=v(d,N.zIndex),b=P(e.bridge);b&&(b.style.top=a.top+"px",b.style.left=a.left+"px",b.style.width=a.width+"px",b.style.height=a.height+"px",b.style.zIndex=a.zIndex+1),e.ready===!0&&e.bridge&&"function"==typeof e.bridge.setSize?e.bridge.setSize(a.width,a.height):e.ready=!1}return this};K.prototype.on=function(a,b){var c,d,f,g={},i=h[this.id]&&h[this.id].handlers;if("string"==typeof a&&a)f=a.toLowerCase().split(/\s+/);else if("object"==typeof a&&a&&"undefined"==typeof b)for(c in a)a.hasOwnProperty(c)&&"string"==typeof c&&c&&"function"==typeof a[c]&&this.on(c,a[c]);if(f&&f.length){for(c=0,d=f.length;d>c;c++)a=f[c].replace(/^on/,""),g[a]=!0,i[a]||(i[a]=[]),i[a].push(b);g.noflash&&e.disabled&&T.call(this,"noflash",{}),g.wrongflash&&e.outdated&&T.call(this,"wrongflash",{flashVersion:e.version}),g.load&&e.ready&&T.call(this,"load",{flashVersion:e.version})}return this},K.prototype.off=function(a,b){var c,d,e,f,g,i=h[this.id]&&h[this.id].handlers;if(0===arguments.length)f=G(i);else if("string"==typeof a&&a)f=a.split(/\s+/);else if("object"==typeof a&&a&&"undefined"==typeof b)for(c in a)a.hasOwnProperty(c)&&"string"==typeof c&&c&&"function"==typeof a[c]&&this.off(c,a[c]);if(f&&f.length)for(c=0,d=f.length;d>c;c++)if(a=f[c].toLowerCase().replace(/^on/,""),g=i[a],g&&g.length)if(b)for(e=y(b,g);-1!==e;)g.splice(e,1),e=y(b,g,e);else i[a].length=0;return this},K.prototype.handlers=function(a){var b,c=null,d=h[this.id]&&h[this.id].handlers;if(d){if("string"==typeof a&&a)return d[a]?d[a].slice(0):null;c={};for(b in d)d.hasOwnProperty(b)&&d[b]&&(c[b]=d[b].slice(0))}return c};var R=function(b,c,d,e){var f=h[this.id]&&h[this.id].handlers[b];if(f&&f.length){var g,i,j,k=c||this;for(g=0,i=f.length;i>g;g++)j=f[g],c=k,"string"==typeof j&&"function"==typeof a[j]&&(j=a[j]),"object"==typeof j&&j&&"function"==typeof j.handleEvent&&(c=j,j=j.handleEvent),"function"==typeof j&&A(j,c,d,e)}return this};K.prototype.clip=function(a){a=z(a);for(var b=0;bd;d++)f=h[c[d]].instance,f&&f instanceof K&&g.push(f);return g};N.hoverClass="zeroclipboard-is-hover",N.activeClass="zeroclipboard-is-active",N.trustedOrigins=null,N.allowScriptAccess=null,N.useNoCache=!0,N.moviePath="ZeroClipboard.swf",K.detectFlashSupport=function(){return C("ZeroClipboard.detectFlashSupport",N.debug),J()},K.dispatch=function(a,b){if("string"==typeof a&&a){var c=a.toLowerCase().replace(/^on/,"");if(c)for(var e=d&&N.autoActivate===!0?S(d):M(),f=0,g=e.length;g>f;f++)T.call(e[f],c,b)}},K.prototype.setHandCursor=function(a){return C("ZeroClipboard.prototype.setHandCursor",N.debug),a="boolean"==typeof a?a:!!a,L(a),N.forceHandCursor=a,this},K.prototype.reposition=function(){return C("ZeroClipboard.prototype.reposition",N.debug),Q()},K.prototype.receiveEvent=function(a,b){if(C("ZeroClipboard.prototype.receiveEvent",N.debug),"string"==typeof a&&a){var c=a.toLowerCase().replace(/^on/,"");c&&T.call(this,c,b)}},K.prototype.setCurrent=function(a){return C("ZeroClipboard.prototype.setCurrent",N.debug),K.activate(a),this},K.prototype.resetBridge=function(){return C("ZeroClipboard.prototype.resetBridge",N.debug),K.deactivate(),this},K.prototype.setTitle=function(a){if(C("ZeroClipboard.prototype.setTitle",N.debug),a=a||N.title||d&&d.getAttribute("title")){var b=P(e.bridge);b&&b.setAttribute("title",a)}return this},K.setDefaults=function(a){C("ZeroClipboard.setDefaults",N.debug),K.config(a)},K.prototype.addEventListener=function(a,b){return C("ZeroClipboard.prototype.addEventListener",N.debug),this.on(a,b)},K.prototype.removeEventListener=function(a,b){return C("ZeroClipboard.prototype.removeEventListener",N.debug),this.off(a,b)},K.prototype.ready=function(){return C("ZeroClipboard.prototype.ready",N.debug),e.ready===!0};var T=function(a,g){a=a.toLowerCase().replace(/^on/,"");var h=g&&g.flashVersion&&b(g.flashVersion)||null,i=d,j=!0;switch(a){case"load":if(h){if(!c(h))return void T.call(this,"onWrongFlash",{flashVersion:h});e.outdated=!1,e.ready=!0,e.version=h}break;case"wrongflash":h&&!c(h)&&(e.outdated=!0,e.ready=!1,e.version=h);break;case"mouseover":s(i,N.hoverClass);break;case"mouseout":N.autoActivate===!0&&K.deactivate();break;case"mousedown":s(i,N.activeClass);break;case"mouseup":t(i,N.activeClass);break;case"datarequested":if(i){var k=i.getAttribute("data-clipboard-target"),l=k?document.getElementById(k):null;if(l){var m=l.value||l.textContent||l.innerText;m&&this.setText(m)}else{var n=i.getAttribute("data-clipboard-text");n&&this.setText(n)}}j=!1;break;case"complete":H(f),i&&i!==I()&&i.focus&&i.focus()}var o=i,p=[this,g];return R.call(this,a,o,p,j)};"function"==typeof define&&define.amd?define(["require","exports","module"],function(a,b,c){return k=c&&c.id||null,K}):"object"==typeof module&&module&&"object"==typeof module.exports&&module.exports&&"function"==typeof a.require?(l=module.id||null,module.exports=K):a.ZeroClipboard=K}(function(){return this}());var anchors=new AnchorJS;/*! + * JavaScript for Bootstrap's docs (http://getbootstrap.com) + * Copyright 2011-2016 Twitter, Inc. + * Licensed under the Creative Commons Attribution 3.0 Unported License. For + * details, see https://creativecommons.org/licenses/by/3.0/. + */ +!function(a){"use strict";a(function(){var b=a(window),c=a(document.body);c.scrollspy({target:".bs-docs-sidebar"}),b.on("load",function(){c.scrollspy("refresh")}),a('.bs-docs-container [href="#"]').click(function(a){a.preventDefault()}),setTimeout(function(){var b=a(".bs-docs-sidebar");b.affix({offset:{top:function(){var c=b.offset().top,d=parseInt(b.children(0).css("margin-top"),10),e=a(".bs-docs-nav").height();return this.top=c-e-d},bottom:function(){return this.bottom=a(".bs-docs-footer").outerHeight(!0)}}})},100),setTimeout(function(){a(".bs-top").affix()},100),function(){var b=a("#bs-theme-stylesheet"),c=a(".bs-docs-theme-toggle"),d=function(){b.attr("href",b.attr("data-href")),c.text("Disable theme preview"),localStorage.setItem("previewTheme",!0)};localStorage.getItem("previewTheme")&&d(),c.click(function(){var a=b.attr("href");a&&0!==a.indexOf("data")?(b.attr("href",""),c.text("Preview theme"),localStorage.removeItem("previewTheme")):d()})}(),a(".tooltip-demo").tooltip({selector:'[data-toggle="tooltip"]',container:"body"}),a(".popover-demo").popover({selector:'[data-toggle="popover"]',container:"body"}),a(".tooltip-test").tooltip(),a(".popover-test").popover(),a(".bs-docs-popover").popover(),a("#loading-example-btn").on("click",function(){var b=a(this);b.button("loading"),setTimeout(function(){b.button("reset")},3e3)}),a("#exampleModal").on("show.bs.modal",function(b){var c=a(b.relatedTarget),d=c.data("whatever"),e=a(this);e.find(".modal-title").text("New message to "+d),e.find(".modal-body input").val(d)}),a(".bs-docs-activate-animated-progressbar").on("click",function(){a(this).siblings(".progress").find(".progress-bar-striped").toggleClass("active")}),ZeroClipboard.config({moviePath:"/assets/flash/ZeroClipboard.swf",hoverClass:"btn-clipboard-hover"}),a(".highlight").each(function(){var b='
Copy
';a(this).before(b)});var d=new ZeroClipboard(a(".btn-clipboard")),e=a("#global-zeroclipboard-html-bridge");d.on("load",function(){e.data("placement","top").attr("title","Copy to clipboard").tooltip(),d.on("dataRequested",function(b){var c=a(this).parent().nextAll(".highlight").first();b.setText(c.text())}),d.on("complete",function(){e.attr("title","Copied!").tooltip("fixTitle").tooltip("show").attr("title","Copy to clipboard").tooltip("fixTitle")})}),d.on("noflash wrongflash",function(){a(".zero-clipboard").remove(),ZeroClipboard.destroy()})})}(jQuery),function(){"use strict";anchors.options.placement="left",anchors.add(".bs-docs-section > h1, .bs-docs-section > h2, .bs-docs-section > h3, .bs-docs-section > h4, .bs-docs-section > h5")}(); \ No newline at end of file diff --git a/docs/archive/1.0/sql/tutorial/js/jquery.min.js b/docs/archive/1.0/sql/tutorial/js/jquery.min.js new file mode 100644 index 00000000000..e836475870d --- /dev/null +++ b/docs/archive/1.0/sql/tutorial/js/jquery.min.js @@ -0,0 +1,5 @@ +/*! jQuery v1.12.4 | (c) jQuery Foundation | jquery.org/license */ +!function(a,b){"object"==typeof module&&"object"==typeof module.exports?module.exports=a.document?b(a,!0):function(a){if(!a.document)throw new Error("jQuery requires a window with a document");return b(a)}:b(a)}("undefined"!=typeof window?window:this,function(a,b){var c=[],d=a.document,e=c.slice,f=c.concat,g=c.push,h=c.indexOf,i={},j=i.toString,k=i.hasOwnProperty,l={},m="1.12.4",n=function(a,b){return new n.fn.init(a,b)},o=/^[\s\uFEFF\xA0]+|[\s\uFEFF\xA0]+$/g,p=/^-ms-/,q=/-([\da-z])/gi,r=function(a,b){return b.toUpperCase()};n.fn=n.prototype={jquery:m,constructor:n,selector:"",length:0,toArray:function(){return e.call(this)},get:function(a){return null!=a?0>a?this[a+this.length]:this[a]:e.call(this)},pushStack:function(a){var b=n.merge(this.constructor(),a);return b.prevObject=this,b.context=this.context,b},each:function(a){return n.each(this,a)},map:function(a){return this.pushStack(n.map(this,function(b,c){return a.call(b,c,b)}))},slice:function(){return this.pushStack(e.apply(this,arguments))},first:function(){return this.eq(0)},last:function(){return this.eq(-1)},eq:function(a){var b=this.length,c=+a+(0>a?b:0);return this.pushStack(c>=0&&b>c?[this[c]]:[])},end:function(){return this.prevObject||this.constructor()},push:g,sort:c.sort,splice:c.splice},n.extend=n.fn.extend=function(){var a,b,c,d,e,f,g=arguments[0]||{},h=1,i=arguments.length,j=!1;for("boolean"==typeof g&&(j=g,g=arguments[h]||{},h++),"object"==typeof g||n.isFunction(g)||(g={}),h===i&&(g=this,h--);i>h;h++)if(null!=(e=arguments[h]))for(d in e)a=g[d],c=e[d],g!==c&&(j&&c&&(n.isPlainObject(c)||(b=n.isArray(c)))?(b?(b=!1,f=a&&n.isArray(a)?a:[]):f=a&&n.isPlainObject(a)?a:{},g[d]=n.extend(j,f,c)):void 0!==c&&(g[d]=c));return g},n.extend({expando:"jQuery"+(m+Math.random()).replace(/\D/g,""),isReady:!0,error:function(a){throw new Error(a)},noop:function(){},isFunction:function(a){return"function"===n.type(a)},isArray:Array.isArray||function(a){return"array"===n.type(a)},isWindow:function(a){return null!=a&&a==a.window},isNumeric:function(a){var b=a&&a.toString();return!n.isArray(a)&&b-parseFloat(b)+1>=0},isEmptyObject:function(a){var b;for(b in a)return!1;return!0},isPlainObject:function(a){var b;if(!a||"object"!==n.type(a)||a.nodeType||n.isWindow(a))return!1;try{if(a.constructor&&!k.call(a,"constructor")&&!k.call(a.constructor.prototype,"isPrototypeOf"))return!1}catch(c){return!1}if(!l.ownFirst)for(b in a)return k.call(a,b);for(b in a);return void 0===b||k.call(a,b)},type:function(a){return null==a?a+"":"object"==typeof a||"function"==typeof a?i[j.call(a)]||"object":typeof a},globalEval:function(b){b&&n.trim(b)&&(a.execScript||function(b){a.eval.call(a,b)})(b)},camelCase:function(a){return a.replace(p,"ms-").replace(q,r)},nodeName:function(a,b){return a.nodeName&&a.nodeName.toLowerCase()===b.toLowerCase()},each:function(a,b){var c,d=0;if(s(a)){for(c=a.length;c>d;d++)if(b.call(a[d],d,a[d])===!1)break}else for(d in a)if(b.call(a[d],d,a[d])===!1)break;return a},trim:function(a){return null==a?"":(a+"").replace(o,"")},makeArray:function(a,b){var c=b||[];return null!=a&&(s(Object(a))?n.merge(c,"string"==typeof a?[a]:a):g.call(c,a)),c},inArray:function(a,b,c){var d;if(b){if(h)return h.call(b,a,c);for(d=b.length,c=c?0>c?Math.max(0,d+c):c:0;d>c;c++)if(c in b&&b[c]===a)return c}return-1},merge:function(a,b){var c=+b.length,d=0,e=a.length;while(c>d)a[e++]=b[d++];if(c!==c)while(void 0!==b[d])a[e++]=b[d++];return a.length=e,a},grep:function(a,b,c){for(var d,e=[],f=0,g=a.length,h=!c;g>f;f++)d=!b(a[f],f),d!==h&&e.push(a[f]);return e},map:function(a,b,c){var d,e,g=0,h=[];if(s(a))for(d=a.length;d>g;g++)e=b(a[g],g,c),null!=e&&h.push(e);else for(g in a)e=b(a[g],g,c),null!=e&&h.push(e);return f.apply([],h)},guid:1,proxy:function(a,b){var c,d,f;return"string"==typeof b&&(f=a[b],b=a,a=f),n.isFunction(a)?(c=e.call(arguments,2),d=function(){return a.apply(b||this,c.concat(e.call(arguments)))},d.guid=a.guid=a.guid||n.guid++,d):void 0},now:function(){return+new Date},support:l}),"function"==typeof Symbol&&(n.fn[Symbol.iterator]=c[Symbol.iterator]),n.each("Boolean Number String Function Array Date RegExp Object Error Symbol".split(" "),function(a,b){i["[object "+b+"]"]=b.toLowerCase()});function s(a){var b=!!a&&"length"in a&&a.length,c=n.type(a);return"function"===c||n.isWindow(a)?!1:"array"===c||0===b||"number"==typeof b&&b>0&&b-1 in a}var t=function(a){var b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u="sizzle"+1*new Date,v=a.document,w=0,x=0,y=ga(),z=ga(),A=ga(),B=function(a,b){return a===b&&(l=!0),0},C=1<<31,D={}.hasOwnProperty,E=[],F=E.pop,G=E.push,H=E.push,I=E.slice,J=function(a,b){for(var c=0,d=a.length;d>c;c++)if(a[c]===b)return c;return-1},K="checked|selected|async|autofocus|autoplay|controls|defer|disabled|hidden|ismap|loop|multiple|open|readonly|required|scoped",L="[\\x20\\t\\r\\n\\f]",M="(?:\\\\.|[\\w-]|[^\\x00-\\xa0])+",N="\\["+L+"*("+M+")(?:"+L+"*([*^$|!~]?=)"+L+"*(?:'((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\"|("+M+"))|)"+L+"*\\]",O=":("+M+")(?:\\((('((?:\\\\.|[^\\\\'])*)'|\"((?:\\\\.|[^\\\\\"])*)\")|((?:\\\\.|[^\\\\()[\\]]|"+N+")*)|.*)\\)|)",P=new RegExp(L+"+","g"),Q=new RegExp("^"+L+"+|((?:^|[^\\\\])(?:\\\\.)*)"+L+"+$","g"),R=new RegExp("^"+L+"*,"+L+"*"),S=new RegExp("^"+L+"*([>+~]|"+L+")"+L+"*"),T=new RegExp("="+L+"*([^\\]'\"]*?)"+L+"*\\]","g"),U=new RegExp(O),V=new RegExp("^"+M+"$"),W={ID:new RegExp("^#("+M+")"),CLASS:new RegExp("^\\.("+M+")"),TAG:new RegExp("^("+M+"|[*])"),ATTR:new RegExp("^"+N),PSEUDO:new RegExp("^"+O),CHILD:new RegExp("^:(only|first|last|nth|nth-last)-(child|of-type)(?:\\("+L+"*(even|odd|(([+-]|)(\\d*)n|)"+L+"*(?:([+-]|)"+L+"*(\\d+)|))"+L+"*\\)|)","i"),bool:new RegExp("^(?:"+K+")$","i"),needsContext:new RegExp("^"+L+"*[>+~]|:(even|odd|eq|gt|lt|nth|first|last)(?:\\("+L+"*((?:-\\d)?\\d*)"+L+"*\\)|)(?=[^-]|$)","i")},X=/^(?:input|select|textarea|button)$/i,Y=/^h\d$/i,Z=/^[^{]+\{\s*\[native \w/,$=/^(?:#([\w-]+)|(\w+)|\.([\w-]+))$/,_=/[+~]/,aa=/'|\\/g,ba=new RegExp("\\\\([\\da-f]{1,6}"+L+"?|("+L+")|.)","ig"),ca=function(a,b,c){var d="0x"+b-65536;return d!==d||c?b:0>d?String.fromCharCode(d+65536):String.fromCharCode(d>>10|55296,1023&d|56320)},da=function(){m()};try{H.apply(E=I.call(v.childNodes),v.childNodes),E[v.childNodes.length].nodeType}catch(ea){H={apply:E.length?function(a,b){G.apply(a,I.call(b))}:function(a,b){var c=a.length,d=0;while(a[c++]=b[d++]);a.length=c-1}}}function fa(a,b,d,e){var f,h,j,k,l,o,r,s,w=b&&b.ownerDocument,x=b?b.nodeType:9;if(d=d||[],"string"!=typeof a||!a||1!==x&&9!==x&&11!==x)return d;if(!e&&((b?b.ownerDocument||b:v)!==n&&m(b),b=b||n,p)){if(11!==x&&(o=$.exec(a)))if(f=o[1]){if(9===x){if(!(j=b.getElementById(f)))return d;if(j.id===f)return d.push(j),d}else if(w&&(j=w.getElementById(f))&&t(b,j)&&j.id===f)return d.push(j),d}else{if(o[2])return H.apply(d,b.getElementsByTagName(a)),d;if((f=o[3])&&c.getElementsByClassName&&b.getElementsByClassName)return H.apply(d,b.getElementsByClassName(f)),d}if(c.qsa&&!A[a+" "]&&(!q||!q.test(a))){if(1!==x)w=b,s=a;else if("object"!==b.nodeName.toLowerCase()){(k=b.getAttribute("id"))?k=k.replace(aa,"\\$&"):b.setAttribute("id",k=u),r=g(a),h=r.length,l=V.test(k)?"#"+k:"[id='"+k+"']";while(h--)r[h]=l+" "+qa(r[h]);s=r.join(","),w=_.test(a)&&oa(b.parentNode)||b}if(s)try{return H.apply(d,w.querySelectorAll(s)),d}catch(y){}finally{k===u&&b.removeAttribute("id")}}}return i(a.replace(Q,"$1"),b,d,e)}function ga(){var a=[];function b(c,e){return a.push(c+" ")>d.cacheLength&&delete b[a.shift()],b[c+" "]=e}return b}function ha(a){return a[u]=!0,a}function ia(a){var b=n.createElement("div");try{return!!a(b)}catch(c){return!1}finally{b.parentNode&&b.parentNode.removeChild(b),b=null}}function ja(a,b){var c=a.split("|"),e=c.length;while(e--)d.attrHandle[c[e]]=b}function ka(a,b){var c=b&&a,d=c&&1===a.nodeType&&1===b.nodeType&&(~b.sourceIndex||C)-(~a.sourceIndex||C);if(d)return d;if(c)while(c=c.nextSibling)if(c===b)return-1;return a?1:-1}function la(a){return function(b){var c=b.nodeName.toLowerCase();return"input"===c&&b.type===a}}function ma(a){return function(b){var c=b.nodeName.toLowerCase();return("input"===c||"button"===c)&&b.type===a}}function na(a){return ha(function(b){return b=+b,ha(function(c,d){var e,f=a([],c.length,b),g=f.length;while(g--)c[e=f[g]]&&(c[e]=!(d[e]=c[e]))})})}function oa(a){return a&&"undefined"!=typeof a.getElementsByTagName&&a}c=fa.support={},f=fa.isXML=function(a){var b=a&&(a.ownerDocument||a).documentElement;return b?"HTML"!==b.nodeName:!1},m=fa.setDocument=function(a){var b,e,g=a?a.ownerDocument||a:v;return g!==n&&9===g.nodeType&&g.documentElement?(n=g,o=n.documentElement,p=!f(n),(e=n.defaultView)&&e.top!==e&&(e.addEventListener?e.addEventListener("unload",da,!1):e.attachEvent&&e.attachEvent("onunload",da)),c.attributes=ia(function(a){return a.className="i",!a.getAttribute("className")}),c.getElementsByTagName=ia(function(a){return a.appendChild(n.createComment("")),!a.getElementsByTagName("*").length}),c.getElementsByClassName=Z.test(n.getElementsByClassName),c.getById=ia(function(a){return o.appendChild(a).id=u,!n.getElementsByName||!n.getElementsByName(u).length}),c.getById?(d.find.ID=function(a,b){if("undefined"!=typeof b.getElementById&&p){var c=b.getElementById(a);return c?[c]:[]}},d.filter.ID=function(a){var b=a.replace(ba,ca);return function(a){return a.getAttribute("id")===b}}):(delete d.find.ID,d.filter.ID=function(a){var b=a.replace(ba,ca);return function(a){var c="undefined"!=typeof a.getAttributeNode&&a.getAttributeNode("id");return c&&c.value===b}}),d.find.TAG=c.getElementsByTagName?function(a,b){return"undefined"!=typeof b.getElementsByTagName?b.getElementsByTagName(a):c.qsa?b.querySelectorAll(a):void 0}:function(a,b){var c,d=[],e=0,f=b.getElementsByTagName(a);if("*"===a){while(c=f[e++])1===c.nodeType&&d.push(c);return d}return f},d.find.CLASS=c.getElementsByClassName&&function(a,b){return"undefined"!=typeof b.getElementsByClassName&&p?b.getElementsByClassName(a):void 0},r=[],q=[],(c.qsa=Z.test(n.querySelectorAll))&&(ia(function(a){o.appendChild(a).innerHTML="",a.querySelectorAll("[msallowcapture^='']").length&&q.push("[*^$]="+L+"*(?:''|\"\")"),a.querySelectorAll("[selected]").length||q.push("\\["+L+"*(?:value|"+K+")"),a.querySelectorAll("[id~="+u+"-]").length||q.push("~="),a.querySelectorAll(":checked").length||q.push(":checked"),a.querySelectorAll("a#"+u+"+*").length||q.push(".#.+[+~]")}),ia(function(a){var b=n.createElement("input");b.setAttribute("type","hidden"),a.appendChild(b).setAttribute("name","D"),a.querySelectorAll("[name=d]").length&&q.push("name"+L+"*[*^$|!~]?="),a.querySelectorAll(":enabled").length||q.push(":enabled",":disabled"),a.querySelectorAll("*,:x"),q.push(",.*:")})),(c.matchesSelector=Z.test(s=o.matches||o.webkitMatchesSelector||o.mozMatchesSelector||o.oMatchesSelector||o.msMatchesSelector))&&ia(function(a){c.disconnectedMatch=s.call(a,"div"),s.call(a,"[s!='']:x"),r.push("!=",O)}),q=q.length&&new RegExp(q.join("|")),r=r.length&&new RegExp(r.join("|")),b=Z.test(o.compareDocumentPosition),t=b||Z.test(o.contains)?function(a,b){var c=9===a.nodeType?a.documentElement:a,d=b&&b.parentNode;return a===d||!(!d||1!==d.nodeType||!(c.contains?c.contains(d):a.compareDocumentPosition&&16&a.compareDocumentPosition(d)))}:function(a,b){if(b)while(b=b.parentNode)if(b===a)return!0;return!1},B=b?function(a,b){if(a===b)return l=!0,0;var d=!a.compareDocumentPosition-!b.compareDocumentPosition;return d?d:(d=(a.ownerDocument||a)===(b.ownerDocument||b)?a.compareDocumentPosition(b):1,1&d||!c.sortDetached&&b.compareDocumentPosition(a)===d?a===n||a.ownerDocument===v&&t(v,a)?-1:b===n||b.ownerDocument===v&&t(v,b)?1:k?J(k,a)-J(k,b):0:4&d?-1:1)}:function(a,b){if(a===b)return l=!0,0;var c,d=0,e=a.parentNode,f=b.parentNode,g=[a],h=[b];if(!e||!f)return a===n?-1:b===n?1:e?-1:f?1:k?J(k,a)-J(k,b):0;if(e===f)return ka(a,b);c=a;while(c=c.parentNode)g.unshift(c);c=b;while(c=c.parentNode)h.unshift(c);while(g[d]===h[d])d++;return d?ka(g[d],h[d]):g[d]===v?-1:h[d]===v?1:0},n):n},fa.matches=function(a,b){return fa(a,null,null,b)},fa.matchesSelector=function(a,b){if((a.ownerDocument||a)!==n&&m(a),b=b.replace(T,"='$1']"),c.matchesSelector&&p&&!A[b+" "]&&(!r||!r.test(b))&&(!q||!q.test(b)))try{var d=s.call(a,b);if(d||c.disconnectedMatch||a.document&&11!==a.document.nodeType)return d}catch(e){}return fa(b,n,null,[a]).length>0},fa.contains=function(a,b){return(a.ownerDocument||a)!==n&&m(a),t(a,b)},fa.attr=function(a,b){(a.ownerDocument||a)!==n&&m(a);var e=d.attrHandle[b.toLowerCase()],f=e&&D.call(d.attrHandle,b.toLowerCase())?e(a,b,!p):void 0;return void 0!==f?f:c.attributes||!p?a.getAttribute(b):(f=a.getAttributeNode(b))&&f.specified?f.value:null},fa.error=function(a){throw new Error("Syntax error, unrecognized expression: "+a)},fa.uniqueSort=function(a){var b,d=[],e=0,f=0;if(l=!c.detectDuplicates,k=!c.sortStable&&a.slice(0),a.sort(B),l){while(b=a[f++])b===a[f]&&(e=d.push(f));while(e--)a.splice(d[e],1)}return k=null,a},e=fa.getText=function(a){var b,c="",d=0,f=a.nodeType;if(f){if(1===f||9===f||11===f){if("string"==typeof a.textContent)return a.textContent;for(a=a.firstChild;a;a=a.nextSibling)c+=e(a)}else if(3===f||4===f)return a.nodeValue}else while(b=a[d++])c+=e(b);return c},d=fa.selectors={cacheLength:50,createPseudo:ha,match:W,attrHandle:{},find:{},relative:{">":{dir:"parentNode",first:!0}," ":{dir:"parentNode"},"+":{dir:"previousSibling",first:!0},"~":{dir:"previousSibling"}},preFilter:{ATTR:function(a){return a[1]=a[1].replace(ba,ca),a[3]=(a[3]||a[4]||a[5]||"").replace(ba,ca),"~="===a[2]&&(a[3]=" "+a[3]+" "),a.slice(0,4)},CHILD:function(a){return a[1]=a[1].toLowerCase(),"nth"===a[1].slice(0,3)?(a[3]||fa.error(a[0]),a[4]=+(a[4]?a[5]+(a[6]||1):2*("even"===a[3]||"odd"===a[3])),a[5]=+(a[7]+a[8]||"odd"===a[3])):a[3]&&fa.error(a[0]),a},PSEUDO:function(a){var b,c=!a[6]&&a[2];return W.CHILD.test(a[0])?null:(a[3]?a[2]=a[4]||a[5]||"":c&&U.test(c)&&(b=g(c,!0))&&(b=c.indexOf(")",c.length-b)-c.length)&&(a[0]=a[0].slice(0,b),a[2]=c.slice(0,b)),a.slice(0,3))}},filter:{TAG:function(a){var b=a.replace(ba,ca).toLowerCase();return"*"===a?function(){return!0}:function(a){return a.nodeName&&a.nodeName.toLowerCase()===b}},CLASS:function(a){var b=y[a+" "];return b||(b=new RegExp("(^|"+L+")"+a+"("+L+"|$)"))&&y(a,function(a){return b.test("string"==typeof a.className&&a.className||"undefined"!=typeof a.getAttribute&&a.getAttribute("class")||"")})},ATTR:function(a,b,c){return function(d){var e=fa.attr(d,a);return null==e?"!="===b:b?(e+="","="===b?e===c:"!="===b?e!==c:"^="===b?c&&0===e.indexOf(c):"*="===b?c&&e.indexOf(c)>-1:"$="===b?c&&e.slice(-c.length)===c:"~="===b?(" "+e.replace(P," ")+" ").indexOf(c)>-1:"|="===b?e===c||e.slice(0,c.length+1)===c+"-":!1):!0}},CHILD:function(a,b,c,d,e){var f="nth"!==a.slice(0,3),g="last"!==a.slice(-4),h="of-type"===b;return 1===d&&0===e?function(a){return!!a.parentNode}:function(b,c,i){var j,k,l,m,n,o,p=f!==g?"nextSibling":"previousSibling",q=b.parentNode,r=h&&b.nodeName.toLowerCase(),s=!i&&!h,t=!1;if(q){if(f){while(p){m=b;while(m=m[p])if(h?m.nodeName.toLowerCase()===r:1===m.nodeType)return!1;o=p="only"===a&&!o&&"nextSibling"}return!0}if(o=[g?q.firstChild:q.lastChild],g&&s){m=q,l=m[u]||(m[u]={}),k=l[m.uniqueID]||(l[m.uniqueID]={}),j=k[a]||[],n=j[0]===w&&j[1],t=n&&j[2],m=n&&q.childNodes[n];while(m=++n&&m&&m[p]||(t=n=0)||o.pop())if(1===m.nodeType&&++t&&m===b){k[a]=[w,n,t];break}}else if(s&&(m=b,l=m[u]||(m[u]={}),k=l[m.uniqueID]||(l[m.uniqueID]={}),j=k[a]||[],n=j[0]===w&&j[1],t=n),t===!1)while(m=++n&&m&&m[p]||(t=n=0)||o.pop())if((h?m.nodeName.toLowerCase()===r:1===m.nodeType)&&++t&&(s&&(l=m[u]||(m[u]={}),k=l[m.uniqueID]||(l[m.uniqueID]={}),k[a]=[w,t]),m===b))break;return t-=e,t===d||t%d===0&&t/d>=0}}},PSEUDO:function(a,b){var c,e=d.pseudos[a]||d.setFilters[a.toLowerCase()]||fa.error("unsupported pseudo: "+a);return e[u]?e(b):e.length>1?(c=[a,a,"",b],d.setFilters.hasOwnProperty(a.toLowerCase())?ha(function(a,c){var d,f=e(a,b),g=f.length;while(g--)d=J(a,f[g]),a[d]=!(c[d]=f[g])}):function(a){return e(a,0,c)}):e}},pseudos:{not:ha(function(a){var b=[],c=[],d=h(a.replace(Q,"$1"));return d[u]?ha(function(a,b,c,e){var f,g=d(a,null,e,[]),h=a.length;while(h--)(f=g[h])&&(a[h]=!(b[h]=f))}):function(a,e,f){return b[0]=a,d(b,null,f,c),b[0]=null,!c.pop()}}),has:ha(function(a){return function(b){return fa(a,b).length>0}}),contains:ha(function(a){return a=a.replace(ba,ca),function(b){return(b.textContent||b.innerText||e(b)).indexOf(a)>-1}}),lang:ha(function(a){return V.test(a||"")||fa.error("unsupported lang: "+a),a=a.replace(ba,ca).toLowerCase(),function(b){var c;do if(c=p?b.lang:b.getAttribute("xml:lang")||b.getAttribute("lang"))return c=c.toLowerCase(),c===a||0===c.indexOf(a+"-");while((b=b.parentNode)&&1===b.nodeType);return!1}}),target:function(b){var c=a.location&&a.location.hash;return c&&c.slice(1)===b.id},root:function(a){return a===o},focus:function(a){return a===n.activeElement&&(!n.hasFocus||n.hasFocus())&&!!(a.type||a.href||~a.tabIndex)},enabled:function(a){return a.disabled===!1},disabled:function(a){return a.disabled===!0},checked:function(a){var b=a.nodeName.toLowerCase();return"input"===b&&!!a.checked||"option"===b&&!!a.selected},selected:function(a){return a.parentNode&&a.parentNode.selectedIndex,a.selected===!0},empty:function(a){for(a=a.firstChild;a;a=a.nextSibling)if(a.nodeType<6)return!1;return!0},parent:function(a){return!d.pseudos.empty(a)},header:function(a){return Y.test(a.nodeName)},input:function(a){return X.test(a.nodeName)},button:function(a){var b=a.nodeName.toLowerCase();return"input"===b&&"button"===a.type||"button"===b},text:function(a){var b;return"input"===a.nodeName.toLowerCase()&&"text"===a.type&&(null==(b=a.getAttribute("type"))||"text"===b.toLowerCase())},first:na(function(){return[0]}),last:na(function(a,b){return[b-1]}),eq:na(function(a,b,c){return[0>c?c+b:c]}),even:na(function(a,b){for(var c=0;b>c;c+=2)a.push(c);return a}),odd:na(function(a,b){for(var c=1;b>c;c+=2)a.push(c);return a}),lt:na(function(a,b,c){for(var d=0>c?c+b:c;--d>=0;)a.push(d);return a}),gt:na(function(a,b,c){for(var d=0>c?c+b:c;++db;b++)d+=a[b].value;return d}function ra(a,b,c){var d=b.dir,e=c&&"parentNode"===d,f=x++;return b.first?function(b,c,f){while(b=b[d])if(1===b.nodeType||e)return a(b,c,f)}:function(b,c,g){var h,i,j,k=[w,f];if(g){while(b=b[d])if((1===b.nodeType||e)&&a(b,c,g))return!0}else while(b=b[d])if(1===b.nodeType||e){if(j=b[u]||(b[u]={}),i=j[b.uniqueID]||(j[b.uniqueID]={}),(h=i[d])&&h[0]===w&&h[1]===f)return k[2]=h[2];if(i[d]=k,k[2]=a(b,c,g))return!0}}}function sa(a){return a.length>1?function(b,c,d){var e=a.length;while(e--)if(!a[e](b,c,d))return!1;return!0}:a[0]}function ta(a,b,c){for(var d=0,e=b.length;e>d;d++)fa(a,b[d],c);return c}function ua(a,b,c,d,e){for(var f,g=[],h=0,i=a.length,j=null!=b;i>h;h++)(f=a[h])&&(c&&!c(f,d,e)||(g.push(f),j&&b.push(h)));return g}function va(a,b,c,d,e,f){return d&&!d[u]&&(d=va(d)),e&&!e[u]&&(e=va(e,f)),ha(function(f,g,h,i){var j,k,l,m=[],n=[],o=g.length,p=f||ta(b||"*",h.nodeType?[h]:h,[]),q=!a||!f&&b?p:ua(p,m,a,h,i),r=c?e||(f?a:o||d)?[]:g:q;if(c&&c(q,r,h,i),d){j=ua(r,n),d(j,[],h,i),k=j.length;while(k--)(l=j[k])&&(r[n[k]]=!(q[n[k]]=l))}if(f){if(e||a){if(e){j=[],k=r.length;while(k--)(l=r[k])&&j.push(q[k]=l);e(null,r=[],j,i)}k=r.length;while(k--)(l=r[k])&&(j=e?J(f,l):m[k])>-1&&(f[j]=!(g[j]=l))}}else r=ua(r===g?r.splice(o,r.length):r),e?e(null,g,r,i):H.apply(g,r)})}function wa(a){for(var b,c,e,f=a.length,g=d.relative[a[0].type],h=g||d.relative[" "],i=g?1:0,k=ra(function(a){return a===b},h,!0),l=ra(function(a){return J(b,a)>-1},h,!0),m=[function(a,c,d){var e=!g&&(d||c!==j)||((b=c).nodeType?k(a,c,d):l(a,c,d));return b=null,e}];f>i;i++)if(c=d.relative[a[i].type])m=[ra(sa(m),c)];else{if(c=d.filter[a[i].type].apply(null,a[i].matches),c[u]){for(e=++i;f>e;e++)if(d.relative[a[e].type])break;return va(i>1&&sa(m),i>1&&qa(a.slice(0,i-1).concat({value:" "===a[i-2].type?"*":""})).replace(Q,"$1"),c,e>i&&wa(a.slice(i,e)),f>e&&wa(a=a.slice(e)),f>e&&qa(a))}m.push(c)}return sa(m)}function xa(a,b){var c=b.length>0,e=a.length>0,f=function(f,g,h,i,k){var l,o,q,r=0,s="0",t=f&&[],u=[],v=j,x=f||e&&d.find.TAG("*",k),y=w+=null==v?1:Math.random()||.1,z=x.length;for(k&&(j=g===n||g||k);s!==z&&null!=(l=x[s]);s++){if(e&&l){o=0,g||l.ownerDocument===n||(m(l),h=!p);while(q=a[o++])if(q(l,g||n,h)){i.push(l);break}k&&(w=y)}c&&((l=!q&&l)&&r--,f&&t.push(l))}if(r+=s,c&&s!==r){o=0;while(q=b[o++])q(t,u,g,h);if(f){if(r>0)while(s--)t[s]||u[s]||(u[s]=F.call(i));u=ua(u)}H.apply(i,u),k&&!f&&u.length>0&&r+b.length>1&&fa.uniqueSort(i)}return k&&(w=y,j=v),t};return c?ha(f):f}return h=fa.compile=function(a,b){var c,d=[],e=[],f=A[a+" "];if(!f){b||(b=g(a)),c=b.length;while(c--)f=wa(b[c]),f[u]?d.push(f):e.push(f);f=A(a,xa(e,d)),f.selector=a}return f},i=fa.select=function(a,b,e,f){var i,j,k,l,m,n="function"==typeof a&&a,o=!f&&g(a=n.selector||a);if(e=e||[],1===o.length){if(j=o[0]=o[0].slice(0),j.length>2&&"ID"===(k=j[0]).type&&c.getById&&9===b.nodeType&&p&&d.relative[j[1].type]){if(b=(d.find.ID(k.matches[0].replace(ba,ca),b)||[])[0],!b)return e;n&&(b=b.parentNode),a=a.slice(j.shift().value.length)}i=W.needsContext.test(a)?0:j.length;while(i--){if(k=j[i],d.relative[l=k.type])break;if((m=d.find[l])&&(f=m(k.matches[0].replace(ba,ca),_.test(j[0].type)&&oa(b.parentNode)||b))){if(j.splice(i,1),a=f.length&&qa(j),!a)return H.apply(e,f),e;break}}}return(n||h(a,o))(f,b,!p,e,!b||_.test(a)&&oa(b.parentNode)||b),e},c.sortStable=u.split("").sort(B).join("")===u,c.detectDuplicates=!!l,m(),c.sortDetached=ia(function(a){return 1&a.compareDocumentPosition(n.createElement("div"))}),ia(function(a){return a.innerHTML="","#"===a.firstChild.getAttribute("href")})||ja("type|href|height|width",function(a,b,c){return c?void 0:a.getAttribute(b,"type"===b.toLowerCase()?1:2)}),c.attributes&&ia(function(a){return a.innerHTML="",a.firstChild.setAttribute("value",""),""===a.firstChild.getAttribute("value")})||ja("value",function(a,b,c){return c||"input"!==a.nodeName.toLowerCase()?void 0:a.defaultValue}),ia(function(a){return null==a.getAttribute("disabled")})||ja(K,function(a,b,c){var d;return c?void 0:a[b]===!0?b.toLowerCase():(d=a.getAttributeNode(b))&&d.specified?d.value:null}),fa}(a);n.find=t,n.expr=t.selectors,n.expr[":"]=n.expr.pseudos,n.uniqueSort=n.unique=t.uniqueSort,n.text=t.getText,n.isXMLDoc=t.isXML,n.contains=t.contains;var u=function(a,b,c){var d=[],e=void 0!==c;while((a=a[b])&&9!==a.nodeType)if(1===a.nodeType){if(e&&n(a).is(c))break;d.push(a)}return d},v=function(a,b){for(var c=[];a;a=a.nextSibling)1===a.nodeType&&a!==b&&c.push(a);return c},w=n.expr.match.needsContext,x=/^<([\w-]+)\s*\/?>(?:<\/\1>|)$/,y=/^.[^:#\[\.,]*$/;function z(a,b,c){if(n.isFunction(b))return n.grep(a,function(a,d){return!!b.call(a,d,a)!==c});if(b.nodeType)return n.grep(a,function(a){return a===b!==c});if("string"==typeof b){if(y.test(b))return n.filter(b,a,c);b=n.filter(b,a)}return n.grep(a,function(a){return n.inArray(a,b)>-1!==c})}n.filter=function(a,b,c){var d=b[0];return c&&(a=":not("+a+")"),1===b.length&&1===d.nodeType?n.find.matchesSelector(d,a)?[d]:[]:n.find.matches(a,n.grep(b,function(a){return 1===a.nodeType}))},n.fn.extend({find:function(a){var b,c=[],d=this,e=d.length;if("string"!=typeof a)return this.pushStack(n(a).filter(function(){for(b=0;e>b;b++)if(n.contains(d[b],this))return!0}));for(b=0;e>b;b++)n.find(a,d[b],c);return c=this.pushStack(e>1?n.unique(c):c),c.selector=this.selector?this.selector+" "+a:a,c},filter:function(a){return this.pushStack(z(this,a||[],!1))},not:function(a){return this.pushStack(z(this,a||[],!0))},is:function(a){return!!z(this,"string"==typeof a&&w.test(a)?n(a):a||[],!1).length}});var A,B=/^(?:\s*(<[\w\W]+>)[^>]*|#([\w-]*))$/,C=n.fn.init=function(a,b,c){var e,f;if(!a)return this;if(c=c||A,"string"==typeof a){if(e="<"===a.charAt(0)&&">"===a.charAt(a.length-1)&&a.length>=3?[null,a,null]:B.exec(a),!e||!e[1]&&b)return!b||b.jquery?(b||c).find(a):this.constructor(b).find(a);if(e[1]){if(b=b instanceof n?b[0]:b,n.merge(this,n.parseHTML(e[1],b&&b.nodeType?b.ownerDocument||b:d,!0)),x.test(e[1])&&n.isPlainObject(b))for(e in b)n.isFunction(this[e])?this[e](b[e]):this.attr(e,b[e]);return this}if(f=d.getElementById(e[2]),f&&f.parentNode){if(f.id!==e[2])return A.find(a);this.length=1,this[0]=f}return this.context=d,this.selector=a,this}return a.nodeType?(this.context=this[0]=a,this.length=1,this):n.isFunction(a)?"undefined"!=typeof c.ready?c.ready(a):a(n):(void 0!==a.selector&&(this.selector=a.selector,this.context=a.context),n.makeArray(a,this))};C.prototype=n.fn,A=n(d);var D=/^(?:parents|prev(?:Until|All))/,E={children:!0,contents:!0,next:!0,prev:!0};n.fn.extend({has:function(a){var b,c=n(a,this),d=c.length;return this.filter(function(){for(b=0;d>b;b++)if(n.contains(this,c[b]))return!0})},closest:function(a,b){for(var c,d=0,e=this.length,f=[],g=w.test(a)||"string"!=typeof a?n(a,b||this.context):0;e>d;d++)for(c=this[d];c&&c!==b;c=c.parentNode)if(c.nodeType<11&&(g?g.index(c)>-1:1===c.nodeType&&n.find.matchesSelector(c,a))){f.push(c);break}return this.pushStack(f.length>1?n.uniqueSort(f):f)},index:function(a){return a?"string"==typeof a?n.inArray(this[0],n(a)):n.inArray(a.jquery?a[0]:a,this):this[0]&&this[0].parentNode?this.first().prevAll().length:-1},add:function(a,b){return this.pushStack(n.uniqueSort(n.merge(this.get(),n(a,b))))},addBack:function(a){return this.add(null==a?this.prevObject:this.prevObject.filter(a))}});function F(a,b){do a=a[b];while(a&&1!==a.nodeType);return a}n.each({parent:function(a){var b=a.parentNode;return b&&11!==b.nodeType?b:null},parents:function(a){return u(a,"parentNode")},parentsUntil:function(a,b,c){return u(a,"parentNode",c)},next:function(a){return F(a,"nextSibling")},prev:function(a){return F(a,"previousSibling")},nextAll:function(a){return u(a,"nextSibling")},prevAll:function(a){return u(a,"previousSibling")},nextUntil:function(a,b,c){return u(a,"nextSibling",c)},prevUntil:function(a,b,c){return u(a,"previousSibling",c)},siblings:function(a){return v((a.parentNode||{}).firstChild,a)},children:function(a){return v(a.firstChild)},contents:function(a){return n.nodeName(a,"iframe")?a.contentDocument||a.contentWindow.document:n.merge([],a.childNodes)}},function(a,b){n.fn[a]=function(c,d){var e=n.map(this,b,c);return"Until"!==a.slice(-5)&&(d=c),d&&"string"==typeof d&&(e=n.filter(d,e)),this.length>1&&(E[a]||(e=n.uniqueSort(e)),D.test(a)&&(e=e.reverse())),this.pushStack(e)}});var G=/\S+/g;function H(a){var b={};return n.each(a.match(G)||[],function(a,c){b[c]=!0}),b}n.Callbacks=function(a){a="string"==typeof a?H(a):n.extend({},a);var b,c,d,e,f=[],g=[],h=-1,i=function(){for(e=a.once,d=b=!0;g.length;h=-1){c=g.shift();while(++h-1)f.splice(c,1),h>=c&&h--}),this},has:function(a){return a?n.inArray(a,f)>-1:f.length>0},empty:function(){return f&&(f=[]),this},disable:function(){return e=g=[],f=c="",this},disabled:function(){return!f},lock:function(){return e=!0,c||j.disable(),this},locked:function(){return!!e},fireWith:function(a,c){return e||(c=c||[],c=[a,c.slice?c.slice():c],g.push(c),b||i()),this},fire:function(){return j.fireWith(this,arguments),this},fired:function(){return!!d}};return j},n.extend({Deferred:function(a){var b=[["resolve","done",n.Callbacks("once memory"),"resolved"],["reject","fail",n.Callbacks("once memory"),"rejected"],["notify","progress",n.Callbacks("memory")]],c="pending",d={state:function(){return c},always:function(){return e.done(arguments).fail(arguments),this},then:function(){var a=arguments;return n.Deferred(function(c){n.each(b,function(b,f){var g=n.isFunction(a[b])&&a[b];e[f[1]](function(){var a=g&&g.apply(this,arguments);a&&n.isFunction(a.promise)?a.promise().progress(c.notify).done(c.resolve).fail(c.reject):c[f[0]+"With"](this===d?c.promise():this,g?[a]:arguments)})}),a=null}).promise()},promise:function(a){return null!=a?n.extend(a,d):d}},e={};return d.pipe=d.then,n.each(b,function(a,f){var g=f[2],h=f[3];d[f[1]]=g.add,h&&g.add(function(){c=h},b[1^a][2].disable,b[2][2].lock),e[f[0]]=function(){return e[f[0]+"With"](this===e?d:this,arguments),this},e[f[0]+"With"]=g.fireWith}),d.promise(e),a&&a.call(e,e),e},when:function(a){var b=0,c=e.call(arguments),d=c.length,f=1!==d||a&&n.isFunction(a.promise)?d:0,g=1===f?a:n.Deferred(),h=function(a,b,c){return function(d){b[a]=this,c[a]=arguments.length>1?e.call(arguments):d,c===i?g.notifyWith(b,c):--f||g.resolveWith(b,c)}},i,j,k;if(d>1)for(i=new Array(d),j=new Array(d),k=new Array(d);d>b;b++)c[b]&&n.isFunction(c[b].promise)?c[b].promise().progress(h(b,j,i)).done(h(b,k,c)).fail(g.reject):--f;return f||g.resolveWith(k,c),g.promise()}});var I;n.fn.ready=function(a){return n.ready.promise().done(a),this},n.extend({isReady:!1,readyWait:1,holdReady:function(a){a?n.readyWait++:n.ready(!0)},ready:function(a){(a===!0?--n.readyWait:n.isReady)||(n.isReady=!0,a!==!0&&--n.readyWait>0||(I.resolveWith(d,[n]),n.fn.triggerHandler&&(n(d).triggerHandler("ready"),n(d).off("ready"))))}});function J(){d.addEventListener?(d.removeEventListener("DOMContentLoaded",K),a.removeEventListener("load",K)):(d.detachEvent("onreadystatechange",K),a.detachEvent("onload",K))}function K(){(d.addEventListener||"load"===a.event.type||"complete"===d.readyState)&&(J(),n.ready())}n.ready.promise=function(b){if(!I)if(I=n.Deferred(),"complete"===d.readyState||"loading"!==d.readyState&&!d.documentElement.doScroll)a.setTimeout(n.ready);else if(d.addEventListener)d.addEventListener("DOMContentLoaded",K),a.addEventListener("load",K);else{d.attachEvent("onreadystatechange",K),a.attachEvent("onload",K);var c=!1;try{c=null==a.frameElement&&d.documentElement}catch(e){}c&&c.doScroll&&!function f(){if(!n.isReady){try{c.doScroll("left")}catch(b){return a.setTimeout(f,50)}J(),n.ready()}}()}return I.promise(b)},n.ready.promise();var L;for(L in n(l))break;l.ownFirst="0"===L,l.inlineBlockNeedsLayout=!1,n(function(){var a,b,c,e;c=d.getElementsByTagName("body")[0],c&&c.style&&(b=d.createElement("div"),e=d.createElement("div"),e.style.cssText="position:absolute;border:0;width:0;height:0;top:0;left:-9999px",c.appendChild(e).appendChild(b),"undefined"!=typeof b.style.zoom&&(b.style.cssText="display:inline;margin:0;border:0;padding:1px;width:1px;zoom:1",l.inlineBlockNeedsLayout=a=3===b.offsetWidth,a&&(c.style.zoom=1)),c.removeChild(e))}),function(){var a=d.createElement("div");l.deleteExpando=!0;try{delete a.test}catch(b){l.deleteExpando=!1}a=null}();var M=function(a){var b=n.noData[(a.nodeName+" ").toLowerCase()],c=+a.nodeType||1;return 1!==c&&9!==c?!1:!b||b!==!0&&a.getAttribute("classid")===b},N=/^(?:\{[\w\W]*\}|\[[\w\W]*\])$/,O=/([A-Z])/g;function P(a,b,c){if(void 0===c&&1===a.nodeType){var d="data-"+b.replace(O,"-$1").toLowerCase();if(c=a.getAttribute(d),"string"==typeof c){try{c="true"===c?!0:"false"===c?!1:"null"===c?null:+c+""===c?+c:N.test(c)?n.parseJSON(c):c}catch(e){}n.data(a,b,c)}else c=void 0; +}return c}function Q(a){var b;for(b in a)if(("data"!==b||!n.isEmptyObject(a[b]))&&"toJSON"!==b)return!1;return!0}function R(a,b,d,e){if(M(a)){var f,g,h=n.expando,i=a.nodeType,j=i?n.cache:a,k=i?a[h]:a[h]&&h;if(k&&j[k]&&(e||j[k].data)||void 0!==d||"string"!=typeof b)return k||(k=i?a[h]=c.pop()||n.guid++:h),j[k]||(j[k]=i?{}:{toJSON:n.noop}),"object"!=typeof b&&"function"!=typeof b||(e?j[k]=n.extend(j[k],b):j[k].data=n.extend(j[k].data,b)),g=j[k],e||(g.data||(g.data={}),g=g.data),void 0!==d&&(g[n.camelCase(b)]=d),"string"==typeof b?(f=g[b],null==f&&(f=g[n.camelCase(b)])):f=g,f}}function S(a,b,c){if(M(a)){var d,e,f=a.nodeType,g=f?n.cache:a,h=f?a[n.expando]:n.expando;if(g[h]){if(b&&(d=c?g[h]:g[h].data)){n.isArray(b)?b=b.concat(n.map(b,n.camelCase)):b in d?b=[b]:(b=n.camelCase(b),b=b in d?[b]:b.split(" ")),e=b.length;while(e--)delete d[b[e]];if(c?!Q(d):!n.isEmptyObject(d))return}(c||(delete g[h].data,Q(g[h])))&&(f?n.cleanData([a],!0):l.deleteExpando||g!=g.window?delete g[h]:g[h]=void 0)}}}n.extend({cache:{},noData:{"applet ":!0,"embed ":!0,"object ":"clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"},hasData:function(a){return a=a.nodeType?n.cache[a[n.expando]]:a[n.expando],!!a&&!Q(a)},data:function(a,b,c){return R(a,b,c)},removeData:function(a,b){return S(a,b)},_data:function(a,b,c){return R(a,b,c,!0)},_removeData:function(a,b){return S(a,b,!0)}}),n.fn.extend({data:function(a,b){var c,d,e,f=this[0],g=f&&f.attributes;if(void 0===a){if(this.length&&(e=n.data(f),1===f.nodeType&&!n._data(f,"parsedAttrs"))){c=g.length;while(c--)g[c]&&(d=g[c].name,0===d.indexOf("data-")&&(d=n.camelCase(d.slice(5)),P(f,d,e[d])));n._data(f,"parsedAttrs",!0)}return e}return"object"==typeof a?this.each(function(){n.data(this,a)}):arguments.length>1?this.each(function(){n.data(this,a,b)}):f?P(f,a,n.data(f,a)):void 0},removeData:function(a){return this.each(function(){n.removeData(this,a)})}}),n.extend({queue:function(a,b,c){var d;return a?(b=(b||"fx")+"queue",d=n._data(a,b),c&&(!d||n.isArray(c)?d=n._data(a,b,n.makeArray(c)):d.push(c)),d||[]):void 0},dequeue:function(a,b){b=b||"fx";var c=n.queue(a,b),d=c.length,e=c.shift(),f=n._queueHooks(a,b),g=function(){n.dequeue(a,b)};"inprogress"===e&&(e=c.shift(),d--),e&&("fx"===b&&c.unshift("inprogress"),delete f.stop,e.call(a,g,f)),!d&&f&&f.empty.fire()},_queueHooks:function(a,b){var c=b+"queueHooks";return n._data(a,c)||n._data(a,c,{empty:n.Callbacks("once memory").add(function(){n._removeData(a,b+"queue"),n._removeData(a,c)})})}}),n.fn.extend({queue:function(a,b){var c=2;return"string"!=typeof a&&(b=a,a="fx",c--),arguments.lengthh;h++)b(a[h],c,g?d:d.call(a[h],h,b(a[h],c)));return e?a:j?b.call(a):i?b(a[0],c):f},Z=/^(?:checkbox|radio)$/i,$=/<([\w:-]+)/,_=/^$|\/(?:java|ecma)script/i,aa=/^\s+/,ba="abbr|article|aside|audio|bdi|canvas|data|datalist|details|dialog|figcaption|figure|footer|header|hgroup|main|mark|meter|nav|output|picture|progress|section|summary|template|time|video";function ca(a){var b=ba.split("|"),c=a.createDocumentFragment();if(c.createElement)while(b.length)c.createElement(b.pop());return c}!function(){var a=d.createElement("div"),b=d.createDocumentFragment(),c=d.createElement("input");a.innerHTML="
a",l.leadingWhitespace=3===a.firstChild.nodeType,l.tbody=!a.getElementsByTagName("tbody").length,l.htmlSerialize=!!a.getElementsByTagName("link").length,l.html5Clone="<:nav>"!==d.createElement("nav").cloneNode(!0).outerHTML,c.type="checkbox",c.checked=!0,b.appendChild(c),l.appendChecked=c.checked,a.innerHTML="",l.noCloneChecked=!!a.cloneNode(!0).lastChild.defaultValue,b.appendChild(a),c=d.createElement("input"),c.setAttribute("type","radio"),c.setAttribute("checked","checked"),c.setAttribute("name","t"),a.appendChild(c),l.checkClone=a.cloneNode(!0).cloneNode(!0).lastChild.checked,l.noCloneEvent=!!a.addEventListener,a[n.expando]=1,l.attributes=!a.getAttribute(n.expando)}();var da={option:[1,""],legend:[1,"
","
"],area:[1,"",""],param:[1,"",""],thead:[1,"","
"],tr:[2,"","
"],col:[2,"","
"],td:[3,"","
"],_default:l.htmlSerialize?[0,"",""]:[1,"X
","
"]};da.optgroup=da.option,da.tbody=da.tfoot=da.colgroup=da.caption=da.thead,da.th=da.td;function ea(a,b){var c,d,e=0,f="undefined"!=typeof a.getElementsByTagName?a.getElementsByTagName(b||"*"):"undefined"!=typeof a.querySelectorAll?a.querySelectorAll(b||"*"):void 0;if(!f)for(f=[],c=a.childNodes||a;null!=(d=c[e]);e++)!b||n.nodeName(d,b)?f.push(d):n.merge(f,ea(d,b));return void 0===b||b&&n.nodeName(a,b)?n.merge([a],f):f}function fa(a,b){for(var c,d=0;null!=(c=a[d]);d++)n._data(c,"globalEval",!b||n._data(b[d],"globalEval"))}var ga=/<|&#?\w+;/,ha=/r;r++)if(g=a[r],g||0===g)if("object"===n.type(g))n.merge(q,g.nodeType?[g]:g);else if(ga.test(g)){i=i||p.appendChild(b.createElement("div")),j=($.exec(g)||["",""])[1].toLowerCase(),m=da[j]||da._default,i.innerHTML=m[1]+n.htmlPrefilter(g)+m[2],f=m[0];while(f--)i=i.lastChild;if(!l.leadingWhitespace&&aa.test(g)&&q.push(b.createTextNode(aa.exec(g)[0])),!l.tbody){g="table"!==j||ha.test(g)?""!==m[1]||ha.test(g)?0:i:i.firstChild,f=g&&g.childNodes.length;while(f--)n.nodeName(k=g.childNodes[f],"tbody")&&!k.childNodes.length&&g.removeChild(k)}n.merge(q,i.childNodes),i.textContent="";while(i.firstChild)i.removeChild(i.firstChild);i=p.lastChild}else q.push(b.createTextNode(g));i&&p.removeChild(i),l.appendChecked||n.grep(ea(q,"input"),ia),r=0;while(g=q[r++])if(d&&n.inArray(g,d)>-1)e&&e.push(g);else if(h=n.contains(g.ownerDocument,g),i=ea(p.appendChild(g),"script"),h&&fa(i),c){f=0;while(g=i[f++])_.test(g.type||"")&&c.push(g)}return i=null,p}!function(){var b,c,e=d.createElement("div");for(b in{submit:!0,change:!0,focusin:!0})c="on"+b,(l[b]=c in a)||(e.setAttribute(c,"t"),l[b]=e.attributes[c].expando===!1);e=null}();var ka=/^(?:input|select|textarea)$/i,la=/^key/,ma=/^(?:mouse|pointer|contextmenu|drag|drop)|click/,na=/^(?:focusinfocus|focusoutblur)$/,oa=/^([^.]*)(?:\.(.+)|)/;function pa(){return!0}function qa(){return!1}function ra(){try{return d.activeElement}catch(a){}}function sa(a,b,c,d,e,f){var g,h;if("object"==typeof b){"string"!=typeof c&&(d=d||c,c=void 0);for(h in b)sa(a,h,c,d,b[h],f);return a}if(null==d&&null==e?(e=c,d=c=void 0):null==e&&("string"==typeof c?(e=d,d=void 0):(e=d,d=c,c=void 0)),e===!1)e=qa;else if(!e)return a;return 1===f&&(g=e,e=function(a){return n().off(a),g.apply(this,arguments)},e.guid=g.guid||(g.guid=n.guid++)),a.each(function(){n.event.add(this,b,e,d,c)})}n.event={global:{},add:function(a,b,c,d,e){var f,g,h,i,j,k,l,m,o,p,q,r=n._data(a);if(r){c.handler&&(i=c,c=i.handler,e=i.selector),c.guid||(c.guid=n.guid++),(g=r.events)||(g=r.events={}),(k=r.handle)||(k=r.handle=function(a){return"undefined"==typeof n||a&&n.event.triggered===a.type?void 0:n.event.dispatch.apply(k.elem,arguments)},k.elem=a),b=(b||"").match(G)||[""],h=b.length;while(h--)f=oa.exec(b[h])||[],o=q=f[1],p=(f[2]||"").split(".").sort(),o&&(j=n.event.special[o]||{},o=(e?j.delegateType:j.bindType)||o,j=n.event.special[o]||{},l=n.extend({type:o,origType:q,data:d,handler:c,guid:c.guid,selector:e,needsContext:e&&n.expr.match.needsContext.test(e),namespace:p.join(".")},i),(m=g[o])||(m=g[o]=[],m.delegateCount=0,j.setup&&j.setup.call(a,d,p,k)!==!1||(a.addEventListener?a.addEventListener(o,k,!1):a.attachEvent&&a.attachEvent("on"+o,k))),j.add&&(j.add.call(a,l),l.handler.guid||(l.handler.guid=c.guid)),e?m.splice(m.delegateCount++,0,l):m.push(l),n.event.global[o]=!0);a=null}},remove:function(a,b,c,d,e){var f,g,h,i,j,k,l,m,o,p,q,r=n.hasData(a)&&n._data(a);if(r&&(k=r.events)){b=(b||"").match(G)||[""],j=b.length;while(j--)if(h=oa.exec(b[j])||[],o=q=h[1],p=(h[2]||"").split(".").sort(),o){l=n.event.special[o]||{},o=(d?l.delegateType:l.bindType)||o,m=k[o]||[],h=h[2]&&new RegExp("(^|\\.)"+p.join("\\.(?:.*\\.|)")+"(\\.|$)"),i=f=m.length;while(f--)g=m[f],!e&&q!==g.origType||c&&c.guid!==g.guid||h&&!h.test(g.namespace)||d&&d!==g.selector&&("**"!==d||!g.selector)||(m.splice(f,1),g.selector&&m.delegateCount--,l.remove&&l.remove.call(a,g));i&&!m.length&&(l.teardown&&l.teardown.call(a,p,r.handle)!==!1||n.removeEvent(a,o,r.handle),delete k[o])}else for(o in k)n.event.remove(a,o+b[j],c,d,!0);n.isEmptyObject(k)&&(delete r.handle,n._removeData(a,"events"))}},trigger:function(b,c,e,f){var g,h,i,j,l,m,o,p=[e||d],q=k.call(b,"type")?b.type:b,r=k.call(b,"namespace")?b.namespace.split("."):[];if(i=m=e=e||d,3!==e.nodeType&&8!==e.nodeType&&!na.test(q+n.event.triggered)&&(q.indexOf(".")>-1&&(r=q.split("."),q=r.shift(),r.sort()),h=q.indexOf(":")<0&&"on"+q,b=b[n.expando]?b:new n.Event(q,"object"==typeof b&&b),b.isTrigger=f?2:3,b.namespace=r.join("."),b.rnamespace=b.namespace?new RegExp("(^|\\.)"+r.join("\\.(?:.*\\.|)")+"(\\.|$)"):null,b.result=void 0,b.target||(b.target=e),c=null==c?[b]:n.makeArray(c,[b]),l=n.event.special[q]||{},f||!l.trigger||l.trigger.apply(e,c)!==!1)){if(!f&&!l.noBubble&&!n.isWindow(e)){for(j=l.delegateType||q,na.test(j+q)||(i=i.parentNode);i;i=i.parentNode)p.push(i),m=i;m===(e.ownerDocument||d)&&p.push(m.defaultView||m.parentWindow||a)}o=0;while((i=p[o++])&&!b.isPropagationStopped())b.type=o>1?j:l.bindType||q,g=(n._data(i,"events")||{})[b.type]&&n._data(i,"handle"),g&&g.apply(i,c),g=h&&i[h],g&&g.apply&&M(i)&&(b.result=g.apply(i,c),b.result===!1&&b.preventDefault());if(b.type=q,!f&&!b.isDefaultPrevented()&&(!l._default||l._default.apply(p.pop(),c)===!1)&&M(e)&&h&&e[q]&&!n.isWindow(e)){m=e[h],m&&(e[h]=null),n.event.triggered=q;try{e[q]()}catch(s){}n.event.triggered=void 0,m&&(e[h]=m)}return b.result}},dispatch:function(a){a=n.event.fix(a);var b,c,d,f,g,h=[],i=e.call(arguments),j=(n._data(this,"events")||{})[a.type]||[],k=n.event.special[a.type]||{};if(i[0]=a,a.delegateTarget=this,!k.preDispatch||k.preDispatch.call(this,a)!==!1){h=n.event.handlers.call(this,a,j),b=0;while((f=h[b++])&&!a.isPropagationStopped()){a.currentTarget=f.elem,c=0;while((g=f.handlers[c++])&&!a.isImmediatePropagationStopped())a.rnamespace&&!a.rnamespace.test(g.namespace)||(a.handleObj=g,a.data=g.data,d=((n.event.special[g.origType]||{}).handle||g.handler).apply(f.elem,i),void 0!==d&&(a.result=d)===!1&&(a.preventDefault(),a.stopPropagation()))}return k.postDispatch&&k.postDispatch.call(this,a),a.result}},handlers:function(a,b){var c,d,e,f,g=[],h=b.delegateCount,i=a.target;if(h&&i.nodeType&&("click"!==a.type||isNaN(a.button)||a.button<1))for(;i!=this;i=i.parentNode||this)if(1===i.nodeType&&(i.disabled!==!0||"click"!==a.type)){for(d=[],c=0;h>c;c++)f=b[c],e=f.selector+" ",void 0===d[e]&&(d[e]=f.needsContext?n(e,this).index(i)>-1:n.find(e,this,null,[i]).length),d[e]&&d.push(f);d.length&&g.push({elem:i,handlers:d})}return h]","i"),va=/<(?!area|br|col|embed|hr|img|input|link|meta|param)(([\w:-]+)[^>]*)\/>/gi,wa=/\s*$/g,Aa=ca(d),Ba=Aa.appendChild(d.createElement("div"));function Ca(a,b){return n.nodeName(a,"table")&&n.nodeName(11!==b.nodeType?b:b.firstChild,"tr")?a.getElementsByTagName("tbody")[0]||a.appendChild(a.ownerDocument.createElement("tbody")):a}function Da(a){return a.type=(null!==n.find.attr(a,"type"))+"/"+a.type,a}function Ea(a){var b=ya.exec(a.type);return b?a.type=b[1]:a.removeAttribute("type"),a}function Fa(a,b){if(1===b.nodeType&&n.hasData(a)){var c,d,e,f=n._data(a),g=n._data(b,f),h=f.events;if(h){delete g.handle,g.events={};for(c in h)for(d=0,e=h[c].length;e>d;d++)n.event.add(b,c,h[c][d])}g.data&&(g.data=n.extend({},g.data))}}function Ga(a,b){var c,d,e;if(1===b.nodeType){if(c=b.nodeName.toLowerCase(),!l.noCloneEvent&&b[n.expando]){e=n._data(b);for(d in e.events)n.removeEvent(b,d,e.handle);b.removeAttribute(n.expando)}"script"===c&&b.text!==a.text?(Da(b).text=a.text,Ea(b)):"object"===c?(b.parentNode&&(b.outerHTML=a.outerHTML),l.html5Clone&&a.innerHTML&&!n.trim(b.innerHTML)&&(b.innerHTML=a.innerHTML)):"input"===c&&Z.test(a.type)?(b.defaultChecked=b.checked=a.checked,b.value!==a.value&&(b.value=a.value)):"option"===c?b.defaultSelected=b.selected=a.defaultSelected:"input"!==c&&"textarea"!==c||(b.defaultValue=a.defaultValue)}}function Ha(a,b,c,d){b=f.apply([],b);var e,g,h,i,j,k,m=0,o=a.length,p=o-1,q=b[0],r=n.isFunction(q);if(r||o>1&&"string"==typeof q&&!l.checkClone&&xa.test(q))return a.each(function(e){var f=a.eq(e);r&&(b[0]=q.call(this,e,f.html())),Ha(f,b,c,d)});if(o&&(k=ja(b,a[0].ownerDocument,!1,a,d),e=k.firstChild,1===k.childNodes.length&&(k=e),e||d)){for(i=n.map(ea(k,"script"),Da),h=i.length;o>m;m++)g=k,m!==p&&(g=n.clone(g,!0,!0),h&&n.merge(i,ea(g,"script"))),c.call(a[m],g,m);if(h)for(j=i[i.length-1].ownerDocument,n.map(i,Ea),m=0;h>m;m++)g=i[m],_.test(g.type||"")&&!n._data(g,"globalEval")&&n.contains(j,g)&&(g.src?n._evalUrl&&n._evalUrl(g.src):n.globalEval((g.text||g.textContent||g.innerHTML||"").replace(za,"")));k=e=null}return a}function Ia(a,b,c){for(var d,e=b?n.filter(b,a):a,f=0;null!=(d=e[f]);f++)c||1!==d.nodeType||n.cleanData(ea(d)),d.parentNode&&(c&&n.contains(d.ownerDocument,d)&&fa(ea(d,"script")),d.parentNode.removeChild(d));return a}n.extend({htmlPrefilter:function(a){return a.replace(va,"<$1>")},clone:function(a,b,c){var d,e,f,g,h,i=n.contains(a.ownerDocument,a);if(l.html5Clone||n.isXMLDoc(a)||!ua.test("<"+a.nodeName+">")?f=a.cloneNode(!0):(Ba.innerHTML=a.outerHTML,Ba.removeChild(f=Ba.firstChild)),!(l.noCloneEvent&&l.noCloneChecked||1!==a.nodeType&&11!==a.nodeType||n.isXMLDoc(a)))for(d=ea(f),h=ea(a),g=0;null!=(e=h[g]);++g)d[g]&&Ga(e,d[g]);if(b)if(c)for(h=h||ea(a),d=d||ea(f),g=0;null!=(e=h[g]);g++)Fa(e,d[g]);else Fa(a,f);return d=ea(f,"script"),d.length>0&&fa(d,!i&&ea(a,"script")),d=h=e=null,f},cleanData:function(a,b){for(var d,e,f,g,h=0,i=n.expando,j=n.cache,k=l.attributes,m=n.event.special;null!=(d=a[h]);h++)if((b||M(d))&&(f=d[i],g=f&&j[f])){if(g.events)for(e in g.events)m[e]?n.event.remove(d,e):n.removeEvent(d,e,g.handle);j[f]&&(delete j[f],k||"undefined"==typeof d.removeAttribute?d[i]=void 0:d.removeAttribute(i),c.push(f))}}}),n.fn.extend({domManip:Ha,detach:function(a){return Ia(this,a,!0)},remove:function(a){return Ia(this,a)},text:function(a){return Y(this,function(a){return void 0===a?n.text(this):this.empty().append((this[0]&&this[0].ownerDocument||d).createTextNode(a))},null,a,arguments.length)},append:function(){return Ha(this,arguments,function(a){if(1===this.nodeType||11===this.nodeType||9===this.nodeType){var b=Ca(this,a);b.appendChild(a)}})},prepend:function(){return Ha(this,arguments,function(a){if(1===this.nodeType||11===this.nodeType||9===this.nodeType){var b=Ca(this,a);b.insertBefore(a,b.firstChild)}})},before:function(){return Ha(this,arguments,function(a){this.parentNode&&this.parentNode.insertBefore(a,this)})},after:function(){return Ha(this,arguments,function(a){this.parentNode&&this.parentNode.insertBefore(a,this.nextSibling)})},empty:function(){for(var a,b=0;null!=(a=this[b]);b++){1===a.nodeType&&n.cleanData(ea(a,!1));while(a.firstChild)a.removeChild(a.firstChild);a.options&&n.nodeName(a,"select")&&(a.options.length=0)}return this},clone:function(a,b){return a=null==a?!1:a,b=null==b?a:b,this.map(function(){return n.clone(this,a,b)})},html:function(a){return Y(this,function(a){var b=this[0]||{},c=0,d=this.length;if(void 0===a)return 1===b.nodeType?b.innerHTML.replace(ta,""):void 0;if("string"==typeof a&&!wa.test(a)&&(l.htmlSerialize||!ua.test(a))&&(l.leadingWhitespace||!aa.test(a))&&!da[($.exec(a)||["",""])[1].toLowerCase()]){a=n.htmlPrefilter(a);try{for(;d>c;c++)b=this[c]||{},1===b.nodeType&&(n.cleanData(ea(b,!1)),b.innerHTML=a);b=0}catch(e){}}b&&this.empty().append(a)},null,a,arguments.length)},replaceWith:function(){var a=[];return Ha(this,arguments,function(b){var c=this.parentNode;n.inArray(this,a)<0&&(n.cleanData(ea(this)),c&&c.replaceChild(b,this))},a)}}),n.each({appendTo:"append",prependTo:"prepend",insertBefore:"before",insertAfter:"after",replaceAll:"replaceWith"},function(a,b){n.fn[a]=function(a){for(var c,d=0,e=[],f=n(a),h=f.length-1;h>=d;d++)c=d===h?this:this.clone(!0),n(f[d])[b](c),g.apply(e,c.get());return this.pushStack(e)}});var Ja,Ka={HTML:"block",BODY:"block"};function La(a,b){var c=n(b.createElement(a)).appendTo(b.body),d=n.css(c[0],"display");return c.detach(),d}function Ma(a){var b=d,c=Ka[a];return c||(c=La(a,b),"none"!==c&&c||(Ja=(Ja||n("