diff --git a/docs/README.md b/docs/README.md index 5a59fff..7211800 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,45 +1,107 @@ - + + + + # DuckDB HTTP Server Extension -This very experimental extension spawns an HTTP Server from within DuckDB serving query requests.
-The extension goal is to replace the functionality currently offered by [Quackpipe](https://github.com/metrico/quackpipe) +This extension transforms **DuckDB** instances into tiny multi-player **HTTP OLAP API** services.
+Supports Authentication _(Basic Auth or X-Token)_ and includes the _play_ SQL user interface. + +The extension goal is to replace the functionality currently offered by [quackpipe](https://github.com/metrico/quackpipe) + +### Features -![image](https://github.com/user-attachments/assets/180fdcec-2cae-4909-a7a2-28333cd7dd44) +- Turn any [DuckDB](https://duckdb.org) instance into an **HTTP OLAP API** Server +- Use the embedded **Play User Interface** to query and visualize data +- Pair with [chsql](https://community-extensions.duckdb.org/extensions/chsql.html) extension for **ClickHouse flavoured SQL** +- Work with local and remote datasets including [MotherDuck](https://motherduck.com) 🐤 +- _100% Opensource, ready to use and extend by the Community!_ +
+ +![image](https://github.com/user-attachments/assets/e930a8d2-b3e4-454e-ba12-e5e91b30bfbe) #### Extension Functions -- `httpserve_start(host, port)` -- `httpserve_stop()` +- `httpserve_start(host, port, auth)`: starts the server using provided parameters +- `httpserve_stop()`: stops the server thread -#### API Endpoints -- `/` `GET`, `POST` - - `default_format`: Supports `JSONEachRow` or `JSONCompact` - - `query`: Supports DuckDB SQL queries -- `/ping` `GET` +#### Notes -### Installation +🛑 Run DuckDB in `-readonly` mode for enhanced security + +
+ +### 📦 [Installation](https://community-extensions.duckdb.org/extensions/httpserver.html) ```sql INSTALL httpserver FROM community; LOAD httpserver; ``` -### Usage -Start the HTTP server providing the `host` and `port` parameters +### 🔌 Usage +Start the HTTP server providing the `host`, `port` and `auth` parameters.
+> * If you want no authentication, just pass an empty string as parameter.
+> * If you want the API run in foreground set `DUCKDB_HTTPSERVER_FOREGROUND=1` + +#### Basic Auth ```sql -D SELECT httpserve_start('0.0.0.0',9999); -┌─────────────────────────────────────┐ -│ httpserve_start('0.0.0.0', 9999) │ -│ varchar │ -├─────────────────────────────────────┤ -│ HTTP server started on 0.0.0.0:9999 │ -└─────────────────────────────────────┘ +D SELECT httpserve_start('localhost', 9999, 'user:pass'); + +┌───────────────────────────────────────────────┐ +│ httpserve_start('0.0.0.0', 9999, 'user:pass') │ +│ varchar │ +├───────────────────────────────────────────────┤ +│ HTTP server started on 0.0.0.0:9999 │ +└───────────────────────────────────────────────┘ +``` +```bash +curl -X POST -d "SELECT 'hello', version()" "http://user:pass@localhost:9999/" +``` + +#### Token Auth +```sql +SELECT httpserve_start('localhost', 9999, 'supersecretkey'); + +┌───────────────────────────────────────────────┐ +│ httpserve_start('0.0.0.0', 9999, 'secretkey') │ +│ varchar │ +├───────────────────────────────────────────────┤ +│ HTTP server started on 0.0.0.0:9999 │ +└───────────────────────────────────────────────┘ +``` + +Query your endpoint using the `X-API-Key` token: + +```bash +curl -X POST --header "X-API-Key: secretkey" -d "SELECT 'hello', version()" "http://localhost:9999/" ``` -#### QUERY UI +You can perform the same action from DuckDB using HTTP `extra_http_headers`: + +```sql +D CREATE SECRET extra_http_headers ( + TYPE HTTP, + EXTRA_HTTP_HEADERS MAP{ + 'X-API-Key': 'secret' + } + ); + +D SELECT * FROM duck_flock('SELECT version()', ['http://localhost:9999']); +┌─────────────┐ +│ "version"() │ +│ varchar │ +├─────────────┤ +│ v1.1.1 │ +└─────────────┘ +``` + + + +#### 👉 QUERY UI Browse to your endpoint and use the built-in quackplay interface _(experimental)_ + ![image](https://github.com/user-attachments/assets/0ee751d0-7360-4d3d-949d-3fb930634ebd) -#### QUERY API +#### 👉 QUERY API Query your API endpoint using curl GET/POST requests ```bash @@ -72,7 +134,9 @@ curl -X POST -d "SELECT 'hello', version()" "http://localhost:9999/?default_form } ``` -You can also have DuckDB instances query each other using `NDJSON` +#### 👉 CROSS-OVER EXAMPLES + +You can now have DuckDB instances query each other and... _themselves!_ ```sql D LOAD json; @@ -93,30 +157,53 @@ D SELECT * FROM read_json_auto('http://localhost:9999/?q=SELECT version()'); └─────────────┘ ``` +#### Flock Macro by @carlopi +Check out this flocking macro from fellow _Italo-Amsterdammer_ @carlopi @ DuckDB Labs + +![image](https://github.com/user-attachments/assets/b409ec0e-86e0-4a8d-822c-377ddbae524d) + +* a DuckDB CLI, running httpserver extension +* a DuckDB from Python, running httpserver extension +* a DuckDB from the Web, querying all 3 DuckDB at the same time +
+
+
-### Build steps -Now to build the extension, run: -```sh -make -``` -The main binaries that will be built are: -```sh -./build/release/duckdb -./build/release/test/unittest -./build/release/extension//.duckdb_extension -``` -- `duckdb` is the binary for the duckdb shell with the extension code automatically loaded. -- `unittest` is the test runner of duckdb. Again, the extension is already linked into the binary. -- `.duckdb_extension` is the loadable binary as it would be distributed. +### API Documentation -## Running the extension -To run the extension code, simply start the shell with `./build/release/duckdb`. This shell will have the extension pre-loaded. +#### Endpoints Overview -## Running the tests -Different tests can be created for DuckDB extensions. The primary way of testing DuckDB extensions should be the SQL tests in `./test/sql`. These SQL tests can be run using: -```sh -make test -``` +| Endpoint | Methods | Description | +|----------|---------|-------------| +| `/` | GET, POST | Query API endpoint | +| `/ping` | GET | Health check endpoint | + +#### Detailed Endpoint Specifications + +##### Query API + +**Methods:** `GET`, `POST` + +**Parameters:** + +| Parameter | Description | Supported Values | +|-----------|-------------|-------------------| +| `default_format` | Specifies the output format | `JSONEachRow`, `JSONCompact` | +| `query` | The DuckDB SQL query to execute | Any valid DuckDB SQL query | + +##### Notes + +- Ensure that your queries are properly formatted and escaped when sending them as part of the request. +- The root endpoint (`/`) supports both GET and POST methods, but POST is recommended for complex queries or when the query length exceeds URL length limitations. +- Always specify the `default_format` parameter to ensure consistent output formatting. + +
+ +##### :black_joker: Disclaimers + +[^1]: DuckDB ® is a trademark of DuckDB Foundation. All rights reserved by their respective owners. [^1] +[^2]: ClickHouse ® is a trademark of ClickHouse Inc. No direct affiliation or endorsement. [^2] +[^3]: Released under the MIT license. See LICENSE for details. All rights reserved by their respective owners. [^3] diff --git a/src/httpserver_extension.cpp b/src/httpserver_extension.cpp index 8e486f9..14ab49f 100644 --- a/src/httpserver_extension.cpp +++ b/src/httpserver_extension.cpp @@ -31,6 +31,7 @@ struct HttpServerState { std::atomic is_running; DatabaseInstance* db_instance; unique_ptr allocator; + std::string auth_token; HttpServerState() : is_running(false), db_instance(nullptr) {} }; @@ -131,6 +132,51 @@ static std::string ConvertResultToJSON(MaterializedQueryResult &result, ReqStats return json_output; } +// New: Base64 decoding function +std::string base64_decode(const std::string &in) { + std::string out; + std::vector T(256, -1); + for (int i = 0; i < 64; i++) + T["ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"[i]] = i; + + int val = 0, valb = -8; + for (unsigned char c : in) { + if (T[c] == -1) break; + val = (val << 6) + T[c]; + valb += 6; + if (valb >= 0) { + out.push_back(char((val >> valb) & 0xFF)); + valb -= 8; + } + } + return out; +} + +// Auth Check +bool IsAuthenticated(const duckdb_httplib_openssl::Request& req) { + if (global_state.auth_token.empty()) { + return true; // No authentication required if no token is set + } + + // Check for X-API-Key header + auto api_key = req.get_header_value("X-API-Key"); + if (!api_key.empty() && api_key == global_state.auth_token) { + return true; + } + + // Check for Basic Auth + auto auth = req.get_header_value("Authorization"); + if (!auth.empty() && auth.compare(0, 6, "Basic ") == 0) { + std::string decoded_auth = base64_decode(auth.substr(6)); + if (decoded_auth == global_state.auth_token) { + return true; + } + } + + return false; +} + + // Convert the query result to NDJSON (JSONEachRow) format static std::string ConvertResultToNDJSON(MaterializedQueryResult &result) { std::string ndjson_output; @@ -210,6 +256,13 @@ static void HandleQuery(const string& query, duckdb_httplib_openssl::Response& r void HandleHttpRequest(const duckdb_httplib_openssl::Request& req, duckdb_httplib_openssl::Response& res) { std::string query; + // Check authentication + if (!IsAuthenticated(req)) { + res.status = 401; + res.set_content("Unauthorized", "text/plain"); + return; + } + // CORS allow res.set_header("Access-Control-Allow-Origin", "*"); res.set_header("Access-Control-Allow-Methods", "GET, POST, OPTIONS, PUT"); @@ -297,7 +350,7 @@ void HandleHttpRequest(const duckdb_httplib_openssl::Request& req, duckdb_httpli } } -void HttpServerStart(DatabaseInstance& db, string_t host, int32_t port) { +void HttpServerStart(DatabaseInstance& db, string_t host, int32_t port, string_t auth = string_t()) { if (global_state.is_running) { throw IOException("HTTP server is already running"); } @@ -305,6 +358,7 @@ void HttpServerStart(DatabaseInstance& db, string_t host, int32_t port) { global_state.db_instance = &db; global_state.server = make_uniq(); global_state.is_running = true; + global_state.auth_token = auth.GetString(); // CORS Preflight global_state.server->Options("/", @@ -331,12 +385,41 @@ void HttpServerStart(DatabaseInstance& db, string_t host, int32_t port) { }); string host_str = host.GetString(); - global_state.server_thread = make_uniq([host_str, port]() { + + const char* run_in_same_thread_env = std::getenv("DUCKDB_HTTPSERVER_FOREGROUND"); + bool run_in_same_thread = (run_in_same_thread_env != nullptr && std::string(run_in_same_thread_env) == "1"); + + if (run_in_same_thread) { +#ifdef _WIN32 + throw IOException("Foreground mode not yet supported on WIN32 platforms."); +#else + // POSIX signal handler for SIGINT (Linux/macOS) + signal(SIGINT, [](int) { + if (global_state.server) { + global_state.server->stop(); + } + global_state.is_running = false; // Update the running state + }); + + // Run the server in the same thread if (!global_state.server->listen(host_str.c_str(), port)) { global_state.is_running = false; throw IOException("Failed to start HTTP server on " + host_str + ":" + std::to_string(port)); } - }); +#endif + + // The server has stopped (due to CTRL-C or other reasons) + global_state.is_running = false; + } else { + // Run the server in a dedicated thread (default) + global_state.server_thread = make_uniq([host_str, port]() { + if (!global_state.server->listen(host_str.c_str(), port)) { + global_state.is_running = false; + throw IOException("Failed to start HTTP server on " + host_str + ":" + std::to_string(port)); + } + }); + } + } void HttpServerStop() { @@ -361,17 +444,19 @@ static void HttpServerCleanup() { static void LoadInternal(DatabaseInstance &instance) { auto httpserve_start = ScalarFunction("httpserve_start", - {LogicalType::VARCHAR, LogicalType::INTEGER}, + {LogicalType::VARCHAR, LogicalType::INTEGER, LogicalType::VARCHAR}, LogicalType::VARCHAR, [&](DataChunk &args, ExpressionState &state, Vector &result) { auto &host_vector = args.data[0]; auto &port_vector = args.data[1]; + auto &auth_vector = args.data[2]; UnaryExecutor::Execute( host_vector, result, args.size(), [&](string_t host) { auto port = ((int32_t*)port_vector.GetData())[0]; - HttpServerStart(instance, host, port); + auto auth = ((string_t*)auth_vector.GetData())[0]; + HttpServerStart(instance, host, port, auth); return StringVector::AddString(result, "HTTP server started on " + host.GetString() + ":" + std::to_string(port)); }); });