-
Notifications
You must be signed in to change notification settings - Fork 44
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(bq,sf|h3,quadbin): add H3/QUADBIN_POLYFILL_TABLE (#447)
- Loading branch information
Showing
13 changed files
with
604 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
## H3_POLYFILL_TABLE (BETA) | ||
|
||
```sql:signature | ||
H3_POLYFILL_TABLE(input_query, resolution, mode, output_table) | ||
``` | ||
|
||
**Description** | ||
|
||
Returns a table with the H3 cell indexes contained in the given geography at a given level of detail. Containment is determined by the mode: center, intersects, contains. All the attributes except the geography will be included in the output table, clustered by the h3 column. | ||
|
||
* `input_query`: `STRING` input data to polyfill. It must contain a column `geom` with the shape to cover. Additionally, other columns can be included. | ||
* `resolution`: `INT64` level of detail. The value must be between 0 and 15 ([H3 resolution table](https://h3geo.org/docs/core-library/restable)). | ||
* `mode`: `STRING` | ||
* `center` returns the indexes of the H3 cells which centers intersect the input geography (polygon). The resulting H3 set does not fully cover the input geography, however, this is **significantly faster** that the other modes. This mode is not compatible with points or lines. Equivalent to [`H3_POLYFILL`](h3#h3_polyfill). | ||
* `intersects` returns the indexes of the H3 cells that intersect the input geography. The resulting H3 set will completely cover the input geography (point, line, polygon). | ||
* `contains` returns the indexes of the H3 cells that are entirely contained inside the input geography (polygon). This mode is not compatible with points or lines. | ||
* `output_table`: `STRING` name of the output table to store the results of the polyfill. | ||
|
||
Mode `center`: | ||
|
||
![](h3_polyfill_mode_center.png) | ||
|
||
Mode `intersects`: | ||
|
||
![](h3_polyfill_mode_intersects.png) | ||
|
||
Mode `contains`: | ||
|
||
![](h3_polyfill_mode_contains.png) | ||
|
||
**Output** | ||
|
||
The results are stored in the table named `<output_table>`, which contains the following columns: | ||
|
||
* `h3`: `STRING` the geometry of the considered point. | ||
* The rest of columns included in `input_query` except `geom`. | ||
|
||
**Examples** | ||
|
||
```sql | ||
CALL carto.H3_POLYFILL_TABLE( | ||
"SELECT ST_GEOGFROMTEXT('POLYGON ((-3.71219873428345 40.413365349070865, -3.7144088745117 40.40965661286395, -3.70659828186035 40.409525904775634, -3.71219873428345 40.413365349070865))') AS geom", | ||
9, 'intersects', | ||
'<project>.<dataset>.<output_table>' | ||
); | ||
-- The table `<project>.<dataset>.<output_table>` will be created | ||
-- with column: h3 | ||
``` | ||
|
||
```sql | ||
CALL carto.H3_POLYFILL_TABLE( | ||
'SELECT geom, name, value FROM `<project>.<dataset>.<table>`', | ||
9, 'center', | ||
'<project>.<dataset>.<output_table>' | ||
); | ||
-- The table `<project>.<dataset>.<output_table>` will be created | ||
-- with columns: h3, name, value | ||
``` |
58 changes: 58 additions & 0 deletions
58
clouds/bigquery/modules/doc/quadbin/QUADBIN_POLYFILL_TABLE.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
## QUADBIN_POLYFILL_TABLE (BETA) | ||
|
||
```sql:signature | ||
QUADBIN_POLYFILL_TABLE(input_query, resolution, mode, output_table) | ||
``` | ||
|
||
**Description** | ||
|
||
Returns a table with the quadbin cell indexes contained in the given geography at a given level of detail. Containment is determined by the mode: center, intersects, contains. All the attributes except the geography will be included in the output table, clustered by the quadbin column. | ||
|
||
* `input_query`: `STRING` input data to polyfill. It must contain a column `geom` with the shape to cover. Additionally, other columns can be included. | ||
* `resolution`: `INT64` level of detail. The value must be between 0 and 26. | ||
* `mode`: `STRING` | ||
* `center` returns the indexes of the quadbin cells which centers intersect the input geography (polygon). The resulting quadbin set does not fully cover the input geography, however, this is **significantly faster** that the other modes. This mode is not compatible with points or lines. Equivalent to [`QUADBIN_POLYFILL`](quadbin#quadbin_polyfill). | ||
* `intersects` returns the indexes of the quadbin cells that intersect the input geography. The resulting quadbin set will completely cover the input geography (point, line, polygon). | ||
* `contains` returns the indexes of the quadbin cells that are entirely contained inside the input geography (polygon). This mode is not compatible with points or lines. | ||
* `output_table`: `STRING` name of the output table to store the results of the polyfill. | ||
|
||
Mode `center`: | ||
|
||
![](quadbin_polyfill_mode_center.png) | ||
|
||
Mode `intersects`: | ||
|
||
![](quadbin_polyfill_mode_intersects.png) | ||
|
||
Mode `contains`: | ||
|
||
![](quadbin_polyfill_mode_contains.png) | ||
|
||
**Output** | ||
|
||
The results are stored in the table named `<output_table>`, which contains the following columns: | ||
|
||
* `quadbin`: `INT64` the geometry of the considered point. | ||
* The rest of columns included in `input_query` except `geom`. | ||
|
||
**Examples** | ||
|
||
```sql | ||
CALL carto.QUADBIN_POLYFILL_TABLE( | ||
"SELECT ST_GEOGFROMTEXT('POLYGON ((-3.71219873428345 40.413365349070865, -3.7144088745117 40.40965661286395, -3.70659828186035 40.409525904775634, -3.71219873428345 40.413365349070865))') AS geom", | ||
12, 'intersects', | ||
'<project>.<dataset>.<output_table>' | ||
); | ||
-- The table `<project>.<dataset>.<output_table>` will be created | ||
-- with column: quadbin | ||
``` | ||
|
||
```sql | ||
CALL carto.QUADBIN_POLYFILL_TABLE( | ||
'SELECT geom, name, value FROM `<project>.<dataset>.<table>`', | ||
12, 'center', | ||
'<project>.<dataset>.<output_table>' | ||
); | ||
-- The table `<project>.<dataset>.<output_table>` will be created | ||
-- with columns: quadbin, name, value | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
---------------------------- | ||
-- Copyright (C) 2023 CARTO | ||
---------------------------- | ||
|
||
CREATE OR REPLACE FUNCTION `@@BQ_DATASET@@.__H3_POLYFILL_QUERY` | ||
( | ||
input_query STRING, | ||
resolution INT64, | ||
mode STRING, | ||
output_table STRING | ||
) | ||
RETURNS STRING | ||
DETERMINISTIC | ||
LANGUAGE js | ||
AS """ | ||
if (!['center', 'intersects', 'contains'].includes(mode)) { | ||
throw Error('Invalid mode, should be center, intersects, or contains.') | ||
} | ||
if (resolution < 0 || resolution > 15) { | ||
throw Error('Invalid resolution, should be between 0 and 15.') | ||
} | ||
output_table = output_table.replace(/`/g, '') | ||
const containmentFunction = (mode === 'contains') ? 'ST_CONTAINS' : 'ST_INTERSECTS' | ||
const cellFunction = (mode === 'center') ? '@@BQ_DATASET@@.H3_CENTER' : '@@BQ_DATASET@@.H3_BOUNDARY' | ||
return 'CREATE TABLE `' + output_table + '` CLUSTER BY (h3) AS\\n' + | ||
'WITH __input AS (' + input_query + '),\\n' + | ||
'__cells AS (SELECT h3, i.* FROM __input AS i,\\n' + | ||
'UNNEST(`@@BQ_DATASET@@.__H3_POLYFILL_INIT`(geom,`@@BQ_DATASET@@.__H3_POLYFILL_INIT_Z`(geom,' + resolution + '))) AS parent,\\n' + | ||
'UNNEST(`@@BQ_DATASET@@.H3_TOCHILDREN`(parent,' + resolution + ')) AS h3)\\n' + | ||
'SELECT * EXCEPT (geom) FROM __cells\\n' + | ||
'WHERE ' + containmentFunction + '(geom, `' + cellFunction + '`(h3));' | ||
"""; | ||
|
||
CREATE OR REPLACE PROCEDURE `@@BQ_DATASET@@.H3_POLYFILL_TABLE` | ||
( | ||
input_query STRING, | ||
resolution INT64, | ||
mode STRING, | ||
output_table STRING | ||
) | ||
BEGIN | ||
DECLARE polyfill_query STRING; | ||
|
||
-- Check if the destination tileset already exists | ||
CALL `@@BQ_DATASET@@.__CHECK_TABLE`(output_table); | ||
|
||
SET polyfill_query = `@@BQ_DATASET@@.__H3_POLYFILL_QUERY`( | ||
input_query, | ||
resolution, | ||
mode, | ||
output_table | ||
); | ||
|
||
EXECUTE IMMEDIATE polyfill_query; | ||
END; |
59 changes: 59 additions & 0 deletions
59
clouds/bigquery/modules/sql/quadbin/QUADBIN_POLYFILL_TABLE.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
---------------------------- | ||
-- Copyright (C) 2023 CARTO | ||
---------------------------- | ||
|
||
CREATE OR REPLACE FUNCTION `@@BQ_DATASET@@.__QUADBIN_POLYFILL_QUERY` | ||
( | ||
input_query STRING, | ||
resolution INT64, | ||
mode STRING, | ||
output_table STRING | ||
) | ||
RETURNS STRING | ||
DETERMINISTIC | ||
LANGUAGE js | ||
AS """ | ||
if (!['center', 'intersects', 'contains'].includes(mode)) { | ||
throw Error('Invalid mode, should be center, intersects, or contains.') | ||
} | ||
if (resolution < 0 || resolution > 26) { | ||
throw Error('Invalid resolution, should be between 0 and 26.') | ||
} | ||
output_table = output_table.replace(/`/g, '') | ||
const containmentFunction = (mode === 'contains') ? 'ST_CONTAINS' : 'ST_INTERSECTS' | ||
const cellFunction = (mode === 'center') ? '@@BQ_DATASET@@.QUADBIN_CENTER' : '@@BQ_DATASET@@.QUADBIN_BOUNDARY' | ||
return 'CREATE TABLE `' + output_table + '` CLUSTER BY (quadbin) AS\\n' + | ||
'WITH __input AS (' + input_query + '),\\n' + | ||
'__cells AS (SELECT quadbin, i.* FROM __input AS i,\\n' + | ||
'UNNEST(`@@BQ_DATASET@@.__QUADBIN_POLYFILL_INIT`(geom,`@@BQ_DATASET@@.__QUADBIN_POLYFILL_INIT_Z`(geom,' + resolution + '))) AS parent,\\n' + | ||
'UNNEST(`@@BQ_DATASET@@.QUADBIN_TOCHILDREN`(parent,' + resolution + ')) AS quadbin)\\n' + | ||
'SELECT * EXCEPT (geom) FROM __cells\\n' + | ||
'WHERE ' + containmentFunction + '(geom, `' + cellFunction + '`(quadbin));' | ||
"""; | ||
|
||
CREATE OR REPLACE PROCEDURE `@@BQ_DATASET@@.QUADBIN_POLYFILL_TABLE` | ||
( | ||
input_query STRING, | ||
resolution INT64, | ||
mode STRING, | ||
output_table STRING | ||
) | ||
BEGIN | ||
DECLARE polyfill_query STRING; | ||
|
||
-- Check if the destination tileset already exists | ||
CALL `@@BQ_DATASET@@.__CHECK_TABLE`(output_table); | ||
|
||
SET polyfill_query = `@@BQ_DATASET@@.__QUADBIN_POLYFILL_QUERY`( | ||
input_query, | ||
resolution, | ||
mode, | ||
output_table | ||
); | ||
|
||
EXECUTE IMMEDIATE polyfill_query; | ||
END; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
--------------------------------- | ||
-- Copyright (C) 2020-2021 CARTO | ||
--------------------------------- | ||
|
||
CREATE OR REPLACE PROCEDURE `@@BQ_DATASET@@.__CHECK_TABLE` | ||
(destination_table STRING) | ||
BEGIN | ||
DECLARE destination_parts DEFAULT (SELECT `@@BQ_DATASET@@.__TABLENAME_SPLIT`(destination_table)); | ||
DECLARE tables_metadata STRING; | ||
DECLARE table_name STRING; | ||
DECLARE num_tables INT64; | ||
|
||
IF destination_parts IS NULL OR destination_parts.table IS NULL OR destination_parts.dataset IS NULL THEN | ||
SELECT ERROR("The output table does not have a correct format, i.e. [projectID].dataset.tablename. Please, use a different output table name and try again."); | ||
END IF; | ||
|
||
SET table_name = destination_parts.table; | ||
SET tables_metadata = `@@BQ_DATASET@@.__TABLENAME_JOIN`((destination_parts.project, destination_parts.dataset, '__TABLES__')); | ||
|
||
EXECUTE IMMEDIATE FORMAT( | ||
''' | ||
SELECT COUNT(size_bytes) | ||
FROM %s | ||
WHERE table_id='%s' | ||
''', | ||
tables_metadata, | ||
table_name | ||
) INTO num_tables; | ||
|
||
IF num_tables > 0 THEN | ||
SELECT ERROR("The output table to store the tileset already exists. Please, use a different output table name and try again."); | ||
END IF; | ||
END; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
---------------------------- | ||
-- Copyright (C) 2021 CARTO | ||
---------------------------- | ||
|
||
CREATE OR REPLACE FUNCTION `@@BQ_DATASET@@.__TABLENAME_JOIN` | ||
(split_name STRUCT<project STRING, dataset STRING, table STRING>) | ||
RETURNS STRING | ||
AS ( | ||
IF( | ||
split_name.project IS NULL, | ||
FORMAT('`%s`.`%s`', split_name.dataset, split_name.table), | ||
FORMAT('`%s`.`%s`.`%s`', split_name.project, split_name.dataset, split_name.table) | ||
) | ||
); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
---------------------------- | ||
-- Copyright (C) 2021 CARTO | ||
---------------------------- | ||
|
||
CREATE OR REPLACE FUNCTION `@@BQ_DATASET@@.__TABLENAME_SPLIT` | ||
(qualified_name STRING) | ||
RETURNS STRUCT<project STRING, dataset STRING, table STRING> | ||
AS (( | ||
WITH unquoted AS (SELECT REPLACE(qualified_name, "`", "") AS name) | ||
|
||
SELECT AS STRUCT | ||
REGEXP_EXTRACT(name, r"^(.+)\..+\..+$") AS project, | ||
COALESCE(REGEXP_EXTRACT(name, r"^.+\.(.+)\..+$"), REGEXP_EXTRACT(name, r"^(.+)\..+$")) AS dataset, | ||
REGEXP_EXTRACT(name, r"^.+\.(.+)$") AS table | ||
FROM unquoted | ||
)); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
const { runQuery } = require('../../../common/test-utils'); | ||
|
||
const BQ_DATASET = process.env.BQ_DATASET; | ||
|
||
test('H3_POLYFILL_TABLE should generate the correct query', async () => { | ||
const query = `SELECT \`@@BQ_DATASET@@.__H3_POLYFILL_QUERY\`( | ||
'SELECT geom, name, value FROM \`<project>.<dataset>.<table>\`', | ||
12, 'center', | ||
'<project>.<dataset>.<output_table>' | ||
) AS output`; | ||
const rows = await runQuery(query); | ||
expect(rows.length).toEqual(1); | ||
expect(rows[0].output).toEqual(`CREATE TABLE \`<project>.<dataset>.<output_table>\` CLUSTER BY (h3) AS | ||
WITH __input AS (SELECT geom, name, value FROM \`<project>.<dataset>.<table>\`), | ||
__cells AS (SELECT h3, i.* FROM __input AS i, | ||
UNNEST(\`@@BQ_DATASET@@.__H3_POLYFILL_INIT\`(geom,\`@@BQ_DATASET@@.__H3_POLYFILL_INIT_Z\`(geom,12))) AS parent, | ||
UNNEST(\`@@BQ_DATASET@@.H3_TOCHILDREN\`(parent,12)) AS h3) | ||
SELECT * EXCEPT (geom) FROM __cells | ||
WHERE ST_INTERSECTS(geom, \`@@BQ_DATASET@@.H3_CENTER\`(h3));`.replace(/@@BQ_DATASET@@/g, BQ_DATASET)); | ||
}); |
20 changes: 20 additions & 0 deletions
20
clouds/bigquery/modules/test/quadbin/QUADBIN_POLYFILL_TABLE.test.js
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
const { runQuery } = require('../../../common/test-utils'); | ||
|
||
const BQ_DATASET = process.env.BQ_DATASET; | ||
|
||
test('QUADBIN_POLYFILL_TABLE should generate the correct query', async () => { | ||
const query = `SELECT \`@@BQ_DATASET@@.__QUADBIN_POLYFILL_QUERY\`( | ||
'SELECT geom, name, value FROM \`<project>.<dataset>.<table>\`', | ||
12, 'center', | ||
'<project>.<dataset>.<output_table>' | ||
) AS output`; | ||
const rows = await runQuery(query); | ||
expect(rows.length).toEqual(1); | ||
expect(rows[0].output).toEqual(`CREATE TABLE \`<project>.<dataset>.<output_table>\` CLUSTER BY (quadbin) AS | ||
WITH __input AS (SELECT geom, name, value FROM \`<project>.<dataset>.<table>\`), | ||
__cells AS (SELECT quadbin, i.* FROM __input AS i, | ||
UNNEST(\`@@BQ_DATASET@@.__QUADBIN_POLYFILL_INIT\`(geom,\`@@BQ_DATASET@@.__QUADBIN_POLYFILL_INIT_Z\`(geom,12))) AS parent, | ||
UNNEST(\`@@BQ_DATASET@@.QUADBIN_TOCHILDREN\`(parent,12)) AS quadbin) | ||
SELECT * EXCEPT (geom) FROM __cells | ||
WHERE ST_INTERSECTS(geom, \`@@BQ_DATASET@@.QUADBIN_CENTER\`(quadbin));`.replace(/@@BQ_DATASET@@/g, BQ_DATASET)); | ||
}); |
Oops, something went wrong.