Skip to content

Commit

Permalink
[SPARK-48479][SQL] Support creating scalar and table SQL UDFs in parser
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR adds support for creating user-defined SQL functions in parser. Here is the SQL syntax:
```
CREATE [OR REPLACE] [TEMPORARY] FUNCTION [IF NOT EXISTS] [db_name.]function_name
([param_name param_type [COMMENT param_comment], ...])
RETURNS {ret_type | TABLE (ret_name ret_type [COMMENT ret_comment], ...])}
[function_properties] function_body;

function_properties:
  [NOT] DETERMINISTIC | COMMENT function_comment | [ CONTAINS SQL | READS SQL DATA ]

function_body:
  RETURN {expression | TABLE ( query )}
```

### Why are the changes needed?

To support SQL user-defined functions.

### Does this PR introduce _any_ user-facing change?

Yes. This PR adds parser support for creating user-defined SQL functions.

### How was this patch tested?

New unit tests.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#46816 from allisonwang-db/spark-48479-sql-udf-parser.

Authored-by: allisonwang-db <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
  • Loading branch information
allisonwang-db authored and cloud-fan committed Jun 20, 2024
1 parent 9eadb2c commit 0d9f8a1
Show file tree
Hide file tree
Showing 13 changed files with 545 additions and 5 deletions.
14 changes: 14 additions & 0 deletions docs/sql-ref-ansi-compliance.md
Original file line number Diff line number Diff line change
Expand Up @@ -426,6 +426,7 @@ Below is a list of all the keywords in Spark SQL.
|BY|non-reserved|non-reserved|reserved|
|BYTE|non-reserved|non-reserved|non-reserved|
|CACHE|non-reserved|non-reserved|non-reserved|
|CALLED|non-reserved|non-reserved|non-reserved|
|CASCADE|non-reserved|non-reserved|non-reserved|
|CASE|reserved|non-reserved|reserved|
|CAST|reserved|non-reserved|reserved|
Expand All @@ -452,6 +453,7 @@ Below is a list of all the keywords in Spark SQL.
|COMPUTE|non-reserved|non-reserved|non-reserved|
|CONCATENATE|non-reserved|non-reserved|non-reserved|
|CONSTRAINT|reserved|non-reserved|reserved|
|CONTAINS|non-reserved|non-reserved|non-reserved|
|COST|non-reserved|non-reserved|non-reserved|
|CREATE|reserved|non-reserved|reserved|
|CROSS|reserved|strict-non-reserved|reserved|
Expand All @@ -478,10 +480,12 @@ Below is a list of all the keywords in Spark SQL.
|DECLARE|non-reserved|non-reserved|non-reserved|
|DEFAULT|non-reserved|non-reserved|non-reserved|
|DEFINED|non-reserved|non-reserved|non-reserved|
|DEFINER|non-reserved|non-reserved|non-reserved|
|DELETE|non-reserved|non-reserved|reserved|
|DELIMITED|non-reserved|non-reserved|non-reserved|
|DESC|non-reserved|non-reserved|non-reserved|
|DESCRIBE|non-reserved|non-reserved|reserved|
|DETERMINISTIC|non-reserved|non-reserved|reserved|
|DFS|non-reserved|non-reserved|non-reserved|
|DIRECTORIES|non-reserved|non-reserved|non-reserved|
|DIRECTORY|non-reserved|non-reserved|non-reserved|
Expand Down Expand Up @@ -540,17 +544,20 @@ Below is a list of all the keywords in Spark SQL.
|INDEXES|non-reserved|non-reserved|non-reserved|
|INNER|reserved|strict-non-reserved|reserved|
|INPATH|non-reserved|non-reserved|non-reserved|
|INPUT|non-reserved|non-reserved|non-reserved|
|INPUTFORMAT|non-reserved|non-reserved|non-reserved|
|INSERT|non-reserved|non-reserved|reserved|
|INT|non-reserved|non-reserved|reserved|
|INTEGER|non-reserved|non-reserved|reserved|
|INTERSECT|reserved|strict-non-reserved|reserved|
|INTERVAL|non-reserved|non-reserved|reserved|
|INTO|reserved|non-reserved|reserved|
|INVOKER|non-reserved|non-reserved|non-reserved|
|IS|reserved|non-reserved|reserved|
|ITEMS|non-reserved|non-reserved|non-reserved|
|JOIN|reserved|strict-non-reserved|reserved|
|KEYS|non-reserved|non-reserved|non-reserved|
|LANGUAGE|non-reserved|non-reserved|reserved|
|LAST|non-reserved|non-reserved|non-reserved|
|LATERAL|reserved|strict-non-reserved|reserved|
|LAZY|non-reserved|non-reserved|non-reserved|
Expand Down Expand Up @@ -579,6 +586,7 @@ Below is a list of all the keywords in Spark SQL.
|MINUTE|non-reserved|non-reserved|non-reserved|
|MINUTES|non-reserved|non-reserved|non-reserved|
|MINUS|non-reserved|strict-non-reserved|non-reserved|
|MODIFIES|non-reserved|non-reserved|non-reserved|
|MONTH|non-reserved|non-reserved|non-reserved|
|MONTHS|non-reserved|non-reserved|non-reserved|
|MSCK|non-reserved|non-reserved|non-reserved|
Expand Down Expand Up @@ -623,6 +631,7 @@ Below is a list of all the keywords in Spark SQL.
|QUARTER|non-reserved|non-reserved|non-reserved|
|QUERY|non-reserved|non-reserved|non-reserved|
|RANGE|non-reserved|non-reserved|reserved|
|READS|non-reserved|non-reserved|non-reserved|
|REAL|non-reserved|non-reserved|reserved|
|RECORDREADER|non-reserved|non-reserved|non-reserved|
|RECORDWRITER|non-reserved|non-reserved|non-reserved|
Expand All @@ -638,6 +647,8 @@ Below is a list of all the keywords in Spark SQL.
|RESET|non-reserved|non-reserved|non-reserved|
|RESPECT|non-reserved|non-reserved|non-reserved|
|RESTRICT|non-reserved|non-reserved|non-reserved|
|RETURN|non-reserved|non-reserved|reserved|
|RETURNS|non-reserved|non-reserved|reserved|
|REVOKE|non-reserved|non-reserved|reserved|
|RIGHT|reserved|strict-non-reserved|reserved|
|RLIKE|non-reserved|non-reserved|non-reserved|
Expand All @@ -651,6 +662,7 @@ Below is a list of all the keywords in Spark SQL.
|SCHEMAS|non-reserved|non-reserved|non-reserved|
|SECOND|non-reserved|non-reserved|non-reserved|
|SECONDS|non-reserved|non-reserved|non-reserved|
|SECURITY|non-reserved|non-reserved|non-reserved|
|SELECT|reserved|non-reserved|reserved|
|SEMI|non-reserved|strict-non-reserved|non-reserved|
|SEPARATED|non-reserved|non-reserved|non-reserved|
Expand All @@ -668,6 +680,8 @@ Below is a list of all the keywords in Spark SQL.
|SORT|non-reserved|non-reserved|non-reserved|
|SORTED|non-reserved|non-reserved|non-reserved|
|SOURCE|non-reserved|non-reserved|non-reserved|
|SPECIFIC|non-reserved|non-reserved|reserved|
|SQL|reserved|non-reserved|reserved|
|START|non-reserved|non-reserved|reserved|
|STATISTICS|non-reserved|non-reserved|non-reserved|
|STORED|non-reserved|non-reserved|non-reserved|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ BUCKETS: 'BUCKETS';
BY: 'BY';
BYTE: 'BYTE';
CACHE: 'CACHE';
CALLED: 'CALLED';
CASCADE: 'CASCADE';
CASE: 'CASE';
CAST: 'CAST';
Expand All @@ -172,6 +173,7 @@ COMPENSATION: 'COMPENSATION';
COMPUTE: 'COMPUTE';
CONCATENATE: 'CONCATENATE';
CONSTRAINT: 'CONSTRAINT';
CONTAINS: 'CONTAINS';
COST: 'COST';
CREATE: 'CREATE';
CROSS: 'CROSS';
Expand All @@ -198,10 +200,12 @@ DECIMAL: 'DECIMAL';
DECLARE: 'DECLARE';
DEFAULT: 'DEFAULT';
DEFINED: 'DEFINED';
DEFINER: 'DEFINER';
DELETE: 'DELETE';
DELIMITED: 'DELIMITED';
DESC: 'DESC';
DESCRIBE: 'DESCRIBE';
DETERMINISTIC: 'DETERMINISTIC';
DFS: 'DFS';
DIRECTORIES: 'DIRECTORIES';
DIRECTORY: 'DIRECTORY';
Expand Down Expand Up @@ -260,17 +264,20 @@ INDEX: 'INDEX';
INDEXES: 'INDEXES';
INNER: 'INNER';
INPATH: 'INPATH';
INPUT: 'INPUT';
INPUTFORMAT: 'INPUTFORMAT';
INSERT: 'INSERT';
INTERSECT: 'INTERSECT';
INTERVAL: 'INTERVAL';
INT: 'INT';
INTEGER: 'INTEGER';
INTO: 'INTO';
INVOKER: 'INVOKER';
IS: 'IS';
ITEMS: 'ITEMS';
JOIN: 'JOIN';
KEYS: 'KEYS';
LANGUAGE: 'LANGUAGE';
LAST: 'LAST';
LATERAL: 'LATERAL';
LAZY: 'LAZY';
Expand Down Expand Up @@ -298,6 +305,7 @@ MILLISECOND: 'MILLISECOND';
MILLISECONDS: 'MILLISECONDS';
MINUTE: 'MINUTE';
MINUTES: 'MINUTES';
MODIFIES: 'MODIFIES';
MONTH: 'MONTH';
MONTHS: 'MONTHS';
MSCK: 'MSCK';
Expand Down Expand Up @@ -342,6 +350,7 @@ PURGE: 'PURGE';
QUARTER: 'QUARTER';
QUERY: 'QUERY';
RANGE: 'RANGE';
READS: 'READS';
REAL: 'REAL';
RECORDREADER: 'RECORDREADER';
RECORDWRITER: 'RECORDWRITER';
Expand All @@ -356,6 +365,8 @@ REPLACE: 'REPLACE';
RESET: 'RESET';
RESPECT: 'RESPECT';
RESTRICT: 'RESTRICT';
RETURN: 'RETURN';
RETURNS: 'RETURNS';
REVOKE: 'REVOKE';
RIGHT: 'RIGHT';
RLIKE: 'RLIKE' | 'REGEXP';
Expand All @@ -369,6 +380,7 @@ SECOND: 'SECOND';
SECONDS: 'SECONDS';
SCHEMA: 'SCHEMA';
SCHEMAS: 'SCHEMAS';
SECURITY: 'SECURITY';
SELECT: 'SELECT';
SEMI: 'SEMI';
SEPARATED: 'SEPARATED';
Expand All @@ -387,6 +399,8 @@ SOME: 'SOME';
SORT: 'SORT';
SORTED: 'SORTED';
SOURCE: 'SOURCE';
SPECIFIC: 'SPECIFIC';
SQL: 'SQL';
START: 'START';
STATISTICS: 'STATISTICS';
STORED: 'STORED';
Expand Down
Loading

0 comments on commit 0d9f8a1

Please sign in to comment.