-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add json_array_length Spark function #11946
base: main
Are you sure you want to change the base?
feat: Add json_array_length Spark function #11946
Conversation
a971121
to
a5c0d3b
Compare
✅ Deploy Preview for meta-velox canceled.
|
✅ Deploy Preview for meta-velox canceled.
|
struct JsonArrayLengthFunction { | ||
VELOX_DEFINE_FUNCTION_TYPES(T); | ||
|
||
FOLLY_ALWAYS_INLINE bool call(int32_t& len, const arg_type<Json>& json) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion:
Change const arg_type<Json>&
to const arg_type<Varchar>&
and remove the above corresponding header.
I think JsonType is just for presto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
@@ -24,6 +25,8 @@ void registerJsonFunctions(const std::string& prefix) { | |||
{prefix + "get_json_object"}); | |||
registerFunction<JsonObjectKeysFunction, Array<Varchar>, Varchar>( | |||
{prefix + "json_object_keys"}); | |||
registerFunction<JsonArrayLengthFunction, int32_t, Varchar>( | |||
{prefix + "json_array_length"}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid duplicating code, we would recommend extracting the implementation to the 'velox/functions/lib/' and registering it with corresponding return types for Presto and Spark if the semantics are the same (corner cases may need to be verified).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo The input type is a bit different, one is Json
, another is Varchar
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo removed duplicated code. Please take a look again
json_array_length
in Spark is similar as Presto but has difference in return type, Presto is int64_t, Spark is int32_t.Examples:
Spark doc: https://spark.apache.org/docs/latest/api/sql/#json_array_length