-
Notifications
You must be signed in to change notification settings - Fork 466
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
328 additions
and
49 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
--- | ||
layout: page | ||
title: Gluten For Flink with Velox Backend | ||
nav_order: 1 | ||
parent: Getting-Started | ||
--- | ||
|
||
# Supported Version | ||
|
||
| Type | Version | | ||
|-------|------------------------------| | ||
| Flink | 1.20 | | ||
| OS | Ubuntu20.04/22.04, Centos7/8 | | ||
| jdk | openjdk11/jdk17 | | ||
| scala | 2.12 | | ||
|
||
# Prerequisite | ||
|
||
Currently, with static build Gluten+Flink+Velox backend supports all the Linux OSes, but is only tested on **Ubuntu20.04/Ubuntu22.04/Centos7/Centos8**. With dynamic build, Gluten+Velox backend support **Ubuntu20.04/Ubuntu22.04/Centos7/Centos8** and their variants. | ||
|
||
Currently, the officially supported Flink versions are 1.20.*. | ||
|
||
We need to set up the `JAVA_HOME` env. Currently, Gluten supports **java 11** and **java 17**. | ||
|
||
**For x86_64** | ||
|
||
```bash | ||
## make sure jdk8 is used | ||
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 | ||
export PATH=$JAVA_HOME/bin:$PATH | ||
``` | ||
|
||
**For aarch64** | ||
|
||
```bash | ||
## make sure jdk8 is used | ||
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-arm64 | ||
export PATH=$JAVA_HOME/bin:$PATH | ||
``` | ||
|
||
**Get gluten** | ||
|
||
```bash | ||
## config maven, like proxy in ~/.m2/settings.xml | ||
|
||
## fetch gluten code | ||
git clone https://github.com/apache/incubator-gluten.git | ||
``` | ||
|
||
# Build Gluten Flink with Velox Backend | ||
|
||
``` | ||
cd /path/to/gluten/gluten-flink | ||
mvn clean package | ||
``` | ||
|
||
## Dependency library deployment | ||
|
||
Gluten for Flink depends on [Velox4j](https://github.com/velox4j/velox4j) to call velox. So you need to get the Velox4j packages and used them with gluten. | ||
Velox4j jar available now is velox4j-0.1.0-SNAPSHOT.jar. | ||
|
||
## Submit the Flink SQL job | ||
|
||
Submit test script from `flink run`. You can use the `StreamSQLExample` as an example. | ||
|
||
### Flink local cluster | ||
``` | ||
var parquet_file_path = "/PATH/TO/TPCH_PARQUET_PATH" | ||
var gluten_root = "/PATH/TO/GLUTEN" | ||
``` | ||
|
||
After deploying flink binaries, please add gluten-flink jar to flink library path, | ||
including gluten-flink-runtime-1.4.0.jar, gluten-flink-loader-1.4.0.jar and Velox4j jars above. | ||
And make them loaded before flink libraries. | ||
Then you can go to flink binary path and use the below scripts to | ||
submit the example job. | ||
|
||
```bash | ||
bin/start-cluster.sh | ||
bin/flink run -d -m 0.0.0.0:8080 \ | ||
-c org.apache.flink.table.examples.java.basics.StreamSQLExample \ | ||
lib/flink-examples-table_2.12-1.20.1.jar | ||
``` | ||
|
||
Then you can get the result in `log/flink-*-taskexecutor-*.out`. | ||
And you can see an operator named `gluten-cal` from the web frontend of your flink job. | ||
|
||
### Flink Yarn per job mode | ||
|
||
TODO | ||
|
||
## Notes: | ||
Now both Gluten for Flink and Velox4j have not a bundled jar including all jar depends on. | ||
So you may have to add these jars by yourself, which may including guava-33.4.0-jre.jar, jackson-core-2.18.0.jar, | ||
jackson-databind-2.18.0.jar, jackson-datatype-jdk8-2.18.0.jar, jackson-annotations-2.18.0.jar, arrow-memory-core-18.1.0.jar, | ||
arrow-memory-unsafe-18.1.0.jar, arrow-vector-18.1.0.jar, flatbuffers-java-24.3.25.jar, arrow-format-18.1.0.jar, arrow-c-data-18.1.0.jar. | ||
We will supply bundled jars soon. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Gluten Flink Project | ||
Gluten for Flink is under developing now, you can refer to [user guide](../docs/get-started/Flink.md) for a quick usage. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
52 changes: 52 additions & 0 deletions
52
gluten-flink/planner/src/main/java/org/apache/gluten/rexnode/LogicalTypeConverter.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.gluten.rexnode; | ||
|
||
import io.github.zhztheplayer.velox4j.type.IntegerType; | ||
import io.github.zhztheplayer.velox4j.type.Type; | ||
import org.apache.flink.table.types.logical.BigIntType; | ||
import org.apache.flink.table.types.logical.IntType; | ||
import org.apache.flink.table.types.logical.LogicalType; | ||
import org.apache.flink.table.types.logical.RowType; | ||
import org.apache.flink.table.types.logical.VarCharType; | ||
|
||
import java.util.List; | ||
import java.util.stream.Collectors; | ||
|
||
/** Convertor to convert Flink LogicalType to velox data Type */ | ||
public class LogicalTypeConverter { | ||
|
||
public static Type toVLType(LogicalType logicalType) { | ||
if (logicalType instanceof RowType) { | ||
RowType flinkRowType = (RowType) logicalType; | ||
List<Type> fieldTypes = flinkRowType.getChildren().stream(). | ||
map(LogicalTypeConverter::toVLType). | ||
collect(Collectors.toList()); | ||
return new io.github.zhztheplayer.velox4j.type.RowType( | ||
flinkRowType.getFieldNames(), | ||
fieldTypes); | ||
} else if (logicalType instanceof IntType) { | ||
return new IntegerType(); | ||
} else if (logicalType instanceof BigIntType) { | ||
return new io.github.zhztheplayer.velox4j.type.BigIntType(); | ||
} else if (logicalType instanceof VarCharType) { | ||
return new io.github.zhztheplayer.velox4j.type.VarCharType(); | ||
} else { | ||
throw new RuntimeException("Unsupported logical type: " + logicalType); | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
48 changes: 48 additions & 0 deletions
48
gluten-flink/planner/src/main/java/org/apache/gluten/rexnode/Utils.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.gluten.rexnode; | ||
|
||
import io.github.zhztheplayer.velox4j.serializable.ISerializableRegistry; | ||
import io.github.zhztheplayer.velox4j.variant.VariantRegistry; | ||
import org.apache.flink.table.types.logical.LogicalType; | ||
import org.apache.flink.table.types.logical.RowType; | ||
|
||
import java.util.List; | ||
|
||
/** Utility to store some useful functions. */ | ||
public class Utils { | ||
|
||
private static boolean registryInitialized = false; | ||
// Get names for project node. | ||
public static List<String> getNamesFromRowType(LogicalType logicalType) { | ||
if (logicalType instanceof RowType) { | ||
RowType rowType = (RowType) logicalType; | ||
return rowType.getFieldNames(); | ||
} else { | ||
throw new RuntimeException("Output type is not row type: " + logicalType); | ||
} | ||
} | ||
|
||
// Init serialize related registries. | ||
public static void registerRegistry() { | ||
if (!registryInitialized) { | ||
registryInitialized = true; | ||
VariantRegistry.registerAll(); | ||
ISerializableRegistry.registerAll(); | ||
} | ||
} | ||
} |
Oops, something went wrong.