Skip to content

Commit

Permalink
Merge pull request #8133 from NvTimLiu/release-tmp
Browse files Browse the repository at this point in the history
Merge branch 'branch-23.04' to main
  • Loading branch information
NvTimLiu authored Apr 18, 2023
2 parents 9b37954 + 448207f commit d5acb6b
Show file tree
Hide file tree
Showing 645 changed files with 17,936 additions and 6,994 deletions.
8 changes: 4 additions & 4 deletions .github/workflows/auto-merge.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ name: auto-merge HEAD to BASE
on:
pull_request_target:
branches:
- branch-23.02
- branch-23.04
types: [closed]

jobs:
Expand All @@ -29,13 +29,13 @@ jobs:
steps:
- uses: actions/checkout@v3
with:
ref: branch-23.02 # force to fetch from latest upstream instead of PR ref
ref: branch-23.04 # force to fetch from latest upstream instead of PR ref

- name: auto-merge job
uses: ./.github/workflows/auto-merge
env:
OWNER: NVIDIA
REPO_NAME: spark-rapids
HEAD: branch-23.02
BASE: branch-23.04
HEAD: branch-23.04
BASE: branch-23.06
AUTOMERGE_TOKEN: ${{ secrets.AUTOMERGE_TOKEN }} # use to merge PR
6 changes: 3 additions & 3 deletions .github/workflows/blossom-ci.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2020-2022, NVIDIA CORPORATION.
# Copyright (c) 2020-2023, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -96,10 +96,10 @@ jobs:
java-version: 8

# add blackduck properties https://synopsys.atlassian.net/wiki/spaces/INTDOCS/pages/631308372/Methods+for+Configuring+Analysis#Using-a-configuration-file
# currently hardcode projects here to avoid intermittent mvn scan failures
- name: Setup blackduck properties
run: |
PROJECTS=$(mvn -am dependency:tree | grep maven-dependency-plugin | awk '{ out="com.nvidia:"$(NF-1);print out }' | grep rapids | xargs | sed -e 's/ /,/g')
echo detect.maven.build.command="-pl=$PROJECTS -am" >> application.properties
echo detect.maven.build.command="-pl=com.nvidia:rapids-4-spark-parent,com.nvidia:rapids-4-spark-sql_2.12 -am" >> application.properties
echo detect.maven.included.scopes=compile >> application.properties
- name: Run blossom action
Expand Down
7 changes: 5 additions & 2 deletions .github/workflows/mvn-verify-check.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2022, NVIDIA CORPORATION.
# Copyright (c) 2022-2023, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -46,7 +46,10 @@ jobs:
. jenkins/version-def.sh
svArrBodyNoSnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":false}" "${SPARK_SHIM_VERSIONS_NOSNAPSHOTS_TAIL[@]}")
svArrBodyNoSnapshot=${svArrBodyNoSnapshot:1}
svArrBodySnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":true}" "${SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]}")
# do not add empty snapshot versions
if [ ${#SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]} -gt 0 ]; then
svArrBodySnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":true}" "${SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]}")
fi
# add snapshot versions which are not in snapshot property in pom file
svArrBodySnapshot+=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":true}" 340)
Expand Down
219 changes: 218 additions & 1 deletion CHANGELOG.md

Large diffs are not rendered by default.

72 changes: 63 additions & 9 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ To this end in a pre-production build you can set the Boolean property

The time saved is more significant if you are merely changing
the `aggregator` module, or the `dist` module, or just incorporating changes from
[spark-rapids-jni](https://github.com/NVIDIA/spark-rapids-jni/blob/branch-23.02/CONTRIBUTING.md#local-testing-of-cross-repo-contributions-cudf-spark-rapids-jni-and-spark-rapids)
[spark-rapids-jni](https://github.com/NVIDIA/spark-rapids-jni/blob/branch-23.04/CONTRIBUTING.md#local-testing-of-cross-repo-contributions-cudf-spark-rapids-jni-and-spark-rapids)

For example, to quickly repackage `rapids-4-spark` after the
initial `./build/buildall` you can iterate by invoking
Expand Down Expand Up @@ -186,22 +186,38 @@ The following acronyms may appear in directory names:
|cdh |Cloudera CDH|321cdh |Cloudera CDH Spark based on Apache Spark 3.2.1|

The version-specific directory names have one of the following forms / use cases:
- `src/main/312/scala` contains Scala source code for a single Spark version, 3.1.2 in this case
- `src/main/312+-apache/scala`contains Scala source code for *upstream* **Apache** Spark builds,

#### Version range directories

The following source directory system is deprecated. See below and [shimplify.md][1]

* `src/main/312/scala` contains Scala source code for a single Spark version, 3.1.2 in this case
* `src/main/312+-apache/scala`contains Scala source code for *upstream* **Apache** Spark builds,
only beginning with version Spark 3.1.2, and + signifies there is no upper version boundary
among the supported versions
- `src/main/311until320-all` contains code that applies to all shims between 3.1.1 *inclusive*,
* `src/main/311until320-all` contains code that applies to all shims between 3.1.1 *inclusive*,
3.2.0 *exclusive*
- `src/main/pre320-treenode` contains shims for the Catalyst `TreeNode` class before the
* `src/main/pre320-treenode` contains shims for the Catalyst `TreeNode` class before the
[children trait specialization in Apache Spark 3.2.0](https://issues.apache.org/jira/browse/SPARK-34906).
- `src/main/post320-treenode` contains shims for the Catalyst `TreeNode` class after the
* `src/main/post320-treenode` contains shims for the Catalyst `TreeNode` class after the
[children trait specialization in Apache Spark 3.2.0](https://issues.apache.org/jira/browse/SPARK-34906).

For each Spark shim, we use Ant path patterns to compute the property
`spark${buildver}.sources` in [sql-plugin/pom.xml](./sql-plugin/pom.xml) that is
picked up as additional source code roots. When possible path patterns are reused using
the conventions outlined in the pom.

#### Simplified version directory structure

Going forward new shim files should be added under:

* `src/main/spark${buildver}`, example: `src/main/spark330db`
* `src/test/spark${buildver}`, example: `src/test/spark340`

with a special shim descriptor as a Scala/Java comment. See [shimplify.md][1]

[1]: ./docs/dev/shimplify.md

### Setting up an Integrated Development Environment

Our project currently uses `build-helper-maven-plugin` for shimming against conflicting definitions of superclasses
Expand Down Expand Up @@ -238,7 +254,12 @@ Known Issues:

* There is a known issue that the test sources added via the `build-helper-maven-plugin` are not handled
[properly](https://youtrack.jetbrains.com/issue/IDEA-100532). The workaround is to `mark` the affected folders
such as `tests/src/test/320+-noncdh-nondb` manually as `Test Sources Root`
such as

* `tests/src/test/320+-noncdh-nondb`
* `tests/src/test/spark340`

manually as `Test Sources Root`

* There is a known issue where, even after selecting a different Maven profile in the Maven submenu,
the source folders from a previously selected profile may remain active. As a workaround,
Expand All @@ -264,7 +285,7 @@ interested in. For example, to generate the Bloop projects for the Spark 3.2.0 d
just for the production code run:

```shell script
mvn install ch.epfl.scala:maven-bloop_2.13:1.4.9:bloopInstall -pl aggregator -am \
mvn install ch.epfl.scala:bloop-maven-plugin:bloopInstall -pl aggregator -am \
-DdownloadSources=true \
-Dbuildver=320 \
-DskipTests \
Expand Down Expand Up @@ -296,7 +317,7 @@ You can now open the spark-rapids as a

Read on for VS Code Scala Metals instructions.

# Bloop, Scala Metals, and Visual Studio Code
#### Bloop, Scala Metals, and Visual Studio Code

_Last tested with 1.63.0-insider (Universal) Commit: bedf867b5b02c1c800fbaf4d6ce09cefba_

Expand Down Expand Up @@ -338,6 +359,29 @@ jps -l
72349 scala.meta.metals.Main
```

##### Known Issues

###### java.lang.RuntimeException: boom

Metals background compilation process status appears to be resetting to 0% after reaching 99%
and you see a peculiar error message [`java.lang.RuntimeException: boom`][1]. You can work around
it by making sure Metals Server (Bloop client) and Bloop Server are both running on Java 11+.

1. To this end make sure that Bloop projects are generated using Java 11+

```bash
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 \
mvn install ch.epfl.scala:bloop-maven-plugin:bloopInstall \
-DdownloadSources=true \
-Dbuildver=331 \
-Dskip -DskipTests -Dmaven.javadoc.skip
```

1. Add [`metals.javaHome`][2] to VSCode preferences to point to Java 11+.

[1]: https://github.com/sourcegraph/scip-java/blob/b7d268233f1a303f66b6d9804a68f64b1e5d7032/semanticdb-javac/src/main/java/com/sourcegraph/semanticdb_javac/SemanticdbTaskListener.java#L76

[2]: https://github.com/scalameta/metals-vscode/pull/644/files#diff-04bba6a35cad1c794cbbe677678a51de13441b7a6ee8592b7b50be1f05c6f626R132
#### Other IDEs
We welcome pull requests with tips how to setup your favorite IDE!

Expand Down Expand Up @@ -481,6 +525,16 @@ You can confirm that the update actually has happened by either inspecting its e
`git diff` first or simply reexecuting `git commit` right away. The second time no file
modification should be triggered by the copyright year update hook and the commit should succeed.
There is a known issue for macOS users if they use the default version of `sed`. The copyright update
script may fail and generate an unexpected file named `source-file-E`. As a workaround, please
install GNU sed
```bash
brew install gnu-sed
# and add to PATH to make it as default sed for your shell
export PATH="/usr/local/opt/gnu-sed/libexec/gnubin:$PATH"
```
### Pull request status checks
A pull request should pass all status checks before merged.
#### signoff check
Expand Down
Loading

0 comments on commit d5acb6b

Please sign in to comment.