Skip to content

Commit

Permalink
[SPARK-46773][BUILD][CONNECT] Change to use include-list to `generate…
Browse files Browse the repository at this point in the history
… assemblyExcludedJars` for the connect server module

### What changes were proposed in this pull request?
This pr change to use include-list to generate `assemblyExcludedJars` option for the connect server module to ensure `sbt assembly` and `maven-shaded-plugin` package the same jars. The reason for no longer using the exclude list is because it requires more configuration(There are over 40 additional jars that need to be excluded)

### Why are the changes needed?
Make `sbt assembly` and `maven-shaded-plugin` package the same jars

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manually check the list of jars involved in the `sbt assembly`.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#44798 from LuciferYang/SPARK-46773.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
LuciferYang authored and dongjoon-hyun committed Jan 19, 2024
1 parent 3d395a6 commit 39f8e1a
Showing 1 changed file with 12 additions and 6 deletions.
18 changes: 12 additions & 6 deletions project/SparkBuild.scala
Original file line number Diff line number Diff line change
Expand Up @@ -752,14 +752,20 @@ object SparkConnect {
// Exclude `scala-library` from assembly.
(assembly / assemblyPackageScala / assembleArtifact) := false,

// Exclude `pmml-model-*.jar`, `scala-collection-compat_*.jar`,`jsr305-*.jar` and
// `netty-*.jar` and `unused-1.0.0.jar` from assembly.
// SPARK-46733: Include `spark-connect-*.jar`, `unused-*.jar`,`guava-*.jar`,
// `failureaccess-*.jar`, `annotations-*.jar`, `grpc-*.jar`, `protobuf-*.jar`,
// `gson-*.jar`, `error_prone_annotations-*.jar`, `j2objc-annotations-*.jar`,
// `animal-sniffer-annotations-*.jar`, `perfmark-api-*.jar`,
// `proto-google-common-protos-*.jar` in assembly.
// This needs to be consistent with the content of `maven-shade-plugin`.
(assembly / assemblyExcludedJars) := {
val cp = (assembly / fullClasspath).value
cp filter { v =>
val name = v.data.getName
name.startsWith("pmml-model-") || name.startsWith("scala-collection-compat_") ||
name.startsWith("jsr305-") || name.startsWith("netty-") || name == "unused-1.0.0.jar"
val validPrefixes = Set("spark-connect", "unused-", "guava-", "failureaccess-",
"annotations-", "grpc-", "protobuf-", "gson", "error_prone_annotations",
"j2objc-annotations", "animal-sniffer-annotations", "perfmark-api",
"proto-google-common-protos")
cp filterNot { v =>
validPrefixes.exists(v.data.getName.startsWith)
}
},

Expand Down

0 comments on commit 39f8e1a

Please sign in to comment.