Skip to content

Commit

Permalink
Merge pull request #26 from jiayuasu/master
Browse files Browse the repository at this point in the history
Update GeoSpark 0.3.2
  • Loading branch information
jiayuasu authored Oct 26, 2016
2 parents 6411d55 + 1f066ad commit 5da8403
Show file tree
Hide file tree
Showing 507 changed files with 1,516 additions and 85,516 deletions.
38 changes: 25 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,20 @@ GeoSpark artifacts are hosted in Maven Central. You can add a Maven dependency w
```
groupId: org.datasyslab
artifactId: geospark
version: 0.3.1
version: 0.3.2
```

## Version information
## Version information ([Full List](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Full-Version-Release-notes))


| Version | Summary |
|:----------------: |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 0.3.2 | Functionality enhancement: 1. [JTSplus Spatial Objects](https://github.com/jiayuasu/JTSplus) now carry the original input data. Each object stores "UserData" and provides getter and setter. 2. Add a new SpatialRDD constructor to transform a regular data RDD to a spatial partitioned SpatialRDD. |
| 0.3.1 | Bug fix: Support Apache Spark 2.X version, fix a bug which results in inaccurate results when doing join query, add more unit test cases |
| 0.3 | Major updates: Significantly shorten query time on spatial join for skewed data; Support load balanced spatial partitioning methods (also serve as the global index); Optimize code for iterative spatial data mining |
| 0.2 | Improve code structure and refactor API |
| 0.1 | Support spatial range, join and Knn |
| Master branch | even with 0.3.1 |
| Master branch | even with 0.3.2 |
| Spark 1.X branch | even with 0.3.1 but only supports Apache Spark 1.X |


## How to get started (For Scala and Java developers)


Expand All @@ -34,15 +32,15 @@ version: 0.3.1

1. Apache Spark 2.X releases (Apache Spark 1.X releases support available in GeoSpark for Spark 1.X branch)
2. JDK 1.7
3. Compiled GeoSpark jar (Run 'mvn clean install' at source code folder or Download [pre-compiled GeoSpark jar](https://github.com/DataSystemsLab/GeoSpark/releases) under "Release" tag).
4. You might need to modify the dependencies in "POM.xml" and make it consistent with your environment.
3. You might need to modify the dependencies in "POM.xml" and make it consistent with your environment.

Note: GeoSpark Master branch supports Apache Spark 2.X releases and GeoSpark for Spark 1.X branch supports Apache Spark 1.X releases. Please refer to the proper branch you need.

### How to use GeoSpark APIs in an interactive Spark shell (Scala)

1. Have your Spark cluster ready.
2. Run Spark shell with GeoSpark as a dependency.
2. Download [pre-compiled GeoSpark jar](https://github.com/DataSystemsLab/GeoSpark/releases) under "Release" tag.
3. Run Spark shell with GeoSpark as a dependency.

`
./bin/spark-shell --jars GeoSpark_COMPILED.jar
Expand All @@ -53,8 +51,7 @@ Note: GeoSpark Master branch supports Apache Spark 2.X releases and GeoSpark for
### How to use GeoSpark APIs in a self-contained Spark application (Scala and Java)

1. Create your own Apache Spark project in Scala or Java
2. Download GeoSpark source code or download [pre-compiled GeoSpark jar](https://github.com/DataSystemsLab/GeoSpark/releases) under "Release" tag.
3. Put GeoSpark source code with your own code and compile together. Or add GeoSpark.jar into your local compilation dependency.
2. Add GeoSpark Maven coordinates into your project dependencies.
4. You can now use GeoSpark APIs in your Spark program!
5. Use spark-submit to submit your compiled self-contained Spark program.

Expand Down Expand Up @@ -97,7 +94,7 @@ Two pairs of longitude and latitude present the vertexes lie on the diagonal of

Each tuple contains unlimited points.

##Supported data format
## Supported data format
GeoSpark supports Comma-Separated Values ("csv"), Tab-separated values ("tsv"), Well-Known Text ("wkt"), and GeoJSON ("geojson") as the input formats. Users only need to specify input format as Splitter and the start column (if necessary) of spatial info in one tuple as Offset when call Constructors.

## Important features
Expand Down Expand Up @@ -131,6 +128,21 @@ Jia Yu, Jinxuan Wu, Mohamed Sarwat. ["GeoSpark: A Cluster Computing Framework fo
GeoSaprk makes use of JTS Plus (An extended JTS Topology Suite Version 1.14) for some geometrical computations.

Please refer to [JTS Topology Suite website](http://tsusiatsoftware.net/jts/main.html) and [JTS Plus](https://github.com/jiayuasu/JTSplus) for more details.

## Thanks for the help from GeoSpark community
We appreciate the help and suggestions from the following GeoSpark users (List is increasing..):

* @gaufung
* @lrojas94
* @mdespriee
* @sabman
* @samchorlton
* @Tsarazin
* @TBuc
* ...



## Contact

### Contributors
Expand All @@ -140,7 +152,7 @@ Please refer to [JTS Topology Suite website](http://tsusiatsoftware.net/jts/main

* [Mohamed Sarwat](http://faculty.engineering.asu.edu/sarwat/) (Email: [email protected])

###Project website
### Project website
Please visit [GeoSpark project wesbite](http://geospark.datasyslab.org) for latest news and releases.

### DataSys Lab
Expand Down
260 changes: 237 additions & 23 deletions dependency-reduced-pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,36 @@
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.datasyslab</groupId>
<artifactId>geospark-precompiled</artifactId>
<name>GeoSpark</name>
<version>0.3.1</version>
<description>Spark-2.X</description>
<artifactId>geospark</artifactId>
<name>${project.groupId}:${project.artifactId}</name>
<version>0.3.2</version>
<description>Geospatial extension for Apache Spark</description>
<url>http://geospark.datasyslab.org/</url>
<developers>
<developer>
<name>Jia Yu</name>
<email>[email protected]</email>
<organization>Arizona State University Data Systems Lab</organization>
<organizationUrl>http://www.datasyslab.org/</organizationUrl>
</developer>
<developer>
<name>Mohamed Sarwat</name>
<email>[email protected]</email>
<organization>Arizona State University Data Systems Lab</organization>
<organizationUrl>http://www.datasyslab.org/</organizationUrl>
</developer>
</developers>
<licenses>
<license>
<name>MIT license</name>
<url>https://opensource.org/licenses/MIT</url>
</license>
</licenses>
<scm>
<connection>scm:git:[email protected]:DataSystemsLab/GeoSpark.git</connection>
<developerConnection>scm:git:[email protected]:DataSystemsLab/GeoSpark.git</developerConnection>
<url>[email protected]:DataSystemsLab/GeoSpark.git</url>
</scm>
<build>
<resources>
<resource>
Expand All @@ -14,32 +40,24 @@
</resources>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
<version>0.7.5.201505241946</version>
<artifactId>maven-source-plugin</artifactId>
<executions>
<execution>
<id>attach-sources</id>
<goals>
<goal>prepare-agent</goal>
</goals>
</execution>
<execution>
<id>report</id>
<phase>test</phase>
<goals>
<goal>report</goal>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<artifactId>maven-shade-plugin</artifactId>
<version>2.1</version>
Expand Down Expand Up @@ -90,6 +108,202 @@
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.1</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<artifactId>avro-mapred</artifactId>
<groupId>org.apache.avro</groupId>
</exclusion>
<exclusion>
<artifactId>chill_2.11</artifactId>
<groupId>com.twitter</groupId>
</exclusion>
<exclusion>
<artifactId>chill-java</artifactId>
<groupId>com.twitter</groupId>
</exclusion>
<exclusion>
<artifactId>xbean-asm5-shaded</artifactId>
<groupId>org.apache.xbean</groupId>
</exclusion>
<exclusion>
<artifactId>hadoop-client</artifactId>
<groupId>org.apache.hadoop</groupId>
</exclusion>
<exclusion>
<artifactId>spark-launcher_2.11</artifactId>
<groupId>org.apache.spark</groupId>
</exclusion>
<exclusion>
<artifactId>spark-network-common_2.11</artifactId>
<groupId>org.apache.spark</groupId>
</exclusion>
<exclusion>
<artifactId>spark-network-shuffle_2.11</artifactId>
<groupId>org.apache.spark</groupId>
</exclusion>
<exclusion>
<artifactId>spark-unsafe_2.11</artifactId>
<groupId>org.apache.spark</groupId>
</exclusion>
<exclusion>
<artifactId>jets3t</artifactId>
<groupId>net.java.dev.jets3t</groupId>
</exclusion>
<exclusion>
<artifactId>curator-recipes</artifactId>
<groupId>org.apache.curator</groupId>
</exclusion>
<exclusion>
<artifactId>javax.servlet-api</artifactId>
<groupId>javax.servlet</groupId>
</exclusion>
<exclusion>
<artifactId>commons-lang3</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>commons-math3</artifactId>
<groupId>org.apache.commons</groupId>
</exclusion>
<exclusion>
<artifactId>jsr305</artifactId>
<groupId>com.google.code.findbugs</groupId>
</exclusion>
<exclusion>
<artifactId>slf4j-api</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
<exclusion>
<artifactId>jul-to-slf4j</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
<exclusion>
<artifactId>jcl-over-slf4j</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
<exclusion>
<artifactId>log4j</artifactId>
<groupId>log4j</groupId>
</exclusion>
<exclusion>
<artifactId>slf4j-log4j12</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
<exclusion>
<artifactId>compress-lzf</artifactId>
<groupId>com.ning</groupId>
</exclusion>
<exclusion>
<artifactId>snappy-java</artifactId>
<groupId>org.xerial.snappy</groupId>
</exclusion>
<exclusion>
<artifactId>lz4</artifactId>
<groupId>net.jpountz.lz4</groupId>
</exclusion>
<exclusion>
<artifactId>RoaringBitmap</artifactId>
<groupId>org.roaringbitmap</groupId>
</exclusion>
<exclusion>
<artifactId>commons-net</artifactId>
<groupId>commons-net</groupId>
</exclusion>
<exclusion>
<artifactId>scala-library</artifactId>
<groupId>org.scala-lang</groupId>
</exclusion>
<exclusion>
<artifactId>json4s-jackson_2.11</artifactId>
<groupId>org.json4s</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-client</artifactId>
<groupId>org.glassfish.jersey.core</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-common</artifactId>
<groupId>org.glassfish.jersey.core</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-server</artifactId>
<groupId>org.glassfish.jersey.core</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-container-servlet</artifactId>
<groupId>org.glassfish.jersey.containers</groupId>
</exclusion>
<exclusion>
<artifactId>jersey-container-servlet-core</artifactId>
<groupId>org.glassfish.jersey.containers</groupId>
</exclusion>
<exclusion>
<artifactId>mesos</artifactId>
<groupId>org.apache.mesos</groupId>
</exclusion>
<exclusion>
<artifactId>netty-all</artifactId>
<groupId>io.netty</groupId>
</exclusion>
<exclusion>
<artifactId>netty</artifactId>
<groupId>io.netty</groupId>
</exclusion>
<exclusion>
<artifactId>stream</artifactId>
<groupId>com.clearspring.analytics</groupId>
</exclusion>
<exclusion>
<artifactId>metrics-core</artifactId>
<groupId>io.dropwizard.metrics</groupId>
</exclusion>
<exclusion>
<artifactId>metrics-jvm</artifactId>
<groupId>io.dropwizard.metrics</groupId>
</exclusion>
<exclusion>
<artifactId>metrics-json</artifactId>
<groupId>io.dropwizard.metrics</groupId>
</exclusion>
<exclusion>
<artifactId>metrics-graphite</artifactId>
<groupId>io.dropwizard.metrics</groupId>
</exclusion>
<exclusion>
<artifactId>jackson-module-scala_2.11</artifactId>
<groupId>com.fasterxml.jackson.module</groupId>
</exclusion>
<exclusion>
<artifactId>ivy</artifactId>
<groupId>org.apache.ivy</groupId>
</exclusion>
<exclusion>
<artifactId>oro</artifactId>
<groupId>oro</groupId>
</exclusion>
<exclusion>
<artifactId>pyrolite</artifactId>
<groupId>net.razorvine</groupId>
</exclusion>
<exclusion>
<artifactId>py4j</artifactId>
<groupId>net.sf.py4j</groupId>
</exclusion>
<exclusion>
<artifactId>spark-tags_2.11</artifactId>
<groupId>org.apache.spark</groupId>
</exclusion>
<exclusion>
<artifactId>unused</artifactId>
<groupId>org.spark-project.spark</groupId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
Expand Down
Loading

0 comments on commit 5da8403

Please sign in to comment.