Skip to content

Commit

Permalink
Merge pull request #48 from jiayuasu/master
Browse files Browse the repository at this point in the history
Push GeoSpark 0.5.0: Babylon Visualization
  • Loading branch information
jiayuasu authored Jan 19, 2017
2 parents ddc4f70 + c9c3301 commit ec6a2a2
Show file tree
Hide file tree
Showing 6 changed files with 108 additions and 23 deletions.
46 changes: 34 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,37 +4,48 @@

GeoSpark is listed as **Infrastructure Project** on **Apache Spark Official Third Party Project Page** ([http://spark.apache.org/third-party-projects.html](http://spark.apache.org/third-party-projects.html))

GeoSpark is a cluster computing system for processing large-scale spatial data. GeoSpark extends Apache Spark with a set of out-of-the-box Spatial Resilient Distributed Datasets (SRDDs) that efficiently load, process, and analyze large-scale spatial data across machines. GeoSpark provides APIs for Apache Spark programmer to easily develop their spatial analysis programs with Spatial Resilient Distributed Datasets (SRDDs) which have in house support for geometrical and distance operations.
GeoSpark is a cluster computing system for processing large-scale spatial data. GeoSpark extends Apache Spark with a set of out-of-the-box Spatial Resilient Distributed Datasets (SRDDs) that efficiently load, process, and analyze large-scale spatial data across machines. GeoSpark provides APIs for Apache Spark programmer to easily develop their spatial analysis programs with Spatial Resilient Distributed Datasets (SRDDs) which have in house support for geometrical and Spatial Queries (Range, K Nearest Neighbors, Join).



GeoSpark artifacts are hosted in Maven Central. You can add a Maven dependency with the following coordinates:

The following version supports Apache Spark 2.X versions:

```
groupId: org.datasyslab
artifactId: geospark
version: 0.4.0
version: 0.5.0
```

The following version supports Apache Spark 1.X versions:

```
groupId: org.datasyslab
artifactId: geospark
version: 0.4.0-spark-1.x
version: 0.5.0-spark-1.x
```



## Version information ([Full List](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Full-Version-Release-notes))


| Version | Summary |
|:----------------: |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 0.4.0| **Major updates:** ([Example](https://github.com/DataSystemsLab/GeoSpark/blob/master/src/main/java/org/datasyslab/geospark/showcase/Example.java)) 1. Refactor constrcutor API usage. 2. Simplify Spatial Join Query API. 3. Add native support for LineStringRDD; **Functionality enhancement:** 1. Release the persist function back to users. 2. Add more exception explanations.
| 0.3.2 | Functionality enhancement: 1. [JTSplus Spatial Objects](https://github.com/jiayuasu/JTSplus) now carry the original input data. Each object stores "UserData" and provides getter and setter. 2. Add a new SpatialRDD constructor to transform a regular data RDD to a spatial partitioned SpatialRDD. |
| 0.3.1 | Bug fix: Support Apache Spark 2.X version, fix a bug which results in inaccurate results when doing join query, add more unit test cases |
| 0.3 | Major updates: Significantly shorten query time on spatial join for skewed data; Support load balanced spatial partitioning methods (also serve as the global index); Optimize code for iterative spatial data mining ||
| 0.5.0| **Major updates:** We are pleased to announce the initial version of [Babylon](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon) a large-scale in-memory geospatial visualization system extending GeoSpark. Babylon and GeoSpark are integrated together. You can just import GeoSpark and enjoy! More detials are available here: [Babylon GeoSpatial Visualization](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon);
| 0.4.0| **Major updates:** ([Example](https://github.com/DataSystemsLab/GeoSpark/blob/master/src/main/java/org/datasyslab/geospark/showcase/Example.java)) 1. Refactor constrcutor API usage. 2. Simplify Spatial Join Query API. 3. Add native support for LineStringRDD; **Functionality enhancement:** 1. Release the persist function back to users. 2. Add more exception explanations.|

##News
### Babylon Visualization Framework on GeoSpark is now available!
Babylon is a large-scale in-memory geospatial visualization system.

Babylon provides native support for general cartographic design by extending GeoSpark to process large-scale spatial data. It can visulize Spatial RDD and Spatial Queries and render super high resolution image in parallel.

Babylon and GeoSpark are integrated together. You just need to import GeoSpark and enjoy them! More detials are available here: [Babylon GeoSpatial Visualization](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon)

### Babylon Gallery
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/usrail.png" width="250">
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/nycheatmap.png" width="250">
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/ustweet.png" width="250">

## How to get started (For Scala and Java developers)

Expand Down Expand Up @@ -81,7 +92,7 @@ Note: GeoSpark Master branch supports Apache Spark 2.X releases and GeoSpark for

## Scala and Java API usage

Please refer [GeoSpark Scala and Java API Usage](http://www.public.asu.edu/~jiayu2/geospark/javadoc/index.html)
Please refer to [GeoSpark Scala and Java API Usage](http://www.public.asu.edu/~jiayu2/geospark/javadoc/)


## Spatial Resilient Distributed Datasets (SRDDs)
Expand All @@ -91,7 +102,7 @@ GeoSpark extends RDDs to form Spatial RDDs (SRDDs) and efficiently partitions SR
**Supported Spatial RDDs: PointRDD, RectangleRDD, PolygonRDD, LineStringRDD**

## Supported data format
GeoSpark supports Comma-Separated Values (**CSV**), Tab-separated values (**TSV**), Well-Known Text (**WKT**), and **GeoJSON** as the input formats. Users only need to specify input format as Splitter and the start and end offset (if necessary) of spatial fields in one row when call Constructors.
GeoSpark supports Comma-Separated Values (**CSV**), Tab-separated values (**TSV**), Well-Known Text (**WKT**), and **GeoJSON** as the input formats. Users only need to specify input format as Splitter and the start and end offset (if necessary) of spatial fields in one row when call Constructors. GeoSpark also takes **any user-supplied format mapper function** to support the desired format.

## Important features

Expand All @@ -109,8 +120,19 @@ GeoSpark currently provides native support for Inside, Overlap, DatasetBoundary,

### Spatial Operation

GeoSpark so far provides spatial range query, join query and KNN query in SRDDs.
GeoSpark so far provides **Spatial Range Query**, **Spatial Join Query**, and **Spatial K Nearest Neighbors Query**.

#Babylon Visualization Framework on GeoSpark
Babylon is a large-scale in-memory geospatial visualization system.

Babylon provides native support for general cartographic design by extending GeoSpark to process large-scale spatial data. It can visulize Spatial RDD and Spatial Queries and render super high resolution image in parallel.

Babylon and GeoSpark are integrated together. You just need to import GeoSpark and enjoy them! More detials are available here: [Babylon GeoSpatial Visualization](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon)

## Babylon Gallery
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/usrail.png" width="250">
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/nycheatmap.png" width="250">
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/ustweet.png" width="250">

## Publication

Expand Down
63 changes: 63 additions & 0 deletions src/main/java/org/datasyslab/babylon/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# BABYLON
**Babylon is a large-scale in-memory geospatial visualization system**

**Babylon** provides native support for general cartographic design by extending **GeoSpark** to process large-scale spatial data.
## Babylon Gallery
### Scatter Plot: USA mainland rail network
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/usrail.png" width="500">
### Heat Map: New York City Taxi Trips (with a given map background)
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/nycheatmap.png" width="500">
### Choropleth Map + Overlay Operator: USA mainland tweets per USA county (Spatial Join Query)
<img src="http://www.public.asu.edu/~jiayu2/geospark/picture/ustweet.png" width="500">

## Main Features

### Extensible Visualization operator

* Support super high resolution image generation: parallel map image rendering
* Visualize Spatial RDD and Spatial Queries (Spatial Range, Spatial K Nearest Neighbors, Spatial Join)
* Customizable: Can be customized to any user-supplied colors or coloring rule
* Extensible: Can be extended to any visualization effect

### Overlay Operator
Overlay one map layer with many other map layers!

### Various Image filter
* Gaussian Blur
* Box Blur
* Embose
* Outline
* Sharpen
* More!

You also can buld your new image filter by easily extending the photo filter!

### Various Image Type
* PNG
* JPEG
* GIF

You also can support your desired image type by easily extending the photo filter!


### Current Visualization effect

* Scatter Plot
* Heat Map
* Choropleth Map
* More!

You also can build your new self-designed effects by easily extending the visualization operator!

### Example
Here is [a runnable single machine exmaple code](https://github.com/jiayuasu/GeoSpark/blob/master/src/main/java/org/datasyslab/babylon/showcase/Example.java). You can clone this repository and directly run it on you local machine!

### Scala and Java API
Please refer to [Babylon Scala and Java API](http://www.public.asu.edu/~jiayu2/geospark/javadoc/latest/).
### Supported Spatial Objects and Input format

All spatial obects and input formats supported by GeoSpark

##Contributor
* [Jia Yu](http://www.public.asu.edu/~jiayu2/) ([email protected])
* [Mohamed Sarwat](http://faculty.engineering.asu.edu/sarwat/) ([email protected])
22 changes: 11 additions & 11 deletions src/main/java/org/datasyslab/babylon/core/OverlayOperator.java
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,14 @@ public class OverlayOperator {
public BufferedImage backImage;

/** The distributed back image. */
public JavaPairRDD<Integer,BufferedImage> distributedBackImage;
public JavaPairRDD<Integer,ImageSerializableWrapper> distributedBackImage;

/**
* Instantiates a new overlay operator.
*
* @param distributedBackImage the distributed back image
*/
public OverlayOperator(JavaPairRDD<Integer,BufferedImage> distributedBackImage)
public OverlayOperator(JavaPairRDD<Integer,ImageSerializableWrapper> distributedBackImage)
{
this.distributedBackImage = distributedBackImage;
}
Expand All @@ -52,17 +52,17 @@ public OverlayOperator(BufferedImage backImage)
* @param distributedFontImage the distributed font image
* @return true, if successful
*/
public boolean JoinImage(JavaPairRDD<Integer,BufferedImage> distributedFontImage)
public boolean JoinImage(JavaPairRDD<Integer,ImageSerializableWrapper> distributedFontImage)
{
this.distributedBackImage = this.distributedBackImage.cogroup(distributedFontImage).mapToPair(new PairFunction<Tuple2<Integer,Tuple2<Iterable<BufferedImage>,Iterable<BufferedImage>>>,Integer,BufferedImage>()
this.distributedBackImage = this.distributedBackImage.cogroup(distributedFontImage).mapToPair(new PairFunction<Tuple2<Integer,Tuple2<Iterable<ImageSerializableWrapper>,Iterable<ImageSerializableWrapper>>>,Integer,ImageSerializableWrapper>()
{
@Override
public Tuple2<Integer, BufferedImage> call(
Tuple2<Integer, Tuple2<Iterable<BufferedImage>, Iterable<BufferedImage>>> imagePair)
public Tuple2<Integer, ImageSerializableWrapper> call(
Tuple2<Integer, Tuple2<Iterable<ImageSerializableWrapper>, Iterable<ImageSerializableWrapper>>> imagePair)
throws Exception {
int imagePartitionId = imagePair._1;
Iterator<BufferedImage> backImageIterator = imagePair._2._1.iterator();
Iterator<BufferedImage> frontImageIterator = imagePair._2._2.iterator();
Iterator<ImageSerializableWrapper> backImageIterator = imagePair._2._1.iterator();
Iterator<ImageSerializableWrapper> frontImageIterator = imagePair._2._2.iterator();
if(backImageIterator.hasNext()==false)
{
throw new Exception("[OverlayOperator][JoinImage] The back image iterator didn't get any image partitions.");
Expand All @@ -71,8 +71,8 @@ public Tuple2<Integer, BufferedImage> call(
{
throw new Exception("[OverlayOperator][JoinImage] The front image iterator didn't get any image partitions.");
}
BufferedImage backImage = backImageIterator.next();
BufferedImage frontImage = frontImageIterator.next();
BufferedImage backImage = backImageIterator.next().image;
BufferedImage frontImage = frontImageIterator.next().image;
if(backImage.getWidth()!=frontImage.getWidth()||backImage.getHeight()!=frontImage.getHeight())
{
throw new Exception("[OverlayOperator][JoinImage] The two given image don't have the same width or the same height.");
Expand All @@ -83,7 +83,7 @@ public Tuple2<Integer, BufferedImage> call(
Graphics graphics = combinedImage.getGraphics();
graphics.drawImage(backImage, 0, 0, null);
graphics.drawImage(frontImage, 0, 0, null);
return new Tuple2(imagePartitionId,combinedImage);
return new Tuple2<Integer, ImageSerializableWrapper>(imagePartitionId,new ImageSerializableWrapper(combinedImage));
}
});
return true;
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit ec6a2a2

Please sign in to comment.