Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] add selection and brushing for Geojson layer #3

Open
wants to merge 3 commits into
base: xli-add-support-geoarrow
Choose a base branch
from

Conversation

lixun910
Copy link
Member

@lixun910 lixun910 commented Sep 21, 2023

This PR is to add selection and brushing for Geojson layer:

  1. add geometric centroids /mean centers to geojson layer
  • mean centers for polygon and multipolygon
  • mean centers for linestring and multiLineString (interpolate point at 50% length)
  1. create a spatial index using geometric centroids
  • Flatbush index
  1. enable polygon filter on geojson layer; use spatial index to prefilter the datacontainer

http://www.faqs.org/faqs/graphics/algorithms-faq/

Performance tests:

selection

  • data: ~1 million polygons Utah house footprint

  • centroids/mean centers

  • Spatial index

Flatbush: 163ms

Summary

C++ code is normally 2x - 6x faster than the JS code with the same implementation. Boost::Geometry C++ has the fastest implementation. However, the centroid function in GEOS is not optimized and is slow (see explanation here Mean center and centroid)

  • 1 million polygons Utah house footprint

    • centroids/mean centers
      • c/c++:
        • GEOS (Mass Centroids): 1112 ms
        • Boost (Mass Centroids): 21 ms
        • Mean Centers: 61 ms
        • Mass Centroids: 398 ms
      • js:
        • polylabel: 8036 ms
        • Mean Centers: 233 ms
        • Turf Mean Centers: 446ms
        • Turf Mass Centroids: 778 ms
  • 3085 US counties

    • c/c++
      • GEOS: 56 ms
      • BOOST: 0.20 ms
      • Mean Centers: 0.56 ms
      • Centroids: 25 ms
    • js
      • Mean Centers: 2.25 ms
      • Turf Mean Centers: 5.2 ms
      • Turf Mass Centroids: 56 ms

Note: Why the GEOS Centroid() function is slow?

The GEOS centroid() function performs an additional calculation for the fallback case when the polygon area is 0. This additional calculation is quite expensive, as it computes the total length of all the edges and returns the point at length/2 as the centroid.

This PR is to add polygon filter based on mean centers for GeoJsonLayer. Mean centers are easy to compute (much faster than computing the mass centers or geometry centers), even though mean center is more affected by the points that are far away from the center of the shape (see notes below).

mean-centers

Notes:

Mean center vs centroid (mass center):

Mean center and centroid are two different ways to represent the center of a geometric shape.
The mean center is the average of all the points in the shape. To compute it, you add up the x-coordinates of all the points and divide by the number of points, and then do the same for the y-coordinates.

The centroid (a.k.a. the center of mass, or center of gravity) of a polygon can be computed as the weighted sum of the centroids of a partition of the polygon into triangles. The centroid of a triangle is simply the average of its three vertices, i.e., it has coordinates (x1 + x2 + x3)/3 and (y1 + y2 + y3)/3. This suggests first triangulating the polygon, then forming a sum of the centroids of each triangle, weighted by the area of each triangle, the whole sum normalized by the total polygon area.

In general, the mean center and centroid will be different. The mean center is more affected by the points that are far away from the center of the shape, while the centroid is more affected by the points that have a large area.

image

@lixun910 lixun910 changed the base branch from master to xli-add-support-geoarrow September 28, 2023 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant