Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs/notebooks #102

Merged
merged 7 commits into from
Dec 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
all: build serve

build:
jupyter-book build . --config docs/_config.yml --toc docs/_toc.yml
sphinx-build docs _build/html -b html

serve:
open _build/html/index.html
5 changes: 3 additions & 2 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ parts:
- caption: Notebook Examples
numbered: False
chapters:
- file: user-docs/space2stats_py_package_demo.ipynb
- file: user-docs/space2stats_api_demo.ipynb
- file: user-docs/space2stats_api_demo_R.Rmd
- file: user-docs/space2stats_api_demo_R.md
- file: user-docs/space2stats_floods.ipynb
- file: user-docs/space2stats_py_package_demo.ipynb
10 changes: 8 additions & 2 deletions docs/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ The expected response is a JSON containing the hexagon ID and the requested fiel

### `/aggregate`

The summary endpoint is very similar to the summary endpoint, but it returns an aggregate statistic for the entire area, based on an additional `aggregation type` function ('sum', 'avg', 'count', 'max' or 'min'). The request body is the same as the summary endpoint, with the addition of the `aggregation_type` field.
The aggregate endpoint is very similar to the summary endpoint, but it returns an aggregate statistic for the entire area, based on an additional `aggregation type` function ('sum', 'avg', 'count', 'max' or 'min'). The request body is the same as the summary endpoint, with the addition of the `aggregation_type` field.

This example uses an admin-1 province boundary from GeoBoundaries, retrieved as a `geopandas geodataframe` or `simple feature` (r).

Expand Down Expand Up @@ -352,9 +352,15 @@ The expected response is a JSON containing the requested aggregate statistic for
{'sum_pop_2020': 1374175.833772784}
```

## Notebook Examples

- [**API Demo (Python)**](user-docs/space2stats_api_demo.ipynb)
- [**API Demo (R)**](user-docs/space2stats_api_demo_R.md)
- [**Exploring Flood Exposure (Python Functions)**](user-docs/space2stats_floods.ipynb)

## StatsTable Python Package

In addition to the API, the `StatsTable` python package is being developed to enable faster database queries and scale research applications. The package currently supports a similar set of functions as the API (_fields_, _summaries_, and _aggregate_).
In addition to the API, the `StatsTable` python package provides the API's underlying functionality as a set of functions (_fields_, _summaries_, and _aggregate_). The package enables researchers to work with the Space2Stats database directly and conduct faster queries and scale research applications.

```{note}
This package is still under development. Currently, users need to set credential parameters to connect to the database. Reach out to [email protected] to request credentials.
Expand Down
285 changes: 98 additions & 187 deletions docs/user-docs/space2stats_api_demo.ipynb

Large diffs are not rendered by default.

32 changes: 21 additions & 11 deletions docs/user-docs/space2stats_api_demo_R.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
---
title: "Space2Stats API Demo in R"
output: html_notebook
output:
html_document:
df_print: paged
---
# R Example

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
Expand Down Expand Up @@ -47,6 +50,11 @@ print(unlist(available_fields))
## Define Area of Interest (AOI)

```{r}
minx <- 29.038924
miny <- -4.468958
maxx <- 30.850461
maxy <- -2.310523

# Define Area of Interest (AOI) with NULL for properties to ensure it's treated as a valid dictionary
aoi <- list(
type = "Feature",
Expand All @@ -55,11 +63,11 @@ aoi <- list(
type = "Polygon",
coordinates = list(
list(
c(33.78593974945852, 5.115816884114494),
c(33.78593974945852, -4.725410543134203),
c(41.94362577283266, -4.725410543134203),
c(41.94362577283266, 5.115816884114494),
c(33.78593974945852, 5.115816884114494)
c(minx, maxy),
c(minx, miny),
c(maxx, miny),
c(maxx, maxy),
c(minx, maxy)
)
)
)
Expand Down Expand Up @@ -88,11 +96,14 @@ resp <- req |> req_perform()
summary_data <- resp |> resp_body_string() |> fromJSON(flatten = TRUE)

# Extract coordinates and convert to a spatial data frame (sf object)
summary_data$x <- sapply(summary_data$geometry.coordinates, function(x) unlist(x)[1])
summary_data$y <- sapply(summary_data$geometry.coordinates, function(x) unlist(x)[2])
summary_data <- summary_data %>%
mutate(
x = sapply(geometry, function(g) fromJSON(g)$coordinates[1]),
y = sapply(geometry, function(g) fromJSON(g)$coordinates[2])
)

# Convert to sf, drop extra geometry fields
gdf <- st_as_sf(summary_data, coords = c("x", "y"), crs = 4326)[, c(1, 2, 5)]
gdf <- st_as_sf(summary_data, coords = c("x", "y"), crs = 4326)
```

## Visualization
Expand Down Expand Up @@ -121,6 +132,5 @@ leaflet(gdf) %>%
addLegend(
pal = custom_pal, values = gdf$sum_pop_2020, title = "Population 2020 (Custom Binned Scale)",
opacity = 1
) %>%
setView(lng = 37.5, lat = 0, zoom = 6) # Center the map based on AOI
)
```
136 changes: 136 additions & 0 deletions docs/user-docs/space2stats_api_demo_R.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
---
title: "Space2Stats API Demo in R"
output:
html_document:
df_print: paged
---
# R Example

```R name="setup" tags=["remove_cell"]
knitr::opts_chunk$set(echo = TRUE)
library(httr2)
library(jsonlite)
library(sf)
library(dplyr)
library(leaflet)
library(viridis)
```

## Set Up API Endpoints

```R
base_url <- "https://space2stats.ds.io"
fields_endpoint <- paste0(base_url, "/fields")
summary_endpoint <- paste0(base_url, "/summary")
```

## Fetch Available Fields

```R
# Set up the request to fetch available fields
req <- request(base_url) |>
req_url_path_append("fields") # Append the correct endpoint

# Perform the request and get the response
resp <- req |> req_perform()

# Check the status code
if (resp_status(resp) != 200) {
stop("Failed to get fields: ", resp_body_string(resp))
}

# Parse the response body as JSON
available_fields <- resp |> resp_body_json()

# Print the available fields in a simplified format
print("Available Fields:")
print(unlist(available_fields))
```

## Define Area of Interest (AOI)

```R
minx <- 29.038924
miny <- -4.468958
maxx <- 30.850461
maxy <- -2.310523

# Define Area of Interest (AOI) with NULL for properties to ensure it's treated as a valid dictionary
aoi <- list(
type = "Feature",
properties = NULL, # Empty properties
geometry = list(
type = "Polygon",
coordinates = list(
list(
c(minx, maxy),
c(minx, miny),
c(maxx, miny),
c(maxx, maxy),
c(minx, maxy)
)
)
)
)
```

## Request Summary Data

```R
request_payload <- list(
aoi = aoi,
spatial_join_method = "centroid",
fields = list("sum_pop_2020"),
geometry = "point"
)

# Set up the base URL and create the request
req <- request(base_url) |>
req_url_path_append("summary") |>
req_body_json(request_payload)

# Perform the request and get the response
resp <- req |> req_perform()

# Turn response into a data frame
summary_data <- resp |> resp_body_string() |> fromJSON(flatten = TRUE)

# Extract coordinates and convert to a spatial data frame (sf object)
summary_data <- summary_data %>%
mutate(
x = sapply(geometry, function(g) fromJSON(g)$coordinates[1]),
y = sapply(geometry, function(g) fromJSON(g)$coordinates[2])
)

# Convert to sf, drop extra geometry fields
gdf <- st_as_sf(summary_data, coords = c("x", "y"), crs = 4326)
```

## Visualization

```R

# Replace NA values in sum_pop_2020 with 0
gdf$sum_pop_2020[is.na(gdf$sum_pop_2020)] <- 0

# Create a custom binned color palette with non-uniform breaks
# For example: 0 (distinct color), 1-200000 (gradient), 200001+ (another color)
breaks <- c(0, 1, 1000, 10000, 50000, 100000, 200000, max(gdf$sum_pop_2020))

custom_pal <- colorBin(palette = c("lightgray", "yellow", "orange", "red", "purple", "blue"),
domain = gdf$sum_pop_2020, bins = breaks)

# Create the leaflet map with custom binned coloring
leaflet(gdf) %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addCircleMarkers(
radius = 3, # Adjust size as needed
color = ~custom_pal(sum_pop_2020),
stroke = FALSE, fillOpacity = 0.7,
popup = ~paste("Hex ID:", hex_id, "<br>", "Population 2020:", sum_pop_2020) # Add a popup with details
) %>%
addLegend(
pal = custom_pal, values = gdf$sum_pop_2020, title = "Population 2020 (Custom Binned Scale)",
opacity = 1
)
```
Loading
Loading