Skip to content

Commit

Permalink
Merge pull request #102 from worldbank/docs/notebooks
Browse files Browse the repository at this point in the history
Docs/notebooks
  • Loading branch information
bpstewar authored Dec 16, 2024
2 parents be2ea27 + 38f2f14 commit 2cfed4b
Show file tree
Hide file tree
Showing 11 changed files with 3,968 additions and 577 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
all: build serve

build:
jupyter-book build . --config docs/_config.yml --toc docs/_toc.yml
sphinx-build docs _build/html -b html

serve:
open _build/html/index.html
5 changes: 3 additions & 2 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ parts:
- caption: Notebook Examples
numbered: False
chapters:
- file: user-docs/space2stats_py_package_demo.ipynb
- file: user-docs/space2stats_api_demo.ipynb
- file: user-docs/space2stats_api_demo_R.Rmd
- file: user-docs/space2stats_api_demo_R.md
- file: user-docs/space2stats_floods.ipynb
- file: user-docs/space2stats_py_package_demo.ipynb
10 changes: 8 additions & 2 deletions docs/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ The expected response is a JSON containing the hexagon ID and the requested fiel

### `/aggregate`

The summary endpoint is very similar to the summary endpoint, but it returns an aggregate statistic for the entire area, based on an additional `aggregation type` function ('sum', 'avg', 'count', 'max' or 'min'). The request body is the same as the summary endpoint, with the addition of the `aggregation_type` field.
The aggregate endpoint is very similar to the summary endpoint, but it returns an aggregate statistic for the entire area, based on an additional `aggregation type` function ('sum', 'avg', 'count', 'max' or 'min'). The request body is the same as the summary endpoint, with the addition of the `aggregation_type` field.

This example uses an admin-1 province boundary from GeoBoundaries, retrieved as a `geopandas geodataframe` or `simple feature` (r).

Expand Down Expand Up @@ -352,9 +352,15 @@ The expected response is a JSON containing the requested aggregate statistic for
{'sum_pop_2020': 1374175.833772784}
```

## Notebook Examples

- [**API Demo (Python)**](user-docs/space2stats_api_demo.ipynb)
- [**API Demo (R)**](user-docs/space2stats_api_demo_R.md)
- [**Exploring Flood Exposure (Python Functions)**](user-docs/space2stats_floods.ipynb)

## StatsTable Python Package

In addition to the API, the `StatsTable` python package is being developed to enable faster database queries and scale research applications. The package currently supports a similar set of functions as the API (_fields_, _summaries_, and _aggregate_).
In addition to the API, the `StatsTable` python package provides the API's underlying functionality as a set of functions (_fields_, _summaries_, and _aggregate_). The package enables researchers to work with the Space2Stats database directly and conduct faster queries and scale research applications.

```{note}
This package is still under development. Currently, users need to set credential parameters to connect to the database. Reach out to [email protected] to request credentials.
Expand Down
285 changes: 98 additions & 187 deletions docs/user-docs/space2stats_api_demo.ipynb

Large diffs are not rendered by default.

32 changes: 21 additions & 11 deletions docs/user-docs/space2stats_api_demo_R.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
---
title: "Space2Stats API Demo in R"
output: html_notebook
output:
html_document:
df_print: paged
---
# R Example

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
Expand Down Expand Up @@ -47,6 +50,11 @@ print(unlist(available_fields))
## Define Area of Interest (AOI)

```{r}
minx <- 29.038924
miny <- -4.468958
maxx <- 30.850461
maxy <- -2.310523
# Define Area of Interest (AOI) with NULL for properties to ensure it's treated as a valid dictionary
aoi <- list(
type = "Feature",
Expand All @@ -55,11 +63,11 @@ aoi <- list(
type = "Polygon",
coordinates = list(
list(
c(33.78593974945852, 5.115816884114494),
c(33.78593974945852, -4.725410543134203),
c(41.94362577283266, -4.725410543134203),
c(41.94362577283266, 5.115816884114494),
c(33.78593974945852, 5.115816884114494)
c(minx, maxy),
c(minx, miny),
c(maxx, miny),
c(maxx, maxy),
c(minx, maxy)
)
)
)
Expand Down Expand Up @@ -88,11 +96,14 @@ resp <- req |> req_perform()
summary_data <- resp |> resp_body_string() |> fromJSON(flatten = TRUE)
# Extract coordinates and convert to a spatial data frame (sf object)
summary_data$x <- sapply(summary_data$geometry.coordinates, function(x) unlist(x)[1])
summary_data$y <- sapply(summary_data$geometry.coordinates, function(x) unlist(x)[2])
summary_data <- summary_data %>%
mutate(
x = sapply(geometry, function(g) fromJSON(g)$coordinates[1]),
y = sapply(geometry, function(g) fromJSON(g)$coordinates[2])
)
# Convert to sf, drop extra geometry fields
gdf <- st_as_sf(summary_data, coords = c("x", "y"), crs = 4326)[, c(1, 2, 5)]
gdf <- st_as_sf(summary_data, coords = c("x", "y"), crs = 4326)
```

## Visualization
Expand Down Expand Up @@ -121,6 +132,5 @@ leaflet(gdf) %>%
addLegend(
pal = custom_pal, values = gdf$sum_pop_2020, title = "Population 2020 (Custom Binned Scale)",
opacity = 1
) %>%
setView(lng = 37.5, lat = 0, zoom = 6) # Center the map based on AOI
)
```
136 changes: 136 additions & 0 deletions docs/user-docs/space2stats_api_demo_R.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
---
title: "Space2Stats API Demo in R"
output:
html_document:
df_print: paged
---
# R Example

```R name="setup" tags=["remove_cell"]
knitr::opts_chunk$set(echo = TRUE)
library(httr2)
library(jsonlite)
library(sf)
library(dplyr)
library(leaflet)
library(viridis)
```

## Set Up API Endpoints

```R
base_url <- "https://space2stats.ds.io"
fields_endpoint <- paste0(base_url, "/fields")
summary_endpoint <- paste0(base_url, "/summary")
```

## Fetch Available Fields

```R
# Set up the request to fetch available fields
req <- request(base_url) |>
req_url_path_append("fields") # Append the correct endpoint

# Perform the request and get the response
resp <- req |> req_perform()

# Check the status code
if (resp_status(resp) != 200) {
stop("Failed to get fields: ", resp_body_string(resp))
}

# Parse the response body as JSON
available_fields <- resp |> resp_body_json()

# Print the available fields in a simplified format
print("Available Fields:")
print(unlist(available_fields))
```

## Define Area of Interest (AOI)

```R
minx <- 29.038924
miny <- -4.468958
maxx <- 30.850461
maxy <- -2.310523

# Define Area of Interest (AOI) with NULL for properties to ensure it's treated as a valid dictionary
aoi <- list(
type = "Feature",
properties = NULL, # Empty properties
geometry = list(
type = "Polygon",
coordinates = list(
list(
c(minx, maxy),
c(minx, miny),
c(maxx, miny),
c(maxx, maxy),
c(minx, maxy)
)
)
)
)
```

## Request Summary Data

```R
request_payload <- list(
aoi = aoi,
spatial_join_method = "centroid",
fields = list("sum_pop_2020"),
geometry = "point"
)

# Set up the base URL and create the request
req <- request(base_url) |>
req_url_path_append("summary") |>
req_body_json(request_payload)

# Perform the request and get the response
resp <- req |> req_perform()

# Turn response into a data frame
summary_data <- resp |> resp_body_string() |> fromJSON(flatten = TRUE)

# Extract coordinates and convert to a spatial data frame (sf object)
summary_data <- summary_data %>%
mutate(
x = sapply(geometry, function(g) fromJSON(g)$coordinates[1]),
y = sapply(geometry, function(g) fromJSON(g)$coordinates[2])
)

# Convert to sf, drop extra geometry fields
gdf <- st_as_sf(summary_data, coords = c("x", "y"), crs = 4326)
```

## Visualization

```R

# Replace NA values in sum_pop_2020 with 0
gdf$sum_pop_2020[is.na(gdf$sum_pop_2020)] <- 0

# Create a custom binned color palette with non-uniform breaks
# For example: 0 (distinct color), 1-200000 (gradient), 200001+ (another color)
breaks <- c(0, 1, 1000, 10000, 50000, 100000, 200000, max(gdf$sum_pop_2020))

custom_pal <- colorBin(palette = c("lightgray", "yellow", "orange", "red", "purple", "blue"),
domain = gdf$sum_pop_2020, bins = breaks)

# Create the leaflet map with custom binned coloring
leaflet(gdf) %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addCircleMarkers(
radius = 3, # Adjust size as needed
color = ~custom_pal(sum_pop_2020),
stroke = FALSE, fillOpacity = 0.7,
popup = ~paste("Hex ID:", hex_id, "<br>", "Population 2020:", sum_pop_2020) # Add a popup with details
) %>%
addLegend(
pal = custom_pal, values = gdf$sum_pop_2020, title = "Population 2020 (Custom Binned Scale)",
opacity = 1
)
```
Loading

0 comments on commit 2cfed4b

Please sign in to comment.