Skip to content

Commit ce4a30c

Browse files
committed
add readme example
1 parent 2e3486c commit ce4a30c

7 files changed

+224
-5
lines changed
29.1 KB
Loading

figures/readme-grid-designation-1.png

38.6 KB
Loading

figures/readme-polygon-1.png

23.1 KB
Loading
25.2 KB
Loading
33 KB
Loading
35.8 KB
Loading

paper.md

+224-5
Original file line numberDiff line numberDiff line change
@@ -300,30 +300,249 @@ Using a Google Form we got an overview of participants' interest in the differen
300300

301301
## Results
302302
### Collaboration
303-
Taks were efficiently distributed along the participants ([Fig. 3](#Figure_3)). In total, we collaborated with fourteen people pushing 209 commits to the main branch and 300 commits to all branches. On main, 56 files were changed and there have been 2,856 additions and 373 deletions. By the end of the hackathon, all CMD checks passed and we had a code coverage of 67%.
303+
Taks were efficiently distributed along the participants ([Fig. 3](#Figure_3)). In total, we collaborated with fourteen people pushing 209 commits to the main branch and 300 commits to all branches. On main, 56 files were changed and there have been 2,856 additions and 373 deletions. By the end of the hackathon, we had a functional pkgdown website ([Fig. 4](#Figure_4)), all CMD checks passed and we had a code coverage of 67%.
304304

305305
![Scrum board progress during code development. Categories from left to right: 'Ice Box', 'In Progress', 'Review', and 'Complete'. Day 1 was mainly introduction and discussion. Day 2-3 mainly code development. Day 4 was primarily review and pull request merging. Coding ended before the final presentations on day 4 in the afternoon.](./figures/scrum_board.jpg){#Figure_3 .Figure}
306306

307+
![Overview of the gcube pkgdown website.](./figures/pkgdown_site.png){#Figure_4 .Figure}
308+
307309
### Package development
308-
Wat allemaal gedaan tegen einde. Pkgdown website met readme, functies met documentation and examples; repo met code coverage enal
310+
The package was renamed to **gcube** which stands for ‘generate cube’ since it can be used to generate biodiversity data cubes from minimal input.
309311

310-
([Fig. 4](#Figure_4))
312+
``` {#Code_4 .r .Code}
313+
simulate_occurrences(
314+
plgn,
315+
initial_average_abundance = 50,
316+
spatial_autocorr = c("random", "clustered"),
317+
n_time_points = 1,
318+
temporal_function = NA,
319+
...,
320+
seed = NA
321+
)
322+
```
311323

312-
![Overview of the gcube pkgdown website.](./figures/pkgdown_site.png){#Figure_4 .Figure}
324+
``` {#Code_5 .r .Code}
325+
sample_observations(
326+
occurrences,
327+
detection_probability = 1,
328+
sampling_bias = c("no_bias", "polygon", "manual"),
329+
bias_area = NA,
330+
bias_strength = 1,
331+
bias_weights = NA,
332+
seed = NA
333+
)
334+
```
313335

314336
### Incorporation of virtual species to the simulation workflow
315337
- incorporation of project 8 and framework for virtualspecies
338+
316339
- samengevat op meeting achteraf met mensen die ermee bezig waren
317340

341+
## gcube workflow example
342+
This is a basic example from the README which shows the workflow for simulating a biodiversity data cube using the **gcube** package. This is not the exact README example from the hackathon, but a cleaned version from the week after. It uses the exact code as developed during the hackathon, but at that time we did not have enough time to create a clean README example.
343+
344+
The workflow is divided in three steps or processes:
345+
346+
1. Occurrence process
347+
2. Detection process
348+
3. Grid designation process
349+
350+
The functions are set up such that a single polygon as input is enough
351+
to go through this workflow using default arguments. The user can change
352+
these arguments to allow for more flexibility.
353+
354+
``` r
355+
# Load packages
356+
library(gcube)
357+
358+
library(sf) # working with spatial objects
359+
library(dplyr) # data wrangling
360+
library(ggplot2) # visualisation with ggplot
361+
```
362+
363+
We create a random polygon as input.
364+
365+
``` r
366+
# Create a polygon to simulate occurrences
367+
polygon <- st_polygon(list(cbind(c(5, 10, 8, 2, 3, 5), c(2, 1, 7,9, 5, 2))))
368+
369+
# Visualise
370+
ggplot() +
371+
geom_sf(data = polygon) +
372+
theme_minimal()
373+
```
374+
375+
![](./figures/readme-polygon-1.png){width=500px}
376+
377+
**1. Occurrence process**
378+
379+
We generate occurrence points within the polygon using the
380+
`simulate_occurrences()` function. These are the “real” occurrences of
381+
the species, whether we have observed them or not. In the
382+
`simulate_occurrences()` function, the user can specify different levels
383+
of spatial clustering, and can define the trend change of the species
384+
over time.
385+
386+
``` r
387+
# Simulate occurrences within polygon
388+
occurrences_df <- simulate_occurrences(
389+
plgn = polygon,
390+
seed = 123)
391+
#> [using unconditional Gaussian simulation]
392+
393+
# Visualise
394+
ggplot() +
395+
geom_sf(data = polygon) +
396+
geom_sf(data = occurrences_df) +
397+
theme_minimal()
398+
```
399+
![](./figures/readme-simulate-occurrences-1.png){width=500px}
400+
401+
**2. Detection process**
402+
403+
In this step we define the sampling process, based on the detection
404+
probability of the species and the sampling bias. This is done using the
405+
`sample_observations()` function. The default sampling bias is
406+
`"no_bias"`, but bias can also be inserted using a polygon or a grid.
407+
408+
``` r
409+
# Detect occurrences
410+
detections_df_raw <- sample_observations(
411+
occurrences = occurrences_df,
412+
detection_probability = 0.5,
413+
seed = 123)
414+
415+
# Visualise
416+
ggplot() +
417+
geom_sf(data = polygon) +
418+
geom_sf(data = detections_df_raw,
419+
aes(colour = sampling_status)) +
420+
theme_minimal()
421+
```
422+
423+
![](./figures/readme-detect-occurrences-1.png){width=500px}
424+
425+
We select the detected occurrences and add an uncertainty to these
426+
observations. This can be done using the `add_coordinate_uncertainty()`
427+
function.
428+
429+
``` r
430+
# Select detected occurrences only
431+
detections_df <- detections_df_raw %>%
432+
dplyr::filter(sampling_status == "detected")
433+
434+
# Add coordinate uncertainty
435+
set.seed(123)
436+
coord_uncertainty_vec <- rgamma(nrow(detections_df), shape = 2, rate = 6)
437+
observations_df <- add_coordinate_uncertainty(
438+
observations = detections_df,
439+
coords_uncertainty_meters = coord_uncertainty_vec)
440+
441+
# Created and sf object with uncertainty circles to visualise
442+
buffered_observations <- st_buffer(
443+
observations_df,
444+
observations_df$coordinateUncertaintyInMeters)
445+
446+
# Visualise
447+
ggplot() +
448+
geom_sf(data = polygon) +
449+
geom_sf(data = buffered_observations,
450+
fill = alpha("firebrick", 0.3)) +
451+
geom_sf(data = observations_df, colour = "firebrick") +
452+
theme_minimal()
453+
```
454+
455+
![](./figures/readme-uncertainty-occurrences-1.png){width=500px}
456+
457+
**3. Grid designation process**
458+
459+
Finally, observations are designated to a grid to create an occurrence
460+
cube. We create a grid over the spatial extend using
461+
`sf::st_make_grid()`.
462+
463+
``` r
464+
# Define a grid over spatial extend
465+
grid_df <- st_make_grid(
466+
buffered_observations,
467+
square = TRUE,
468+
cellsize = c(1.2, 1.2)
469+
) %>%
470+
st_sf() %>%
471+
mutate(intersect = as.vector(st_intersects(geometry, polygon,
472+
sparse = FALSE))) %>%
473+
dplyr::filter(intersect == TRUE) %>%
474+
dplyr::select(-"intersect")
475+
```
476+
477+
To create an occurrence cube, `grid_designation()` will randomly take a
478+
point within the uncertainty circle around the observations. These
479+
points can be extracted by setting the argument `aggregate = FALSE`.
480+
481+
``` r
482+
# Create occurrence cube
483+
occurrence_cube_df <- grid_designation(
484+
observations = observations_df,
485+
grid = grid_df,
486+
seed = 123)
487+
488+
# Get sampled points within uncertainty circle
489+
sampled_points <- grid_designation(
490+
observations = observations_df,
491+
grid = grid_df,
492+
aggregate = FALSE,
493+
seed = 123)
494+
495+
# Visualise grid designation
496+
ggplot() +
497+
geom_sf(data = occurrence_cube_df, linewidth = 1) +
498+
geom_sf_text(data = occurrence_cube_df, aes(label = n)) +
499+
geom_sf(data = buffered_observations,
500+
fill = alpha("firebrick", 0.3)) +
501+
geom_sf(data = sampled_points, colour = "blue") +
502+
geom_sf(data = observations_df, colour = "firebrick") +
503+
labs(x = "", y = "", fill = "n") +
504+
theme_minimal()
505+
```
506+
507+
![](./figures/readme-grid-designation-1.png){width=500px}
508+
509+
The output gives the number of observations per grid cell and minimal
510+
coordinate uncertainty per grid cell.
511+
512+
``` r
513+
# Visualise minimal coordinate uncertainty
514+
ggplot() +
515+
geom_sf(data = occurrence_cube_df, aes(fill = min_coord_uncertainty),
516+
alpha = 0.5, linewidth = 1) +
517+
geom_sf_text(data = occurrence_cube_df, aes(label = n)) +
518+
scale_fill_continuous(type = "viridis") +
519+
labs(x = "", y = "") +
520+
theme_minimal()
521+
```
522+
523+
![](./figures/readme-visualise-designation-1.png){width=500px}
524+
318525
## Discussion
319526

320527

321528
## Conclusions and future work
322529

323530

531+
- multispecies
532+
533+
- implement virtual species
534+
535+
- unit tests for all functions
536+
537+
- documentation complete
538+
539+
- issues: eg bugs: crs, improvements: column names, enhancements: spatial pattern, spatiotemporal connection
540+
324541
## Links to software
325542
-gcube repo
326-
-Commit meegeven van einde hackathon
543+
544+
-Commit hash meegeven van einde hackathon: reference point to know what version of the software you reviewed.
545+
https://github.com/b-cubed-eu/gcube/commit/6cceb2b229ac25d1df47a9c3a2e20b464f827e18
327546

328547

329548
## Acknowledgements

0 commit comments

Comments
 (0)