Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutating deployment duration #107

Open
sannegovaert opened this issue Jul 17, 2024 · 4 comments
Open

Mutating deployment duration #107

sannegovaert opened this issue Jul 17, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@sannegovaert
Copy link
Member

sannegovaert commented Jul 17, 2024

A part of a deployment can be invalid. E.g. a storm happens and the tree with the camera trap falls. All observations after this event are invalid.
This has implications for:

  • deploymentEnd
  • observations and media
  • metadata

There are two options:

  • users code this themselves, and can be helped with a vignette
  • a new function to mutate deployments

User case:

library(camtrapdp)

raw_data <- example_dataset()
#> ✔ Updating temporal and spatial scopes in metadata based on data.
deploymentID_storm <- "62c200a9"
new_deploymentEnd <- as.POSIXct("2021-04-02 17:31:00", tz = "Europe/Brussels")

# inspect raw data
deployments(raw_data) %>% 
  dplyr::select(deploymentID, deploymentStart, deploymentEnd)
#> # A tibble: 4 × 3
#>   deploymentID deploymentStart     deploymentEnd      
#>   <chr>        <dttm>              <dttm>             
#> 1 00a2c20d     2020-05-30 02:57:37 2020-07-01 09:41:41
#> 2 29b7d356     2020-07-29 05:29:41 2020-08-08 04:20:40
#> 3 577b543a     2020-06-19 21:00:00 2020-06-28 23:33:22
#> 4 62c200a9     2021-03-27 20:38:18 2021-04-18 21:25:00
observations(raw_data) %>% 
  dplyr::filter(deploymentID == deploymentID_storm) %>% 
  dplyr::summarise(last_eventEnd = max(eventEnd))
#> # A tibble: 1 × 1
#>   last_eventEnd      
#>   <dttm>             
#> 1 2021-04-18 21:25:00
media(raw_data) %>% 
  dplyr::filter(deploymentID == deploymentID_storm) %>% 
  dplyr::summarise(last_timestamp = max(timestamp))
#> # A tibble: 1 × 1
#>   last_timestamp     
#>   <dttm>             
#> 1 2021-04-18 21:25:00

# mutate deploymentEnd
clean_data <- raw_data
camtrapdp::deployments(clean_data) <-
  camtrapdp::deployments(clean_data) %>%
  dplyr::mutate(
    deploymentEnd =
      as.POSIXct(
        dplyr::if_else(
          deploymentID == deploymentID_storm,
          new_deploymentEnd,
          deploymentEnd
        )
      )
  )

# Identify observations and media to remove
to_remove <-
  clean_data %>%
  camtrapdp::filter_observations(deploymentID == deploymentID_storm, eventEnd >= new_deploymentEnd)
#> Warning: There was 1 warning in `dplyr::filter()`.
#> ℹ In argument: `eventEnd >= new_deploymentEnd`.
#> Caused by warning in `.check_tzones()`:
#> ! 'tzone' attributes are inconsistent

observations_to_remove <- observations(to_remove)
media_to_remove <- media(to_remove)

# Update observations and media
observations(clean_data) <- 
  observations(clean_data) %>% 
  dplyr::anti_join(observations_to_remove)
#> Joining with `by = join_by(observationID, deploymentID, mediaID, eventID,
#> eventStart, eventEnd, observationLevel, observationType, cameraSetupType,
#> scientificName, count, lifeStage, sex, behavior, individualID,
#> individualPositionRadius, individualPositionAngle, individualSpeed, bboxX,
#> bboxY, bboxWidth, bboxHeight, classificationMethod, classifiedBy,
#> classificationTimestamp, classificationProbability, observationTags,
#> observationComments, taxon.taxonID, taxon.taxonRank, taxon.vernacularNames.eng,
#> taxon.vernacularNames.nld)`

media(clean_data) <-
  media(clean_data) %>% 
  dplyr::anti_join(media_to_remove)
#> Joining with `by = join_by(mediaID, deploymentID, captureMethod, timestamp,
#> filePath, filePublic, fileName, fileMediatype, exifData, favorite,
#> mediaComments, eventID)`

# inspect clean data
deployments(clean_data) %>% 
  dplyr::select(deploymentID, deploymentStart, deploymentEnd)
#> # A tibble: 4 × 3
#>   deploymentID deploymentStart     deploymentEnd      
#>   <chr>        <dttm>              <dttm>             
#> 1 00a2c20d     2020-05-30 02:57:37 2020-07-01 11:41:41
#> 2 29b7d356     2020-07-29 05:29:41 2020-08-08 06:20:40
#> 3 577b543a     2020-06-19 21:00:00 2020-06-29 01:33:22
#> 4 62c200a9     2021-03-27 20:38:18 2021-04-02 17:31:00

observations(clean_data) %>% 
  dplyr::filter(deploymentID == deploymentID_storm) %>% 
  dplyr::summarise(last_eventEnd = max(eventEnd))
#> # A tibble: 1 × 1
#>   last_eventEnd      
#>   <dttm>             
#> 1 2021-03-31 22:59:21

media(clean_data) %>% 
  dplyr::filter(deploymentID == deploymentID_storm) %>% 
  dplyr::summarise(last_timestamp = max(timestamp))
#> # A tibble: 1 × 1
#>   last_timestamp     
#>   <dttm>             
#> 1 2021-03-31 22:59:21

# The last observation is a couple of days before the storm (new deploymentEnd)

# (Update metadata in assignment functions is not merged with main branch yet)
clean_data <- clean_data %>% 
  camtrapdp:::update_temporal() %>% 
  camtrapdp:::update_taxonomic()

Created on 2024-10-15 with reprex v2.1.1

@peterdesmet
Copy link
Member

I understand this as a theoretical use case, is it also one provided by a user? I wonder if such data cleaning aspects should be resolved in a data management system like Agouti or in the camtrapdp.

@peterdesmet peterdesmet added the enhancement New feature or request label Aug 28, 2024
@sannegovaert
Copy link
Member Author

It is a real user case of @bramdhondt.

@bramdhondt
Copy link

If this is referring to the deployment that I think it is, the real world scenario was even "worse": after the tree annex camera went down, the camera was taken away to an office desk nearby, where it kept filming the employee sitting at his computer for some days :-)

@sannegovaert
Copy link
Member Author

A feature request about this issue been logged to de Agouti repository by @peterdesmet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants