Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New function merge_camtrapdp() #112

Open
wants to merge 125 commits into
base: main
Choose a base branch
from
Open

New function merge_camtrapdp() #112

wants to merge 125 commits into from

Conversation

sannegovaert
Copy link
Member

@sannegovaert sannegovaert commented Jul 25, 2024

fix #75

Remarks

  • Assume that locationID's and individualID's must not be unique as the same location or individual can be used in different data packages.
  • Assume that duplicatesd ID's are between packages, not within.
  • If duplicates are present, all values of that identifier get a prefix.

Helper functions
I created 6 helper functions. Maybe they can be simplified or reduced in number!
They now live at utils.R.

function functionality in merge_camtrapdp() suggestion
check_duplicate_ids() add prefix to identifiers with duplicates helper function @ utils.R
add_prefix() add prefix to identifiers with duplicates helper function @ utils.R
normalize_list() remove duplicated elements of lists helper function @ utils.R or separate file.R
is_subset() remove duplicated elements of lists helper function @ utils.R or separate file.R
update_unique() remove duplicated elements of lists helper function @ utils.R or separate file.R
remove_duplicates() remove duplicated elements of lists helper function @ utils.R or separate file.R

Copy link

codecov bot commented Jul 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.91%. Comparing base (4e52a04) to head (5e7360f).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #112      +/-   ##
==========================================
+ Coverage   99.89%   99.91%   +0.01%     
==========================================
  Files          23       24       +1     
  Lines         983     1211     +228     
==========================================
+ Hits          982     1210     +228     
  Misses          1        1              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@peterdesmet peterdesmet marked this pull request as ready for review October 17, 2024 12:28
eventID = FALSE)

deploymentIDs <- c(
unique(purrr::pluck(deployments(x), "deploymentID")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can deploymentID be missing from deployments(x) ? If not, then purrr::chuck() will throw an error if it is missing, thus making the line a little bit more defensive.

Suggested change
unique(purrr::pluck(deployments(x), "deploymentID")),
unique(purrr::chuck(deployments(x), "deploymentID")),


deploymentIDs <- c(
unique(purrr::pluck(deployments(x), "deploymentID")),
unique(purrr::pluck(deployments(y), "deploymentID"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unique(purrr::pluck(deployments(y), "deploymentID"))
unique(purrr::chuck(deployments(y), "deploymentID"))

Comment on lines +61 to +62
unique(purrr::pluck(media(x), "mediaID")),
unique(purrr::pluck(media(y), "mediaID"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unique(purrr::pluck(media(x), "mediaID")),
unique(purrr::pluck(media(y), "mediaID"))
unique(purrr::chuck(media(x), "mediaID")),
unique(purrr::chuck(media(y), "mediaID"))

Comment on lines +65 to +66
unique(purrr::pluck(observations(x), "observationID")),
unique(purrr::pluck(observations(y), "observationID"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unique(purrr::pluck(observations(x), "observationID")),
unique(purrr::pluck(observations(y), "observationID"))
unique(purrr::chuck(observations(x), "observationID")),
unique(purrr::chuck(observations(y), "observationID"))

Comment on lines +69 to +70
unique(purrr::pluck(media(x), "eventID")),
unique(purrr::pluck(media(y), "eventID"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unique(purrr::pluck(media(x), "eventID")),
unique(purrr::pluck(media(y), "eventID"))
unique(purrr::chuck(media(x), "eventID")),
unique(purrr::chuck(media(y), "eventID"))

)

# Check for duplicates
if (any(duplicated(deploymentIDs))) {result$deploymentID <- TRUE}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyDuplicated() ?

# Add unique resources from y
y_unique_resources <-
y_additional_resources[!y_additional_resources %in% duplicated_names]
purrr::map(y_unique_resources, function(resource_name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could purrr::imap() make this a bit easier instead of having to create the index in the anonymous function?

unique_data_list <-
purrr::map(unique_data, function(x) {
x <- as.list(x)
x[!sapply(x, is.na)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sapply() is not type safe, and will set off ropensci review alarms!

@PietrH
Copy link
Member

PietrH commented Nov 11, 2024

I only had a brief look, but an more in depth review of Peter is coming up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create merge() to merge datasets
3 participants