Skip to content

Consider dot-prefixing all parameter names in functions where we take in user-controlled named ... #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brookslogan opened this issue Jul 22, 2022 · 4 comments
Labels
op-semantics Operational semantics; many potentially breaking changes here P1 medium priority

Comments

@brookslogan
Copy link
Contributor

brookslogan commented Jul 22, 2022

Arose from this discussion. When the user can specify desired output columns or arguments to their own functions in ..., there is the danger of partial [see clarification below; this can be the source of problems, but not our current problems] argument name matching assigning args the user meant to be part of ... to one of our built-in parameters, and getting incorrect output or cryptic errors. Ryan has already encountered this once. The approach of dplyr and purrr is to put a dot prefix on all their built-in parameters, as users typically won't use dots as the first character in their own column or function parameter names. We should consider doing the same in epi_slide, epix_slide, and any other similar functions.

@brookslogan brookslogan added P1 medium priority op-semantics Operational semantics; many potentially breaking changes here labels Jul 22, 2022
@brookslogan
Copy link
Contributor Author

brookslogan commented Jul 30, 2022

(<- expand for details) Partial matching only appears to be an issue for parameters we declare before the dots (`x`, `f`, and `n` in the slide functions). For parameters declared after the dots, only exact matches are a concern.
dots_first = function(..., abcde=2) {
  cat("abcde:\n")
  print(abcde)
  cat("list(...):\n")
  print(list(...))
}
dots_last = function(abcde=2, ...) {
  cat("abcde:\n")
  print(abcde)
  cat("list(...):\n")
  print(list(...))
}
# `dots_first` avoids partial-match problems of `dots_last`:
dots_first(abc=10)
#> abcde:
#> [1] 2
#> list(...):
#> $abc # <------------- placed `abc` in dots, the more reasonable place
#> [1] 10
dots_last(abc=10)
#> abcde:  # <------------- matched `abc` to `abcde` parameter, less reasonable
#> [1] 10
#> list(...):
#> list()
# but of course doesn't avoid (rarer) exact-match issues. Regardless of where
# `...` appears in the function signature, calling with `abcde=10` will match to
# an `abcde` parameter rather than placing `abcde=10` into `...`. If `abcde` is
# a popular column name or `f` parameter name (e.g., `n`), then these exact
# matches get in the way.
dots_first(abcde=10) 
#> abcde:
#> [1] 10
#> list(...):
#> list()
dots_last(abcde=10)
#> abcde:
#> [1] 10
#> list(...):
#> list()

Created on 2022-07-30 by the reprex package (v2.0.1) (then annotated)

Since we keep most parameters to the right of ..., partial matching doesn't look like the cause of issues we have encountered so far, because all parameters to the left of ... have had only a single character each (x, f, n). Instead, the issue is probably because these argument names are popular as column/parameter names, and we encountered issues with an exact name conflict.

The above is just for clarification; it still looks like prefixing parameter names with dots, especially the currently-single-character parameter names, could help reduce the frequency of name conflicts. (Although we might want to think about whether the user will want to use f=purrr::map or other functions that take dot-prefixed arg names; those could still be a source of conflicts.)

@brookslogan
Copy link
Contributor Author

#161 sort of interacts with this: if before and after are placed at the same location as n was, then we could run into issues with the partial name matching (e.g., a = , b = ). But even though the details of the problems are changed up a bit, prefixing our parameter names with a dot still seems like a resolution/improvement.

@brookslogan brookslogan changed the title Consider dot-prefixing all parameter names in functions where we take in user-controlled ... Consider dot-prefixing all parameter names in functions where we take in user-controlled named ... Jun 28, 2023
@brookslogan
Copy link
Contributor Author

note: mutate now has .before and .after arguments of its own controlling output column placement, and this may lead to confusion for people who learned dplyr more recently.

@dshemetov
Copy link
Contributor

dshemetov commented Oct 3, 2024

Let's call this done in #477 and open a new issue if we want to cover more of the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
op-semantics Operational semantics; many potentially breaking changes here P1 medium priority
Projects
None yet
Development

No branches or pull requests

3 participants