Releases: r-lib/tidyselect
tidyselect 1.2.1
tidyselect 1.2.0
New features
-
New
tidyselect_data_proxy()
andtidyselect_data_has_predicates()
allows tidyselect to work with custom input types (#242). -
New
eval_relocate()
for moving a selection. This powersdplyr::relocate()
(#232).
Lifecycle changes
-
Using
all_of()
outside of a tidyselect context is now deprecated (#269).
In the future it will error to be consistent withany_of()
. -
Use of
.data
in tidyselect expressions is now deprecated to more cleanly
separate tidy-select from data-masking. Replace.data$x
with"x"
and
.data[[var]]
withany_of(var)
orall_of(var)
(#169). -
Use of bare predicates (not wrapped in
where()
) and indirection (without
usingall_of()
) have been formally deprecated (#317).
Minor improvements and bug fixes
-
Selection language:
-
any_of()
generates a more informative error if you supply too many
arguments (#241). -
all_of()
(likeany_of()
) returns an integer vector to make it easier
to combine in functions (#270, #294). It also fails when it can't find
variables even whenstrict = FALSE
. -
matches()
recognises and correctly uses stringr pattern objects
(stringr::regex()
,stringr::fixed()
, etc) (#238). It also now
works with named vectors (#250). -
num_range()
gains asuffix
argument (#229). -
where()
is now exported, like all other select helpers (#201),
and gives more informative errors (#236).
-
-
eval_select()
withinclude
now preserves the order of the variables
if they're present in the selection (#224). -
eval_select()
always returns a named vector, even when renaming is not
permitted (#220). -
eval_select()
andeval_relocate()
gain newallow_empty
argument which
makes it possible to forbid empty selections withallow_empty = FALSE
(#252). -
eval_select(allow_rename = FALSE)
no longer fails with empty
selections (#221, @eutwt) or with predicate functions (#225). It now properly
fails with partial renaming (#305). -
peek_var()
error now generates hyperlink to docs with recent RStudio (#289). -
vars_pull()
generates more informative error messages (#234, #258, #318)
and gainserror_call
anderror_arg
arguments. -
Errors produced by tidyselect should now be more informative. Evaluation
errors are now chained, with the child error call is set to theerror_call
argument ofeval_select()
andeval_rename()
. We've also improved
backtraces of base errors, and done better at propagating the root
error_call
to vctrs input checkers. -
tidyselect_verbosity
is no longer used; deprecation messaging is now
controlled bylifecycle_verbosity
like all other packages (#317).
tidyselect 1.1.2
-
Fix for CRAN checks.
-
Better compatibility with rlang 1.0.0 errors. More to come soon.
tidyselect 1.1.1
-
Fix for CRAN checks.
-
tidyselect has been re-licensed as MIT (#217).
tidyselect 1.1.0
-
Predicate functions must now be wrapped with
where()
.iris %>% select(where(is.factor))
We made this change to avoid puzzling error messages when a variable
is unexpectedly missing from the data frame and there is a
corresponding function in the environment:# Attempts to invoke `data()` function data.frame(x = 1) %>% select(data)
Now tidyselect will correctly complain about a missing variable
rather than trying to invoke a function.For compatibility we will support predicate functions starting with
is
for 1 version. -
eval_select()
gains anallow_rename
argument. If set toFALSE
,
renaming variables with thec(foo = bar)
syntax is an error.
This is useful to implement purely selective behaviour (#178). -
Fixed issue preventing repeated deprecation messages when
tidyselect_verbosity
is set to"verbose"
(#184). -
any_of()
now preserves the order of the input variables (#186). -
The return value of
eval_select()
is now always named, even when
inputs are constant (#173).
tidyselect 1.0.0
This is the 1.0.0 release of tidyselect. It features a more solidly
defined and implemented syntax, support for predicate functions, new
boolean operators, and much more.
Documentation
-
New Get started vignette for client packages. Read it with
vignette("tidyselect")
or at
https://tidyselect.r-lib.org/articles/tidyselect.html. -
The definition of the tidyselect language has been consolidated. A
technical description is now available:
https://tidyselect.r-lib.org/articles/syntax.html.
Breaking changes
- Selecting non-column variables with bare names now triggers an
informative message suggesting to useall_of()
instead. Referring
to contextual objects with a bare name is brittle because it might
be masked by a data frame column. Usingall_of()
is safe (#76).
tidyselect now uses vctrs for validating inputs. These changes may
reveal programming errors that were previously silent. They may also
cause failures if your unit tests make faulty assumptions about the
content of error messages created in tidyselect:
-
Out-of-bounds errors are thrown when a name doesn't exist or a
location is too large for the input. -
Logical vectors now fail properly.
-
Selected variables now must be unique. It was previously possible to
return duplicate selections in some circumstances. -
The input names can no longer contain
NA
values.
Note that we recommend testthat::verify_output()
for monitoring
error messages thrown from packages that you don't control. Unlike
expect_error()
, verify_output()
does not cause CMD check failures
when error messages have changed. See
https://www.tidyverse.org/blog/2019/11/testthat-2-3-0/ for more
information.
Syntax
-
The boolean operators can now be used to create selections (#106).
!
negates a selection.|
takes the union of two selections.&
takes the intersection of two selections.
These patterns can currently be achieved using
-
,c()
and
intersect()
respectively. The boolean operators should be more
intuitive to use.Many thanks to Irene Steves (@isteves) for suggesting this UI.
-
You can now use predicate functions in selection contexts:
iris %>% select(is.factor) iris %>% select(is.factor | is.numeric)
This feature is not available in functions that use the legacy
interface of tidyselect. These need to be updated to use
the neweval_select()
function instead ofvars_select()
. -
Unary
-
inside nestedc()
is now consistently syntax for set
difference (#130). -
Improved support for named elements. It is now possible to assign
the same name to multiple elements, if the input data structure
doesn't require unique names (i.e. anything but a data frame). -
The selection engine has been rewritten to support a clearer
separation between data-expressions (calls to:
,-
, andc
) and
env-expressions (anything else). This means you can now safely use
expressions of the type:data %>% select(1:ncol(data)) data %>% pivot_longer(1:ncol(data))
Even if the data frame
data
contains a column also nameddata
,
the subexpressionncol(data)
is still correctly evaluated.
Thedata:ncol(data)
expression is equivalent to2:3
because
data
is looked up in the relevant context without ambiguity:data <- tibble(foo = 1, data = 2, bar = 3) data %>% dplyr::select(data:ncol(data)) #> # A tibble: 1 x 2 #> data bar #> <dbl> <dbl> #> 1 2 3
While this example above is a bit contrived, there are many realistic
cases where these changes make it easier to write safe code:select_from <- function(data, var) { data %>% dplyr::select({{ var }} : ncol(data)) } data %>% select_from(data) #> # A tibble: 1 x 2 #> data bar #> <dbl> <dbl> #> 1 2 3
User-facing improvements
-
The new selection helpers
all_of()
andany_of()
are strict
variants ofone_of()
. The former always fails if some variables
are unknown, while the latter does not.all_of()
is safer to use
when you expect all selected variables to exist.any_of()
is
useful in other cases, for instance to ensure variables are selected
out:vars <- c("Species", "Genus") iris %>% dplyr::select(-any_of(vars))
Note that
all_of()
andany_of()
are a bit more conservative in
their function signature thanone_of()
: they do not accept dots.
The equivalent ofone_of("a", "b")
isall_of(c("a", "b"))
. -
Selection helpers like
all_of()
andstarts_with()
are now
available in all selection contexts, even when they haven't been
attached to the search path. The most visible consequence of this
change is that it is now easier to use selection functions without
attaching the host package:# Before dplyr::select(mtcars, dplyr::starts_with("c")) # After dplyr::select(mtcars, starts_with("c"))
It is still recommended to export the helpers from your package so
that users can easily look up the documentation with?
. -
starts_with()
,ends_with()
,contains()
, andmatches()
now
accept vector inputs (#50). For instance these are now equivalent
ways of selecting all variables that start with either"a"
or"b"
:starts_with(c("a", "b")) starts_with("a") | starts_with("b")
-
matches()
has new argumentperl
to allow for Perl-like regular
expressions (@fmichonneau, #71) -
Better support for selecting with S3 vectors. For instance, factors
are treated as characters.
API
New eval_select()
and eval_rename()
functions for client
packages. These replace vars_select()
and vars_rename()
, which are
now deprecated. These functions:
-
Take the full data rather than just names. This makes it possible to
use function predicates in selection context. -
Return a numeric vector of locations rather than a vector of
names. This makes it possible to use tidyselect with inputs that
support duplicate names, like regular vectors.
Other features and fixes
-
The
.strict
argument ofvars_select()
now works more robustly
and consistently. -
Using arithmetic operators in selection context now fails more
informatively (#84). -
It is now possible to select columns in data frames containing
duplicate variables (#94). However, the duplicates can't be part of
the final selection. -
eval_rename()
no longer ignore the names of unquoted character
vectors of length 1 (#79). -
eval_rename()
now fails when a variable is renamed to an existing
name (#70). -
eval_rename()
has better support for existing duplicates (but
creating new duplicates is an error). -
eval_select()
,eval_rename()
andvars_pull()
now detect
missing values uniformly (#72). -
vars_pull()
now includes the faulty expression in error messages. -
The performance issues of
eval_rename()
with many arguments have
been fixed. This makedplyr::rename_all()
with many columns much
faster (@zkamvar, #92). -
tidyselect is now much faster with many columns, thanks to a
performance fix inrlang::env_bind()
as well as internal fixes. -
vars_select()
ignores vectors with only zeros (#82).
tidyselect 0.2.5
This is a maintenance release for compatibility with rlang 0.3.0.
tidyselect 0.2.4
-
Fixed a warning that occurred when a vector of column positions was
supplied tovars_select()
or functions depending on it such as
tidyr::gather()
(#43 and tidyverse/tidyr#374). -
Fixed compatibility issue with rlang 0.2.0 (#51).
tidyselect 0.2.3
-
Internal fixes in prevision of using
tidyselect
withindplyr
. -
vars_select()
andvars_rename()
now correctly support unquoting
character vectors that have names. -
vars_select()
now ignores missing variables.
tidyselect 0.2.2
Maintenance release:
dplyr
is now correctly mentioned as suggested package.