Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Map one Factor to Another #144

Open
billdenney opened this issue Jul 29, 2018 · 2 comments
Open

Feature Request: Map one Factor to Another #144

billdenney opened this issue Jul 29, 2018 · 2 comments
Labels
feature a feature request or enhancement

Comments

@billdenney
Copy link
Contributor

I often need to map a factor to another variable that either is already a factor or I need to make into a factor.

Would you consider including something like the following?

fct_mirror <- function(.f, .x) {
  factor(.x, levels=levels(.f), ordered=is.ordered(.f))
}

(This would solve an issue similar to tidyverse/dplyr#3731 by allowing fct_mirror(my_factor, case_when(...)).)

@billdenney billdenney changed the title Feature Request: Map a Factor to Another Feature Request: Map one Factor to Another Jul 29, 2018
@hadley
Copy link
Member

hadley commented Jul 29, 2018

I think the problem is a bit underspecified currently. I can see at least three things you might want to do:

  • create a new factor (e.g. original motivation from SO)
  • change levels of existing factor
  • change values of existing factor

Your proposal would help with the last, but not the others. I'd likely to fully explore the problem space before thinking about a solution.

@billdenney
Copy link
Contributor Author

You're right, the original SO question was different. The original SO question seems like a mix of fct_relevel and case_when.

To change the levels of an existing factor, I would think that is the work of fct_recode or fct_expand, but maybe you have a different use case in mind.

My code above intends to allow remapping of one factor to another as is a common case working with clinical trial data:

library(dplyr)                                                                            
#> Warning: package 'dplyr' was built under R version 3.4.4
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
d <- expand.grid(Treatment_Original=factor(c("Treatment 1", "Treatment 2"), ordered=TRUE),
Day=c(-1, 1))                                                                             
                                                                                          
# Fails because "Baseline" and `Treatment_Original` are different classes                 
d %>%                                                                                     
mutate(Treatment_Baseline=case_when(Day == -1~"Baseline",                                 
TRUE~Treatment_Original))                                                                 
#> Warning: package 'bindrcpp' was built under R version 3.4.4
#> Error in mutate_impl(.data, dots): Evaluation error: must be type character, not integer.
# Succeeds, but loses the fact that I want a factor for `Treatment_Baseline`              
d %>%                                                                                     
mutate(Treatment_Baseline=case_when(Day == -1~"Baseline",                                 
TRUE~as.character(Treatment_Original)))                                                   
#>   Treatment_Original Day Treatment_Baseline
#> 1        Treatment 1  -1           Baseline
#> 2        Treatment 2  -1           Baseline
#> 3        Treatment 1   1        Treatment 1
#> 4        Treatment 2   1        Treatment 2
# Succeeds, but is cumbersome                                                             
d %>%                                                                                     
mutate(Treatment_Baseline=factor(                                                         
case_when(Day == -1~"Baseline",                                                           
TRUE~as.character(Treatment_Original)),                                                   
levels=c("Baseline", levels(Treatment_Original)),                                         
ordered=is.ordered(Treatment_Original)))                                                  
#>   Treatment_Original Day Treatment_Baseline
#> 1        Treatment 1  -1           Baseline
#> 2        Treatment 2  -1           Baseline
#> 3        Treatment 1   1        Treatment 1
#> 4        Treatment 2   1        Treatment 2

(This use case is made more relevant if #138 is implemented.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants