Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impute empty values as na in the variable browser #700

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vedhav
Copy link
Contributor

@vedhav vedhav commented Feb 28, 2024

Closes #697

@vedhav vedhav added bug Something isn't working core labels Feb 28, 2024
#' @returns (`vector`) a vector with empty strings imputed as `NA`, if provided.
#' @keywords internal
impute_blanks_as_na <- function(var) {
var <- as.vector(var)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping factor?

Copy link
Contributor Author

@vedhav vedhav Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was the main issue that was raised. If the factor has a level that is "" (or a character vector that has "") it will NOT be considered as NA, dropping the factor was the only way to consider them as NA.
I'm still keeping the PR as a draft because the request is unclear if we need to be doing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this not influence the plotting, like change the order of categories on the axis or in the color key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We explicitly change the NAs to <Missing> before we plot them. Previously the label was NA with a different color. Now, we're just displaying it as <Missing> because it is in line with the tern::df_explicit_na()

r$> df_explicit_na(data.frame(col = c("A", "B", NA, "C")))
        col
1         A
2         B
3 <Missing>
4         C

image

Again, so much of a feature request here. Nothing is clear as of now. So, it's still a draft and we can talk about what we want.

@m7pr
Copy link
Contributor

m7pr commented Feb 29, 2024

.

#' @param var (`vector`) a vector of any type and length
#' @returns (`vector`) a vector with empty strings imputed as `NA`, if provided.
#' @keywords internal
impute_blanks_as_na <- function(var) {
Copy link
Contributor

@chlebowa chlebowa Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

teal.modules.general Imports tern. Why write a new function rather than use one from the dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is such an imputation function in tern. However, It can be used when replacing NA with <Missing> 👍🏽

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't think we should be even imputing things. The true bug is that the data had empty strings which should be the app developer's responsibility to fix before injecting it into the teal app.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is such an imputation function in tern

Look again ?tern::df_explicit_na 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh great! I'll use it. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

View Variables: Left hand vs. bar plot table "Missing" counts don't match
3 participants