Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

have stat layers preserve variables passed to aesthetics #132

Open
Yingjie4Science opened this issue Jun 26, 2024 · 5 comments
Open

have stat layers preserve variables passed to aesthetics #132

Yingjie4Science opened this issue Jun 26, 2024 · 5 comments

Comments

@Yingjie4Science
Copy link

Hi @corybrunson I have a similar but slightly different question: in your last example here, is it possible to only show the labels on the last axis, i.e., "ms460_NSA".

I have tried label = ifelse(survey == "ms460_NSA" & after_stat(n)>10, after_stat(n), NA)), but with an error "object 'survey' not found"

Originally posted by @Yingjie4Science in #114 (comment)

@corybrunson
Copy link
Owner

Hi @Yingjie4Science, thanks for checking. I believe the reason this syntax doesn't work is that the use of after_stat() controls the entire expression passed to label, not just the part contained in after_stat(). The variable survey is not preserved by StatFlow, so it's not recognized. Instead, you can use the variable x, to which survey is passed, though since x is made numeric you'll need to know what number it corresponds to:

library(ggalluvial)
#> Loading required package: ggplot2
# rightward flow aesthetics for vaccine survey data, with cubic flows
data(vaccinations)
vaccinations$response <- factor(vaccinations$response,
                                rev(levels(vaccinations$response)))
# annotate fixed-width ribbons with counts
ggplot(vaccinations,
       aes(x = survey, stratum = response, alluvium = subject,
           weight = freq, fill = response)) +
  geom_lode() + geom_flow(curve_type = "cubic") +
  geom_stratum(alpha = 0) +
  geom_text(
    stat = "flow",
    aes(
      label = ifelse(x == 3, after_stat(n), NA),
      hjust = (after_stat(flow) == "to")
    )
  )
#> Warning: Removed 44 rows containing missing values or values outside the scale range
#> (`geom_text()`).

Created on 2024-06-27 with reprex v2.1.0

Maybe it would be worthwhile to have the Stat*s preserve the variables passed to aesthetics. I'll leave this issue open as a reminder to try that.

@corybrunson corybrunson changed the title annotate ribbons with counts on one of the Axes have stat layers preserve variables passed to aesthetics Jun 27, 2024
@Yingjie4Science
Copy link
Author

Yingjie4Science commented Jun 28, 2024

Thank you @corybrunson ! It's good to know that x is made numeric.

I have two follow-up questions related to the annotations.

  1. Is it possible to remove the text labels on the left side (see attached screenshot - labels in blue box) when x == 2?
  2. The current stat can add % as labels, but the % is calculated by lumping all strata. Is is possible to label the % by each stratum? (see an example in the screenshot, text in red)
ggplot(vaccinations,
       aes(x = survey, stratum = response, alluvium = subject,
           weight = freq, fill = response)) +
  geom_lode() + geom_flow(curve_type = "cubic") +
  geom_stratum(alpha = 0) +
  geom_text(
    stat = "flow",
    aes(
      # label = ifelse(x == 2, after_stat(n), NA),
      label = ifelse(x == 2, scales::percent(after_stat(prop), accuracy = 0.1), NA),
      hjust = (after_stat(flow) == "to")
    )
  )

Weixin Screenshot_20240627233609

@corybrunson
Copy link
Owner

Hi @Yingjie4Science, i think (1) can be done by additionally conditioning the labels on after_stat(flow) == "to" (or against after_stat(flow) == "from"). Please report back on whether that works, or i can try it later.

I don't think (2) has a straightforward solution. It might also be something to implement as an additional computed variable, maybe stratum_count or just sum for the total of count within each stratum?

@Yingjie4Science
Copy link
Author

Hi @corybrunson Thanks! The first solution works perfectly.

I am still struggling with (2) - are you suggesting we add an extra column to the dataframe? I am not sure how to call that data and use it in the label argument

@corybrunson
Copy link
Owner

@Yingjie4Science i apologize, i think i lost track of this exchange as other obligations piled up.

Regarding (2), i tried to write up my own understanding of computed variables here. Please let me know if the idea is clear. My proposal for (2) is then to add a new computed variable for within-stratum sums or proportions. This could be done quickly; i just need to think through the conventions (i.e. what to call these new columns) and consequences (i.e. make sure they don't introduce backward incompatibilities).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants