-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inf test fails #172
base: master
Are you sure you want to change the base?
Inf test fails #172
Conversation
@tdhock That sounds reasonable. But it might be a bit tricky. We will have to edit the We check for domains after saving the layers, which might require some attention. I could look into it further, if you would like me to. Quick question: Was this handled by If yes, we could save some time and edit the data at the same place. This should work for non-update cases at least... |
for this example we have
|
however in this example, in this case, how are the domains computed? |
I would think that after computing domains, and before saving the data to tsv files, we should replace the infinite values with the corresponding finite value that is at the edge of the domain |
and yes I think this was handled by coord_transform in old ggplot2 however I don't think we should go back to using that, since it does not work with update_axes. now I think we should be systematically using your compute_domains |
So when we have no axis-updates, as in this example, we could just replace the values accordingly before saving the layer. So I suggested that we could do it at the same place the But this creates a problem for when we do have axis-updates, since we check for that after the layer data has been saved. This is problematic since the A straightforward solution could be to move the saveLayer call after the whole axis update code. If we do proceed with this solution, the This is surprisingly more complicated than I thought it would be... |
@tdhock I tried implementing the above solution when there are no axes updates, and unfortunately it seems like it will get complicated to deal with the |
thanks for trying to implement that solution. what exactly was the problem that you encountered? it seems to me that it should be possible to do the following: (1) use |
by the way there is a workaround: just replace the Inf values with finite values in the data set or aes definition. It is annoying and the values are not plotted on the edge of the panel, but it works ... for example here is an animint I presented last Friday http://bl.ocks.org/tdhock/raw/9311ca39d643d127e04a088814c81ee1/ where the second plot should have Inf values but I replaced them with finite values https://github.com/tdhock/PeakSegFPOP-paper/blob/d53e389bb86a5646d44c9fffbf72a0911409714f/figure-large-margin.R#L475 |
@tdhock In the case when
I think this is the way to go. I will try and see how this works out.. Regarding the replacing of values in the original dataset, I think it would be worthwhile to handle this in the animint code, as some users may find it bothersome to replace these values beforehand. As an alternative, is it possible to handle this in the renderer? |
thanks for clarifying the problem. that makes sense that we can't do it entirely in saveLayer, since we do not have access to the scale limits (domains) there. and yes I agree it is worthwhile to handle it in the animint code, as that is what ggplot2 does. it is definitely not possible to handle this only in the renderer, since Inf values in the tsv files are read into JavaScript as NaN. |
@tdhock I have made some changes that I want to be reviewed:
With these changes, the previous example seems to work well: https://bl.ocks.org/faizan-khan-iit/a58ad738149c016b485af3e697ae6102 |
Hey Faizan thanks for all of these changes, especially for cleaning up the parsePlot / meta mess. About point 3. can you please explain why it is not possible to handle the Inf values in the same way for updating and non-updating plots? It seems to me that non-updating plots should be a special case. (only 1 set of axis ranges instead of several) |
Well the internals look a bit better now. I will try and clean up more afterwards. Regarding the case of non-updating plots as a particular case of updating plots with only one axis range: When I implemented the axis updates, I edited the compiler so that the non-updating plots skip the domain computation (see this line), since that was redundant as the axis ranges had already been calculated before, and also because the renderer would have to display the same axis range every time. So when replacing the I guess what you would like is a single function that handles both the cases at the same place. If that is so, I will need to edit the current code that handles the axis updates. What I was suggesting was that I could just add another function to handle |
Thanks for your very clear explanation Faizan. However I'm not sure to have understood the point about "the renderer would have to display the same axis range every time." It seems to me that if there is only one axis domain/range, then the renderer should just render that axis at the beginning, and changing the selection should not change the axis. Am I missing something? So I guess we are confronted with a design choice: either (1) compute axis domains for updating and non-updating plots differently. Advantage: it uses the range that has already computed when building the ggplot, so it avoids repetition. Disadvantage: we have to implement Inf handling (and possibly other things later) twice -- once for updating and once for non-updating plots. (2) compute axis domains the same way for updating and non-updating plots, which means that we ignore the domain from the built ggplot, and instead always use your domain computation code. Advantage: we only need to implement Inf handling once. Disadvantage: it may be a little bit slower for non-updating plots. Do you see any other advantages/disadvantages to add to these choices? I think design choice (1) will be easier to maintain in the long run, since there is just once place in the code that we will know is responsible for domain computation. |
I think your comment pretty much sums up the situation. I was also thinking about the point you raised and I think that having the code for replacing
No need to worry regarding this. It won't happen. It was a bit of mistake on my part. I had forgotten that I had already integrated a check in the renderer regarding this. Regarding the pros and cons of the choices, choice (2) will not be slower once the plot has been generated. Of course the compiler will take a bit more time. Also, there are concerns with maintenance in both cases. Any bugs in Inf value code will have to be corrected at two different places, which is a disadvantage for (1). So finally, I think despite introducing a bit of redundant code, choice (1) is better, but only slightly. |
ok, thanks for the clarification. but actually I made a mistake, I meant to say that choice (2) will be easier to maintain in the long run, since there is just once place in the code that we will know is responsible for domain computation for all plots. haha. does that change your opinion? |
and I agree with your comment that "having the code for replacing Inf at two different places is a bit of bad programming practice" |
Ok then. I will alter the implementation of the axes updates a bit (choice 2). It should benefit us in the future and is a better programming practice. As for point 4 regarding the geoms, do you have any comments, or should I just mirror the |
great. about point 4, I think "the Inf values are just replaced with the most extreme values in the panel" is almost the same thing that ggplot2 does. Rather than the most extreme finite data point value, I think it should be the most extreme domain value (which should be a bit more extreme than the most extreme finite data point value). Do you see any geoms for which this would not work? |
To be fair, I have not worked with Inf values much, so I do not yet know how different geoms respond to it in |
I tried to create various cases for |
that sounds great. Thanks Faizan. |
This branch implements a test that fails for infinite data. This used to be working when we saved the built data to tsv files (before axis updates). See screenshot below.
Note also that the DOM panel shows that the JavaScript code is not able to tell the difference between positive and negative infinity (they are both read as NaN).
To fix this I propose to modify the compiler to replace Inf/-Inf values with the appropriate finite values that correspond to the panel border positions (scale limits). These scale limits will be different for different showSelected subsets, if axis_updates is on. @faizan-khan-iit do you have any ideas about where to change the compiler in order to achieve this?
TODO add a test for infinite values in a plot with axis updates.