Thoughts about our approach #3
Replies: 3 comments 10 replies
-
@almarklein Thanks for writing this up. You bring up some good points, but I'm having trouble distinguishing the differences between your preferred approach and the API approaches you argue against. I'll try to organize this, but I'm not super confident it'll happen... My UnderstandingMy understanding of something like a vispy 2.0 was that it is one way of doing things (sure, call it an API) where different "graphics backends" are implemented by meeting a minimal set of requirements. You could call this set of requirements an "aligned" API. These requirements were supposed to be implemented as "primitive" Visuals. These should be things that any library should reasonably be able to implement. Any extra features can be implemented as separate Visuals for better performance, but theoretically any high level Visual can be made from low-level Visuals. This may be hard to implement but it also should limit differences from backend to backend. I guess this could be thought of as VisPy is the frontend API, the primitive Visuals are the "middleware", and the graphics backends are...the backend API. The backend APIs don't have to be the same or even similar as long as we can implement the "middleware". Against unique and alignedA big part of this proposal was to update VisPy even if that is only the front-facing part of the proposal so that the backend libraries can get improved. While the VisPy API might change, the overall concepts are similar to what already exists. Users who are familiar with VisPy shouldn't have to change much about how they do things, but will suddenly get to use newer technologies (vulkan, wgpu).
When you get to the above point it doesn't seem like you are doing things to better the APIs between the libraries. Each library ends up doing its own thing and being slightly different which leaves users in the same spot as if you/we had tried to make a central API. An API that defines how differences are to be handled may work better than one that says "yeah there will be differences". Also for the current libraries being discussed (datoviz, pygfx/wgpu) are we expecting major differences in functionality? They draw stuff/things/Visuals on the screen and they do it efficiently.
At what point is a similar API just a single API? These "separate" APIs would probably have shared/common code so would that go in VisPy? How much low-level access do you expect a user to need? If this is a shared high-level API then there should be a high-level access point for getting to those low-level things. Previous workWhat you talked about is something the pyviz.org group (https://pyviz.org/) originally attempted to nail down early on, but I can't seem to find the pull requests from back then. The idea was to define the behavior for very high level interfaces of various visualization libraries (ex. a "save" function that can save to an image on disk). If every library had these high level interfaces then it reduces the learning curve for users switching between them. Found the github repository: https://github.com/pyviz/spec/pulls Other questions
|
Beta Was this translation helpful? Give feedback.
-
@djhoese thanks for the reply.
IIRC the proposal started out as an API for graphics backends (in particular Datoviz, Pygfx and Vispy.visuals). And I think that if this would be the scope, the option (1) that I mentioned (an API that is the intersection of functionality, so it always works on all backends), is viable, because the overlap is quite large. And also because none have a high-level API yet :) It seems, however, that along the way the scope has increased, because at some point we included at least Matplotlib. This is when it gets tricky, and then the worries that I expressed apply. The fact that we're now discussing this is a sign that its not very clear what the problem statement, goals, and scope is :) I
This is a valid point. I've come to believe that there is no perfect solution. But I do believe that individual but similar API's is preferred over a library/API that pretends to expose a single API while in fact its a mix of APIs, parts of which may or may not work on a specific backend. It reminds me of OpenGL ;) You could consider this as an argument to keep the scope confined to graphics APIs.
All of it :) But seriously, while some users will only use the high level API, most users will want more control at some point. What would that look like? Would they have to drop the high level API completely? With the "aligned APIs" idea, each library can make this transition in their own (relatively smooth) way. Though I can also see this work with a unified API if its constrained to graphics APIs.
Oh wow, I did not know this. One major difficulty with such an attempt, I realize now, is that it would be hard to convince devs of other libraries to play along and implement your dictated API :) |
Beta Was this translation helpful? Give feedback.
-
Cyrille, Nico and I discussed this this morning. We came up with some definitions (#5) to hopefully help making these discussions less "ambiguous". I updated some of the terms that I used in the original post. I'll also try to summarize my thoughts here (read #5 first): If we would make a unified-plotting-api using only gpu-viz-libraries as backends, I think there'd be enough overlap, so that the plotting-api can be the intersection. I think this could work. If instead we would aim to make a unifed-plotting-api using generic viz-libaries as backends, then the overlap between these backends would be much smaller. If this unified-plotting-api would be an intersection of the features, it would be very limited, and getting traction would be hard, because ... why would anyone use it? If instead this unified-plotting-api would expose the union of features, you get into a situation where depending on what features are used, user code may only work on specific backends. That last point was (as I found out by thinking it through) the source of an unease gut-feeling that I had, and the main reason for starting this discussion. I don't believe a feature matrix in the docs would help. See more arguments in my initial post. I'll try to word this in the form of a proposal:
|
Beta Was this translation helpful? Give feedback.
-
I have been thinking about our goals, and believe that the way we have now formulated them is flawed. It would be good to take a step back, try to define what the problem is that we're trying to solve, and move from there. Below I'll argue why I think that the way we have formulated our approach is problematic, and propose a slightly different approach. If you squeeze your eyes it's still the same, depending on how literal you take the original wording :)
What is the problem we're trying to solve?
In my words: the situation is that we have an ecosystem with many awesome different visualization libraries, that are each good at certain tasks, but none of them provides a complete package for all the visualization needs of a scientist. The problem that this poses for the scientist is that multiple API's must be learned in order to fulfil the different needs. It's also hard to try out different tools because of the cognitive burden to learn a new API.
What is our goal?
We formulated our goal earlier as: To design a new high-level API for scientific visualization, and to implement that API for different backends, including Datoviz, Pygfx & Matplotlib.
This formulation already includes the approach. Perhaps we can phrase it in more general terms: To provide a way for a user to use the different visualization libraries in a way that minimizes the cognitive load of switching between them.
What is our approach to realize this goal?
Some possibilities:
The way we've formulated it now seems to point towards (1) or (2). Spoiler: I'm going to argue that (1) and (2) are not a good approach.
Dealing with feature incompatibilities
edit: in this section I assumed the viz-libaries were not restricted to gpu-viz-libaries. If they are, the intersection is much more complete, and we should probably just do that.
One of the biggest difficulties of this project, I think, is how to deal with incompatibilities between the viz-libraries. Each library will have its own set of features and abstractions.
The intersection of that set is quite narrow. If we confine to only the intersection (option (1)), our API has little advantage; from the POV of each viz-ibrary, our API is less capable than the plotting-api of the library itself. This will not help getting traction. Plus one important point of this project was to let users make use of the different strengths of the viz-libraries, because none of them scratches all itches.
Therefore, creating a unified-plotting-api that covers the intersection of features (i.e. works on all backends) is not very useful.
This means that the new unified-plotting-api will provide the user with features that are not always available on all viz-libraries. In other words, there is a chance that the code that a user writes, does not work on certain backends, works on only one specific backend, or even not works at all because no backend covers all the API that is being used (e.g. a pie chart and volume rendering).
This is a problem, since it somewhat defeats the purpose of this project. Earlier, a user would use different tools with different APIs to build different kinds of visualizations. Now there would be one API, but you'd need to learn (or carefully keep track of) which parts of the API can be used where. How are we going to make it clear what parts of the API work where? We can provide a compatibility matrix in the docs, but do we expect users to keep that on the side the whole time?
This also poses a problem when sharing code (e.g. online). If you copy code from StackOverflow, that code may only work on a specific backend (which you may not have installed). More so, if you combine multiple code samples, there is a chance that one sample requires one backend, while another sample requires another, causing the combined code to not work at all.
In short: option (2) would be something that looks like a single API, but it's not: it would be a union of API's, and you'd always have to be aware what viz-library (backend) you intend to run your code on.
Therefore, creating an API that covers the union of features (i.e. may or not work on a certain backend) is also not very useful.
Unique but aligned APIs
I think we may have to drop the idea of the unified-plotting-api (one API with multiple backends), and replace it with with the idea of having multiple aligned plotting-apis (aligned as in made similar).
We could come up with a "proposed API specification", and each viz-library implements its own variant of that (subset/superset). They will be considered individual API's, each with their own docs, although they will have certain parts in common.
Users import an API specific to one viz-library at the top of their code. This makes it clear what viz-libary this code will run on. The GUI will help the user (docs, autocompletion, etc.) in a way specific to that API, and the user will use http://rtd.org/specific-viz-library/high-level-api. The function calls between different API's often look the same, but return values would be objects specific to the viz-library, allowing the user to drop into the lower-level mechanics of that library if more control is needed.
With a bit of willpower, simple viz code can be made to work on different viz-libraries by only changing the import statement. But this is not the main point. The point is that users are able to understand code from different viz libaries quickly, because they use the same constructs (e.g. for interaction callbacks), and similar function calls.
This helps break down the silos between the different viz libraries. A scientist can write high performance code (via e.g. Datoviz or pygfx), create publication quality figures for a paper (via e.g. MPL), and create a visualization for a blog (via e.g. plotly), using API's that are different, but familiar.
Summarizing
I think that the way that we have initially formulated our goal/approach may have been too ambitious. Not because it will be hard to do, but because it will be impossible to do without making concessions that undermine our initial purpose.
We should reconsider the idea of creating a unified-plotting-api with existing viz-libraries as backends, and instead look at creating multiple "aligned" plotting-apis, that have some parts that are equal, use the same familiar constructs, and can still leverage the power of the specific viz-library that they are part of.
There may also be other solutions that I have not thought of.
Beta Was this translation helpful? Give feedback.
All reactions