Proposal: replace asyncio with anyio #166
Labels
discussion
Needs more discussion
help wanted
Extra attention is needed
investigate
Needs investigaton or experimentation
Milestone
Yesterday I added the pull request #165 to start a discussion on supporting the trio ecosystem, preferably using anyio. @Cito invited me to create an issue as a place for discussing the implications of such a step.
I hope to get some feedback from you wether you think supporting trio in graphql-core is a good idea. My reasoning and approach for the pull request is outlined below
The basics
What is structured concurrency and trio?
This is best answered by the trio project, I can recommend reading their documentation, especially their design principles. There are also a few blog posts that shed light on the reasoning behind trio, e.g. structured concurrency and cancellation.
Basically, trio implements an event loop for asynchronous code execution, just like the asyncio standard library. However, there are some important differences. The following two have probably the most impact:
1.) You cannot start async tasks whenever you want. All tasks have a parent, if you want to start multiple coroutines, you need a
Nursery
. The nursery guarantees that all tasks are done when exiting its scope. This makes it much easier to reason about async code: If you await an async function, there will be no background tasks running when the function returns. Only exception: passing in a nursery explicitely to start the tasks. Then the tasks may run until the end of the nursery scope.2.) Level-based cancellation. Cancellation is done via scopes. Once a scope is cancelled, all its children scopes are cancelled as well. It also stays cancelled: After cancellation, any checkpoint (see trio docs for details, basically any call into trio like
trio.sleep
) raisesCancelled
. This makes it easier to get cancellation right. Asyncio only ever sends oneCancelledError
. It is often caught with good intentions but not raised again in some circumstances.These two changes alone make it much easier to write correct asynchronous code.
What is anyio?
The anyio project Allows to use the same structured concurrency and cancellation behavior as trio (trio's
Nursery
is aTaskGroup
, etc.). However, it can run on two different backends: trio and standard asyncio.This makes anyio ideal for two use cases:
1.) Iteratively replacing asyncio code with anyio code. At any point the code should still work, even if only partially replaced.
2.) Writing libraries that support trio and asyncio
Why support trio in graphql-core?
If you ever had to create or work in a large scale asynchronous Python application, you will probably have spent hours fixing tasks that should have closed but are instead still blocking some resource or preventing the event loop shutdown. Mostly in your own code, sometimes in third-party libraries. The trio (and anyio) way of tackling asynchronous code can help tremendously when using asynchronous code at a larger scale.
The graphql-core library is well-tested and takes great care to handle asynchronous tasks and cancellation correctly. Therefore, this change will have no benefit for the graphql-core codebase, it will not make it simpler or easier to understand. However, it allows users to make the choice between asyncio and trio, opening up to the growing trio ecosystem. This could also potentially increase usage of the graphql-core library. Both are worthy goals in my opinion.
How to do it?
If I have you convinced that supporting trio is not a bad idea in itself, the question comes down to: How to do that in graphql-core without too many downsides?
Why Anyio?
If you want to avoid additional libraries, you can potentially support trio by adding a lot of if-else blocks to your code, using either an asyncio or a trio implementation. This has a few downsides:
Another option would be trio-asyncio, which runs two event loops and contains tools to stitch it together. However, its documentation already states it is intended for applications that need to use libraries from trio and asyncio. It seems more like a stopgap solution if you really need to combine two libraries that have conflicting requirements.
That is why I used the anyio library in the pull request. It makes it easier to write code that works with asyncio and trio. Anyio is also a promise to users of the graphql-core library: The library is built using structured concurrency, you can be sure that tasks are cleaned up correctly. Of course there is one major downside: graphql-core now requires the anyio library.
@Cito already expressed concern about adding an additional library:
Regarding support I was sceptical to include anyio in my own code as well. However, looking at my dependency tree I found that I was inadvertently using it: Starlette can run on asyncio and trio, using anyio (encode/starlette#1157). As Starlette is rather widespread I believe there is sufficient interest to keep anyio going. Maybe you know additional libraries that rely on anyio? If enough libraries rely on it, it is unlikely to simply disappear.
Worst case scenario: anyio support is dropped and their is no replacement. In this case you could decide to drop trio support again, at the cost of angry users, who want to have a graphql server with their trio event loop.
Introducing new bugs when using anyio is of course possible. As the anyio API is very close to trio, I expect it to work rather well with trio. The trio backend in anyio is often only a thin wrapper around the corresponding trio object (see https://github.com/agronholm/anyio/blob/master/src/anyio/_backends/_trio.py).
However, emulating the trio behavior in asyncio code is more difficult. Especially since the cancellation works differently. There might very well cases when things do not work as expected when including it here.
The only way to get some ease of mind would be sufficient tests. While my pull request passes the graphql-core tests, I did not specifically test for incompatibilities. You can help out here. If you are using graphql-core in your project and would like to see trio support: take my pull request and test it. See if it introduces any problems with existing code. If this issue gathers enough interest I might be motivated to write additional tests explicitely searching for changed behavior and incompatibilities.
Performance considerations
Using anyio instead of asyncio can potentially cost performance, as it is an additional wrapper on top of asyncio (or trio). The question are: How much? Is it worth it?
The Starlette pull request (encode/starlette#1157) found a 4.5% drop in their performance score, but this has hardly any meaning for a different library.
graphql-core itself has only one async benchmark. I included benchmarks in the pull request, I could not see a performance impact.
But standard deviation was rather large, and a single benchmark is hardly enough here. The best way forward would be additional benchmarks. Again, if this issue gathers enough interest I might be motivated to put some more work in.
The fate of SimplePubSub?
Edit: SimplePubSub is not part of the official API, we can just change it keep the tests similar to their GraphQL.js counterpart, see comment
This discussion is only relevant if we actually want to go through with it and adopt anyio. But since it cost most of the time when porting graphql-core to anyio, I will include it here.
Basically, I did not port SimplePubSub to anyio. Its API is not compatible. Instead of changing its API I added a warning about its reliance on asyncio and replaced its usage in the test code with a completely new implementation called
MemoryObjectBroadcastStream
. Its interface is in line with the anyio streams and works well in the test suite.It would also be possible to change the SimplePubSub API or come up with something completely different. SimplePubSub is not advertised much in the documentation and its description is rather sparse. I would be interested to hear if this class is actually used much outside of graphql-core and in which way.
Exception groups
Again, only a relevant discussion if anyio is actually used. Generally, trio supports a
MultiError
(ExceptionGroup
in anyio), which does exactly as advertised: If aNursery
(TaskGroup
) has multiple tasks ending with an exception it collects them and raises all of them. These are difficult to handle and would break the graphql-core API if propagated to the caller. Handling these cleanly in trio ultimately lead to the newexcept*
language feature in Python 3.11 (see https://peps.python.org/pep-0654/). That is, handling them in Python versions <3.11 is annoying.Asyncio uses a different approach in its
gather
function: Raise the first exception.To avoid unnecessary surprises for users of the graphql-core library I opted for handling all exception groups internally in the pull request. Only one of the errors is actually raised to the caller. Of course, we might loose some information here, just as we do with
asyncio.gather
. Any opinions on this matter?Python version
Using anyio would bump the minimum Python version from 3.6.0 to 3.6.2. @Cito plans to drop support for 3.6 anyway, so this does not seem much of an issue. But it would probably be a good idea to delay the inclusion of anyio until Python 3.6 support is dropped.
The text was updated successfully, but these errors were encountered: