Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GraphQL #653

Open
davidbrochart opened this issue Jan 5, 2022 · 7 comments
Open

Support GraphQL #653

davidbrochart opened this issue Jan 5, 2022 · 7 comments

Comments

@davidbrochart
Copy link
Contributor

See jupyterlab/jupyterlab#11789

Problem

JupyterLab constantly polls the server to retrieve information about:

  • files and directories (contents API)
  • running terminals (terminals API)
  • running notebooks (sessions API)
  • running kernels (kernels API)

It's not optimal because it might:

  • make useless requests (when there is no update)
  • make requests long after the update happened
  • get back more information than needed

Proposed Solution

Would it make sense to support a GraphQL API? I remember there has been some work on this already, but I can't find it.

@blink1073
Copy link
Contributor

AFAIK @saulshanabrook did all of that work on his rtc branch

@saulshanabrook
Copy link
Contributor

@davidbrochart The demo coding I was working on is in https://github.com/saulshanabrook/rtc/tree/graphql/packages/jupyter-graphql It was working for a subset of the Jupyter server and used subscriptions to push the analysis of kernel messages to the server, keeping the state there.

I stopped working on it due to pressure to prioritize a working RTC implementation.

@echarles
Copy link
Member

echarles commented Jan 6, 2022

The client having to pull those information is really something we should move away from in favor of something more event-based pushed from the server.

If GraphQL can bring that, it would be wonderful. This should be an addition to all the existing APIs, not a replacement to ensure backwards compatibility.

I stopped working on it due to pressure to prioritize a working RTC implementation.

@saulshanabrook Yeah, we had discussed that. My understanding is that GraphQL still makes sense even without the RTC aspects which is now implemented via CRDT. But from what I see, not all aspects of RTC needs should/could be covered by CRDT which is focussed on a pure documents. e.g. The RTC event "Open a notebook" could be fulfilled by GraphQL... ? (just thinking loud)

@davidbrochart
Copy link
Contributor Author

Thanks @blink1073, @saulshanabrook and @echarles for the feedback.
I think GraphQL is not only helpful for handling notifications pushed from the server (and removing the polling from the client), but also to give clients more flexibility as to which information they request.
If it can be useful for RTC, that would be another reason to support it, but I'm not sure how. I will look at Saul's work to try and have a better understanding.
Maybe Jupyverse would be a good place to start experimenting with a GraphQL API, because FastAPI makes it easy to use any ASGI-compatible GraphQL library. I don't know if it's as easy with Tornado.

@echarles
Copy link
Member

echarles commented Jan 6, 2022

Maybe Jupyverse would be a good place to start experimenting with a GraphQL API, because FastAPI makes it easy to use any ASGI-compatible GraphQL library. I don't know if it's as easy with Tornado.

As @saulshanabrook pointed out, GraphQL on Tornado is already implemented on https://github.com/saulshanabrook/rtc/tree/graphql/packages/jupyter-graphql

I would favor experimenting on top of the existing Jupyter Server instead of Jupyverse to deliver value as soon as possible to existing Jupyter Server frontends.

@saulshanabrook
Copy link
Contributor

Yeah the implementation I was working on works as an extension on top of Jupyter Server, which allows clients to connect either with the existing endpoints or by using GraphQL. It uses the same in memory data structures as the server, to allow both simultaneously.

See for example https://github.com/saulshanabrook/rtc/blob/graphql/packages/jupyter-graphql/jupyter_graphql/jupyter_server_extension.py which adds a Jupyter Server extension, for graphql as well as the grpahql playground.

The Services class is what takes the jupyter server services and adds listeners to keep its own structures.

I gave a demo of the working code in an RTC meeting a while ago: https://youtu.be/fRlVawMDVMk?t=608

@bollwyvl
Copy link
Contributor

Hey, y'all! Hooray GraphQL!

With another jupyter-graphql, we got up to some fairly interesting demos. I particularly liked:

  • integration with graphql-voyager: having an accurate, well-typed schema that happens to generate interactive documentation is ❤️‍🔥.
  • wrap nbconvert in a subscription so you could emit a live-updating view of a rendered notebook.

If i was doing it again, I would not use the graphene ORM magic, but instead ariadne, as @saulshanabrook did, or tartiflette... whichever seemed more robust/maintained/extensible. As they are both schema-driven, it would be relatively straightforward to do a bakeoff. And the schema part is the big win, as it mostly avoids things like #518. Indeed, the types that come of GraphQL are about as expressive as TypeScript, and beyond JSON schema... certainly robust enough to generate either... or a bunch of other things.

At the time, extensible GraphQL schema wasn't really A Thing, but now that schema federation is more well-defined, I'd probably lean towards that. The magic here would be the ability to reuse core Jupyter types on top of other GraphQL-enabled apps such as gitlab or dagster.

In addition, there's also some of @rgbkrk's work on some node-based stuff.

ASGI-compatible

It's great for python to define a semi-formalized thing, and indeed, I feel like adopting the ASGI model would be a step forward rather than requiring tornado or FastAPI... but the long con of Jupyter infrastructure can't be python-only. Getting things like #518 under control so folk could really explore alternate high-performance (or lower-resource) implementations would pay off handsomly.

easy with Tornado.

It's entirely possible to shoehorn an ASGI app in-loop with tornado. I think this is critical for an extensible system that can also take advantage of all of the existing (and future) services a jupyter server + extensions might provide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants