Skip to content
This repository has been archived by the owner on Apr 13, 2021. It is now read-only.

fork & merge threads: staging->master update pattern #19

Closed
100ideas opened this issue Feb 7, 2020 · 4 comments
Closed

fork & merge threads: staging->master update pattern #19

100ideas opened this issue Feb 7, 2020 · 4 comments

Comments

@100ideas
Copy link

100ideas commented Feb 7, 2020

problem: how to implement "draft" or "staging" system for a textile thread. In draft mode, local changes to a thread or threads drive the UI like normal without changing the original threads. If the user commits the draft, the original threads are updated and changes are transmitted across network. Otherwise the draft/staging branch is discarded.

More generally, I'd like to be able to:

  1. take some Threads
  2. rewind them to some prior state
  3. fork them
  4. make some changes
  5. merge them back into the original threads.

Describe the solution you'd like
A Thread or set of dependant Threads can be transiently forked or copied into a new set of Threads. The app + UI switches to using these, thus allowing the frontend<->textile data bindings to remain the same. At some point, these threads are merged back into the originals, or discarded. Either way, the app switches back to using the original threads. Perhaps the merge operation works simply by updating the original Threads with just the HEAD values from the draft threads - i.e. no history is merged.

As an example, consider a textile webapp, say draft-js text editor w/ support for images. It uses textile for "everything" (replicated data store + redux event sourcing style bindings for frontend view), making use of several Threads in the process.

Goal: implement a local only "draft stage" that lets the user make changes to their document & files via the ui, see the results, then optionally commit them or revert them. conceptually like git staging. Draft changes do not need to be synchronized across network, although that would be cool.

Describe alternatives you've considered
Instead of implenting directly with Threads, an App could implement a client-only data layer on top of textile that shadows the Threads. In draft mode, the UI makes changes to and previews the results of editing this draft data layer. To commit the draft, the shadow layer is applied to the appropriate Textile threads. Upsides to this approach: may be simpler than that proposed above. Downsides: clientside "shadow threads" require their own UI & data bindings for draft mode. Should shadow threads provide the same / subset API as textile client?

Additional context
I originally asked about this on Textile slack chat. Below is a transcript:


@100ideas: I'd like to use textile threads drive my JS application and provide data versioning & replication. The app is a computational notebook similar to jupyter. One of the major use cases I want to support is transiently forking part of the state (computational graph), merging in edits, then recomputing the results and showing how they differ from the original (parent) state. This is to help the user iteratively explore the effects of changing their program's structure and its data dependancies - they may want to undo it all, or only commit and merge the changes after experimenting a bit.

the 2019 whitepaper describes textile's store -> events -> dispatcher -> reducers (& datastore) -> event bus -> log service state update + control loop.

Would it be possible to add special dispatchers + reducers that short-circuit the event-bus and instead only generate temporary local state? Partially inside textile and partially outside of it? I am thinking this might be a nice way to shadow the app's state with a lightweight, transient, non-replicated draft of changes to state and to use this for rendering previews of what the draft changes would do. Or would it be better to build this completely on top of textile, in another separate layer (that perhaps still listens to certain events)?

@carsonfarmer So, this is a much bigger question that relates to how developers can (and will be able to) use threads.

Right now, we essentially have two API levels, one is the Store API, which is what you mentioned above. Complete with existing event codecs etc, with reference implementation in Go, and js-wrappers available to play with.

The second API is the Thread-level API (currently called the Network service layer). This is really just the thread part of threads, the network/data transport protocol so to speak. But developers can have access to this level, which means, you have a lot more control over what eventually gets pushed to peers. This level might be where you'd integrate your special dispatchers and reducers (which we wrap into event codecs).

The other thing you can do is as you suggest, build out some application level logic that takes 'official' store-level data, and generates drafts (maybe even using the same store, but a different keyspace), before committing them to the store.

We actually envisage different store level apis emerging... so while not everyone is going to want the document/model-based framework we've implemented in our reference implementation, the network service layer is pretty agnostic to these details... which is what the codecs help with.

So, long story short... I think you are on the right track... the 'easy' way (from threads perspective) would be to build on top of threads 'as is', with application logic to control draft states. The 'hard' way (again, from threads perspective) would be to build your own 'store'-level APIs, and use the service-level APIs to do the actual hard p2p bits...

Hopefully that stream-of-conciousness response is helpful in your research? Note that js-threads is under heavy development right now, so any inputs or ideas you have here to make it easier as an external dev to start playing with this stuff is super valuable. Follow along here: https://github.com/textileio/js-threads

@100ideas Thank you @carsonfarmer. I still have only the vaguest sense of how to structure a textile app. I guess my next step would be to run the js-foldersync demo and start tweaking it to get some hands-on experience.

If so, I will try the "easy" approach (layer on top) as you described it above.

Just to restate the goal:

  1. consider a textile webapp, say draft-js text editor w/ support for images. It uses textile for "everything" (replicated data store + redux event sourcing style bindings for frontend view)

  2. implement a local only "draft stage" that lets the user make changes to the document via the ui, see the results, then optionally commit them or revert them. conceptually like git staging.

since the way the user will interact with the draft is the exact same as with the original document, and since in "normal mode" the app translates these interactions into textile commands which then update the view (all via Store API), it feels like it would be good to also use textile and its data-view bindings for the draft mode.

You suggested this might be possible with the Store API by providing a different keyspace. Are there any examples or docs that outline how a textile app would use two different keyspaces to fork and merge threads that have the same schema?

@carsonfarmer Cool, yes let’s do a ticket, especially because this is pretty new ground for me/us. The key space thing was more riffing off the top of my head then a fleshed our idea, but essentially, this is like having two separate ‘stores’ or threads with the same data. One for staging, and one for ‘real’.

The way you then might structure it is that you have a store in which you store every update etc, and then only ‘commit’ the good/viable updates to the ‘master’ store. Very much like a git flow. These two stores could actually be two threads, with the same schemas. You could in theory then have some readers following both stores, and some only following master. Day you device peers sync everything, but your collaborators only see the committed stuff?

So really, it’s just two or more threads, with app logic to help the user decide which commits make it to master.

@100ideas hmm, that sounds good. And it would perhaps be useful to occasionally switch to a fresh "master" thread initialialzed from the commit thread, to provide something akin to log compaction

@carsonfarmer Also just FYI, thread forking, snapshots, and even merging are things we’re actively researching/planning for, but haven’t yet made it into the white paper. See for example some earlier stuff here: textileio/go-threads#229 (textileio/go-threads#229)

@100ideas
Copy link
Author

100ideas commented Feb 7, 2020

Incidentally, I looked into trying to create a draft mechanism for an Automerge / Hypermerge document store. Automerge doesn't directly support forking and architecturally favors convergence of replicas over other use cases. According to one of the automerge devs:

@MartinKl:
Hi @100ideas, at the moment branches/forks of documents are not well supported. You can get some of the way by maintaining two separate documents, and applying some of the changes to only one of the documents. But there is no built-in support for diffing two documents constructed this way.

I don't think Hypermerge does this either. We did experiment with branching on the Pixelpusher project (https://medium.com/@pvh/pixelpusher-real-time-peer-to-peer-collaboration-with-react-7c7bc8ecbf74) by maintaining separate docs for each branch as described, but the performance was not great.

Notes & code here: https://repl.it/@100ideas/playing-with-automerge

@carsonfarmer
Copy link
Member

Hi @100ideas, super long delay in updates and responding. But wondering if some recent changes here might get you closer to what you're looking for now. So by separating out the 'Store' from the 'Database', you can have a 'local only' 'Store' for any staged changes (using its own dispatcher), and then when finalized, push the accepted changes onto a 'live' thread for replication etc.

@carsonfarmer
Copy link
Member

Happy to discuss these ideas further with you on or offline!

carsonfarmer added a commit that referenced this issue May 4, 2020
@carsonfarmer
Copy link
Member

Closing this as "stale". In fact, ongoing work as part of The Great Refactor (#414) should cover this usage pattern quite nicely.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants