Skip to content

Content management: managing state #14

Open
@enykeev

Description

@enykeev

Problems:

  • Action runners run code that appear to be on the disk at the moment of the run.
  • We don't have a story on how entry point code ends up on the disk for the remote runner.
  • We assuming that multiple users would never edit the same pack at the same time.
  • We have no way of knowing what was the state of filesystem at the moment of execution and have no guarantees that the action did that it supposed to do.
  • Our dependency management story to this point revolves around failing if the current version of the pack doesn't match the required one

My proposal is to give every component its own state and the ability to mutate it between a shared set of states. As far as implementation goes, git fits the bill perfectly and we already picked it to be at the core of our pack management solution.

What we're now calling /opt/stackstorm/packs should be replaced with a set of bare git repositories. This repositories should accept pushes from the user and be programmatically pulled by pack management module (it might be an st2api or a designated component) during pack update. Bare in this case means that repo doesn't have a working copy, it doesn't have a current state, no one can edit it manually and there is no problem with stashing it to switch to other state.

Instead, every component should have its own /opt/stackstorm/packs which he himself (or pack management component of choice) should have to keep up to date with the main repository.

ActionRunner would be able to checkout the particular git hash for every execution he's about to run given us an ability to define a particular version of the code we want to run. It would allow us to do dependency management per execution and it would also allow multiple users run different versions of the same pack on the same machine. The repo should be fetched asynchronously and the cost of checking out a new hash is relatively low. We would still have a problem with potential venv and system dependencies mismatch, though. First can be solved by having a set of virutalenvs addressed by the hash of requirements.txt. The second one most likely would require us to run ActionRunners inside the containers and manage this containers inside st2. Neither of the two is covered by this proposal.

Sensors are very much alike to ActionRunners in their requirements to the code, but they are restarting way less frequently and may start duplicating the events if more than one instance is running at a time. In case of sensors, the version we want to run should be defined upfront and their management is completely up to the user.

Rules, Aliases and Triggers are similar to Sensors in the sense that it is impossible to predict how multiple instances would react in a certain situations. Defining a version of the code during registration (or switching between versions with API call) would at least give us an insight into the state of the system at any given moment and would allow us to build collaboration tool of some kind that would allow multiple users to work on the same instance without interference.

As far as the registration goes, we would need to figure out how we're going to approach having multiple versions of the meta. We can either load every version of the meta to the db, load only the one considered master at the given moment and load specific ones from the disk or pretty much everything in between. I don't see how we can decide that without profiling.

Another part of having core repository bare is that it allows us to expose it through the API or SSH for users to push their versions of the packs they've been working on on their own machines. User would not need to login to st2 instance anymore to edit the pack. And thanks to ActionRunner state, user should be able to run a specific version of the action he just uploaded.

Giving every ActionRunner its own state also solves the problem of code distribution in HA environment. As soon at new code has been pushed, all the runners would get a request to fetch it from the core repo. The process of synchronizing the data may create some delays on executions waiting for the particular version of the code to arrive, but having to consistency to what being executed by the node seems miles worse to me.

If we decide that we want to conserve disk space, we need to came up with synchronization mechanics. We can register rules and triggers from the same storage providing we do it serially. We may not be able to avoid races for multiple action runners reading the same storage.

One of the requirement was so that st2 should still be able to work as it is working right now without all this fancy new stuff. Potentially, we can just point all the individual stagings to the same folder and check for the config flag to pass all the synchronization routines. Personally, it seems to me that the "old way" of using st2 should not stick for long considering that we would absolutely not being able to backport all the new features to the old single state approach.

There is still a ton of tiny details such as how new pluggable runners fit the picture, how configs should be distributed, permission control, connectivity implications and such. But in the broad strokes:

  • switch from using files to working with repositories in package management (clone, fetch)
  • every component should have its own staging or a way of synchronizing access to a common staging
  • ActionRunner should be able to run a specific version of the code defined in work order
  • decide how we going to approach registering multiple versions of meta for st2api and st2reactor to get a proper set of parameters for the work order
  • expose core repository through API or SSH for user to push his code to st2 instance

The task could (and probably should be) further broken apart on a smaller steps, though we need to keep in mind and try not to expose our user to the changes in their workflow too often as they may easily became fatigued.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions