Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generators #52

Merged
merged 5 commits into from
Feb 27, 2021
Merged

Add generators #52

merged 5 commits into from
Feb 27, 2021

Conversation

Backfighter
Copy link
Contributor

I'm sorry it took so long but I've finally finished an intial version of the generator feature.
Generators allow dynamically generating files for plugins. They are executable files placed under /usr/share/holo/generators. They get passed the environment variable $OUT containing a folder to which they can write generated files. Plugins will then pick up this files as if they were placed under /usr/share/holo/.

Fixes #51

@coveralls
Copy link

coveralls commented Aug 8, 2020

Coverage Status

Coverage decreased (-0.3%) to 81.014% when pulling dc9965e on Backfighter:master into 32d511c on holocm:master.

Copy link
Contributor

@majewsky majewsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for putting this on the back burner for so long. I have focused on reading the implementation parts (rather than the test parts) for now. In general, this looks pretty good so far. Besides the more specific nitpicks down below, I have one big gripe: The output format for generators deviates significantly from that for entities, even though it could be extremely similar. For example:

$ holo scan

Running generator example.sh
         found at /usr/share/holo/generators/example.sh

ls: /foo: no such file or directory
!! exit status 1

or even just

$ holo scan

Running generator /usr/share/holo/generators/example.sh

ls: /foo: no such file or directory
!! exit status 1

Like with entities, the prologue paragraph ("Running generator...") would only be shown if the generator produces stdout/stderr or a non-zero exit status, so during normal operation this would be entirely silent. (When no errors occur, holo apply only reports when system configuration actually gets touched, and holo scan only lists entities.)

Please consider this change. It would make generators feel much more like first-class citizens.

env := os.Environ()
env = append(
env,
fmt.Sprintf("OUT=%s", targetDir),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generators should also get HOLO_RESOURCE_DIR and HOLO_CACHE_DIR to ensure that they don't hardcode /usr/share/holo or /tmp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HOLO_RESOURCE_DIR is usually adjusted to be plugin specific. What would be the content of it for generators? Correct me if I am wrong, but isn't /use/share/holo currently hard coded in plugin.go? In other words the plugins are flexible to where their resource files live, but holo isn't. We should potentially introduce a HOLO_RESOURCE_ROOT.

}

func getGeneratorCacheDir() (string, error) {
path, err := prepareDir(RootDirectory(), "/var/tmp/holo/generated")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since generated resource files are extremely transient, /tmp looks to be more appropriate than /var/tmp. I'd prefer having all generated files inside $HOLO_CACHE_DIR (like in the fallback path of this method) to simplify cleanup and to ensure that we don't accidentally reuse old files from a previous run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted generated files to be cacheable (let the generator decide whether to regenerate or not) and avoid excessive regeneration. But thinking about it this leads to a problem when a generator just stops to generate a certain file it previously generated. Would you say we leave it to the generator to maintain his own cache and just always clean the generated files?

@majewsky
Copy link
Contributor

This has obviously been on my list for quite a while now, and upon first looking at the diff, I wondered why I kept putting this off for so long. But once I got into it, I found that my anticipation of dread was correct. I was hoping to get this feature done and shipped in a single evening, but it's 1AM now and I need to get some rest. My current progress is on the generators branch; the commit messages may be quite insightful if you wonder what's going on. In short, I keep discovering more and more ramifications and cross-interactions of this feature with every other part of Holo. One thing is for sure: If this gets merged, it's going to be Holo 3.0 because there are at least two backwards-incompatible changes in there.

@Backfighter
Copy link
Contributor Author

I see two alternatives to actually copying the files:

  1. Use hardlinks. This will obviously only work on file level.
  2. Require plugins to follow symlinks. Sadly it seems that will open another can of worms in the form of loop detection on traversal. Golang has no file walk with symlinks for exactly that reason.

Just throwing these out there in case you haven't considered them yet.

Just to clarify: The only thing missing is to adjust the tests to the new implementation. Is that correct?

Another thing I wanted to mention: Do you see any way to keep a persistant cache for generators? Since generators are executed at every run of holo it would be good to avoid unnecessary regeneration of the same files in order to reduce execution time.

@majewsky
Copy link
Contributor

majewsky commented Feb 27, 2021

Use hardlinks.

The cache dir is in /tmp which is most likely on tmpfs, so that's not an option. (This ties into the point below about caching generator results.)

Require plugins to follow symlinks.

My primary concern with this feature ist that it increases the complexity of Holo's execution model quite a bit, so I'm actively trying to avoid adding extra complexity. As you said, working with symlinks brings lots of extra complexity and extra gotchas, so I would like to not do that unless absolutely necessary. holo-files only deals with symlinks because it is a valid usecase to want to deploy symlinks (real-life example for reference).

Just throwing these out there in case you haven't considered them yet.

Much appreciated. I'm fairly positive that I will have considered all consequences and gotchas with this feature by the time I merge it, though.

Just to clarify: The only thing missing is to adjust the tests to the new implementation. Is that correct?

For the most part, yes. I also want to extend CLI selectors to generated resource files. Right now, you can say holo apply /usr/share/holo/files/20-example/etc/foo.conf to apply all entities backed by that static resource file. I want to have holo apply /usr/share/holo/generators/foo.sh to apply all entities produced by that generator, and something like holo apply /usr/share/holo/generators/foo.sh::files/20-example/etc/foo.conf to apply entities backed by that generated resource file (plus proper autocomplete support, obviously).

Do you see any way to keep a persistant cache for generators?

I removed that part from the spec and implementation for the same reason as above, to reduce complexity both in implementation and mental model. If caching saves half a second, but increases the chance of bugs and gotchas, I'll go without caching. To witness, holo-files scans the entire resource dir every time it runs apply for a single entity. It's not that bad in practice because of the kernel's filesystem cache.

EDIT: I just realized that a persistent cache would introduce two other problems. First, it would mean that holo scan (and thus also all shell completions) would need to acquire a lock file to avoid concurrent access to the generator cache from different runs of Holo. Second, a persistent cache would cause problems when holo scan is run as non-root (e.g. during shell completions), but the persistent cache was created with root permissions and cannot be read from or written to by the non-privileged Holo run.

@majewsky majewsky merged commit dc9965e into holocm:master Feb 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle plugins sequentially in apply
3 participants