Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support accessing remote registries via ssh #589

Merged
Merged
Show file tree
Hide file tree
Changes from 69 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
636ddb9
Support accessing remote registries via ssh
muffato Sep 4, 2022
5ca9a19
Factored out a function that tells whether a registry path is local
muffato Sep 5, 2022
4cdc206
Make sure the URL is used, not self.source which could be a local path
muffato Sep 5, 2022
d5569d5
Make sure the URL is used, not self.source which could be a local path
muffato Sep 5, 2022
1e755a3
Make self.source include the subdir from the start. Allows implementi…
muffato Sep 5, 2022
724eee8
bugfix: this should be the local registry
muffato Sep 5, 2022
bef33d6
Made VersionControl generic enough to deal by itself with GitHub and …
muffato Sep 5, 2022
ede973f
Not needed ?
muffato Sep 5, 2022
5d7087f
Factored out and parametrised the code that builds the URL of the con…
muffato Sep 5, 2022
1cadf59
The check already happens in _update_cache()
muffato Sep 5, 2022
db2a14f
Moved is_path_local to shpc.utils
muffato Sep 6, 2022
8c1ad75
Added a safeguard to prevent cloning multiple times
muffato Sep 6, 2022
06fd1a6
clone() is actually only supported by VersionControl
muffato Sep 6, 2022
5d290f7
No need to yield self.source in iter_modules since it's constant and …
muffato Sep 7, 2022
3d3e6b9
syntactic sugar
muffato Sep 9, 2022
289000a
It's more practical to yield the the registry object (provider) rathe…
muffato Sep 9, 2022
d356ab1
Optimised the "update all" mode by directly using Result objects from…
muffato Sep 9, 2022
3666a97
Clones only ever exist within a function
muffato Sep 9, 2022
0e34cc3
Optimised iter_modules method for remote registries (using the cache)
muffato Sep 9, 2022
700b9e3
Moved back iter_modules to Filesystem since VersionControl has its ow…
muffato Sep 9, 2022
58b6af9
Stopped using self.source in VersionControl, to avoid confusion with …
muffato Sep 9, 2022
8190884
url, not source, is to be used for remote registries
muffato Sep 9, 2022
98259c1
Cannot do these heuristics as we need to report unexisting local paths
muffato Sep 9, 2022
4797224
str.split can limit the number of splits
muffato Sep 10, 2022
3d9cae2
The main Registry object, not the settings, should decide whether the…
muffato Sep 6, 2022
54d45ad
To avoid duplicating the code that assesses whether a path or local o…
muffato Sep 6, 2022
59c01cc
The parent class shouldn't know that much about the subclasses
muffato Sep 6, 2022
db48626
Restored back the automatic addition of https://
muffato Sep 6, 2022
583e7d4
Restructured to avoid an unnecessary else
muffato Sep 7, 2022
bd67e13
shpc convention: no else when the then ends with a return
muffato Sep 7, 2022
fb2ed7c
Unnecessary due to operator precedence rule
muffato Sep 7, 2022
b54bbfe
Added a cache in `library_url`
muffato Sep 7, 2022
5c912a6
Fixed the implementation of the cache in VersionControl.exists
muffato Sep 7, 2022
99a2d57
exists has its own implementation in VersionControl, so this implemen…
muffato Sep 7, 2022
486f12b
iter_registry is basically iter_modules with an extra filter
muffato Sep 7, 2022
0b65fa6
Yield relative paths rather than full paths since *all* consumers nee…
muffato Sep 7, 2022
b064026
Proper method to cleanup a clone
muffato Sep 7, 2022
bc24016
Increased the symmetry to simplify the maintainability
muffato Sep 11, 2022
8768237
NotImplementedError is more useful than pass
muffato Sep 17, 2022
4170484
The tuplized version is not the preference here
muffato Sep 17, 2022
be3793f
Easier to understand
muffato Sep 17, 2022
468afdf
Made the clone return a Filesystem object independent from VersionCon…
muffato Sep 17, 2022
bc2fe99
Extra comment
muffato Nov 5, 2022
b836cd8
Back to a subclass of VersionControl for each forge
muffato Nov 6, 2022
e506ee3
Pre-parse the URL
muffato Nov 6, 2022
b263272
VersionControl should not be directly used
muffato Nov 6, 2022
9c03796
bugfix
muffato Nov 6, 2022
b1a2ea8
Forgot to build the registry object
muffato Nov 6, 2022
492bcb3
typo
muffato Nov 6, 2022
b6fbd50
Maybe this one will work ?
muffato Nov 6, 2022
8ff0c77
Forgot to change this signature
muffato Nov 6, 2022
153bd39
Renamed the variable for clarity
vsoch Nov 7, 2022
da752f3
typo
muffato Nov 7, 2022
653d467
Removing yaml because it's the only file we have for a container
muffato Nov 7, 2022
b71fb93
Defensive programming: local could still be None
muffato Nov 7, 2022
f069f48
bugfix: iter_modules needs to yield paths to container.yaml
muffato Nov 7, 2022
ccdab16
Moved the cleanup call up to _sync()
muffato Nov 7, 2022
c5b4cb9
bugfix: iter_modules now returns path to the container.yaml
muffato Nov 7, 2022
9402e11
Revert "bugfix: iter_modules needs to yield paths to container.yaml"
muffato Nov 7, 2022
687c076
Revert "bugfix: iter_modules now returns path to the container.yaml"
muffato Nov 7, 2022
e744af2
The temp directory may have been deleted in the meantime
muffato Nov 7, 2022
e37a252
Need to check here too that the clone still exists
muffato Nov 7, 2022
ca5b632
Also need to reset self._clone if the directory is gone
muffato Nov 7, 2022
f1c5445
More checks on local and remote
muffato Nov 7, 2022
7bb857b
It makes more sense to cleanup tmplocal than self, and it works becau…
muffato Nov 7, 2022
043532b
Moved this to the parent class
muffato Nov 7, 2022
1334b64
Another implementation that doesn't make it too obvious the base-clas…
muffato Nov 7, 2022
e925aa1
Silly typo: self._clone is a Filesystem object, not a string
muffato Nov 7, 2022
7ad7729
No colon
muffato Nov 7, 2022
455f10d
You shall use American spelling
muffato Nov 7, 2022
89975c1
Added a test to showcase ssh
muffato Nov 8, 2022
6eaadfb
black
muffato Nov 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions shpc/main/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,12 +124,16 @@ def update(self, name=None, dryrun=False, filters=None):
"""
# No name provided == "update all"
if name:
modules = [name]
# find the module in the registries. _load_container
# calls `container.ContainerConfig(result)` like below
configs = [self._load_container(name)]
else:
modules = [x[1] for x in list(self.registry.iter_modules())]

for module_name in modules:
config = self._load_container(module_name)
# directly iterate over the content of the registry
vsoch marked this conversation as resolved.
Show resolved Hide resolved
configs = []
for result in self.registry.iter_registry():
configs.append(container.ContainerConfig(result))
# do the update
for config in configs:
config.update(dryrun=dryrun, filters=filters)

def test(
Expand Down
23 changes: 13 additions & 10 deletions shpc/main/modules/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,12 @@ def add(self, image, module_name=None, **kwargs):
"""
Add a container to the registry to enable install.
"""
self.settings.ensure_filesystem_registry()
local_registry = self.registry.filesystem_registry

if not local_registry:
logger.exit(
"This command is only supported for a filesystem registry! Add one or use --registry."
)

# Docker module name is always the same namespace as the image
if image.startswith("docker"):
Expand All @@ -185,7 +190,7 @@ def add(self, image, module_name=None, **kwargs):

# Assume adding to default registry
dest = os.path.join(
self.settings.filesystem_registry,
local_registry.source,
module_name.split(":")[0],
"container.yaml",
)
Expand Down Expand Up @@ -235,10 +240,9 @@ def docgen(self, module_name, registry=None, out=None, branch="main"):
aliases = config.get_aliases()
template = self.template.load("docs.md")
registry = registry or defaults.github_url
github_url = "%s/blob/%s/%s/container.yaml" % (registry, branch, module_name)
raw_github_url = shpc.main.registry.get_module_config_url(
registry, module_name, branch
)
remote = self.registry.get_registry(registry, tag=branch)
github_url = remote.get_container_url(module_name)
raw_github_url = remote.get_raw_container_url(module_name)

# Currently one doc is rendered for all containers
result = template.render(
Expand Down Expand Up @@ -306,10 +310,9 @@ def _get_module_lookup(self, base, filename, pattern=None):
A shared function to get a lookup of installed modules or registry entries
"""
modules = {}
for fullpath in utils.recursive_find(base, pattern):
if fullpath.endswith(filename):
module_name, version = os.path.dirname(fullpath).rsplit(os.sep, 1)
module_name = module_name.replace(base, "").strip(os.sep)
for relpath in utils.recursive_find(base, pattern):
if relpath.endswith(filename):
module_name, version = os.path.dirname(relpath).rsplit(os.sep, 1)
if module_name not in modules:
modules[module_name] = set()
modules[module_name].add(version)
Expand Down
78 changes: 45 additions & 33 deletions shpc/main/registry/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
from shpc.main.settings import SettingsBase

from .filesystem import Filesystem, FilesystemResult
from .remote import GitHub, GitLab, get_module_config_url
from .remote import GitHub, GitLab


def update_container_module(module, from_path, existing_path):
Expand All @@ -23,13 +23,12 @@ def update_container_module(module, from_path, existing_path):
"""
if not os.path.exists(existing_path):
shpc.utils.mkdir_p(existing_path)
for filename in shpc.utils.recursive_find(from_path):
relative_path = filename.replace(from_path, "").strip("/")
for relative_path in shpc.utils.recursive_find(from_path):
to_path = os.path.join(existing_path, relative_path)
if os.path.exists(to_path):
shutil.rmtree(to_path)
shpc.utils.mkdir_p(os.path.dirname(to_path))
shutil.copy2(filename, to_path)
shutil.copy2(os.path.join(from_path, relative_path), to_path)


class Registry:
Expand All @@ -44,21 +43,29 @@ def __init__(self, settings=None):
# and they must exist.
self.registries = [self.get_registry(r) for r in self.settings.registry]

@property
def filesystem_registry(self):
"""
Return the first found filesystem registry.
"""
for registry in self.registries:
if isinstance(registry, Filesystem):
return registry

def exists(self, name):
"""
Determine if a module name *exists* in any local registry, return path
Determine if a module name *exists* in any registry, return the first one
"""
for reg in self.registries:
if reg.exists(name):
return os.path.join(reg.source, name)
return reg

def iter_registry(self, filter_string=None):
"""
Iterate over all known registries defined in settings.
"""
for reg in self.registries:
for entry in reg.iter_registry(filter_string=filter_string):
yield entry
yield from reg.iter_registry(filter_string=filter_string)
muffato marked this conversation as resolved.
Show resolved Hide resolved

def find(self, name, path=None):
"""
Expand All @@ -80,19 +87,19 @@ def iter_modules(self):
"""
Iterate over modules found across the registry
"""
for reg in self.registries:
for registry, module in reg.iter_modules():
for registry in self.registries:
for module in registry.iter_modules():
yield registry, module

def get_registry(self, source):
def get_registry(self, source, **kwargs):
"""
A registry is a local or remote registry.

We can upgrade from, or otherwise list
"""
for Registry in PROVIDERS:
if Registry.matches(source):
return Registry(source)
return Registry(source, **kwargs)
raise ValueError("No matching registry provider for %s" % source)

def sync(
Expand Down Expand Up @@ -128,20 +135,10 @@ def _sync(
local=None,
sync_registry=None,
):
# Registry to sync from
sync_registry = sync_registry or self.settings.sync_registry

# Create a remote registry with settings preference
Remote = GitHub if "github.com" in sync_registry else GitLab
remote = Remote(sync_registry, tag=tag)
local = self.get_registry(local or self.settings.filesystem_registry)

# We sync to our first registry - if not filesystem, no go
if not local.is_filesystem_registry:
logger.exit(
"sync is only supported for a remote to a filesystem registry: %s"
% sync_registry.source
)
remote = self.get_registry(
sync_registry or self.settings.sync_registry, tag=tag
)

# Upgrade the current registry from the remote
self.sync_from_remote(
Expand All @@ -152,6 +149,8 @@ def _sync(
add_new=add_new,
local=local,
)

# Cleanup the remote once we've done the sync
remote.cleanup()
muffato marked this conversation as resolved.
Show resolved Hide resolved

def sync_from_remote(
Expand All @@ -163,26 +162,39 @@ def sync_from_remote(
If the registry module is not installed, we install to the first
filesystem registry found in the list.
"""
updates = False

## First get a valid local Registry
# A local (string) path provided
if local and isinstance(local, str) and os.path.exists(local):
if local and isinstance(local, str):
if not os.path.exists(local):
logger.exit("The path %s doesn't exist." % local)
local = Filesystem(local)

# No local registry provided, use default
muffato marked this conversation as resolved.
Show resolved Hide resolved
if not local:
local = Filesystem(self.settings.filesystem_registry)
local = self.filesystem_registry
muffato marked this conversation as resolved.
Show resolved Hide resolved
# We sync to our first registry - if not filesystem, no go
if not local:
logger.exit("No local registry to sync to. Check the shpc settings.")

tmpdir = remote.source
if tmpdir.startswith("http") or not os.path.exists(tmpdir):
tmpdir = remote.clone()
if not isinstance(local, Filesystem):
logger.exit("Can only synchronise to a local file system, not to %s." % local)
muffato marked this conversation as resolved.
Show resolved Hide resolved

## Then a valid remote Registry
if not remote:
logger.exit("No remote provided. Cannot sync.")

if not isinstance(remote, Filesystem):
# Instantiate a local registry, which will have to be cleaned up
remote = remote.clone()

# These are modules to update
for regpath, module in remote.iter_modules():
updates = False
for module in remote.iter_modules():
if name and module != name:
continue

from_path = os.path.join(regpath, module)
from_path = os.path.join(remote.source, module)
existing_path = local.exists(module)

# If we have an existing module and we want to replace all files
Expand Down
30 changes: 18 additions & 12 deletions shpc/main/registry/filesystem.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,20 +75,31 @@ def override_exists(self, tag):


class Filesystem(Provider):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.source = os.path.abspath(self.source)
def __init__(self, source):
if not self.matches(source):
raise ValueError(
"Filesystem registry source must exist on the filesystem. Got %s"
% source
)
self.source = os.path.abspath(source)

@classmethod
def matches(cls, source):
return os.path.exists(source) or source == "."

def exists(self, name):
return os.path.exists(os.path.join(self.source, name))

def iter_modules(self):
"""
yield module names
"""
# Find modules based on container.yaml
for filename in shpc.utils.recursive_find(self.source, "container.yaml"):
module = os.path.dirname(filename).replace(self.source, "").strip(os.sep)
module = os.path.dirname(filename)
vsoch marked this conversation as resolved.
Show resolved Hide resolved
if not module:
continue
yield self.source, module
yield module

def find(self, name):
"""
Expand All @@ -110,14 +121,9 @@ def iter_registry(self, filter_string=None):
"""
Iterate over content in filesystem registry.
"""
for filename in shpc.utils.recursive_find(self.source):
if not filename.endswith("container.yaml"):
continue
module_name = (
os.path.dirname(filename).replace(self.source, "").strip(os.sep)
)

for module_name in self.iter_modules():
# If the user has provided a filter, honor it
if filter_string and not re.search(filter_string, module_name):
continue
filename = os.path.join(self.source, module_name)
yield FilesystemResult(module_name, filename)
56 changes: 31 additions & 25 deletions shpc/main/registry/provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@

import os

import shpc.utils


class Result:
@property
Expand Down Expand Up @@ -32,36 +34,40 @@ class Provider:
A general provider should retrieve and provide registry files.
"""

def __init__(self, source, *args, **kwargs):
muffato marked this conversation as resolved.
Show resolved Hide resolved
if not (source.startswith("https://") or os.path.exists(source)):
raise ValueError(
"Registry source must exist on the filesystem or be given as https://."
)
self.source = source

def exists(self, name):
muffato marked this conversation as resolved.
Show resolved Hide resolved
return os.path.exists(os.path.join(self.source, name))

@property
def is_filesystem_registry(self):
return not self.source.startswith("http") and os.path.exists(self.source)

@property
def name(self):
return self.__class__.__name__.lower()

@classmethod
def matches(cls, source_url: str):
pass
def matches(cls, source):
"""
Returns true if this class understands the source
"""
raise NotImplementedError

def find(self, name):
pass
"""
Returns a Result object if the module can be found in the registry
"""
raise NotImplementedError

def exists(self, name):
"""
Returns true if the module can be found in the registry
"""
raise NotImplementedError

def cleanup(self):
pass
"""
Cleanup the registry
"""
raise NotImplementedError

def iter_registry(self):
pass
def iter_registry(self, filter_string=None):
"""
Iterates over the modules of this registry (that match the filte, if
provided) as Result instances
"""
raise NotImplementedError

def iter_modules(self):
pass
"""
Iterates over the module names of this registry
"""
raise NotImplementedError
Loading