-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Registry design (blueprint) #1
Comments
One thought is that I think the registry should be app centric, not developer centric. That is, I prefer the way Ruby, Debian, Haskell, and Python does it:
Their namespaces are centered around the artifacts produced in the ecosystem. Each entry has a number of owners or admins who can update it. Doing it like this avoids the situation where a project like ZeroVM creates a |
About uploading more than one zapp with a given version number: I agree with @pkit when he said on IRC that we should be less permissive and not allow uploading more than one zapp with a given version number. PyPI will not allow you to do this and I believe developers in general will consider it a mistake if we allow it. There are very strong traditions in Debian and other distributions for requiring a new version number with every change to the published source. One area where this is important is security updates: if package content can be updated without changing the version number, well then it becomes a nightmare to figure out if you have vulnerable software on your system. |
What @pkit suggested was incrementing the revision number (1.0.0-4, where 4 is the revision). In principle, I'm fine with, and it would be a reasonable alternative to the timestamp. However, the timestamp might be better for managing revisions: with a timestamp, there is no need to figure out what was the previous revision and ensure that newer revisions have a higher number. The timestamp automatically sorts this out.
That's why I suggested for the ZPA to do the revision/timestamp incrementing automatically. If the developer wants to make minor changes without rolling a whole new version number, the new revision needs an updated number. We could make them do it manually, but why not automate it? Personally, I find this to be one of the more annoying aspects of Debian packaging, one which I automate as much as possible.
Agreed. In fact, I never suggested that we allow package changes without a new version/revision number. |
That's a fair point, although I think that making that work on Swift/ZeroCloud would be more difficult; having separate namespaces for each user solves a lot of quota and permissions issue for us, almost completely out of the box. I'm open to suggestions on how the app-centric approach could be implemented, though. I'd like to hear from some others. If there is an overwhelming consensus to do it this way, we can figure out how to make it work. |
@larsbutler it's very important that revision number (be it timestamp or running integer) is assigned by client and not by server.
It can be implemented by giving each account a write permission to specific app container.
|
Hmm...on the second thought it was kind of stupid to just allow users to write any stuff to a container.
|
Constantine Peresypkin [email protected] writes:
Agreed. I never envisioned giving users Swift credentials and letting
I'm unsure what you're proposing here? The way I imagine the system What is lacking for this to work is mostly the ability to invoke a zapp |
@larsbutler You write
It is my impression that the problems you see solved by Swift aren't the difficult or important problems. Implementing quotas can be done in many ways and I don't think that's the core problem we're solving here. |
Fair enough. So in that case, how do we enforce the rule that new versions bear a newer version/revision number? Can we enforce that (in the way that Launchpad does, for example)?
Dependencies between what exactly? Dependencies between zapps? |
Right now you cannot "upload through zapp" anything. But it can be arranged either on job-description level or on the "helper middleware" level.
You still need users to authenticate themselves. And the trivial approach: users must have a Zebra account.
We have the ability to invoke zapp anonymously. We just need to sort out the correct permission level for that.
I think current Swift permissions are too limited and have too much legacy baggage. Like "referrer" or "rlisting". On the other hand we may want to have backwards compatibility here. On yet other hand the whole auth is external to Swift, and "Swift ACL" is just a recommendation, as even keystone and tempauth slightly differ already. |
Lars Butler [email protected] writes:
The problem is simply that you cannot change the version number of a If there is a problem here, then I feel that solving it is outside the
Then I misunderstood the example where you had geomet version 0.1.9 |
I never said whether they were difficult or not, but I think they do need to be solved, and solving them in this way (with out-of-the-box functionality) reduces the amount of work we have to do. It is my impression that you underestimate the amount work it will take build something like that from scratch.
First of all, I never said it was the "core problem"; it's "a problem", one of the many which needs to be solved in order to build this thing. Would you perceive to be the core problem, then? (Just saying "that's not the core problem" is not helpful or constructive, without offering your ideas about the core problem.) |
If we store stuff in registry by invoking a zapp it can enforce "no overwrite" rule by using specific headers (
Yep |
Right, there's only 1 official package per version (version being 0.1.9, for example). If I have a 0.1.9-1 and I upload a 0.1.9-2, 0.1.9-2 should be the new canonical package for the 0.1.9 version (it might include some security patches). When 0.1.9-2 is uploaded, it should effectively replace 0.1.9-1, BUT due to the lack of consistent state of Swift, 0.1.9-1 will be deleted at some point in time later with no guarantee about when that happens. So technically, multiple revisions of 0.1.9 can exist in the storage system at a given time, but for all intents and purposes, there is only one: the latest one. See the point about deleting old revisions in the section "Searching and listing packages". |
There is no need to delete the old one in a generic case. We can just make sure that action "download package 0.19" will choose the latest one. |
Yeah, that's technically true. I was thinking of doing that more as a housecleaning; if the download action will never grab an old version, why keep it around? |
You don't change the entire version number only through external means; what I'm proposing is that the developer still chooses when increment the version number (x.x.x), just not the revision. In this case, the revision would be more of an internal artifact to keep track of what is newer and what is older.
A fair point. We can make developers do it themselves. As long we have a clear rule about version increments and a way to enforce it, I'm fine with this. |
Can you please elaborate on that? If you can provide some more details, I'll edit the spec and put it in. |
The biggest unknown I see is how to let the server-side code accept files, check them, and put them into Swift. The output objects from a given job are fixed before the job starts today, but the way I think about it, the ZeroVM job that inspects the tarball would need to decide on output objects the tarball is to be stored in. AFAIK, that isn't supported today, so I'm unsure how we would do this. One option might be to use the tempurl feature: that way the registry zapp can give clients a token that allows them to make a specific PUT request. That could enforce that uploads go where we want them. We would have no idea what the user uploads, though. So my something like my original scheme might be needed:
There are obvious pitfalls here: someone needs to clean up at various stages if the client goes away before finishing all steps. I'm also not sure if we can allow anonymous people to invoke zapps and still restrict them to only invoke zapps using pre-defined job descriptions. So there are still some unknowns here: hence me thinking that this is where you'll end up with most of the effort. |
If we will have dependencies it will matter. If we won't - why do we need a registry? :)
If you do a PUT with |
It depends on how granular you want your dependency specification to be. If I want 0.1.9 as a dependency, would I be allowed specify 0.1.9-2? I was thinking that we wouldn't do this; instead, one would specify just 0.1.9 and will get the latest revision of 0.1.9, whatever happens to be available. |
Yes, probably it's a good idea. And also any other variant. |
Okay, agreed. |
What you're saying here is (apparently) that the full version number (0.1.9-2) isn't the version number of the software. Instead it's something else — an internal version number of the registry. This means that you allow people to upload different packages and still give them the same version number (0.1.9). That should not be allowed and I think you also think so based on what you said earlier. I think you should avoid over-thinking this part. Let users decide on version numbers and the semantics. Let the registry maintain a version->zapp mapping, with the constraint that the version numbers are unique per zapp. That is the semantics developers are used to from other package indexes. As for dependencies between zapps: we've talked about this before and zapps was designed to be self-contained. I would also like to see something like libraries in the future, but that's still far away. Even when we have some notion of libraries, I expect it to be the clients that download the dependencies. So let the clients decide how they want to resolve |
Okay, fair enough. |
Okay, I think I've received enough feedback to fix/rewrite some parts of the spec. Let me take another stab at this and see where we land. |
I've created this issue to serve as a blueprint for the implementation for the "zapp registry".
For ease of editing and to show history, I've moved the spec to this gist: https://gist.github.com/larsbutler/10a6355169f2404d8959
I've updated and cleaned up the spec per the discussion on this page.
Name
I propose
ZPA
(ZeroVM Package Archive).Platform
The assumption so far is that we will build this on Swift+ZeroCloud, and many of the functions of the ZPA will be written as zapps. Dogfooding is one of the reasons for this. Another reason is the Swift provides a horizontally-scalable storage system that can store millions of files. Since the ZPA is intended to be the central repository for developers to publish their zapps, ZPA must be capable of operating at this kind of scale.
If one were to build the ZPA from scratch, probably ~80% of the work would be focused purely on storage. With a platform like Swift, a lot of that is solved for us.
To make this work, we will need to write some custom zapps to handle various client requests and backend processing tasks. Some changes to ZeroCloud middleware may also be required.
General requirements
container
, owned by the userPublishing a zapp
larsbutler
is the users name or ID andzpa
is a special Swift containerzapp.yaml
filehttp://example.com/larsbutler/zpa/<zapp-name>/<zapp-name>-<version>-<timestamp>.zapp
, where<zapp-name>
and<version>
are extracted from thezapp.yaml
meta
section, and<timestamp>
is automatically computed by the publishing zapp, in the formatYYYYmmddHHMMSS
(20140630185610, for example)<version>
, the old one will be marked for deletion (which is a change which will take time to propagate through the system, because of the eventual consistency of Swift)Example:
The result will be:
http://example.com/larsbutler/zpa/geomet/geomet-0.1.9-20140707001122.zapp (latest 0.1.9 package)
http://example.com/larsbutler/zpa/geomet/geomet-0.1.9-20140630123456.zapp (marked for deletion)
http://example.com/larsbutler/zpa/geomet/geomet-0.2.0-20140801120000.zapp (latest 0.2.0 package)
This opinionated file naming convention was inspired by headaches I've experienced in publishing packages to Launchpad PPAs. For example:
Searching and listing packages
<version>
.Backend logic
zpa
container. The reason is so that users don't just put random files into Swift. As far as I know, this cannot be done in Swift itself, so this will probably require some custom middleware to implement.zpa
must trigger the "publishing zapp" (mentioned above). If the file uploaded is not a valid zapp (an archive containing a zapp.yaml), an error should be returned (probably a 4xx).Some of this could be implemented by extending ZeroCloud, but it may be more appropriate to write this into a separate middleware application.
User interface
Client use cases
A client (a command-line client to start with) needs to be capable of the following actions:
get metadata for a given zapp
The text was updated successfully, but these errors were encountered: