Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Put each nimble package in its own file #777

Open
jyapayne opened this issue Jun 27, 2018 · 19 comments
Open

[Proposal] Put each nimble package in its own file #777

jyapayne opened this issue Jun 27, 2018 · 19 comments

Comments

@jyapayne
Copy link
Contributor

Proposal

Put each nimble package in its own file/directory, or let each user create a subdirectory for themselves and put each package into one file.

Benefits

  • User only has to add a file to the repo instead of editing one big file. Multiple users can add packages without conflicts
  • Parser won't eventually be overwhelmed by packages (since Nim and nimble will be super popular one day!)
  • Better git history
  • If each user has their own directory, packages can be found easily by the same user (not that it isn't easy already with nimble search, but just browsing the github repo could be a fun discovery)

Downsides

  • Lots of files. But git can handle it.
  • Higher code complexity. Must write recursive directory walker and gather packages.

So what do you think? It's just an idea, but maybe it'll be of use?

@Araq
Copy link
Member

Araq commented Jun 27, 2018

You are talking about packages.json, right?

@jyapayne
Copy link
Contributor Author

@Araq yep!

@dom96
Copy link
Contributor

dom96 commented Jun 27, 2018

Sure, we can do this :)

@FedericoCeratto
Copy link
Member

Parser won't eventually be overwhelmed by packages
This is not correct. The most common searches are by package name, tag and words from the description.
Indexing the packages require scanning the whole packages.json file. With separate files this would require recursive walks and parsing a large number of manifest files which is even slower.

The usual solution for large archieves is to precompute index files centrally (essentially a keyword -> package name k/v map) and ship those. Yet, we are not going to need this for quite a while!

@jyapayne
Copy link
Contributor Author

jyapayne commented Jul 1, 2018

@FedericoCeratto You make a good point about the package scanner and it taking more time to do recursive walks, however, the main point (which I failed to emphasize) is the greater ease of use.

It's not happening so much now, but when lots of users start submitting lots of packages to Nim, there are potentially going to be many PRs for packages.json and many conflicts to resolve either by the user or by @dom96 and yourself, or whoever is maintaining it in the future.

Again, this is just an idea. If it makes things too difficult or my reasoning is faulty, maybe package.json fine the way it is. I haven't looked at the source, so maybe this is the most efficient way.

@dom96
Copy link
Contributor

dom96 commented Jul 1, 2018

One potential drawback with this proposal is that you won't be able to as easily add a new package using just the GitHub web interface.

@jyapayne
Copy link
Contributor Author

jyapayne commented Jul 1, 2018

@dom96 that's true as well. It's pretty easy to just add a package like that.

@euantorano
Copy link
Contributor

You can still fairly easily add new packages using the GitHub UI by clicking Create new file, you just have to then copy the JSON (or whatever) structure into the editor.

@jyapayne
Copy link
Contributor Author

jyapayne commented Aug 8, 2018

@euantorano that's a good point as well. I'm still a proponent of this change and I'll look into the effort when I have some time.

@FedericoCeratto
Copy link
Member

@jyapayne I'll create a proposal on the Nimble issue tracker for a package publishing service.

@jyapayne
Copy link
Contributor Author

jyapayne commented Aug 8, 2018

@FedericoCeratto that would be equally awesome!

@treeform
Copy link
Contributor

treeform commented Oct 7, 2018

packages.json is getting big...

Maybe just a single big directory were each package is a file?

If we have sub directories package names can conflict without people knowing.

@kaushalmodi
Copy link
Contributor

you won't be able to as easily add a new package using just the GitHub web interface.

Do people do that? :)

I like the point one package per directory (or may be even a file?) rather than having to edit a humongous .json.

@timotheecour
Copy link
Member

timotheecour commented Dec 12, 2018

proposal:

  • write a tool that splits existing json into its individual array elements (ie 1 per package), and saves them as mypkg.json for a package called mypkg
  • the existing json could then be auto-generated (ie, either not checked in, or checked in but updated after each commit using a git commit hook), from individual mypkg.json files in that repo btw, so it’ll make migration easy (ie existing tools that depend on that single json file would continue to work)

note:

checking it in might make it simpler, eg so that nimble refresh only has to download 1 file

file organization

packages/mypkg1.json # all pacakges go here
blacklist.json # ignore packages specified here
autogenerated_list.json # large auto generated package list

@yglukhov
Copy link
Member

One potential drawback with this proposal is that you won't be able to as easily add a new package using just the GitHub web interface.

Actually you can create files through web interface.

@genotrance
Copy link
Contributor

You can also create a folder structure using tab though its not intuitive.

@FedericoCeratto
Copy link
Member

FedericoCeratto commented Dec 7, 2019

Problems

  1. The packages.json file is getting bigger. It could become slow to parse. It is awkward to edit. It requires contributors to have a GitHub account. It requires PR approval to prevent package hijacking.
  2. sometimes GH repos ar deleted, and some countries block GitHub, and GitHub refuses to serve some other countries.
  3. Implementing a Nim distribution: Developing Nim's stdlib and a Nim distribution RFCs#173

Proposal
Precompute index files centrally: a <keyword> -> <package name> k/v map, to look up packages by keyword/tag; Also a <package name> -> <package metadata> map. Ship the two indexes as binary/compressed files for fast transfer and fast lookup time.

Run a simple service similar to pypi.org to handle package creation/update and generate the indexes. Initially it could feed from GH and/or use GH as a backend to store the indexes.

Future goals
Store compressed tarballs of released packages. This is useful in case of dead repos, and in countries that block GitHub, and countries that GitHub refuses to serve.
Check URL / git repo existence before accepting a new package.

Moonshot goals
Let package owners sign metadata. Also use the signature to allow allow owners to update/delete packages without having to store logins and passwords.
Verify signed tarballs from GH (and other sources) against the owner pubkey.
A pool of "admin" pubkeys is allowed to update/delete other packages.
A pool of "contributor" pubkeys can vet trusted packages by adding a "vote +1" signature. Nimble can warn before installing unvetted packages.
This implements most of the building blocks for a Nim distribution described in nim-lang/RFCs#173

Update from a conversation with Araq:
A small database is preferred over a directory because it can be: downloaded easily over plain http, replaced atomically on disk, checksummed to verify integrity, signed.
SQLite is supported by the stdlib, has a stable format, works cross-platform, has really good lookup timing.

@dom96
Copy link
Contributor

dom96 commented Mar 26, 2020

A small database is preferred over a directory because it can be: downloaded easily over plain http, replaced atomically on disk, checksummed to verify integrity, signed.
SQLite is supported by the stdlib, has a stable format, works cross-platform, has really good lookup timing.

Nothing is simpler than a directory structure. SQLite is a dependency and I would not wish to make Nimble depend on it.

Everything else sounds good to me on the surface. It'll be the details of how you implement this that I may disagree with, but just go for it :)

@ZeroAurora
Copy link

ZeroAurora commented Nov 29, 2021

Any progress here? I'm learning Nim but found the package list is pretty ugly. Maybe we can separate it to single files in an other branch (using a script for migration), and compile them into a single json file using CI for backward compability.
Using git or tarball to fetch the package repo is preferred by me though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests