The Racket Package Catalog comprises two pieces of software that work in tandem:
-
pkg-index, a.k.a. the backend: responsible for managing the package database and user database, and periodically polls package sources to update checksums. Eventually, all Racket package clients can poll the package server, instead of checking each package individually, but package clients do not access the backend's information directly.
-
website, a.k.a. the frontend: responsible for formatting the package database and rendering it as a web site, and also for publishing the current package database (as determined by the backend) to the site that package clients consult.
Both frontend and backend run in the same Racket process, but as separate server threads. Each also has additional threads for periodic and internal tasks. The frontend sends package-change requests to the backend, and it watches for periodic updates (especially new checksums) from the pkg-index backend. The servers were originally implemented separately, and there is some value to keeping them conceptually separate.
Although the implementation here is not necessarily tied to the main Racket deployment's configuation, various configuration options make the most sense in terms of the main deployment's structure:
-
https://pkgs.racket-lang.org/
is an S3-hosted web site, which makes it as available as possible. The content of this site is uploaded is generated by the frontend server. (Also,pkg.racket-lang.org
is set up forward topkgs.racket-lang.org
, in case someone forgets the "s" in "pkgs".)The frontend server will tend to forward back into this static content, but the "login" button in the static view goes to the dynamic frontend server. After a user has logged in, the dynamic server tends to serve information directly, instead of forwarding. So, the server needs to be up and working well for logged-in use or gathering package updates, but not for querying the most recently published package updates.
-
https://pkgd.racket-lang.org/
is the server frontend and backend implemented here. More precisely, it's an Apache instance that sends most URLs to the frontend server, but anything path that startsapi
orjsonp
is sent to the backend server (so that the backend functionality remains accessible). But to the degree that the frontend server needs to talk to the backend server, it does so more directly.
In addition to Racket v8.14.0.3 or later, you will need to install the following Racket packages:
raco pkg install --skip-installed \
'https://github.com/racket/infrastructure-userdb.git#main' \
reloadable \
aws \
s3-sync \
plt-service-monitor
You can run the server in test mode with make run
or ./run
or
racket src/main.rkt
with no further configuration, except that
-
You'll need to have a certificate in place. Use
make keys
to produce a self-signed certificate. The certificate and key are dropped intocompiled/root
so that they're in the right place for a default configuration. -
Probably you'll want to seed the set of registered packages. See "Adding packages" below.
By default, all package state and generated files go into compiled
in the current directory. (The current directory when you run the
server doesn't have to be the top of this Git repo checkout.)
An advantage of using make run
or ./run
is that it sets
PLTSTDERR
to turn on lots of logging. You can run just the frontend
as src/website/main.rkt
or just the backend with
src/pkg-index/main.rkt
. Running just the frontend requires running
the backend at least once to generate its output for the frontend,
though, or configuring the frontend to use another source via
pkg-index-url as described below.
When you use make run
or ./run
, it actually runs
configs/${CONFIG}.rkt
, so you can set CONFIG
as an environment
variable or makefile variable to pick a configuration there. If
CONFIG
is not defined, testing
is used (which is an empty
configuration). For example, to select configs/live.rkt
, set
CONFIG
to live
. A good place to do this is in the run-prelude
file; see the description of run-prelude
below.
Within a configuration file, configuration details are to be given as
a hashtable to main
. Whenusing the testing configuration of ./run
or when using racket src/main.rkt
, you can supply a --config
argument to specify a module that exports a config
hashtable.
Keys useful for deployment:
- port: number; defaults to the value of the
SITE_PORT
environment variable, if defined; otherwise, 7443. - pkg-index-port: number; defaults to the value of the
SITE_PKG_INDEX_PORT
environment variable, if defined; otherwise, 9004. - ssl?: boolean; default is
#t
, unlessPKG_SERVER_HTTP
is defined. - reloadable?: boolean; default is
#t
if theSITE_RELOADABLE
environment variable is defined; otherwise,#f
. - recent-seconds: number, in seconds; default is 172800. Packages modified fewer than this many seconds ago are considered "recent", and displayed as such in the UI.
- static-output-type: either
'aws-s3
or'file
, indicates where the frontend write the static-site data:- When
'file
(the default),- static-content-target-directory: either
#f
or a string denoting a path to a folder to which the static content of the site will be copied for local serving.
- static-content-target-directory: either
- When
'aws-s3
,- aws-s3-bucket+path: a string naming an S3 bucket and path.
Must end with a forward slash,
.../
. AWS access keys are loaded per the documentation for theaws
module; usually from a file~/.aws-keys
.
- aws-s3-bucket+path: a string naming an S3 bucket and path.
Must end with a forward slash,
- When
- dynamic-urlprefix: string; absolute or relative URL, prepended to URLs targetting dynamic content on the site, i.e., for when the frontend wants to serve a link back to itself.
- static-urlprefix: string; absolute or relative URL, prepended to relative URLs referring to static HTML files placed in static-generated-directory for when the dynamic server wants to refer to static content.
- pkg-index-generated-directory: a string pointing to where the
backend places its redered files; the main rendered file
that the frontend cares about is
pkgs-all.json.gz
, although the backend write a whole package catalog and web site there. - user-directory: directory containing the user database.
- email-sender-address: string; defaults to
[email protected]
. Used as the "from" address when sending authentication emails on behalf of the server. - beat-s3-bucket: string or #f; defaults to #f a bucket name for
regsitering heartbeats, or
#f
to disable heartbeats; the region is determined automatically from the bucket name. - beat-publish-task-name: string; defaults to "pkgd-publish"; a task name for heartbeats after publish information for all packages.
Keys useful for development:
- package-index-url: string; an alternative source that the frontend
uses to get
pkgs-all.json.gz
, such ashttp://pkgs.racket-lang.org/pkgs-all.json.gz
to pull from the live database instead of the running backend. The default is based on pkg-index-generated-directory unless thePACKAGE_INDEX_URL
environment variable is defined. - package-fetch-interval; number, in seconds; default is 300.
- session-lifetime: number, in seconds; default is 604800.
- static-generated-directory: string; names a directory within which generated static HTML files are to be placed. Must be writable by the user running the server.
- disable-cache?: boolean; default is
#f
; a#t
value causes the frontend to always redirect to itself to serve the package dynamically, instead of redirecting to generated static files. - backend-baseurl: string; defaults to a
https://localhost:
followed by pkg-index-port; must point to the backend package server API root, such that (for example)
/jsonp/authenticate`, when appended to it, resolves to the authentication call. - pkg-build-baseurl: string; defaults to
http://pkg-build.racket-lang.org/
. Used to build URLs relative to the package build host, such as for documentation links and build reports. - pkg-index:
#f
or hash table; use#f
to disable the backend server entirely, or provide a hash table to condigure the backend specifically.
Backend keys for a pkg-index configuration table within the main configuration:
- port: number; defaults to pkg-index-port from the enclosing
configuration or
9004
; port on which the backend site will be served. - ssl? - boolean; defaults to ssl? from the enclosing
configuration or to
#t
; a true value serves HTTPS and requires root/server-cert.pem
and root/private-key.pem
. - src: path; defaults to
src/pkg-index
relative to here - static.src-path: path; defaults to src
/static
, the location of of (non-generated) HTML/JS/CSS files to be copied to bestatic-path
(see below), although these files are mostly not used anymore - static-path: path; defaults to src
/static-gen
; staging area where all static resources - both generated and non-generated - are written. - notice-path: path; defaults to static-path
/notice.json
; whenever the server has a message for site users, the message will be placed in this file. - root: path; defaults to pkg-index-generated-directory from the outer configurartion; determines several other defaults
- users.new-path: path; defaults to user-directory from the
outer configuration, which defaults to
pkg-index-generated-directory
/users.new
; directory in which to hold user records, one file per user - cache-path: path; defaults to root
/cache
; names a directory where filessummary.rktd
andsummary.rktd.etag
will be created. - pkgs-path: path; defaults to root
/pkgs
; names a directory where one file of package information for each package in the catalog will be stored. - github-client_id (obsolete): string or #f; defaults to the contents of the
file at root
/client_id
, if it exists; should be a Github client ID string (hex; twenty characters long, i.e. 10 bytes of data, hex-encoded), used only if package downloaing is forced to use the GitHub API by setting thePLT_USE_GITHUB_API
environment variable. - github-client_secret (obsolete): string or #f; defaults to the contents of the
file at root
/client_secret
, if it exists; should be a Github client secret string (hex; forty characters long, i.e. 20 bytes of data, hex-encoded), used only whengithub-client_id
is used. - s3-bucket: string or
#f
; defaults to the contents of the environment variableS3_BUCKET
, if it is defined,#f
otherwise; AWS credentials are found by thes3
package, typically from~/.aws-keys
; if set to#f
, S3 synchronization will be disabled. - s3-bucket-region - string; defaults to the contents of the
environment variable
S3_BUCKET_REGION
, if it is defined; otherwise, to#f
; needs to be non-#f
if s3-bucket is. - beat-s3-bucket: string or #f; defaults to beat-s3-bucket from the
enclosing configuration table or #f; a bucket name for
regsitering heartbeats, or
#f
to disable heartbeats; the region is determined automatically from the bucket name. - beat-update-task-name: string; defaults to "pkgd-update"; a task name for heartbeats after updating information for all packages.
- beat-upload-task-name: string; defaults to "pkgd-upload"; a task name for heartbeats after uploading information for all packages.
beat-update-task-name
- string; defaults to "pkgd-update". A task name for heartbeats after updating information for all packages.- redirect-to-static-proc: function from HTTP request to HTTP
response, which should issue a redirect pointing to a static
resource; defaults to a function which replaces the scheme with the
contents of the configuration variable redirect-to-static-scheme,
the host with redirect-to-static-host, and the port to
redirect-to-static-port. These, in turn, default to
"http"
,"pkgs.racket-lang.org"
and80
, respectively. - atom-self-link: string; defaults to
https://pkg.racket-lang.org/rss
; sed as therel=self
link in the header of the generated ATOM feed. - atom-link: string; defaults to
https://pkg.racket-lang.org/
; used as the default site link in the header of the generated ATOM feed. - atom-id: string; defaults to
https://pkg.racket-lang.org/
; used as the ATOM feed ID. - atom-compute-package-url: function from package-name symbol to
URL string; defaults to a function which calls
format
with the package name and a format template-string fromatom-package-url-format-string
, which in turn defaults tohttp://pkg.racket-lang.org/#[~a]
.
Instead of manually adding packages to a fresh instance of the package
web server, use raco pkg catalog-copy
to copy an existing catalog into
a directory tree. Then, move the pkg
directory (no "s") in the catalog
copy to be root/pkgs
(with "s") where root is the server's root
directory — so, "compiled/root/pkgs" by default.
$ raco pkg catalog-copy https://pkgs.racket-lang.org compiled/pkgs-copy
$ mkdir -p compiled/root/pkgs
$ mv compiled/pkgs-copy/pkg/* compiled/root/pkgs/
Beware, however, that the backend will start by updating the checksum
of every package, so consider using a specific package name in place
of *
or a glob that selects a small set of packages.
When the server is started, the backend starts by looking for new checksums, while the frontend immediately checks for backend updates. Since those run concurrently, you may not immediately see updates via the frontend even when the backend has completed its scan (which you might infer from logging output). At that point, restarting is the fastest way to warm up the frontend.
You can use the website frontend to add a user. Email for a new user
is sent via sendmail
, so if you don't have that configured, just
watch the logs to see the token that would have been sent.
If you would like to enable the automatic code-reloading feature, set
the environment variable SITE_RELOADABLE
to a non-empty string or
set the reloadable?
configuration variable to #t
.
You must also delete any compiled code .zo
files. Otherwise, the
system will not be able to correctly replace modules while running.
Therefore, when using automatic code reloading, use just
make run
and make sure to run make clean
beforehand, if you've run make compile
at all previously.
The site can be set up to run either
- entirely dynamically, generating package pages on-the-fly for each request;
- both statically and dynamically, with HTML renderings of package pages stored on and served from disk like other static resources such as Javascript and CSS; or
- both statically and dynamically, as the previous option, but additionally replicating both static and generated content to a local file-system directory and invoking an optional update hook that can be used to further replicate the content to S3 or a remote host.
The default is mixed static/dynamic, with no additional replication.
For a fully dynamic site, set configuration variable disable-cache?
to #t
.
To enable replication, set configuration variable
static-content-target-directory to a non-#f
value, and optionally
set static-content-update-hook to a string containing a shell
command to execute every time the static content is updated.
To set up an S3 bucket — let's call it s3.example
— for use with
this site, follow these steps:
- Create the bucket ("
s3.example
") - Optionally add a CNAME record to DNS mapping
s3.example
tos3.example.s3-website-us-east-1.amazonaws.com
. If you do, static resources will be available athttp://s3.example/
; if not, at the longer URL. - Enable "Static Website Hosting" for the bucket. Set the index
document to
index.html
and the error document tonot-found
.
Then, under "Permissions", click "Add bucket policy", and add something like the following.
{
"Id": "RacketPackageWebsiteS3Policy",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RacketPackageWebsiteS3PolicyStmt1",
"Action": "s3:*",
"Effect": "Allow",
"Resource": ["arn:aws:s3:::s3.example",
"arn:aws:s3:::s3.example/*"],
"Principal": {
"AWS": ["<<<ARN OF THE USER TO WHOM ACCESS SHOULD BE GRANTED>>>"]
}
}
]
}
The user will need to be able to read and write objects and set CORS
policy. (CORS is configured automatically by code in
src/static.rkt
.)
Startable using djb's daemontools;
symlink this directory into your services directory and start it as
usual. The run
script starts the program, and log/run
sets up
logging of stdout/stderr.
If the file run-prelude
exists in the current directory on startup,
it will be dotted in before racket is invoked. A prelude is useful to
update PATH
for a locally-built racket bin
directory or to select
an appropriate CONFIG
setting.
On Debian, daemontools can be installed with apt-get install daemontools daemontools-run
, and the services directory is
/etc/service/
.
You can send signals to the running service by creating files in
/etc/service/webservice/signals/
. For example:
-
creating
.pull-required
causes the server to shell out togit pull
and then exit. Daemontools will restart it. -
creating
.restart-required
causes it to exit, to be restarted by daemontools. -
creating
.reload
causes an explicit code reload. Useful when automatic code reloading is disabled. -
creating
.fetchindex
causes an immediate refetch of the package index from the backend server. -
creating
.rerender
causes an immediate rerendering of all generated static HTML files.
See src/signals.rkt
for details of the available signals.
So long as sudo chmod 0777 /etc/service/webservice/signals
, these
are useful for non-root administrators to control the running service.
In particular, a git post-receive
hook can be used to create the
.pull-required
signal in order to update the service on git push.
Copyright © 2014 Tony Garnock-Jones
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.