Add documentation for HTTP endpoints (#16)

* add documentation for API endpoints * add PREDICT_STRATEGY docs * update allocation API docs for spec in parameters
spack · Mar 21, 2024 · 45c1502 · 45c1502
1 parent 936c88e
commit 45c1502
Show file tree

Hide file tree

Showing 4 changed files with 73 additions and 3 deletions.
diff --git a/docs/api.md b/docs/api.md
@@ -0,0 +1,68 @@
+# REST API Documentation
+
+This section documents the behavior of Gantry's REST API, which is hosted at `https://gantry.spack.io/v1/`.
+
+## Collection
+
+```
+POST /collect
+```
+
+This endpoint is intended to handle webhooks triggered by changes in Gitlab job status.
+
+If the `X-Gitlab-Token` header does not match the token set in the Gitlab interface, the API will respond with `401 Unauthorized` and could lead to the webhook being disabled.
+
+If the `X-Gitlab-Event` header does not equal "Job Hook," the response will be `200 OK`, but the client will be warned that this is an inappropriate use of the endpoint.
+
+The payload should be include the following fields at a minimum (Gitlab will include more):
+
+```
+{
+    "build_status": str,
+    "build_name": str,
+    "build_id": int,
+    "build_started_at": str,
+    "build_finished_at": str,
+    "ref": str,
+    "runner": {
+        "description": str
+    }
+}
+```
+
+The API will respond with `400 Bad Request` if any of this information is missing. Barring any other immediate issues, a background job will be queued to process the job and `200 OK` will be sent. This behavior means that there is no immediate feedback about the success of the collection; any failure is visible in the application logs. This is done to ensure that the API responds to the webhook in time and to reflect that a collection failure is not considered fatal.
+
+## Allocation
+
+```
+GET /allocation?spec=
+```
+
+Given a spec, the API will calculate the optimal resource allocation for the job.
+
+The spec sent to the endpoint should have the following format:
+
+```
+pkg_name@pkg_version +variant1+variant2%compiler@compiler_version
+```
+
+**There must be a space after the package version in order to account for variant parsing.** 
+
+If the request does not contain a valid spec, the API will respond with `400 Bad Request`. The maximum allowed size of the `GET` request is 8190 bytes.
+
+Expected response:
+
+```
+200 OK
+
+{
+    "variables": {
+        "KUBERNETES_CPU_REQUEST": str,
+        "KUBERNETES_MEMORY_REQUEST": str
+    }
+}
+```
+
+All CPU variables will be sent in core format (e.g., "1" for 1 core), and all memory variables will be represented in megabytes (e.g., "2000M" for 2000 megabytes).
+
+The API may change in the future to expand the number of variables, so clients should apply all values within `variables` to the job's environment.
diff --git a/docs/deploy.md b/docs/deploy.md
@@ -31,3 +31,6 @@ The following variables should be exposed to the container. Those **bolded** are
 - `MAX_GET_SIZE` - the maximum `GET` request (in bytes), default is 8MB
 - `GANTRY_HOST` - web app hostname, default is `localhost`
 - `GANTRY_PORT` - web app port, default is `8080`
+- `PREDICT_STRATEGY` - optional mode for the prediction algorithm
+    - options: 
+        - `ensure_higher`: if the predicted resource usage is below current levels, it will disregard the prediction and keep what would be allocated without Gantry's intervention
diff --git a/docs/prediction.md b/docs/prediction.md
@@ -23,7 +23,7 @@ We could also figure out the upper threshold by calculating
 
 = skimpiest predicted limit / maximum usage for that job
 
-However, when doing this, I've stumbled upon packages that have unpredictable usage patterns that can swing from 200-400% of each other (with no discernable differences).
+However, when doing this, I've stumbled upon packages that have unpredictable usage patterns that can swing from 200-400% of each other (with no discernible differences).
 
 More research and care will be needed when we finally decide to implement limit prediction.
 

diff --git a/docs/home.md → docs/readme.md b/docs/home.md → docs/readme.md
@@ -1,11 +1,10 @@
 # `spack-gantry`
 
-
-
 ## Table of Contents
 
 1. [Context](context.md)
 2. [Data Collection](data-collection.md)
 3. [Architecture](arch.md)
 4. [Prediction](prediction.md)
 5. [Deployment](deploy.md)
+6. [API Endpoints](api.md)