Skip to content

Commit 2e2962a

Browse files
Vishal Bollugitbook-bot
Vishal Bollu
authored andcommitted
GitBook: [0.11] 38 pages modified
1 parent e6604eb commit 2e2962a

25 files changed

+127
-149
lines changed

README.md

+15-33
Original file line numberDiff line numberDiff line change
@@ -2,32 +2,18 @@
22

33
Cortex is an open source platform for deploying machine learning models—trained with nearly any framework—as production web services.
44

5-
<br>
6-
7-
<!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
85
![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.8.gif)
96

10-
<br>
11-
127
## Key features
138

14-
- **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
15-
16-
- **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
17-
18-
- **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
19-
20-
- **Spot instances:** Cortex supports EC2 spot instances.
21-
22-
- **Rolling updates:** Cortex updates deployed APIs without any downtime.
23-
24-
- **Log streaming:** Cortex streams logs from deployed models to your CLI.
25-
26-
- **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
27-
28-
- **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.
29-
30-
<br>
9+
* **Autoscaling:** Cortex automatically scales APIs to handle production workloads.
10+
* **Multi framework:** Cortex supports TensorFlow, PyTorch, scikit-learn, XGBoost, and more.
11+
* **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
12+
* **Spot instances:** Cortex supports EC2 spot instances.
13+
* **Rolling updates:** Cortex updates deployed APIs without any downtime.
14+
* **Log streaming:** Cortex streams logs from deployed models to your CLI.
15+
* **Prediction monitoring:** Cortex monitors network metrics and tracks predictions.
16+
* **Minimal configuration:** Deployments are defined in a single `cortex.yaml` file.
3117

3218
## Usage
3319

@@ -92,19 +78,15 @@ positive 8
9278
negative 4
9379
```
9480

95-
<br>
96-
9781
## How it works
9882

99-
The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing (ELB), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
100-
101-
<br>
83+
The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using Elastic Load Balancing \(ELB\), TensorFlow Serving, and ONNX Runtime. The containers are orchestrated on Elastic Kubernetes Service \(EKS\) while logs and metrics are streamed to CloudWatch.
10284

10385
## Examples
10486

105-
<!-- CORTEX_VERSION_README_MINOR x5 -->
106-
- [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer) in TensorFlow with BERT
107-
- [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier) in TensorFlow with Inception
108-
- [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator) in PyTorch with DistilGPT2
109-
- [Reading comprehension](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/reading-comprehender) in PyTorch with ELMo-BiDAF
110-
- [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier) in scikit-learn
87+
* [Sentiment analysis](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/sentiment-analyzer) in TensorFlow with BERT
88+
* [Image classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/tensorflow/image-classifier) in TensorFlow with Inception
89+
* [Text generation](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/text-generator) in PyTorch with DistilGPT2
90+
* [Reading comprehension](https://github.com/cortexlabs/cortex/tree/0.11/examples/pytorch/reading-comprehender) in PyTorch with ELMo-BiDAF
91+
* [Iris classification](https://github.com/cortexlabs/cortex/tree/0.11/examples/sklearn/iris-classifier) in scikit-learn
92+

docs/cluster/aws.md renamed to docs/cluster-management/aws.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@
22

33
As of now, Cortex only runs on AWS. We plan to support other cloud providers in the future. If you don't have an AWS account you can get started with one [here](https://portal.aws.amazon.com/billing/signup#/start).
44

5-
Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user (or see [security](security.md) for a minimal access configuration).
5+
Follow this [tutorial](https://aws.amazon.com/premiumsupport/knowledge-center/create-access-key) to create an access key. Enable programmatic access for the IAM user, and attach the built-in `AdministratorAccess` policy to your IAM user \(or see [security](security.md) for a minimal access configuration\).
6+

docs/cluster/config.md renamed to docs/cluster-management/config.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
# Cluster configuration
22

3-
The Cortex cluster may be configured by providing a configuration file to `cortex cluster up` or `cortex cluster update` via the `--config` flag (e.g. `cortex cluster up --config=cluster.yaml`). Below is the schema for the cluster configuration file, with default values shown (unless otherwise specified):
4-
5-
<!-- CORTEX_VERSION_BRANCH_STABLE -->
3+
The Cortex cluster may be configured by providing a configuration file to `cortex cluster up` or `cortex cluster update` via the `--config` flag \(e.g. `cortex cluster up --config=cluster.yaml`\). Below is the schema for the cluster configuration file, with default values shown \(unless otherwise specified\):
64

75
```yaml
86
# cluster.yaml
@@ -83,3 +81,4 @@ image_istio_pilot: cortexlabs/istio-pilot:0.11.0
8381
image_istio_citadel: cortexlabs/istio-citadel:0.11.0
8482
image_istio_galley: cortexlabs/istio-galley:0.11.0
8583
```
84+

docs/cluster/security.md renamed to docs/cluster-management/security.md

+4-3
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ If you are not using a sensitive AWS account and do not have a lot of experience
88

99
The operator requires read permissions for any S3 bucket containing exported models, read and write permissions for the Cortex S3 bucket, read and write permissions for the Cortex CloudWatch log group, and read and write permissions for CloudWatch metrics. The policy below may be used to restrict the Operator's access:
1010

11-
```json
11+
```javascript
1212
{
1313
"Version": "2012-10-17",
1414
"Statement": [
@@ -43,8 +43,9 @@ In order to connect to the operator via the CLI, you must provide valid AWS cred
4343

4444
## API access
4545

46-
By default, your Cortex APIs will be accessible to all traffic. You can restrict access using AWS security groups. Specifically, you will need to edit the security group with the description: "Security group for Kubernetes ELB <ELB name> (istio-system/apis-ingressgateway)".
46+
By default, your Cortex APIs will be accessible to all traffic. You can restrict access using AWS security groups. Specifically, you will need to edit the security group with the description: "Security group for Kubernetes ELB \(istio-system/apis-ingressgateway\)".
4747

4848
## HTTPS
4949

50-
All APIs are accessible via HTTPS. The certificate is autogenerated during installation using `localhost` as the Common Name (CN). Therefore, clients will need to skip certificate verification (e.g. `curl -k`) when using HTTPS.
50+
All APIs are accessible via HTTPS. The certificate is autogenerated during installation using `localhost` as the Common Name \(CN\). Therefore, clients will need to skip certificate verification \(e.g. `curl -k`\) when using HTTPS.
51+

docs/cluster/uninstall.md renamed to docs/cluster-management/uninstall.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
1. [AWS credentials](aws.md)
66
2. [Docker](https://docs.docker.com/install)
7-
3. [Cortex CLI](install.md)
7+
3. [Cortex CLI](../install.md)
88
4. [AWS CLI](https://aws.amazon.com/cli)
99

1010
## Uninstalling Cortex
@@ -34,3 +34,4 @@ aws s3 rb --force s3://<bucket-name>
3434
# delete the log group
3535
aws logs describe-log-groups --log-group-name-prefix=<log_group_name> --query logGroups[*].[logGroupName] --output text | xargs -I {} aws logs delete-log-group --log-group-name {}
3636
```
37+

docs/cluster/update.md renamed to docs/cluster-management/update.md

+1-2
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@ cortex cluster update
1515

1616
## Upgrading to a newer version of Cortex
1717

18-
<!-- CORTEX_VERSION_MINOR -->
19-
2018
```bash
2119
# spin down your cluster
2220
cortex cluster down
@@ -30,3 +28,4 @@ cortex version
3028
# spin up your cluster
3129
cortex cluster up
3230
```
31+

docs/development.md renamed to docs/contributing/development.md

+14-13
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
# Development Environment
1+
# Development
22

33
## Prerequisites
44

5-
1. Go (>=1.12.9)
6-
1. Docker
7-
1. eksctl
8-
1. kubectl
5+
1. Go \(&gt;=1.12.9\)
6+
2. Docker
7+
3. eksctl
8+
4. kubectl
99

1010
## Cortex Dev Environment
1111

@@ -135,23 +135,24 @@ path/to/cortex/bin/cortex deploy
135135
If you're making changes in the operator and want faster iterations, you can run an off-cluster operator.
136136

137137
1. `make operator-stop` to stop the in-cluster operator
138-
1. `make devstart` to run the off-cluster operator (which rebuilds the CLI and restarts the Operator when files change)
139-
1. `path/to/cortex/bin/cortex configure` (on a separate terminal) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`
138+
2. `make devstart` to run the off-cluster operator \(which rebuilds the CLI and restarts the Operator when files change\)
139+
3. `path/to/cortex/bin/cortex configure` \(on a separate terminal\) to configure your cortex CLI to use the off-cluster operator. When prompted for operator URL, use `http://localhost:8888`
140140

141141
Note: `make cortex-up-dev` will start Cortex without installing the operator.
142142

143143
If you want to switch back to the in-cluster operator:
144144

145145
1. `<ctrl+C>` to stop your off-cluster operator
146-
1. `make operator-start` to install the operator in your cluster
147-
1. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`
146+
2. `make operator-start` to install the operator in your cluster
147+
3. `path/to/cortex/bin/cortex configure` to configure your cortex CLI to use the in-cluster operator. When prompted for operator URL, use the URL shown when running `make cortex-info`
148148

149149
## Dev Workflow
150150

151151
1. `make cortex-up-dev`
152-
1. `make devstart`
153-
1. Make changes
154-
1. `make registry-dev`
155-
1. Test your changes with projects in `examples` or your own
152+
2. `make devstart`
153+
3. Make changes
154+
4. `make registry-dev`
155+
5. Test your changes with projects in `examples` or your own
156156

157157
See `Makefile` for additional dev commands
158+

docs/dependencies/python-packages.md renamed to docs/dependency-management/python-packages.md

+4-3
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## PyPI packages
44

5-
You can install your required PyPI packages and import them in your Python files. Cortex looks for a `requirements.txt` file in the top level Cortex project directory (i.e. the directory which contains `cortex.yaml`):
5+
You can install your required PyPI packages and import them in your Python files. Cortex looks for a `requirements.txt` file in the top level Cortex project directory \(i.e. the directory which contains `cortex.yaml`\):
66

77
```text
88
./iris-classifier/
@@ -12,7 +12,7 @@ You can install your required PyPI packages and import them in your Python files
1212
└── requirements.txt
1313
```
1414

15-
Note that some packages are pre-installed by default (see [predictor](../deployments/predictor.md) or [request handlers](../deployments/request-handlers.md) depending on which runtime you're using).
15+
Note that some packages are pre-installed by default \(see [predictor](../deployments/predictor.md) or [request handlers](../deployments/request-handlers.md) depending on which runtime you're using\).
1616

1717
## Private packages on GitHub
1818

@@ -28,7 +28,7 @@ You can generate a personal access token by following [these steps](https://help
2828

2929
## Project files
3030

31-
Cortex makes all files in the project directory (i.e. the directory which contains `cortex.yaml`) available to request handlers. Python bytecode files (`*.pyc`, `*.pyo`, `*.pyd`), files or folders that start with `.`, and `cortex.yaml` are excluded.
31+
Cortex makes all files in the project directory \(i.e. the directory which contains `cortex.yaml`\) available to request handlers. Python bytecode files \(`*.pyc`, `*.pyo`, `*.pyd`\), files or folders that start with `.`, and `cortex.yaml` are excluded.
3232

3333
The contents of the project directory is available in `/mnt/project/` in the API containers. For example, if this is your project directory:
3434

@@ -53,3 +53,4 @@ def pre_inference(sample, signature, metadata):
5353
print(config)
5454
...
5555
```
56+

docs/dependencies/system-packages.md renamed to docs/dependency-management/system-packages.md

+4-3
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# System packages
22

3-
Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to (e.g. [Docker Hub](https://hub.docker.com/) or [AWS ECR](https://aws.amazon.com/ecr/)).
3+
Cortex uses Docker images to deploy your models. These images can be replaced with custom images that you can augment with your system packages and libraries. You will need to push your custom images to a container registry that your cluster has access to \(e.g. [Docker Hub](https://hub.docker.com/) or [AWS ECR](https://aws.amazon.com/ecr/)\).
44

5-
See the `image paths` section in [cluster configuration](../cluster/config.md) for a complete list of customizable images.
5+
See the `image paths` section in [cluster configuration](../cluster-management/config.md) for a complete list of customizable images.
66

77
## Create a custom image
88

@@ -14,7 +14,7 @@ mkdir my-api && cd my-api && touch Dockerfile
1414

1515
Specify the base image you want to override followed by your customizations. The sample Dockerfile below inherits from Cortex's Python serving image and installs the `tree` system package.
1616

17-
```dockerfile
17+
```text
1818
# Dockerfile
1919
2020
FROM cortexlabs/predictor-serve
@@ -79,3 +79,4 @@ def predict(sample, metadata):
7979
subprocess.run(["tree"])
8080
...
8181
```
82+

docs/deployments/autoscaling.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,5 @@ Cortex adjusts the number of replicas that are serving predictions by monitoring
88

99
## Autoscaling Nodes
1010

11-
Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` (configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)).
11+
Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `min_instances` and no more than `max_instances` \(configured during installation and modifiable via `cortex cluster update` or the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)\).
12+

docs/cluster/cli.md renamed to docs/deployments/cli.md

+1
Original file line numberDiff line numberDiff line change
@@ -186,3 +186,4 @@ Usage:
186186
Flags:
187187
-h, --help help for completion
188188
```
189+

docs/deployments/compute.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -11,22 +11,22 @@ For example:
1111
cpu: 1
1212
gpu: 1
1313
mem: 1G
14-
1514
```
1615
17-
CPU, GPU, and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the API will only be scheduled once 1 CPU, 1GPU, and 1G of memory are available on any instance, and the deployment will be guaranteed to have access to those resources throughout its execution. In some cases, resource requests can be (or may default to) `Null`.
16+
CPU, GPU, and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the API will only be scheduled once 1 CPU, 1GPU, and 1G of memory are available on any instance, and the deployment will be guaranteed to have access to those resources throughout its execution. In some cases, resource requests can be \(or may default to\) `Null`.
1817

1918
## CPU
2019

21-
One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix (`0.2` and `200m` are equivalent).
20+
One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix \(`0.2` and `200m` are equivalent\).
2221

2322
## Memory
2423

25-
One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` (or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
24+
One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` \(or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`\). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
2625

2726
## GPU
2827

2928
1. Make sure your AWS account is subscribed to the [EKS-optimized AMI with GPU Support](https://aws.amazon.com/marketplace/pp/B07GRHFXGM).
3029
2. You may need to [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to incease the limit for your desired instance type.
31-
3. Set instance type to an AWS GPU instance (e.g. p2.xlarge) when installing Cortex.
30+
3. Set instance type to an AWS GPU instance \(e.g. p2.xlarge\) when installing Cortex.
3231
4. Note that one unit of GPU corresponds to one virtual GPU on AWS. Fractional requests are not allowed.
32+

docs/deployments/deployments.md

+1
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@ Deployments are used to group a set of APIs that are deployed together. It must
1515
- kind: deployment
1616
name: my_deployment
1717
```
18+

docs/deployments/onnx.md

+4-3
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Deploy ONNX models as web services.
2626
mem: <string> # memory request per replica (default: Null)
2727
```
2828
29-
See [packaging ONNX models](../packaging/onnx.md) for information about exporting ONNX models.
29+
See [packaging ONNX models](../packaging-models/onnx.md) for information about exporting ONNX models.
3030
3131
## Example
3232
@@ -45,6 +45,7 @@ See [packaging ONNX models](../packaging/onnx.md) for information about exportin
4545
You can log information about each request by adding a `?debug=true` parameter to your requests. This will print:
4646

4747
1. The raw sample
48-
2. The value after running the `pre_inference` function (if provided)
48+
2. The value after running the `pre_inference` function \(if provided\)
4949
3. The value after running inference
50-
4. The value after running the `post_inference` function (if provided)
50+
4. The value after running the `post_inference` function \(if provided\)
51+

docs/deployments/prediction-monitoring.md

+1
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,4 @@ For classification models, the tracker should be configured with `model_type: cl
2424
tracker:
2525
model_type: classification
2626
```
27+

0 commit comments

Comments
 (0)