Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model id specified in dockerfile #23

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ RUN pip3 install --upgrade pip
ADD requirements.txt requirements.txt
RUN pip3 install -r requirements.txt

# In this example, we can define the hugging face model as an ENV variable
# and from here pass it to download.py & app.py
ENV HF_MODEL_NAME bert-base-uncased

# We add the banana boilerplate here
ADD server.py .

Expand Down
33 changes: 33 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,37 @@ Generalize this framework to [deploy anything on Banana](https://docs.banana.dev

<br>

# Local testing

## With docker

To test the Serverless Framework with docker locally, you need to build the docker container and then run it.
In the root of this directory, run:
```
docker build . -t serverless-template
```
After which you can run the container. Here we also forward the port to access the localhost url outside of the
container and enable gpu acceleration.
```
docker run -p 8000:8000 --gpus=all serverless-template
```

## Without docker

Testing your code without docker is straight forward. Remember to pass in the Hugging Face model name as
an ENV variable. In this case:
```
export HF_MODEL_NAME=bert-base-uncased
```
Make sure you have the required dependencies:
```
pip3 install -r requirements.txt
```
And then simply run the server.py
```
python3 server.py
```

<br>

## Use Banana for scale.
6 changes: 5 additions & 1 deletion app.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
from transformers import pipeline
import torch
import os

# Init is ran on server startup
# Load your model to GPU as a global variable here using the variable name "model"
def init():
global model

# In this example, we get the model name as an ENV variable defined in the Dockerfile
hf_model_name = os.getenv("HF_MODEL_NAME")

device = 0 if torch.cuda.is_available() else -1
model = pipeline('fill-mask', model='bert-base-uncased', device=device)
model = pipeline('fill-mask', model=hf_model_name, device=device)

# Inference is ran for every server call
# Reference your preloaded global model variable here.
Expand Down
7 changes: 6 additions & 1 deletion download.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,15 @@
# In this example: A Huggingface BERT model

from transformers import pipeline
import os

def download_model():

# In this example, we get the model name as an ENV variable defined in the Dockerfile
hf_model_name = os.getenv("HF_MODEL_NAME")

# do a dry run of loading the huggingface model, which will download weights
pipeline('fill-mask', model='bert-base-uncased')
pipeline('fill-mask', model=hf_model_name)

if __name__ == "__main__":
download_model()