Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/al2 updates #421

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
6f6feb5
(PPS-710): use new branches of gen3datamodel and related repos
george42-ctds Aug 6, 2024
1e8c563
(PPS-710): replace 'gdcdatamodel' with 'gen3datamodel'
george42-ctds Aug 6, 2024
4d291f9
(PPS-710): replace 'gdcdatamodel' with 'gen3datamodel'
george42-ctds Aug 6, 2024
5d36356
(PPS-710): replace 'gdcdatamodel' with 'gen3datamodel'
george42-ctds Aug 6, 2024
0504877
(PPS-710): replace 'gdcdatamodel' with 'gen3datamodel'
george42-ctds Aug 6, 2024
643b424
adding nginx and gunicorn to sheepdog image
EliseCastle23 Aug 12, 2024
b608fa7
Update settings.py
jawadqur Aug 12, 2024
33e328f
adding nginx and gunicorn to sheepdog image
jawadqur Aug 13, 2024
31ae1de
Update nginx.conf
jawadqur Aug 27, 2024
eb0c9d8
(PPS-710): re-lock to pick up changes in datamodelutils
george42-ctds Aug 28, 2024
691bd70
Stagger start of gunicorn workers, flip nginx/gunicorn starts
george42-ctds Sep 12, 2024
8a7ef28
Set preload_app = False
george42-ctds Sep 12, 2024
b5df998
Merge branch 'chore/update-gen3datamodel' into feat/al2-updates
george42-ctds Sep 12, 2024
dcc5856
adding updates to pyproject.toml file
EliseCastle23 Oct 21, 2024
9f34a8d
updating case in dockerfile
EliseCastle23 Oct 22, 2024
e23f689
Merge branch 'master' into feat/al2-updates
EliseCastle23 Oct 22, 2024
138e2b9
re-lock
george42-ctds Oct 22, 2024
0da46be
updating base image to python nginx
EliseCastle23 Nov 1, 2024
52d331d
Merge branch 'feat/al2-updates' of https://github.com/uc-cdis/sheepdo…
EliseCastle23 Nov 1, 2024
a68808c
updating lock file
EliseCastle23 Nov 1, 2024
077de4d
modifying settings.py
EliseCastle23 Nov 7, 2024
fb71416
updating lock file
EliseCastle23 Nov 7, 2024
769e392
fixing build error
EliseCastle23 Nov 7, 2024
fcb68f9
updating poetry lock
EliseCastle23 Nov 7, 2024
da2df89
testing installing packages to fix build error
EliseCastle23 Nov 8, 2024
62055f0
moving command
EliseCastle23 Nov 8, 2024
fef6157
moving command again
EliseCastle23 Nov 8, 2024
bea25df
test
EliseCastle23 Nov 8, 2024
43a1e6b
test adding gcc only
EliseCastle23 Nov 8, 2024
fc144e4
testing
EliseCastle23 Nov 8, 2024
cbd124a
testing python devel and gcc
EliseCastle23 Nov 8, 2024
a618457
python3-devel postgresql-devel gcc
EliseCastle23 Nov 8, 2024
4e534bd
sheepdog app init updates
EliseCastle23 Nov 8, 2024
3e779c0
updating settings
EliseCastle23 Nov 8, 2024
d6c3203
adding extra import
EliseCastle23 Nov 8, 2024
dda2860
pinning versions and updating variable syntax
EliseCastle23 Nov 12, 2024
3d2bf6a
adding "postgresql-libs" to ensure the "libpq.so.5" library is added
EliseCastle23 Nov 13, 2024
e8bb8de
making suggested updates
EliseCastle23 Nov 15, 2024
b8085df
removing commented command
EliseCastle23 Nov 15, 2024
5fc125f
editing nginx command
EliseCastle23 Nov 15, 2024
375eee9
adding back sleep
EliseCastle23 Nov 15, 2024
2e562ba
fixing syntax error
EliseCastle23 Nov 15, 2024
b52a6d9
adding comment
EliseCastle23 Nov 15, 2024
7308e9e
Update dockerrun.bash
EliseCastle23 Nov 18, 2024
ed548c4
Necessary for flask apps internally. Kept per Alex VanTol's recommend…
piotrsenkow Dec 20, 2024
b7ccea1
Merge branch 'master' into feat/al2-updates
piotrsenkow Dec 20, 2024
cfa0e33
adding newly generate poetry.lock file after modifying pyproject.toml…
piotrsenkow Dec 20, 2024
53bf9d5
Merge branch 'feat/al2-updates' of github.com:uc-cdis/sheepdog into f…
piotrsenkow Dec 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@
"filename": "bin/settings.py",
"hashed_secret": "347cd9c53ff77d41a7b22aa56c7b4efaf54658e3",
"is_verified": false,
"line_number": 43
"line_number": 54
}
],
"docs/local_dev_environment.md": [
Expand Down Expand Up @@ -354,5 +354,5 @@
}
]
},
"generated_at": "2024-04-22T20:07:28Z"
"generated_at": "2024-08-12T21:26:18Z"
}
128 changes: 75 additions & 53 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,65 +1,87 @@
# To run:
# - Create and fill out `creds.json`:
# {
# "fence_host": "",
# "fence_username": "",
# "fence_password": "",
# "fence_database": "",
# "db_host": "",
# "db_username": "",
# "db_password": "",
# "db_database": "",
# "gdcapi_secret_key": "",
# "indexd_password": "",
# "hostname": ""
# }
# - Build the image: `docker build . -t sheepdog -f Dockerfile`
# - Run: `docker run -v /full/path/to/creds.json:/var/www/sheepdog/creds.json -p 81:80 sheepdog`
# To check running container: `docker exec -it sheepdog /bin/bash`

FROM quay.io/cdis/python:python3.9-buster-2.0.0
ARG AZLINUX_BASE_VERSION=master

# Base stage with python-build-base
FROM quay.io/cdis/python-build-base:${AZLINUX_BASE_VERSION} AS base

# Comment this in, and comment out the line above, if quay is down
# FROM 707767160287.dkr.ecr.us-east-1.amazonaws.com/gen3/python-build-base:${AZLINUX_BASE_VERSION} as base

ENV appname=sheepdog
ENV POETRY_NO_INTERACTION=1 \
POETRY_VIRTUALENVS_IN_PROJECT=1 \
POETRY_VIRTUALENVS_CREATE=1

WORKDIR /${appname}

# create gen3 user
# Create a group 'gen3' with GID 1000 and a user 'gen3' with UID 1000
RUN groupadd -g 1000 gen3 && \
useradd -m -s /bin/bash -u 1000 -g gen3 gen3 && \
chown -R gen3:gen3 /$appname && \
chown -R gen3:gen3 /venv


# Builder stage
FROM base AS builder


RUN yum install -y gcc gcc-c++ make postgresql-devel

RUN dnf install python3-devel -y

USER gen3


RUN python -m venv /venv

COPY poetry.lock pyproject.toml /${appname}/

RUN pip install poetry && \
poetry install -vv --only main --no-interaction

COPY --chown=gen3:gen3 . /$appname

# Run poetry again so this app itself gets installed too
RUN poetry install --without dev --no-interaction

RUN git config --global --add safe.directory /${appname} && COMMIT=`git rev-parse HEAD` && echo "COMMIT=\"${COMMIT}\"" > /$appname/version_data.py \
&& VERSION=`git describe --always --tags` && echo "VERSION=\"${VERSION}\"" >> /$appname/version_data.py

# Final stage
FROM base

COPY --from=builder /venv /venv
COPY --from=builder /$appname /$appname

# install nginx
RUN yum install nginx postgresql-devel -y

# allow nginx to bind to port 80
RUN setcap 'cap_net_bind_service=+ep' /usr/sbin/nginx

RUN pip install --upgrade pip poetry
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential libffi-dev musl-dev gcc libxml2-dev libxslt-dev \
curl bash git vim
# chown nginx directories
RUN chown -R gen3:gen3 /var/log/nginx

RUN mkdir -p /var/www/$appname \
&& mkdir -p /var/www/.cache/Python-Eggs/ \
&& mkdir /run/nginx/ \
&& ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx/error.log \
&& chown nginx -R /var/www/.cache/Python-Eggs/ \
&& chown nginx /var/www/$appname
# pipe nginx logs to stdout and stderr
RUN ln -sf /dev/stdout /var/log/nginx/access.log && ln -sf /dev/stderr /var/log/nginx/error.log

EXPOSE 80
# create /var/lib/nginx/tmp/client_body to allow nginx to write to fence
RUN mkdir -p /var/lib/nginx/tmp/client_body
RUN chown -R gen3:gen3 /var/lib/nginx/

WORKDIR /$appname
# copy nginx config
COPY ./deployment/nginx/nginx.conf /etc/nginx/nginx.conf

# copy ONLY poetry artifact, install the dependencies but not indexd
# this will make sure than the dependencies is cached
COPY poetry.lock pyproject.toml /$appname/
RUN poetry config virtualenvs.create false \
&& poetry install -vv --no-root --no-dev --no-interaction \
&& poetry show -v

# copy source code ONLY after installing dependencies
COPY . /$appname
COPY ./deployment/uwsgi/uwsgi.ini /etc/uwsgi/uwsgi.ini
COPY ./bin/settings.py /var/www/$appname/settings.py
COPY ./bin/confighelper.py /var/www/$appname/confighelper.py
# Switch to non-root user 'gen3' for the serving process
USER gen3

# install sheepdog
RUN poetry config virtualenvs.create false \
&& poetry install -vv --no-dev --no-interaction \
&& poetry show -v
RUN source /venv/bin/activate

RUN COMMIT=`git rev-parse HEAD` && echo "COMMIT=\"${COMMIT}\"" >$appname/version_data.py \
&& VERSION=`git describe --always --tags` && echo "VERSION=\"${VERSION}\"" >>$appname/version_data.py
ENV PYTHONUNBUFFERED=1 \
PYTHONIOENCODING=UTF-8

WORKDIR /var/www/$appname
WORKDIR /var/www/${appname}

RUN ls
CMD /dockerrun.sh
CMD ["/sheepdog/dockerrun.bash"]
# CMD ["gunicorn", "-c", "/sheepdog/deployment/wsgi/gunicorn.conf.py"]
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ import sheepdog
import datamodelutils
from dictionaryutils import dictionary
from gdcdictionary import gdcdictionary
from gdcdatamodel import models, validators
from gen3datamodel import models, validators

dictionary.init(gdcdictionary)
datamodelutils.validators.init(validators)
Expand Down
2 changes: 1 addition & 1 deletion bin/confighelper.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,6 @@ def load_json(file_name, app_name, search_folders=None):
"""
actual_files = find_paths(file_name, app_name, search_folders)
if not actual_files:
return None
return {}
with open(actual_files[0], "r") as reader:
return json.load(reader)
45 changes: 29 additions & 16 deletions bin/settings.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from sheepdog.api import app, app_init
from os import environ
import confighelper
import os
from . import confighelper

APP_NAME = "sheepdog"

Expand All @@ -22,39 +22,52 @@ def load_json(file_name):

# Signpost: deprecated, replaced by index client.
config["SIGNPOST"] = {
"host": environ.get("SIGNPOST_HOST") or "http://indexd-service",
"host": os.environ.get("SIGNPOST_HOST", "http://indexd-service"),
"version": "v0",
"auth": ("gdcapi", conf_data.get("indexd_password", "{{indexd_password}}")),
}
config["INDEX_CLIENT"] = {
"host": environ.get("INDEX_CLIENT_HOST") or "http://indexd-service",
"host": os.environ.get("INDEX_CLIENT_HOST") or "http://indexd-service",
"version": "v0",
"auth": ("gdcapi", conf_data.get("indexd_password", "{{indexd_password}}")),
}
config["FAKE_AUTH"] = False
config["PSQLGRAPH"] = {
"host": conf_data["db_host"],
"user": conf_data["db_username"],
"password": conf_data["db_password"],
"database": conf_data["db_database"],
"host": conf_data.get("db_host", os.environ.get("PGHOST", "localhost")),
EliseCastle23 marked this conversation as resolved.
Show resolved Hide resolved
"user": conf_data.get("db_username", os.environ.get("PGUSER", "sheepdog")),
"password": conf_data.get("db_password", os.environ.get("PGPASSWORD", "sheepdog")),
"database": conf_data.get("db_database", os.environ.get("PGDB", "sheepdog")),
}

config["FLASK_SECRET_KEY"] = conf_data.get("gdcapi_secret_key", "{{gdcapi_secret_key}}")
config["PSQL_USER_DB_CONNECTION"] = "postgresql://%s:%s@%s:5432/%s" % tuple(
[
conf_data.get(key, key)
for key in ["fence_username", "fence_password", "fence_host", "fence_database"]
]

fence_username = conf_data.get(
"fence_username", os.environ.get("FENCE_DB_USER", "fence")
)
fence_password = conf_data.get(
"fence_password", os.environ.get("FENCE_DB_PASS", "fence")
)
fence_host = conf_data.get("fence_host", os.environ.get("FENCE_DB_HOST", "localhost"))
fence_database = conf_data.get(
"fence_database", os.environ.get("FENCE_DB_DATABASE", "fence")
)
config["PSQL_USER_DB_CONNECTION"] = "postgresql://%s:%s@%s:5432/%s" % (
fence_username,
fence_password,
fence_host,
fence_database,
)

config["USER_API"] = "https://%s/user" % conf_data["hostname"] # for use by authutils
config["USER_API"] = "https://%s/user" % conf_data.get(
"hostname", os.environ.get("CONF_HOSTNAME", "localhost")
) # for use by authutils
# use the USER_API URL instead of the public issuer URL to accquire JWT keys
config["FORCE_ISSUER"] = True
config["DICTIONARY_URL"] = environ.get(
config["DICTIONARY_URL"] = os.environ.get(
"DICTIONARY_URL",
"https://s3.amazonaws.com/dictionary-artifacts/datadictionary/develop/schema.json",
)

app_init(app)
application = app
application.debug = environ.get("GEN3_DEBUG") == "True"
application.debug = os.environ.get("GEN3_DEBUG") == "True"
2 changes: 1 addition & 1 deletion bin/setup_psqlgraph.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import logging
from sqlalchemy import create_engine

from gdcdatamodel.models import *
from gen3datamodel.models import *
from psqlgraph import create_all, Node, Edge


Expand Down
2 changes: 1 addition & 1 deletion bin/setup_transactionlogs.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

import argparse
from sqlalchemy import create_engine
from gdcdatamodel.models.submission import Base
from gen3datamodel.models.submission import Base


def setup(host, port, user, password, database, use_ssl=False):
Expand Down
44 changes: 44 additions & 0 deletions deployment/nginx/nginx.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
user gen3;
worker_processes auto;
error_log /var/log/nginx/error.log notice;
pid /var/lib/nginx/nginx.pid;

# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;

events {
worker_connections 1024;
}

http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';

access_log /var/log/nginx/access.log main;

sendfile on;
tcp_nopush on;
keepalive_timeout 65;
types_hash_max_size 4096;

include /etc/nginx/mime.types;
default_type application/octet-stream;

# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/*.conf;

server {

listen 80;
server_name localhost;

location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
}
32 changes: 0 additions & 32 deletions deployment/uwsgi/uwsgi.ini

This file was deleted.

17 changes: 17 additions & 0 deletions deployment/wsgi/gunicorn.conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# /sheepdoog/bin/settings.py
wsgi_app = "bin.settings:application"
bind = "0.0.0.0:8000"
workers = 4
preload_app = False
user = "gen3"
group = "gen3"
timeout = 300
keepalive = 2
keepalive_timeout = 5

import random
import time


def pre_fork(server, worker):
time.sleep(random.uniform(0, 2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add some comments about the reasoning for this? And is there a reason we do this here but not in other services like indexd?

1 change: 1 addition & 0 deletions deployment/wsgi/wsgi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from sheepdog.api import app as application
5 changes: 5 additions & 0 deletions dockerrun.bash
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

gunicorn -c "/sheepdog/deployment/wsgi/gunicorn.conf.py" &
sleep 30
EliseCastle23 marked this conversation as resolved.
Show resolved Hide resolved
nginx -g 'daemon off;'
2 changes: 1 addition & 1 deletion docs/local_dev_environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ For convenience, the minimal usage looks like the following:
import datamodelutils
from dictionaryutils import dictionary
from gdcdictionary import gdcdictionary
from gdcdatamodel import models, validators
from gen3datamodel import models, validators
from flask import Flask
import sheepdog

Expand Down
Loading
Loading