Skip to content

Commit 950f88a

Browse files
authored
Adding a Dockerfile for devcontainer (run-llama#43)
* add Dockerfile for devcontainer * add nodejs installation * more deps needed for node.js installation * fix * fix * fix * fix * fix * update README + add Codespace origin if running in codespaces * use docker compose yaml * rename file * fix allow_origins set to include codespace origin * use features + remove workspace compose yaml * apt-get update * remove npm & poetry install * add nodejs feature * address variety of more issues with running on codespaces + issues with make migate * add one more step to backend readme * remove loose console log
1 parent cf5338b commit 950f88a

File tree

9 files changed

+80
-8
lines changed

9 files changed

+80
-8
lines changed

.devcontainer/Dockerfile

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# https://hub.docker.com/_/python
2+
FROM python:3.11.3-slim-bullseye
3+
4+
ENV PYTHONUNBUFFERED True
5+
# Install other backend deps
6+
RUN apt-get update
7+
RUN apt-get install libpq-dev gcc build-essential wkhtmltopdf s3fs -y
8+
RUN pip install poetry==1.6.1
9+
# Install frontend node modules
10+
ENV APP_HOME /app
11+
COPY . $APP_HOME
12+
13+
CMD ["/bin/bash"]

.devcontainer/devcontainer.json

+7-2
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,15 @@
11
{
2-
"image": "mcr.microsoft.com/devcontainers/universal:2",
2+
"name": "sec_insights",
3+
"build": {
4+
"dockerfile": "./Dockerfile",
5+
"context": ".."
6+
},
37
"features": {
48
"ghcr.io/devcontainers-contrib/features/pipx-package:1": {},
59
"ghcr.io/devcontainers-contrib/features/poetry:2": {},
610
"ghcr.io/warrenbuckley/codespace-features/sqlite:1": {},
711
"ghcr.io/devcontainers/features/docker-in-docker:2": {},
8-
"ghcr.io/devcontainers/features/aws-cli:1": {}
12+
"ghcr.io/devcontainers/features/aws-cli:1": {},
13+
"ghcr.io/devcontainers/features/node:1": {}
914
}
1015
}

backend/Makefile

+2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ migrate:
1313
docker compose create db
1414
docker compose start db
1515
poetry run python -m alembic upgrade head
16+
# workaround for having PGVector create its tables
17+
poetry run python -m scripts.build_vector_tables
1618

1719
refresh_db:
1820
# First ask for confirmation.

backend/README.md

+19-5
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@ Live at https://secinsights.ai/
33
## Setup Dev Workspace
44
1. Install [pyenv](https://github.com/pyenv/pyenv#automatic-installer) and then use it to install the Python version in `.python-version`.
55
1. install pyenv with `curl https://pyenv.run | bash`
6+
* This step can be skipped if you're running from the devcontainer image in Github Codespaces
67
1. [Install docker](https://docs.docker.com/engine/install/)
8+
* This step can be skipped if you're running from the devcontainer image in Github Codespaces
79
1. Run `poetry shell`
810
1. Run `poetry install` to install dependencies for the project
911
1. Create the `.env` file and source it. The `.env.development` file is a good template.
@@ -17,6 +19,13 @@ Live at https://secinsights.ai/
1719
- This spins up the Postgres 15 DB & Localstack in their own docker containers.
1820
- The server will not run in a container but will instead run directly on your OS.
1921
- This is to allow for use of debugging tools like `pdb`
22+
1. Lastly, you will likely want to populate your local database with some sample SEC filings
23+
- We have a script for this! But first, open your `.env` file and replace the placeholder values for the `OPENAI_API_KEY` with your own OpenAI API key
24+
- At some point you will want to do the same for the other secret keys in here like `POLYGON_IO_API_KEY`, `AWS_KEY`, & `AWS_SECRET`
25+
- Source the file again with `set -a` then `source .env`
26+
- Run `make seed_db_local`
27+
- If this step fails, you may find it helpful to run `make refresh_db` to wipe your local database and re-start with emptied tables.
28+
- Done 🏁! You can run `make run` again and you should see some documents loaded at http://localhost:8000/api/document
2029

2130
## Scripts
2231
The `scripts/` folder contains several scripts that are useful for both operations and development.
@@ -71,23 +80,28 @@ These steps assume you've already followed the steps above for setting up your d
7180

7281
1. Setup AWS CLI
7382
1. Install AWS CLI
74-
- `curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"`
75-
- `unzip awscliv2.zip`
76-
- `sudo ./aws/install`
83+
- This step can be skipped if you're running from the devcontainer image in Github Codespaces
84+
- Steps:
85+
- `curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"`
86+
- `unzip awscliv2.zip`
87+
- `sudo ./aws/install`
7788
1. Configure AWS CLI
7889
- This is mainly to set the AWS credentials that will later be used by s3fs
7990
- Run `aws configure` and enter the access key & secret key for a AWS IAM user that has access to the PDFs where you want to store the SEC files.
8091
- set the default AWS region to `us-east-1` (what we're primarily using).
8192
1. Setup [`s3fs`](https://github.com/s3fs-fuse/s3fs-fuse)
8293
1. Install s3fs
94+
- This step can be skipped if you're running from the devcontainer image in Github Codespaces
8395
- `sudo apt install s3fs`
8496
1. Setup a s3fs mounted folder
8597
- Create the mounted folder locally `mkdir ~/mounted_folder`
8698
- `s3fs llama-app-web-assets-preview ~/mounted_folder`
8799
- You can replace `llama-app-web-assets-preview` with the name of the S3 bucket you want to upload the files to.
88100
1. Install [`wkhtmltopdf`](https://wkhtmltopdf.org/)
89-
- `sudo apt-get update`
90-
- `sudo apt-get install wkhtmltopdf`
101+
- This step can be skipped if you're running from the devcontainer image in Github Codespaces
102+
- Steps:
103+
- `sudo apt-get update`
104+
- `sudo apt-get install wkhtmltopdf`
91105
1. Get into your poetry shell with `poetry shell` from the project's root directory.
92106
1. Run the script! `python scripts/download_sec_pdf.py -o ~/mounted_folder --file-types="['10-Q','10-K']"`
93107
- Take a 🚽 break while it's running, it'll take a while!

backend/app/core/config.py

+2
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@ class Settings(PreviewPrefixedSettings):
7070
LOG_LEVEL: str = "DEBUG"
7171
IS_PULL_REQUEST: bool = False
7272
RENDER: bool = False
73+
CODESPACES: bool = False
74+
CODESPACE_NAME: Optional[str]
7375
S3_BUCKET_NAME: str
7476
S3_ASSET_BUCKET_NAME: str
7577
CDN_BASE_URL: str

backend/app/main.py

+6-1
Original file line numberDiff line numberDiff line change
@@ -97,10 +97,15 @@ async def lifespan(app: FastAPI):
9797

9898

9999
if settings.BACKEND_CORS_ORIGINS:
100+
origins = settings.BACKEND_CORS_ORIGINS.copy()
101+
if settings.CODESPACES and settings.CODESPACE_NAME and \
102+
settings.ENVIRONMENT == AppEnvironment.LOCAL:
103+
# add codespace origin if running in Github codespace
104+
origins.append(f"https://{settings.CODESPACE_NAME}-3000.app.github.dev")
100105
# allow all origins
101106
app.add_middleware(
102107
CORSMiddleware,
103-
allow_origins=settings.BACKEND_CORS_ORIGINS,
108+
allow_origins=origins,
104109
allow_origin_regex="https://llama-app-frontend.*\.vercel\.app",
105110
allow_credentials=True,
106111
allow_methods=["*"],
+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
from fire import Fire
2+
from app.schema import Document
3+
from app.db.session import SessionLocal
4+
from app.chat.pg_vector import get_vector_store_singleton
5+
import asyncio
6+
7+
async def build_vector_tables():
8+
vector_store = await get_vector_store_singleton()
9+
await vector_store.run_setup()
10+
11+
12+
def main_build_vector_tables():
13+
"""
14+
Script to build the PGVector table if they don't already exist
15+
"""
16+
asyncio.run(build_vector_tables())
17+
18+
if __name__ == "__main__":
19+
Fire(main_build_vector_tables)

frontend/src/config.js

+8
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
11
import { env } from "~/env.mjs";
2+
3+
if (env.NEXT_PUBLIC_CODESPACES === 'true' && env.NEXT_PUBLIC_CODESPACE_NAME) {
4+
const suggestedUrl = `https://${env.NEXT_PUBLIC_CODESPACE_NAME}-8000.app.github.dev/`;
5+
if (!env.NEXT_PUBLIC_BACKEND_URL.startsWith(suggestedUrl)) {
6+
console.warn(`It looks like you're running on a Github codespace. You may want to set the NEXT_PUBLIC_BACKEND_URL environment variable to ${suggestedUrl}`);
7+
}
8+
}
9+
210
export const backendUrl = env.NEXT_PUBLIC_BACKEND_URL;
311

frontend/src/env.mjs

+4
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ export const env = createEnv({
1717
*/
1818
client: {
1919
NEXT_PUBLIC_BACKEND_URL: z.string().min(1),
20+
NEXT_PUBLIC_CODESPACES: z.string().default("false").optional(),
21+
NEXT_PUBLIC_CODESPACE_NAME: z.string().optional(),
2022
},
2123

2224
/**
@@ -26,6 +28,8 @@ export const env = createEnv({
2628
runtimeEnv: {
2729
NODE_ENV: process.env.NODE_ENV,
2830
NEXT_PUBLIC_BACKEND_URL: process.env.NEXT_PUBLIC_BACKEND_URL,
31+
NEXT_PUBLIC_CODESPACES: process.env.CODESPACES,
32+
NEXT_PUBLIC_CODESPACE_NAME: process.env.CODESPACE_NAME,
2933
},
3034
/**
3135
* Run `build` or `dev` with `SKIP_ENV_VALIDATION` to skip env validation.

0 commit comments

Comments
 (0)