Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging #1

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Logging #1

wants to merge 11 commits into from

Conversation

tekrajchhetri
Copy link
Collaborator

@tekrajchhetri tekrajchhetri commented Apr 16, 2024

This is a FastAPI skeleton that currently implements logging features.

@djarecka @aaronkanzer

@aaronkanzer
Copy link

@tekrajchhetri Thanks for tagging me -- let me know if/when you'd like any feedback! (saw you were still pushing commits so didn't want to review anything yet)

@tekrajchhetri
Copy link
Collaborator Author

@aaronkanzer There are no more changes to be made for the logging. So, I would say you can now review and give me feedback so that I could improve if any! And thanks in advance for your time to review

@aaronkanzer
Copy link

aaronkanzer commented Apr 16, 2024

@aaronkanzer There are no more changes to be made for the logging. So, I would say you can now review and give me feedback so that I could improve if any! And thanks in advance for your time to review

Sounds good -- I'll review today -- just so I have context and can review best, what is your overall goal for your PR here? Is this PR a starting point for a larger Python-based API you are building out? Or something else? (I noticed there are also some schema definitions here via SQLAlchemy, so just wanted to ask)

I see some context in your README.md, but are you eventually setting up this API to handle requests regarding LLM-related functionality?

@tekrajchhetri
Copy link
Collaborator Author

@aaronkanzer Yes, the goal is to build larger Python based API (or microservices application) for the work that I am doing. And since there will be multiple microservices components, I wanted to setup a skeleton so that I can re-use it.

Yes, you understood correctly. There will be both ML and Non-ML components. For ML, e.g., LLM, I would re-use this skeleton to create new one making changes as required. Therefore, I would say that this is mostly for non-ml, although most of the things shouldn't change, in my view.

Here's a very-high level initial architecture, which should provide you more context about the work.

image

@tekrajchhetri tekrajchhetri added the enhancement New feature or request label Apr 16, 2024
@aaronkanzer
Copy link

aaronkanzer commented Apr 16, 2024

@aaronkanzer Yes, the goal is to build larger Python based API (or microservices application) for the work that I am doing. And since there will be multiple microservices components, I wanted to setup a skeleton so that I can re-use it.

Yes, you understood correctly. There will be both ML and Non-ML components. For ML, e.g., LLM, I would re-use this skeleton to create new one making changes as required. Therefore, I would say that this is mostly for non-ml, although most of the things shouldn't change, in my view.

Here's a very-high level initial architecture, which should provide you more context about the work.

image

Thanks for this architecture diagram -- a few of thoughts/questions after reading your code:

• I'd re-evaluate how you go about authentication/permissions -- managed services like AWS Cognito will allow you to focus more on the actual science you are doing here (Knowledge Graph exploration with DANDI, etc.) instead of things like authentication which doesn't need to be custom-coded unless there is something special about your application. I think going the managed route will save you a lot of time and headache.

• I'd strongly suggest evaluating an architecture that leverages serverless technologies (look into AWS API Gateway + Lambdas + Cognito) -- LLM-related endpoints require a lot of bug-free code to not time out/obstruct other operations of your server if you are not in a serverless framework. AWS Lamdbas have even implemented streaming support so you can provide quicker feedback in your UI for LLM-related activities that may take more time behind the scenes

• Do you have plans for a vectorized DB to best handle context for your LLM-related operations? (something like Pinecone is what I'm thinking). The reason I ask is so that you can disregard SQLAlchemy (and, by association, Alembic for your downstream ORM operations) and use something more simplistic and faster for anything that wouldn't fit into a Pinecone instance (such as DynamoDB)

Let me know if you have any further questions/thoughts -- happy to talk further.

(Cc @satra @hvgazula -- just tagging you here for context of some of the challenges that currently present themselves with packaging web apps around LLMs)

requirements.txt Outdated
@@ -0,0 +1,22 @@
fastapi==0.110.1
uvicorn[standard]==0.29.0
sqlalchemy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You pinned other versions, but not this -- any reason why?

# @File : index.py
# @Software: PyCharm
from fastapi import APIRouter
from fastapi import Header

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you are using this import here -- can probably remove

core/main.py Outdated
logger = logging.getLogger(__name__)

app = FastAPI()
configure_logging()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can throw this call in one of the FastAPI lifecycle events upon startup

https://fastapi.tiangolo.com/advanced/events/#startup-event

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aaronkanzer. I will look into your comments and update the code accordingly.

@tekrajchhetri
Copy link
Collaborator Author

@aaronkanzer I have updated the code as per your comments.

  • configure_logging() will be triggered on startup
  • removed unused import - Header
  • sqlalchemy and other JWT related library have been removed as they're there for JWT based authentication.

Comment on lines +19 to +43
import datetime

import sqlalchemy as sa

# https://docs.sqlalchemy.org/en/20/core/metadata.html
metadata = sa.MetaData()

# JWT User Table
jwt_user_table = sa.Table(
"jwt_user",
metadata,
sa.Column("id", sa.Integer, primary_key=True),
sa.Column("email", sa.String, unique=True, nullable=False),
sa.Column("password", sa.String, nullable=False),
sa.Column(
"created_at", sa.DateTime, nullable=False, default=datetime.datetime.utcnow
),
sa.Column(
"updated_at",
sa.DateTime,
nullable=False,
default=datetime.datetime.utcnow,
onupdate=datetime.datetime.utcnow,
),
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed SQLAlchemy -- so I don't think this file is necessary at this time? Unless you had other intentions for it?

requirements.txt Outdated
@@ -0,0 +1,15 @@
fastapi==0.110.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What commands are you running locally to install your requirements? Are you doing this via a local virtual environment? Docker container?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will containerize the application but later will create a separate pull request. For now I am running in a local virtual environment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you are using poetry as well (via the pyproject.toml file present in your PR) -- it might be useful to just use poetry to manage your dependencies as well here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I was using pyproject.toml for pycharm config but now I have moved all the dependencies to the poetry.

@aaronkanzer aaronkanzer mentioned this pull request Apr 22, 2024
@djarecka
Copy link

djarecka commented Apr 24, 2024

@tekrajchhetri - I don't really have experience with it, so @aaronkanzer is much better as the code reviewer. However I would be happy to test something if I get some instructions. This might also give me better idea, what you are implementing here

@aaronkanzer
Copy link

@tekrajchhetri I think this PR and #2 can be merged! We can continue to iterate here as more of your application code is written

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants