Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interview Transcriber #7

Open
wants to merge 38 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
32fb872
Created db structure in models and schemas
colettebas Mar 15, 2022
382c815
created crug and main methods, no testing
colettebas Mar 15, 2022
bb4c981
New workflow file created
sdub18 Mar 23, 2022
445d1cd
Added docker config files
sdub18 Mar 23, 2022
3abc828
Resolved error with the format
sdub18 Mar 23, 2022
bcad4a4
removed testing line
sdub18 Mar 23, 2022
7f562c8
fixed syntax error
sdub18 Mar 23, 2022
14bf558
remove artifict code . not sure what it does
sdub18 Mar 23, 2022
3b21294
Create docker-image.yml
sdub18 Mar 23, 2022
8fe0ad3
made repo name all lowercase
sdub18 Mar 23, 2022
6125077
Merge pull request #9 from UMass-Rescue/packaging
sdub18 Mar 23, 2022
5a458c1
Some small upgrades to the docker deploy workflow
sdub18 Mar 23, 2022
0d06c7d
renamed repo to try and create package
sdub18 Mar 23, 2022
705f876
Update README.md
sdub18 Mar 23, 2022
336a50b
Think i might have fixed it
sdub18 Mar 23, 2022
8cbe0ff
Merge branch 'main' of https://github.com/UMass-Rescue/596-S22-Backend
sdub18 Mar 23, 2022
711bf40
repo name change
sdub18 Mar 23, 2022
5b81ec0
This one should definitely work
sdub18 Mar 23, 2022
05a3df6
Added package name to the repo
sdub18 Mar 23, 2022
2885ed4
added to the docker file
sdub18 Mar 24, 2022
26db7a3
Added depends on clause to the docker compose
sdub18 Mar 24, 2022
ab3ba31
Merge pull request #10 from UMass-Rescue/docker-alembic
sdub18 Mar 24, 2022
5e37e76
added alembic file to Dockerfile
sdub18 Mar 24, 2022
d5a4ca6
Merge pull request #11 from UMass-Rescue/docker-alembic
sdub18 Mar 24, 2022
5111c48
Added alembic files to the requirements.txt and also changed location…
sdub18 Mar 24, 2022
e7bf663
Trying to get this startup file to run and check alembic. Not really …
sdub18 Mar 24, 2022
e5a1571
Merge pull request #12 from UMass-Rescue/docker-alembic
sdub18 Mar 24, 2022
c03f644
Trying to change tag, may break
sdub18 Mar 24, 2022
f9cedb0
Merge pull request #13 from UMass-Rescue/docker-alembic
sdub18 Mar 24, 2022
bf92622
changed tag to tagfs
sdub18 Mar 24, 2022
a13eddf
Update README.md
sdub18 Mar 25, 2022
e47ef69
nearly got it working but startup.sh is failing on the mysqladmin part
sdub18 Mar 25, 2022
37eaf5b
commented out the alembic stuff for right now. realized it was using …
sdub18 Mar 25, 2022
fbce755
trying to use pg_isready to check the server
sdub18 Mar 25, 2022
eef82ef
Got it workinggit add .
sdub18 Mar 25, 2022
6e03850
removed unwanted code
sdub18 Mar 25, 2022
1e029dc
Merge pull request #14 from UMass-Rescue/auto-alembic
sdub18 Mar 25, 2022
8f05c79
resolved issues with merge conflicts. looking over file additions now
sdub18 Mar 27, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions .github/workflows/docker-image-deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@

name: Docker Image CD

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
Build-and-Push-Image:
runs-on: ubuntu-latest
name: Docker Build, Tag, Push

steps:
- name: Checkout
uses: actions/checkout@v1
- name: Build container image
uses: docker/build-push-action@v1
with:
username: ${{github.actor}}
password: ${{secrets.GITHUB_TOKEN}}
registry: docker.pkg.github.com
repository: umass-rescue/596-s22-backend/backend
tags: latest
18 changes: 18 additions & 0 deletions .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: Docker Image CI

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:

build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Build the Docker image
run: docker build . --file Dockerfile --tag my-image-name:$(date +%s)
11 changes: 10 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,15 @@ WORKDIR /rescue

COPY requirements.txt requirements.txt

RUN apt-get update && apt-get install -f -y postgresql-client

RUN pip install --no-cache-dir --upgrade -r /rescue/requirements.txt

COPY ./app /rescue/app
COPY alembic/ ./alembic/
COPY alembic.ini .
COPY startup.sh .

COPY ./app /rescue/app

# Run alembic configuration
CMD ["./startup.sh"]
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,20 @@ Backend Repo for 596RL Spring 2022
- SQLAlchemy (Locally for database manipulation)
```

### 🤝 How to Run the Package in Another Repo
To learn how to run the backend in YOUR repo, follow this [guide](https://www.notion.so/Setting-Up-Backend-Docker-Container-dd2ce1e805e84b44a245991c96d46591) here.


### 👀 Configuring Secrets
To successfully connect to the database, we use a `.env` file, which you will need to generate using:
`touch .env` in the backend directory. This is an untracked file in our repo. Here you will need to paste in the secrets of the backend. DM Sam DuBois and he will send you the file.

```
# PostgreSQL Container Secrets
POSTGRES_USER=
POSTGRES_PASSWORD=
POSTGRES_DB=
```

### 🚀 How to Run the Container

Expand All @@ -34,8 +44,8 @@ Make sure the Docker and the postgresSQL library are working.

From there, all you need to do to run the container is:
```
docker-compose up -d --build
docker-compose up
docker compose build
docker compose up
```

### 🧪 Useful Tools for Debugging and Testing
Expand Down
2 changes: 1 addition & 1 deletion alembic.ini
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ version_path_separator = os # Use os.pathsep. Default configuration used for ne
# are written from script.py.mako
# output_encoding = utf-8

sqlalchemy.url = postgresql://username:password@localhost:5432/default_database
sqlalchemy.url = postgresql://username:password@db:5432/default_database


[post_write_hooks]
Expand Down
76 changes: 75 additions & 1 deletion app/crud.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,4 +119,78 @@ def get_footage_for_plate_id(license_plate_number: str, db: Session, skip: int =
if plate.footage_id not in data:
data[plate.footage_id] = db.query(models.LicenseFootage).filter(models.LicenseFootage.id == plate.footage_id).offset(skip).limit(limit).all()[0]

return list(data.values())
return list(data.values())

# # Create Interview object with audio file name
# ##THIS IS ASSUMING AUDIO FILES HAS ALREADY BEEN SAVED TO DATABASE
# def create_interview(audio_filename: str, db: Session):

# # Insert code to retrieve audio file from db

# ####################
# ## YOUR CODE HERE ##
# ####################

# # return type - audio file of interview
# # NEED TO FIGURE OUT IF AUDIO FILE OR PATH OF FILE SHOULD BE RETURNED

# #following this post: https://stackoverflow.com/questions/64558200/python-requests-in-docker-compose-containers
# path = "the path to the server" + "/sendTranscription"
# response = session.post(path, json={"audio_filename": audio_filename})

# #get the full text transcription from the response
# reponse_json = json.loads(response.text)
# full_text = reponse_json['transcription']

# # Add Interview object
# db_message = models.Interview(filename=filename, full_name = interview.full_name,
# created_at = interview.created_at,
# address = interview.address,
# case = interview.case, full_text = full_text)
# db.add(db_message)
# db.commit()
# db.refresh(db_message)

# return db_message

# # Create Interview object with audio file name
# ##THIS IS ASSUMING AUDIO FILES HAS ALREADY BEEN SAVED TO DATABASE
# def analyze_interview(case: int, audio_filename: str, db: Session):

# #get full text for this interview
# full_text = db.query(models.Interview).filter(models.Interview.filename == audio_filename).offset(skip).limit(limit).all()

# #get list of questions for this interview
# questions = db.query(models.Question).filter(models.Question.case == case).offset(skip).limit(limit).all()
# add_questions = db.query(models.Additional_Question).filter(models.Additional_Question.interview_id == question_answer_pair.interview_id).offset(skip).limit(limit).all()
# questions = questions + add_questions

# #following this post: https://stackoverflow.com/questions/64558200/python-requests-in-docker-compose-containers
# path = "the path to the server" + "/analyzeText"
# response = session.post(path, json={"full_text": full_text, "questions": questions})

# #get the json from the response
# reponse_json = json.loads(response.text)
# all_pairs = reponse_json['transcription']

# # Add each pair and NER
# # CAN I ADD MULTIPLE OBJECTS TO THE DB IN ONE COMMIT?
# for pair in all_pairs:
# db_pair_message = models.Question_Answer_Pair(interview_id=interview.id, question=pair['question'], answer=pair['answer'])
# db.add(db_pair_message)
# db.flush() #allows for the primary key of the pair to be generated
# for ner in pair['ner']:
# db_ner_message = models.Answer_NER(question_answer_pair_id=db_pair_message.id, label=ner['label'], answer=pair['answer'])
# db.add(db_ner_message)
# db.commit()
# db.refresh(db_pair_message)

# return all_pairs

# # Get all question and answer pairs for an interview
# def get_question_answer_pairs(interview_id: int, db: Session, skip: int = 0, limit: int = 100):
# return db.query(models.Question_Answer_Pair).filter(models.Question_Answer_Pair.interview_id == interview_id).offset(skip).limit(limit).all()

# # Get all ner objects for an answer
# def get_ners_for_answer(question_answer_pair_id: int, db: Session, skip: int = 0, limit: int = 100):
# return db.query(models.Answer_NER).filter(models.Answer_NER.question_answer_pair_id == question_answer_pair_id).offset(skip).limit(limit).all()
26 changes: 25 additions & 1 deletion app/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,31 @@ def get_plates_for_footage_id(footage_id: int, skip: int = 0, limit: int= 100, d
plates = crud.get_license_plates_for_filename(footage_id=footage_id, skip=skip, limit=limit, db=db)
return plates

# Route - GET - get all footage for a specific plate number
@app.get("/licenses/plates/{license_plate_number}", response_model=List[schemas.LicenseFootage])
def get_footage_for_plate_id(license_plate_number: str, skip: int = 0, limit: int = 100, db: Session = Depends(get_db)):
footage = crud.get_footage_for_plate_id(license_plate_number=license_plate_number, skip=skip, limit=limit, db=db)
return footage
return footage

# # Route - POST - Create a interview object and add transcribed full text
# @app.post("/transciber/{audio_filename}", response_model=schemas.LicenseFootage)
# def create_interview(license_footage: schemas.CreateLicenseFootageObj, db:Session = Depends(get_db)):
# return crud.create_interview(audio_filename=audio_filename, db=db)

# # Route - POST - Create question_answer_pairs and ners from analyzed text
# #HOW ARE WE GOING TO DEAL WITH CASE NUMBERS? INCLUDE IN POST REQUEST?
# @app.post("/transciber/{audio_filename}/analyze", response_model=schemas.LicenseFootage)
# def create_interview(license_footage: schemas.CreateLicenseFootageObj, db:Session = Depends(get_db)):
# return crud.analyze_interview(case=case, audio_filename=audio_filename, db=db)

# # Route - GET - get all question answer pairs for an interview_id
# @app.get("/transcriber/{interview_id}/question_answer_pairs", response_model=List[schemas.Question_Answer_Pair])
# def get_question_answer_pair_for_interview_id(interview_id: int, skip: int = 0, limit: int= 100, db: Session = Depends(get_db)):
# pairs = crud.get_question_answer_pairs(interview_id=interview_id, skip=skip, limit=limit, db=db)
# return pairs

# # Route - GET - get all ners for an question_answer_pair_id
# @app.get("/transcriber/{question_answer_pair_id}/ners", response_model=List[schemas.Question_Answer_Pair])
# def get_ner_for_question_answer_pair_id(question_answer_pair_id: int, skip: int = 0, limit: int= 100, db: Session = Depends(get_db)):
# ners = crud.get_ners_for_answer(question_answer_pair_id=question_answer_pair_id, skip=skip, limit=limit, db=db)
# return ners
51 changes: 50 additions & 1 deletion app/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,53 @@ class RecognizedPlate(Base):
time = Column(TIMESTAMP(timezone=False))
footage_id = Column(Integer, ForeignKey("license_footage.id"))

footage = relationship("LicenseFootage", back_populates="recognized_plates")
footage = relationship("LicenseFootage", back_populates="recognized_plates")

class Interview(Base):
__tablename__ = "interview"

id = Column(Integer, primary_key=True, index=True)
filename = Column(String, index=True)
full_name = Column(String, index=True)
created_at = Column(TIMESTAMP(timezone=False))
address = Column(String, index=True)
case = Column(Integer, ForeignKey("license_footage.id"))
full_text = Column(String, index=True)

interview = relationship("Interview", back_populates="interview")

class Question_Answer_Pair(Base):
__tablename__ = "question_answer_pair"

id = Column(Integer, primary_key=True, index=True)
interview_id = Column(Integer, ForeignKey("interview.id")) ##Please check if this is correct
question = Column(String, index=True)
answer = Column(String, index=True)

question_answer_pair = relationship("Question_Answer_Pair", back_populates="question_answer_pair")

class Question(Base):
__tablename__ = "question"

case = Column(Integer, index=True)
question = Column(String, index=True)

question = relationship("Question", back_populates="question")

class Additional_Question(Base):
__tablename__ = "additional_question"

interview_id = Column(Integer, ForeignKey("interview.id")) ##Please check if this is correct
question = Column(String, index=True)

additional_question = relationship("Additional_Question", back_populates="additional_question")

class Answer_NER(Base):
__tablename__ = "answer_ner"

question_answer_pair_id = Column(Integer, ForeignKey("question_answer_pair.id")) ##Please check if this is correct
ner_label = Column(String, index=True)
start_index = Column(Integer, index=True)
end_index = Column(Integer, index=True)

answer_ner = relationship("Answer_NER", back_populates="answer_ner")
73 changes: 73 additions & 0 deletions app/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,5 +85,78 @@ class LicenseFootage(LicenseFootageBase):
date_uploaded = datetime.now()
recognized_plates: List[RecognizedPlate] = []

class Config:
orm_mode = True


############ Transcribed Interview Data #############

class InterviewCreate(BaseModel):
filename: str
full_name: str
created_at: datetime = datetime
address: str
case: int

class Interview(BaseModel):
id: int
filename: str
full_name: str
created_at: datetime = datetime
address: str
case: int
full_text: str

class Config:
orm_mode = True

class Question_Answer_PairCreate(BaseModel):
interview_id: int ##I don't know how to link this to another table
question: str
answer: str

class Question_Answer_Pair(BaseModel):
id: int
interview_id: int ##I don't know how to link this to another table
question: str
answer: str

class Config:
orm_mode = True

class QuestionCreate(BaseModel):
case: int
question: str

class Question(BaseModel):
case: int
question: str

class Config:
orm_mode = True

class Additional_QuestionCreate(BaseModel):
interview_id: int ##I don't know how to link this to another table
question: str

class Additional_Question(BaseModel):
interview_id: int ##I don't know how to link this to another table
question: str

class Config:
orm_mode = True

class Answer_NERCreate(BaseModel):
question_answer_pair_id: int ##I don't know how to link this to another table
ner_label: str
start_index: int
end_index: int

class Answer_NER(BaseModel):
question_answer_pair_id: int ##I don't know how to link this to another table
ner_label: str
start_index: int
end_index: int

class Config:
orm_mode = True
23 changes: 12 additions & 11 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,25 @@ version: '3.8'

# Setting up the PostgreSQL DB Container
services:
db:
image: 'postgres:latest'
restart: always
env_file: # The location we use to share all of our secrets
- .env
volumes:
- ./db-data/:/var/lib/postgresql/data/
ports:
- 5432:5432

server:
build:
context: ./
dockerfile: Dockerfile
depends_on:
- db
volumes:
- ./be-data/:/backend/
command: uvicorn app.main:app --host 0.0.0.0 --port 8000
env_file:
- .env
ports:
- 8000:8000

db:
image: 'postgres:latest'
restart: always
env_file: # The location we use to share all of our secrets
- .env
volumes:
- ./db-data/:/var/lib/postgresql/data/
ports:
- 5432:5432
Loading