Skip to content

Commit fb3bfdd

Browse files
Merge pull request #55 from DigitalProductInnovationAndDevelopment/test/llm
add test for api and llm
2 parents 1c69ac9 + 253ecbe commit fb3bfdd

21 files changed

+685
-147
lines changed

.github/workflows/license-compliance.yml

+2
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ jobs:
2525
run: |
2626
python -m venv venv
2727
. venv/bin/activate
28+
pip install pipenv
29+
pipenv requirements > requirements.txt
2830
pip install -r requirements.txt
2931
3032
- name: Check licenses

README.md

+4
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,10 @@ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file
151151
This Project was created in the context of TUMs practical course [Digital Product Innovation and Development](https://www.fortiss.org/karriere/digital-product-innovation-and-development) by fortiss in the summer semester 2024.
152152
The task was suggested by Siemens, who also supervised the project.
153153

154+
## FAQ
155+
156+
Please refer to [FAQ](documentation/FAQ.md)
157+
154158
## Contact
155159

156160
To contact fortiss or Siemens, please refer to their official websites.

documentation/00 - introduction.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,9 @@ For more detailed information about the API routes, refer to the [API Routes](ap
6767

6868
- **[Prerequisites](01%20-%20prerequisites):** Environment setup and dependencies.
6969
- **[Installation](02%20-%20installation):** Step-by-step installation guide.
70-
- **[Usage](04%20-%20usage):** Instructions on how to use the system.
70+
- **[Usage](03%20-%20usage):** Instructions on how to use the system.
71+
72+
- **[Testing](04-testing):** Provides an explanation of the testing strategy and rules.
7173

7274
---
7375

documentation/04-testing.md

+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Testing
2+
3+
We have a basic testing setup in place that covers database insertion and API retrieval.
4+
5+
The tests are located inside the `src/tests` directory and can be executed using the following command:
6+
7+
```bash
8+
pipenv run pytest
9+
#or
10+
pytest
11+
```
12+
13+
We also have a GitHub test workflow set up in `test.yml` that runs on every pull request and merge on the main branch to ensure that the tests are passing.
14+
15+
## Adding new Tests
16+
17+
New tests can also be added to the folder `src/tests`.
18+
19+
The files must be prefixed with `test_` for pytest to recognize them.
20+
21+
A testing function should also be created with the prefix `test_`.
22+
23+
eg:
24+
25+
```python
26+
def test_create_get_task_integration():
27+
assert 1+1=2
28+
```
29+
30+
We use dependency overriding of Repositories with different databases to ensure that they do not interfere with your actual database. The dependencies can be overridden as shown below.
31+
32+
```python
33+
34+
from app import app
35+
36+
37+
SQLALCHEMY_DATABASE_URL = "sqlite://"
38+
39+
engine = create_engine(
40+
SQLALCHEMY_DATABASE_URL,
41+
connect_args={"check_same_thread": False},
42+
poolclass=StaticPool,
43+
)
44+
TestingSessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
45+
46+
47+
db_models.BaseModel.metadata.create_all(bind=engine)
48+
49+
50+
def override_get_db():
51+
try:
52+
db = TestingSessionLocal()
53+
yield db
54+
finally:
55+
db.close()
56+
57+
58+
def override_get_task_repository(session: Session = Depends(override_get_db)):
59+
return TaskRepository(session)
60+
61+
app.dependency_overrides[get_task_repository] = override_get_task_repository
62+
63+
```
64+
65+
For a comprehensive guide on testing with FastAPI, please refer to the [FASTAPI Testing](https://fastapi.tiangolo.com/tutorial/testing/) documentation.
66+
67+
Note:
68+
It is always challenging to perform integration testing, especially when dealing with LLMS and queues. However, the API endpoints have been thoroughly tested to ensure accurate responses.

documentation/FAQ.md

+15-7
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,29 @@
11
# Frequently Asked Questions
22

3-
A compiled list of FAQs that may come in handy.
3+
A compiled list of frequently asked questions that may come in handy.
44

5-
## Import ot found?
5+
## Import not found?
66

7-
If you are getting an "import not found" error, it is likely because the base folder is always `src`. Always run the program from the `src` folder or use the commands inside the Makefile.
7+
If you are encountering an "import not found" error, it is likely because the base folder is always `src`. Make sure to run the program from the `src` folder or use the commands inside the Makefile.
88

9-
If you have something in `src/lib` and want to use it, import it as follows:
9+
If you have something in `src/lib` and want to use it, import it as shown below:
1010

1111
```python
1212
import lib # or
1313
from lib import ...
1414
```
1515

16+
# Why use POST call to retrieve recommendations and aggregated recommendations?
17+
18+
For these calls, we require a JSON type body with at least `{}`. This allows us to handle nested filters such as `task_id` and `severity` inside the filters. Using query parameters and GET calls would be less suitable for this purpose. However, one can modify the pathname, like changing `get-`, to make it more convenient.
19+
1620
# Why are env.docker and .env different?
1721

18-
If you are only running the program using Docker, then you only need to worry about `.env.docker`.
22+
If you are running the program exclusively using Docker, then you only need to concern yourself with `.env.docker`.
1923

20-
As the addresses tend to be different in a Docker environment compared to a local environment, you need different values to resolve the addresses.
24+
Since the addresses can differ between a Docker environment and a local environment, you need different values to resolve the addresses.
2125

22-
For example, if you have your program outside Docker (locally) and want to access a database, you may use:
26+
For example, if your program is outside Docker (locally) and you want to access a database, you may use:
2327

2428
```
2529
POSTGRES_SERVER=localhost
@@ -50,3 +54,7 @@ We have a predefined structure that input must adhere to called `Content`. You c
5054

5155
Inside the db model called `Findings`, there is a method `from_data` which can be modified to adapt the changes.
5256
`VulnerablityReport` also has `create_from_flama_json` that must be adjusted accordingly to make sure the Generation side also works.
57+
58+
# Does it make sense to Mock LLM?
59+
60+
While we don't strive for accuracy, it would still make sense to mock LLM methods to ensure that methods for finding and interacting properly with LLM class methods work correctly. Nevertheless, it is still difficult to extract meaningful test outputs based on only prompts as input.

requirements.txt

-25
This file was deleted.

run

-4
This file was deleted.

run.sh

-4
This file was deleted.

src/data/VulnerabilityReport.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ def set_llm_service(self, llm_service: "LLMServiceStrategy"):
3030
finding.llm_service = llm_service
3131
return self
3232

33-
def add_finding(self, finding):
33+
def add_finding(self, finding: Finding):
3434
self.findings.append(finding)
3535

3636
def get_findings(self):

src/data/repository/finding.py

-2
This file was deleted.

src/db/models.py

+8-6
Original file line numberDiff line numberDiff line change
@@ -102,23 +102,25 @@ class Finding(BaseModel):
102102

103103
def from_data(self, data: Content):
104104
self.cve_id_list = (
105-
[x.dict() for x in data.cve_id_list] if data.cve_id_list else []
105+
[x.model_dump() for x in data.cve_id_list] if data.cve_id_list else []
106106
)
107107
self.description_list = (
108-
[x.dict() for x in data.description_list] if data.description_list else []
108+
[x.model_dump() for x in data.description_list]
109+
if data.description_list
110+
else []
109111
)
110-
self.title_list = [x.dict() for x in data.title_list]
112+
self.title_list = [x.model_dump() for x in data.title_list]
111113
self.locations_list = (
112-
[x.dict() for x in data.location_list] if data.location_list else []
114+
[x.model_dump() for x in data.location_list] if data.location_list else []
113115
)
114-
self.raw_data = data.dict()
116+
self.raw_data = data.model_dump()
115117
self.severity = data.severity
116118
self.priority = data.priority
117119
self.report_amount = data.report_amount
118120
return self
119121

120122
def __repr__(self):
121-
return f"<Finding {self.finding}>"
123+
return f"<Finding {self.title_list}>"
122124

123125

124126
class TaskStatus(PyEnum):

src/repository/finding.py

+4-5
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,9 @@ def get_findings_by_task_id_and_filter(
7676
)
7777

7878
total = query.count()
79-
findings = query.all()
8079
if pagination:
8180
query = query.offset(pagination.offset).limit(pagination.limit)
81+
findings = query.all()
8282

8383
return findings, total
8484

@@ -95,10 +95,9 @@ def get_findings_count_by_task_id(self, task_id: int) -> int:
9595

9696
return count
9797

98-
def create_findings(
99-
self, findings: list[db_models.Finding]
100-
) -> list[db_models.Finding]:
101-
self.session.bulk_save_objects(findings)
98+
def create_findings(self, findings: list[db_models.Finding]):
99+
self.session.add_all(findings)
100+
102101
self.session.commit()
103102

104103

src/repository/recommendation.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ def create_recommendations(
6363
else "No long description available"
6464
),
6565
meta=f.solution.metadata if f.solution.metadata else {},
66-
search_terms=f.solution.search_terms if f.solution.search_terms else [],
66+
search_terms=f.solution.search_terms if f.solution.search_terms else "",
6767
finding_id=finding_id,
6868
recommendation_task_id=recommendation_task_id,
6969
category=(f.category.model_dump_json() if f.category else None),

src/routes/v1/recommendations.py

+7-7
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
import datetime
22
from typing import Annotated, Optional
33

4-
from fastapi import Body, Depends, HTTPException, Response
4+
from fastapi import Body, Depends, HTTPException
55
from fastapi.routing import APIRouter
6-
from sqlalchemy import Date, cast
76
from sqlalchemy.orm import Session
87

98
import data.apischema as apischema
109
import db.models as db_models
11-
from data.AggregatedSolution import AggregatedSolution
1210
from db.my_db import get_db
1311
from dto.finding import db_finding_to_response_item
1412
from repository.finding import get_finding_repository
15-
from repository.recommendation import (RecommendationRepository,
16-
get_recommendation_repository)
13+
from repository.recommendation import (
14+
RecommendationRepository,
15+
get_recommendation_repository,
16+
)
1717
from repository.task import TaskRepository, get_task_repository
1818
from repository.types import GetFindingsByFilterInput
1919

@@ -102,8 +102,8 @@ def aggregated_solutions(
102102
task = None
103103
if request.filter and request.filter.task_id:
104104
task = task_repository.get_task_by_id(request.filter.task_id)
105-
task = task_repository.get_task_by_date(today)
106-
105+
else:
106+
task = task_repository.get_task_by_date(today)
107107
if not task:
108108
raise HTTPException(
109109
status_code=404,

src/routes/v1/upload.py

+8-6
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
from db.my_db import get_db
1414
from repository.finding import get_finding_repository
1515
from repository.task import TaskRepository, get_task_repository
16+
from worker.types import GenerateReportInput
1617
from worker.worker import worker
1718

1819
router = APIRouter(prefix="/upload")
@@ -57,15 +58,16 @@ async def upload(
5758
find.recommendation_task_id = recommendation_task.id
5859
findings.append(find)
5960
finding_repository.create_findings(findings)
61+
worker_input = GenerateReportInput(
62+
recommendation_task_id=recommendation_task.id,
63+
generate_long_solution=data.preferences.long_description or True,
64+
generate_search_terms=data.preferences.search_terms or True,
65+
generate_aggregate_solutions=data.preferences.aggregated_solutions or True,
66+
)
6067

6168
celery_result = worker.send_task(
6269
"worker.generate_report",
63-
args=[
64-
recommendation_task.id,
65-
data.preferences.long_description,
66-
data.preferences.search_terms,
67-
data.preferences.aggregated_solutions,
68-
],
70+
args=[worker_input.model_dump()],
6971
)
7072

7173
# update the task with the celery task id

0 commit comments

Comments
 (0)