Skip to content

Commit

Permalink
init flask integration
Browse files Browse the repository at this point in the history
  • Loading branch information
leonlolly committed Nov 8, 2023
1 parent 34c1728 commit ad7f1d5
Show file tree
Hide file tree
Showing 8 changed files with 57 additions and 46 deletions.
12 changes: 5 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@ FROM python:3.9
USER root
RUN mkdir /home/wannadb
WORKDIR /home/wannadb
COPY requirements.txt requirements.txt

# Install dependencies
# install torch
RUN pip install --use-pep517 torch==1.10.0

# Install dependencies
COPY requirements.txt requirements.txt
RUN pip install --use-pep517 -r requirements.txt
##################################
## do not change above ##
Expand All @@ -22,11 +23,8 @@ RUN pip install --use-pep517 pytest
#copy the rest
COPY . .

COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
RUN chmod +x entrypoint.sh

EXPOSE 8080
EXPOSE 5000

# Define the entrypoint.sh
CMD ["/entrypoint.sh"]
ENTRYPOINT "/home/wannadb/entrypoint.sh"
35 changes: 28 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,26 @@
# Start the docker

beim ersten mal

```
docker compose build
```

zum weiter arbeiten

```
docker compose up
```

danach sollte das backend gestartet sein

ihr könnt mit `code` den container attachen und dann im docker arbeiten

git functioniert erst wenn ihr gh installiert und gh auth macht
anschließend könnt ihr wie gewohn arbeiten

ein docker rebuild ist nur nötig wenn sich dependencies geändert haben

# WannaDB: Ad-hoc SQL Queries over Text Collections

![Document collection and corresponding table.](header_image.svg)
Expand Down Expand Up @@ -114,33 +137,31 @@ series = {SIGMOD '22}

WannaDB is dually licensed under both AGPLv3 for the free usage by end users or the embedding in Open Source projects, and a commercial license for the integration in industrial projects and closed-source tool chains. More details can be found in [our licence agreement](LICENSE.md).


## Availability of Code & Datasets

We publish the source code four our system as discussed in the papers here. Additionally, we publish code to reproduce our experiments in a separate repository (coming soon).

Unfortunately, we cannot publish the datasets online due to copyright issues. We will send them via email on request to everyone interested and hope they can be of benefit for other research, too.


## Implementation details

The core of WannaDB (extraction and matching) was previously developed by us under the name [ASET (Ad-hoc Structured Exploration of Text Collections)](https://link.tuda.systems/aset). To better reflect the whole application cycle vision we present with this paper, we switchted the name to WannaDB.
The core of WannaDB (extraction and matching) was previously developed by us under the name [ASET (Ad-hoc Structured Exploration of Text Collections)](https://link.tuda.systems/aset). To better reflect the whole application cycle vision we present with this paper, we switchted the name to WannaDB.

### Repository structure

This repository is structured as follows:

* `wannadb`, `wannadb_parsql`, and `wannadb_ui` contain the implementation of ASET and the GUI.
* `scripts` contains helpers, like a stand-alone preprocessing script.
* `tests` contains pytest tests.
- `wannadb`, `wannadb_parsql`, and `wannadb_ui` contain the implementation of ASET and the GUI.
- `scripts` contains helpers, like a stand-alone preprocessing script.
- `tests` contains pytest tests.

### Architecture: Core

The core implementation of WannaDB is in the `wannadb` package and implemented as a library. The implementation allows you to construct pipelines of different data processors that work with the data model and may involve user feedback.

**Data model**

`data` contains WannaDB's data model. The entities are `InformationNugget`s, `Attribute`s, `Document`s, and the `DocumentBase`.
`data` contains WannaDB's data model. The entities are `InformationNugget`s, `Attribute`s, `Document`s, and the `DocumentBase`.

A nugget is an information piece obtained from a document. An attribute is a table column that gets
populated with information from the documents. A document is a textual document, and the document base is a collection of documents and provides facilities for `BSON` serialization, consistency checks, and data access.
Expand Down
11 changes: 11 additions & 0 deletions app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from flask import Flask
from flask_cors import CORS


app = Flask(__name__)
CORS(app)


@app.route('/')
def hello_world(): # put application's code here
return 'Hello World!'
7 changes: 0 additions & 7 deletions backend/app.py

This file was deleted.

12 changes: 0 additions & 12 deletions backend/routes.py

This file was deleted.

18 changes: 9 additions & 9 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
version: '3.6'
version: "3.6"
services:
wannadb:
build:
context: .
dockerfile: Dockerfile
restart: always
tty: true
ports:
- 8080:8080
wannadb:
build:
context: .
dockerfile: Dockerfile
restart: always
tty: true
ports:
- "8000:8000"
4 changes: 1 addition & 3 deletions entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,4 @@ export PYTHONPATH="."

pytest

flask --app backend/app.py run

sleep infinity
gunicorn -w 4 --bind 0.0.0.0:8000 app:app
4 changes: 3 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -232,4 +232,6 @@ wasabi==0.10.1
# The following packages are considered to be unsafe in a requirements file:
# setuptools

flask==3.0.0
flask==3.0.0
Flask_Cors==4.0.0
gunicorn==21.2.0

0 comments on commit ad7f1d5

Please sign in to comment.