Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
XanderVertegaal committed Dec 10, 2024
1 parent fd18629 commit 9b17b5d
Showing 1 changed file with 47 additions and 23 deletions.
70 changes: 47 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,52 +2,76 @@

[![Actions Status](https://github.com/UUDigitalHumanitieslab/parseport/workflows/Unit%20tests/badge.svg)](https://github.com/UUDigitalHumanitieslab/parseport/actions)

ParsePort is an interface for the [Spindle](https://github.com/konstantinosKokos/spindle) parser using the [Æthel](https://github.com/konstantinosKokos/aethel) library, both developed by dr. Konstantinos Kogkalidis as part of a research project conducted with prof. dr. Michaël Moortgat at Utrecht University. Other parsers may be added in the future.
ParsePort is a web interface for two NLP-related (natural language processing) parsers and two associated pre-parsed text corpora, both developed at Utrecht University.

1. The [Spindle](https://github.com/konstantinosKokos/spindle) parser is used to produce type-logical parses of Dutch sentences. It features a pre-parsed corpus of around 65.000 sentences (based on [Lassy Small](https://taalmaterialen.ivdnt.org/download/lassy-klein-corpus6/)) called [Æthel](https://github.com/konstantinosKokos/aethel). These tools have been developed by dr. Konstantinos Kogkalidis as part of a research project conducted with prof. dr. Michaël Moortgat at Utrecht University.

2. The Minimalist Parser produces syntactic tree models of English sentences based on user input, creating syntax trees in the style of [Chomskyan Minimalist Grammar](https://en.wikipedia.org/wiki/Minimalist_program). The parser has been developed by dr. Meaghan Fowlie at Utrecht University and comes with a pre-parsed corpus of 100 sentences taken from the Wall Street Journal. The tool used to visualize these syntax trees in an interactive way is Vulcan, developed by dr. Jonas Groschwitz, also at Utrecht University.

## Running this application in Docker

In order to run this application you need a working installation of Docker and an internet connection. You will also need the source code from two other repositories, `spindle-server` and `latex-service` to be present in the same directory as the `parseport` source code.
In order to run this application you need a working installation of Docker and an internet connection. You will also need the source code from four other repositories. These must be located in the same directory as the `parseport` source code.

1. [`spindle-server`](https://github.com/CentreForDigitalHumanities/spindle-server) hosts the source code for a server with the Spindle parser;
2. [`latex-service`](https://github.com/CentreForDigitalHumanities/latex-service) contains a LaTeX compiler that is used to export the Spindle parse results in PDF format;
3. [`mg-parser-server`](https://github.com/CentreForDigitalHumanities/mg-parser-server) has the source code for the Minimalist Grammar parser;
4. [`vulcan-parseport`](https://github.com/CentreForDigitalHumanities/vulcan-parseport) is needed for the websocket-based webserver that hosts Vulcan, the visualization tool for MGParser parse results.

See the instructions in the README files of these repositories for more information on these codebases.

In addition, you need to add a configuration file named `.env` to the root directory of this project with at least the following setting.

```
```conf
DJANGO_SECRET_KEY=...
```

In overview, your file structure should be as follows.

```
┌── parseport (this project)
| ├── compose.yaml
| ├── .env
| ├── frontend
| | └── Dockerfile
| └── backend
| ├── Dockerfile
| └── aethel_db
| └── data
| └── aethel.pickle
|
├── spindle-server
| ── Dockerfile
| ── Dockerfile
| └── model_weights.pt
|
├── latex-service
| └── Dockerfile
|
└── parseport (this project)
├── compose.yaml
├── .env
├── frontend
| └── Dockerfile
└── backend
├── Dockerfile
└── aethel.pickle
├── mg-parser-server
| └── Dockerfile
|
└── vulcan-parseport
├── Dockerfile
└── app
└── standard.pickle
```

Note that you will need two data files in order to run this project.
Note that you will need three data files in order to run this project.

- `model_weights.pt` should be put in the root directory of the `spindle-server` project. It can be downloaded from _Yoda-link here_.
- `aethel.pickle` should live at `parseport/backend/`. You can find it in the zip archive [here](https://github.com/konstantinosKokos/aethel/tree/stable/data).
- `aethel.pickle` contains the pre-parsed data for Æthel and should live at `parseport/backend/aethel_db/data`. You can find it in the zip archive [here](https://github.com/konstantinosKokos/aethel/tree/stable/data).
- `standard.pickle` contains the pre-parsed corpus for the Minimalist Parser. It should be placed in the `vulcan-parseport/app` directory. You can download it from _Yoda-link here_.

This application can be run in both `production` and `development` mode. Either mode will start a network of five containers.
This application can be run in both `production` and `development` mode. Either mode will start a network of seven containers.

| Name | Description |
|--------------|---------------------------------------------------|
| `nginx` | Entry point and reverse proxy, exposes port 5000. |
| `pp-ng` | The frontend server (Angular). |
| `pp-dj` | The backend/API server (Django). |
| `pp-spindle` | The server hosting the Spindle parser. |
| `pp-latex` | The server hosting a LaTeX compiler. |
| Name | Description |
|-------------------|---------------------------------------------------|
| `nginx` | Entry point and reverse proxy, exposes port 5000. |
| `pp-ng` | The frontend server (Angular). |
| `pp-dj` | The backend/API server (Django). |
| `pp-spindle` | The server hosting the Spindle parser. |
| `pp-latex` | The server hosting a LaTeX compiler. |
| `pp-mg-parser` | The server hosting the Minimalist Grammar parser. |
| `pp-vulcan` | The server hosting the Vulcan visualization tool. |

Start the Docker network in **development mode** by running the following command in your terminal.

Expand All @@ -61,7 +85,7 @@ For **production mode**, run the following instead.
docker compose --profile prod up --build -d
```

The Spindle server needs to download several files before the parser is ready to receive. You should wait a few minutes until the message *App is ready!* appears in the Spindle container logs.
The Spindle server needs to download several files before the parser is ready to receive input. You should wait a few minutes until the message *App is ready!* appears in the Spindle container logs.

Open your browser and visit your project at http://localhost:5000 to view the application.

Expand Down

0 comments on commit 9b17b5d

Please sign in to comment.