Setup

Requires Python 3.6+. Install requirements.txt in an appropriate virtual environment:

# Set up a new virtual environment
python -m venv --prompt job-advert-analysis .venv
source .venv/bin/activate
# Install requirement
python -m pip install -r requirements.txt
# Download SpaCy model
python -m spacy download en_core_web_lg

For downloading the Kaggle data you will need Kaggle API credentials set up, and accept the competition rules. Alternatively you can manually download and unzip the data from Kaggle directly. If you do not wish to use Kaggle datasources then remove them from DATASOURCES in 01_fetch_data.py.

Runnning

You can run the whole pipeline using python -m job_pipeline build.

You need a Placeholder server running on Port 3000 of localhost for locatino normalisation. Follow these instructions for a simple way to do this using Docker.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Setup

Runnning

Files

README.md

Latest commit

History

README.md

File metadata and controls

Setup

Runnning