Thoughtworks Pairing Interview
Codebase URL: https://github.com/techops-recsys-lateral-hiring/datascientist-politicalparties-python
As a data scientist you are required to analyse the political landscape of Europe using the Chapel Hill Expert Survery dataset. The dataset provides insights into the positioning of 277 political parties in Europe based on 55 different attributes. The dataset can be downloaded here and the codebook provides further information on the survey attributes.
This repository contains the necessary setup and code base to help guide you in performing an analysis using different statistical methods.
Please make sure you have the following software installed
- Poetry
- Python (3.7 or 3.8)
Poetry is used for python dependency management. To install the necessary python dependencies run the following command.
poetry install
Alternatively, the make command defined in the makefile can also be used.
make install
Similarly to add/install new python packages to your poetry virtual environment, use
poetry add <python-package-name>
The unit tests, linting checks and type checks can be run either by using the make commands (given in the makefile) or by using the commands from the respective packages. For example, unit tests can be executed using,
make test
or
poetry run pytest tests
For running linting checks using flake8, use
poetry run flake8 src tests
or
make lint-check
Please be sure to complete the below tasks before the pairing session.
- Get a high-level understanding of the dataset by looking into the codebook and if necessary downloading the dataset.
- Have your coding environment ready by installing python and poetry.
- Ensure that you are able to run all commands mentioned in this README (except for pytest errors)
Please note that you DO NOT have to complete the code/tasks inside the src/
folder. It is meant to be done together during pairing session.