Skip to content

Installation

Dylan Hall edited this page Apr 19, 2023 · 2 revisions

Dependency Overview

These tools were created and tested on Python 3.9.12. The tools rely on two libraries: SQLAlchemy and anonlink.

SQLAlchemy is a library that allows the tools to connect to a database in a vendor independent fashion. This allows the tools to connect to a database that conforms to the CODI Identity Data Model implemented in PostgreSQL or Microsoft SQLServer (and a number of others).

anonlink is responsible for garbling the PII so that it can be de-identified prior to transmission to the linkage agent.

Installing with an existing Python install

Cloning the Repository

Clone the project locally as a Git repository

git clone https://github.com/mitre/data-owner-tools.git

Or download as a zip file:

  1. Click this link to download the project as a zip or use the "Clone or download" button on GitHub.
  2. Unzip the file.

Set up a virtual environment (Optional, but recommended)

It can be helpful to set up a virtual environment to isolate project dependencies from system dependencies. There are a few libraries that can do this, but this documentation will stick with venv since that is included in the Python Standard Library.

# Navigate to the project folder
cd data-owner-tools/
# Create a virtual environment in a `venv/` folder
python -m venv venv/
# Activate the virtual environment
source venv/bin/activate

Installing dependencies

pip install --upgrade pip
pip install -r requirements.txt

N.B. If the install fails during install of psycopg2 due to a clang error, you may need to run the following to resolve: env LDFLAGS='-L/usr/local/lib -L/usr/local/opt/openssl/lib -L/usr/local/opt/readline/lib' pip install psycopg2==2.8.4

Installing with Anaconda

  1. Install Anaconda by following the install instructions.
    1. Depending on user account permissions, Anaconda may not install the latest version or may not be available to all users. If that is the case, try running conda update -n base -c defaults conda
  2. Download the tools as a zip file using the "Clone or download" button on GitHub.
  3. Unzip the file.
  4. Open an Anaconda Powershell Prompt
  5. Go to the unzipped directory
  6. Run the following commands:
    1. conda create --name codi
    2. conda activate codi
    3. conda install pip
    4. pip install -r requirements.txt