The DHIS2 Analytics Pipeline automates the extraction, transformation, and analysis of health data from a DHIS2 instance. This project focuses on retrieving facility-level weekly data, enriching it with organisational unit details, and exporting the processed data in a structured format for further reporting and visualisation.
- Automated Data Extraction: Queries DHIS2 API to fetch datasets, data elements, and organisational units.
- Facility-Level Aggregation: Aggregates health data weekly, categorised by facilities, districts, and provinces.
- Enrichment with Organizational Details: Merges facility data with district and province information for comprehensive insights.
- CSV Export: Outputs processed data to CSV files for easy access and analysis.
- Clone the repository:
git clone https://github.com/malambomutila/DHIS2-Pipeline.git
- Navigate to the project directory:
cd DHIS2-Pipeline
- Install required dependencies:
pip install -r requirements.txt
- Configure DHIS2 credentials in the
00_Local/01_Configs/credentials.txt
file:DHIS2_URL=your_dhis2_instance_url USERNAME=your_username PASSWORD=your_password
- Run the Jupyter Notebook to fetch and process data:
querydata.ipynb
- Processed data will be saved in the
00_Local/02_Data/
directory as CSV files.
The following CSV files are generated by the pipeline:
datasets.csv
: List of available datasets from DHIS2.data_elements.csv
: Metadata for available data elements.organisation_units.csv
: Details of all organization units.districts.csv
: District-level organization units.provinces.csv
: Province-level organization units.facility_january_2024_with_org_details.csv
: Processed health data by facility.
Contributions are welcome! Feel free to submit issues, feature requests, or pull requests to improve the project.