Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
Holly-Transport authored Sep 7, 2024
1 parent cbdc7e3 commit 5bd13be
Show file tree
Hide file tree
Showing 5 changed files with 81 additions and 0 deletions.
35 changes: 35 additions & 0 deletions docs/1-intro-to-data-goods.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Introduction to Data Goods

**Data Goods** are comprised of data, reproducible methods (code), sample insights, and training guidance. Unlike a traditional data analysis, which results in a single-use report or visualization, Data Goods are designed to be customized, reused, and updated, thereby building the capacity of the World Bank and partner organizations to quickly and effectively deliver complex data science solutions to pressing global challenges.

Data Goods packages may include:

1. **Data**. Data Goods provide guidance on how to access the data underpinning all data products, indicators, and insights. This transparency in data sources supports reproducibility and, critically, re-use in new countries and contexts, over time. The Datasets section includes three parts:



> <u>Existing Data</u>. Each Data Goods may include a curation of datasets -- public and private -- that will support project objectives. The team prepares this curated list as a table, which includes data type, update frequency, access links, and contact information.
> <u>Digitized Government Data</u>. Where needed, a Data Good may also include guidance on government data digitalization and/or management, leveraging AI methods to make disaggregated government records readily searchable and usable.
> <u>New Data Collection</u>. A Data Good may also incude a field data collection plan (and implementation of that plan, as needed) that includes some combination of household surveys, remote sensing (including drones), and crowdsourcing. Data Goods may also include guidance (and again, implementation of that guidance) on processing, storage, and cataloguing of all collected data.
>
> All Bank-produced datasets as part of the Data Good can be hosted as a special collection on the World Bank's Data Catalogue, managed by the Development Economics Data Group (DECDG). The Catalogue receives more than 14 million unique users per month and will ensure value of the investment in data collection will be multiplied.


2. **Reusable Data Products**. These are analytical products derived from the Datasets, which can be further used to generate indicators and insights. All data products include original code, documentation, links to original data sources (and/or information on how to access them), and a description of their limitations. Reference resources are also cited, where relevant.

3. **Insights and Indicators**. Each Data Goods package may also include additional analytical work, such as dynamic maps, data visualizaations, and/or sample indicators. Indicators can be derived from a combination of **Datasets** and **Reusable Data Products**. By combining these two inputs, teams are empowered to develop a large array of indicators to meet their project needs.

4. **Training and Dissemination**. Each Data Good is packaged as a readily translated and distributed web book and may guidance for further training and capcity building, as needed.

5. **Data Lab Team**. For each project, the [World Bank Data Lab](https://wbdatalab.org/) recruits colleagues from throughout the World Bank, pooling our collective data talents in support of our lending and technical assistance operations. Data Goods packages include names and contact information for the unique teams that prepared the Goods.

## How Data Goods are Managed

1. **Dynamic, Web-Hosted Documentation**. Unless specified otherwise, all code and documentation used to produce the Data Goods is hosted in a project GitHub repository to facilitate reuse for future updates and projects, as well as to support collaboration and capacity building activities.

2. **Data Catalogue**. Where possible, all datasets used in the production of Data Goods are added as entries to the World Bank’s [Development Data Hub](https://datacatalog.worldbank.org/home), where they are tagged with meta data, license attributes, and access information.

3. **Internal Project Management and File Sharing System**. To facilitate project management across teams, the Lab creates a Project SharePoint, which includes project management information (work plan, milestones, check-in slides, log of hours charged, final report), related literature, data files, indicator tables, and links to resources, such as this documentation. The advantage of SharePoint for World Bank usage is that all contents are automatically encrypted and tagged as Official Use Only. The project SharePoint is accessible to project team members and, with permission, can be replicated as a basis for future project updates or for similar projects.
13 changes: 13 additions & 0 deletions docs/2-data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Data

The following section provides a convenient list of all datasets used in the data science products and analytics prepared for this project. The datasets table includes a description of the data and their update frequency, as well as access links and contact information for questions about use and access. Users should not require any datasets not included in this table to complete the analytical work for the Data Good.

```{note}
**Project Sharepoint** links are only accessible to the project team. For permissions to access these data, please write to the contact provided. The **Development Data Hub** is the World Bank's central data catalogue and includes meta-data and license information.
```

| ID | Name | License | Description | Update Frequency | Access | Contact |
| --- | ------------------- | ----------- | ------------------------------------------------------------------------------------------------------ | ---------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------- |
| 1 | IMF PortWatch | Open | Open platform for monitoring and simulating disruptions to maritime trade flows | Weekly | [IMF PortWatch](https://portwatch.imf.org/),[Project SharePoint](https://worldbankgroup.sharepoint.com/:f:/r/teams/DevelopmentDataPartnershipCommunity-WBGroup/Shared%20Documents/Projects/Data%20Lab/Red%20Sea%20Maritime%20Monitoring?csf=1&web=1&e=AHvobA) | [Andres Chamorro](mailto:[email protected]), GOST |
| 2 | ACLED Conflict Data | Proprietary | Timestamped, geolocated points where conflict took place collected based on news and crowdsourced data | Daily | [Development Data Hub]((https://datacatalog.worldbank.org/int/search/dataset/0061835/acled---middle-east)) and [Project SharePoint](https://worldbankgroup.sharepoint.com/:f:/r/teams/DevelopmentDataPartnershipCommunity-WBGroup/Shared%20Documents/Projects/Data%20Lab/Red%20Sea%20Maritime%20Monitoring?csf=1&web=1&e=AHvobA) | [Sahiti Sarva](mailto:[email protected]), Data Lab |
| | | | | | | |
19 changes: 19 additions & 0 deletions docs/3-reusable-data-products.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Data Products

**Reusable Data Products** can be used to generate indicators and insights. All Data Products include documentation, references to original data sources (and/or information on how to access them), and a description of their limitations.

Each Data Product is presented in the web book according to the following outline:

1. Data Product Overview

2. Data Description
Include everything a user would need to access and use the data that supports the data product. Include description, license, access instructions, etc.

4. Methodology
Include step-by-step directions, code snippets, links to complete code, and notes on any critical dependencies.

6. Findings

7. Limitations

8. References and Works Cited
3 changes: 3 additions & 0 deletions docs/4-sample-insights-and-ind.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Insights and Indicators

This section shows how datasets and reeusable data products can be combined to generate new insights and indicators.
11 changes: 11 additions & 0 deletions docs/6-team.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Data Goods Team and Acknowedgements

The Data Lab would like to express our sincere gratutude and appreciation for the colleagues who worked together to prepare this Data Goods package:

<sample table -- input actual team>

| **Name** | **Role** | **Team** |
| ---------------------------------------------------------- | ---------------------------------------------- | ------------------ |
| [Holly Krambeck](mailto:hkrambeck%40worldbank.org) | Project Lead | WB Data Lab, DECDG |
| [Andres Chamorro](mailto:achamorroelizond%40worldbank.org) | Geographer - Maritime Anaytics | GOST, DECDG |
| [Sahiti Sarva](mailto:ssarva%40worldbank.org) | Data Scientist - ACLED and Aviation Statistics | WB Data Lab, DECDG |

0 comments on commit 5bd13be

Please sign in to comment.