Skip to content

Some scripts to scrape and clean berlin marathon data.

Notifications You must be signed in to change notification settings

stappit/berlin-marathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Berlin Marathon

The official results from the Berlin marathons. I use the API offered by the Berlin marathon website to collect all available results. Although data exists for every year since 1974, the API only offers the data since 2005.

Data

  • The raw JSON files returned by the API are named <year>-<page>.json where <year> is the year of the race and <page> is the page as numbered by the API. Each page should have 100 records (except the last pages).
  • The dirty CSV file is also created during scraping -- see the scrape file.
  • The clean CSV file is generated from the dirty CSV file using the clean notebook.
  • The cleaning process makes use of country abbreviations. The abbreviation RKS has not yet been identified.
  • The md5 checksums of csv files can be found in hashes.

About

Some scripts to scrape and clean berlin marathon data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published