Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running out of memory on large (planetary) files #25

Open
dziegler991 opened this issue Jan 5, 2023 · 3 comments
Open

Running out of memory on large (planetary) files #25

dziegler991 opened this issue Jan 5, 2023 · 3 comments

Comments

@dziegler991
Copy link

Hi all,

Does anyone have thoughts on how to programmatically break up large files to run them through earth-osm? I am constrained by memory. I am attempting the (almost) impossible and trying to run a planetary pbf for lines, generator, and substation. It would be ideal if you could chunk in the planet.pbf file but I am not sure that's possible.

Thoughts?

@pz-max
Copy link
Member

pz-max commented Jan 10, 2023

Hi @dziegler991, sorry for the late response. You are the first outside of the PyPSA bubble using this package 🥇
Earth-osm should have no memory issues for single countries. So you could just iterate through the country list to be able to create an extract for the Earth.

You can see the regions with -> eo.view_regions(). Or any other idea @mnm-matin ?

But yeah, we never tried a planetary.pbf would be great if that works. I think these steps are necessary:

  1. we need to figure out how to chunk .pbf's
  2. Read the chunks/ extract information
  3. Save them to disk e.g. appending csv file

@Mousa-Zerai
Copy link

@dziegler991 Also trying to use this for a very large file, but there is no way to run a custom pbf file through the filters. The largest file from geofabrik is for Europe (26.2 GB) and the tool works for it. pinging @mnm-matin

@mnm-matin
Copy link
Member

mnm-matin commented Jan 18, 2023

I think the problem is not with the extraction but with the writing of the csv and geojson files as those are kept in memory first before writing. There is a way to incrementally write these files as well now, but not correctly implemented to be used. The idea is that most use cases will be satisfied by passing in a list of all the the continents. keeping this open for now, for potential use cases...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants