Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError on doomrnn #19

Open
xiaoschannel opened this issue Jan 20, 2019 · 3 comments
Open

MemoryError on doomrnn #19

xiaoschannel opened this issue Jan 20, 2019 · 3 comments

Comments

@xiaoschannel
Copy link
Contributor

I am encountering what appears to be an error where the program is using too much memory:

When I load 500 episodes, the program runs fine and VAE gets trained and loss decreases.
When I load 2000 episodes, I get the following:

Traceback (most recent call last):
  File "vae_train.py", line 77, in <module>
    dataset = create_dataset(dataset)
  File "vae_train.py", line 60, in create_dataset
    data = np.zeros((M, 64, 64, 3), dtype=np.uint8)
MemoryError

The repo uses 10k episodes, but I cannot load even 2k on my 16GB machine. Am I missing something? If my memory really is the issue here, what amount of memory is necessary to replicate the paper with the codes here?

@hardmaru
Copy link
Owner

hardmaru commented Jan 21, 2019

Hi @zuoanqh

Thanks for the issue. It's due more to laziness on my part, rather than actual requirements to train the VAE.

When I was running the experiments I was doing them on virtual cloud instances that had GPUs, 64-core CPUs and a few hundred GBs of RAM, so I was lazy and just loaded the entire dataset into a numpy array (as you have outlined: data = np.zeros((M, 64, 64, 3), dtype=np.uint8)) and this dumped hundreds of GBs of data directly to RAM.

If you want to train the VAE with very little RAM, feel free to refactor the code using the more modern tf.data which will load batches from disk slowly to construct mini-batches and handle the training iteration queues.

Here are a few tutorials on how to do use tf.data:

https://towardsdatascience.com/how-to-use-dataset-in-tensorflow-c758ef9e4428

https://www.tensorflow.org/guide/datasets

Best.

@xiaoschannel
Copy link
Contributor Author

xiaoschannel commented Jan 21, 2019 via email

@hardmaru
Copy link
Owner

hardmaru commented Jan 21, 2019 via email

asolano added a commit to asolano/WorldModelsExperiments that referenced this issue Sep 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants