Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write tar files to a stream instead of to the filesystem first #22

Open
wants to merge 3 commits into
base: 1.6
Choose a base branch
from

Conversation

spmiller
Copy link

@spmiller spmiller commented Jun 8, 2017

This improves performance when writing to a stream (i.e. there is no need to wait for the contents to be written to disk first), and can help solve EMFILE errors. This is achieved by using tar-stream instead of regular tar.

This commit introduces a small shim in front of the file system and tar-stream to allow both direct disk writes and tar file stream writes. It then uses the appropriate shim (file system or tar-stream) throughout the codebase to create directories and store files.

Potentially fixes #20.

This improves performance when writing to a stream (i.e. there is no need to wait for the contents to be written to disk first), and can help solve EMFILE errors. This is achieved by using tar-stream instead of regular tar.

This commit introduces a small shim in front of the file system and tar-stream to allow both direct disk writes and tar file stream writes. It then uses the appropriate shim (file system or tar-stream) throughout the codebase to create directories and store files.
@coveralls
Copy link

coveralls commented Jun 8, 2017

Coverage Status

Coverage decreased (-71.4%) to 26.316% when pulling 1f85b75 on spmiller:tar-streams into 20768db on hex7c0:1.6.

@hex7c0
Copy link
Owner

hex7c0 commented Jun 8, 2017

Hi spmiller,
thanks for your PR

can you change only index.js?
for code review

@coveralls
Copy link

Coverage Status

Coverage decreased (-72.7%) to 25.0% when pulling 82b80c6 on spmiller:tar-streams into 20768db on hex7c0:1.6.

1 similar comment
@coveralls
Copy link

coveralls commented Jun 8, 2017

Coverage Status

Coverage decreased (-72.7%) to 25.0% when pulling 82b80c6 on spmiller:tar-streams into 20768db on hex7c0:1.6.

@spmiller
Copy link
Author

spmiller commented Jun 8, 2017

@hex7c0 done. I'm not sure why coveralls is reporting the coverage has decreased so dramatically -- when I ran the tests locally coverage stayed at 97.74%.

Edit: coverage decreased because the tests can't run on a PR since the URI is defined with a secure variable.

@spmiller
Copy link
Author

I realised the way I was injecting the document store made it difficult to see what was actually changing in the diff. I have now changed it to be stored as a global variable rather than injecting it into functions, which has made the diff much easier to digest.

@coveralls
Copy link

coveralls commented Jun 13, 2017

Coverage Status

Coverage decreased (-72.7%) to 25.0% when pulling 36b035e on spmiller:tar-streams into 20768db on hex7c0:1.6.

@spmiller
Copy link
Author

Hi @hex7c0, is there anything I can do to help get this merged?

@hex7c0
Copy link
Owner

hex7c0 commented Jul 9, 2017

I'll work on it soon
I'm so sorry for the delay

@mledwards
Copy link

mledwards commented Nov 7, 2017

Any update on this? I'm getting the same issue with streams and fs. This would be much less of an issue if the plugin gzipped as that's tiny. My tar is coming out about 10 times bigger than the data.

Unfortunately for me it renders the module useless, as databases grow and grow and
inevitably will be bigger than 100mb.

Otherwise, it's a great module, I'm using it for hourly backups.

@spmiller
Copy link
Author

spmiller commented Apr 9, 2018

Hey @hex7c0, is there anything more you need me to do before merging this? Our project is starting to have issues backing up large-ish databases, so I may need to start using my fork :(

I am also planning a similar patch to the mongodb-restore library to support reading directly from the .tar.gz stream, but I wanted to get this one in first.

@hex7c0
Copy link
Owner

hex7c0 commented Apr 11, 2018

hey @spmiller
you are absolutely right!

I'll try to allocate some time to do it

@cortezcristian
Copy link

cortezcristian commented Feb 11, 2019

Patched it to work with Mongo 4.x
spmiller#5
here's the package
https://www.npmjs.com/package/mongodb-backup-stream-4x

@Phoscur
Copy link

Phoscur commented May 12, 2019

Whats the holdup on this? @hex7c0

@Phoscur Phoscur mentioned this pull request May 12, 2019
Closed
@tennox
Copy link

tennox commented May 21, 2019

Note that this seems to have a problem when the root directory does not exist (as it tries to create the tar file first):

mongoBackup({
  uri: Meteor.settings['mongoUri'],
  parser: 'json',
  root: 'dump/',
  logger: 'mongo.log',
  tar: 'dump.tar',
});

(STDERR) (node:11311) DeprecationWarning: current URL string parser is deprecated, and will be removed in a future version. To use the new parser, pass option { useNewUrlParser: true } to MongoClient.connect.
(STDERR) events.js:183
(STDERR) throw er; // Unhandled 'error' event
(STDERR) ^
(STDERR)
(STDERR) Error: ENOENT: no such file or directory, open '/home/manu/dev/my-app/.meteor/local/build/programs/server/dump/dump.tar'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error while backup large collection
7 participants