-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For estimating the cost for data upload to AWS S3 #480
Comments
Hi m, I believe it is the number of PUT requests (Which should be ~1:1 with the number of files except for failed uploads? Not sure if they charge for error codes.) Remember to account for storage cost, number of writes, and any inter-region egress and egress to internet charges later on. Usually these cloud providers have free ingress. No warranty on this advice though! It's always possible I've missed something. |
Here's some example calculations for running a process on Google. The AWS calculations are pretty similar. https://github.com/seung-lab/kimimaro/wiki/The-Economics:-Skeletons-for-the-People/c2d4e28645e96d3e963f7338a46d15dc3890c553 |
Thank you for the inputs @william-silversmith !! Wow, it does need to be carefully processed for production ready chunks to be uploaded. What if the scenario is to compute on a local machine, (and not directly writing to the cloud), should I be looking at the following costs for upload ready files? Let's say the cloud provider provides a CLI to upload these dataset.
neurodata.io seems to use s3 cloud storage. Do you know if they use CloudVolume to upload? |
To my knowledge they've been using s3 though at one point they were considering Azure. I believe they use CV to upload, but I can't be sure. They recommend using CloudVolume to download on their site.
You should check to see if those are the current prices for s3 yourself, but the first line refers to reading the entire segmentation and the second line refers to writing all the skeleton fragments. If you're just reading and writing images using a reasonable chunk size, you should be okay. What kind of job are you running and what size is it (approximately if need be)? It would helpful for giving better insight. |
Hi @william-silversmith , Yes, definitely I would need to revisit the current prices for s3. To better understand the illustrated example, what was the original file size of the raw data? Has the file been downsampled?
|
In the example given, the file was downsampled to mip 3 (hence 15.6 TVx). If you're concerned about the generation of meshes/skeletons, there's some good news. Skeletons has gotten a lot better (that's an old article) by using the sharded format. Meshes are still under development. Here's the updated article: https://github.com/seung-lab/kimimaro/wiki/The-Economics:-Skeletons-for-the-People |
Hi there,
AWS S3 storage comes with the cost for uploading the data to their storage. I realized CloudVolume is compatible with direct upload to the S3 storage if I'm not mistaken. When estimating the cost, should I be calculating based on the number of files being uploaded or number of POST/PUT requests made to the S3 storage?
This is what AWS is providing for pricing estimation.
https://aws.amazon.com/s3/pricing/
Thank you,
-m
The text was updated successfully, but these errors were encountered: