-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ft/dropbox chunked uploads #11
base: dropbox-chunked-uploads
Are you sure you want to change the base?
Ft/dropbox chunked uploads #11
Commits on Mar 7, 2018
-
Ignore invalid range headers instead of erroring
* According to the RFC[1], a server may ignore a Range header and should ignore a Range header containing units it doesn't understand. WB was erroring under these conditions, but it seems more appropriate to ignore the field. Testing against external providers showed no consistent practice to follow. The parse_request_range docs have been updated to reflect the new behavior. Big thanks to @birdbrained for doing the legwork on researching this! [1] https://tools.ietf.org/html/rfc7233#section-3.1
Configuration menu - View commit details
-
Copy full SHA for fbc8399 - Browse repository at this point
Copy the full SHA fbc8399View commit details
Commits on Mar 8, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 265b47c - Browse repository at this point
Copy the full SHA 265b47cView commit details -
Add GoogleCloudStorage Provider
This is a rebased and squashed commit of a fully working implementation for the provider based on Google Cloud Storage's JSON API and OAuth 2.0 protocol. For detailed commits and messages, please see this PR: CenterForOpenScience#317
Configuration menu - View commit details
-
Copy full SHA for ae2524e - Browse repository at this point
Copy the full SHA ae2524eView commit details -
- Removed functionalities that are not used by OSFStorage and that do not have documented support - Removed JSON API related docstrings/comments - Added TODOs on what needs to be refactored - Update tests
Configuration menu - View commit details
-
Copy full SHA for d05ec48 - Browse repository at this point
Copy the full SHA d05ec48View commit details -
GCS XML API refactor - part 1: metadata
- updated settings to use XML API - updated metadata.py to parse response headers and init metadata object - added a helper function in utils.py to convert GCS's base64-encoded hash to hex digest - refactor the structure of fixtures and updated them with real responses from Postman tests
Configuration menu - View commit details
-
Copy full SHA for 6be26bc - Browse repository at this point
Copy the full SHA 6be26bcView commit details -
GCS XML API refactor - part 2: a minimal provider
- Fully refactored all provider actions to use XML API and signed request and implemented a minimal version: - Upload - Download - Metadata for file - Delete file - Intra-copy file - Added TODO comments for Phase 1, 1.5 and 2: - Create folder - Metadata for folder - Delete folder - Intra-copy folder - Rewrote the provider's `.build_url` with `.build_and_sign_req_url` to take care of the url buidling and request signing together
Configuration menu - View commit details
-
Copy full SHA for 7d681ab - Browse repository at this point
Copy the full SHA 7d681abView commit details
Commits on Mar 9, 2018
-
GSC XML API refactor - part 3: a working provider
Discovered and fixed a few issues during OSF integration test and updated comments and docstr. - Main issue: aiohttp parses the `x-goog-hash` correctly and returns an `aiohttp.MultiDict` dictionary that contains two entries with the same key, one for crc32c and one for md5. Fix: updated headers parsing and tests - Minor fixes/updates - Updated HTTP method for delete and intra-copy - Added bucket to "x-goog-copy-source" and no longer convert value to lower case for request signing - Strip '"' from "ETag" for verifying upload checksum - Removed "Content-length" and only use "x-goog-stored- content-length" for file size - Prefixed `build_and_sign_req_url()` with `_`
Configuration menu - View commit details
-
Copy full SHA for 4c70e1c - Browse repository at this point
Copy the full SHA 4c70e1cView commit details
Commits on Mar 12, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 75add86 - Browse repository at this point
Copy the full SHA 75add86View commit details -
Configuration menu - View commit details
-
Copy full SHA for b00dfb2 - Browse repository at this point
Copy the full SHA b00dfb2View commit details
Commits on Mar 13, 2018
-
Configuration menu - View commit details
-
Copy full SHA for a3cc17f - Browse repository at this point
Copy the full SHA a3cc17fView commit details -
Configuration menu - View commit details
-
Copy full SHA for cb07ea6 - Browse repository at this point
Copy the full SHA cb07ea6View commit details -
Explicit parse SIGNATURE_EXPIRATION to integer
Note: environment variables are passed in as string
Configuration menu - View commit details
-
Copy full SHA for 14388eb - Browse repository at this point
Copy the full SHA 14388ebView commit details -
Merge branch 'feature/gcloud-provider' into develop
Some tests and refactors to come, but provider is ready for testing. [SVCS-617] Closes: CenterForOpenScience#322
Configuration menu - View commit details
-
Copy full SHA for 4863533 - Browse repository at this point
Copy the full SHA 4863533View commit details
Commits on Mar 23, 2018
-
Remove region from Google Cloud
- WB only needs to know the bucket name, OSF handles the region and select the bucket with the expected region for WB.
Configuration menu - View commit details
-
Copy full SHA for 625ebc4 - Browse repository at this point
Copy the full SHA 625ebc4View commit details -
Merge branch 'cslzchen-fix/remove-region-from-gc' into develop
[SVCS-617] Closes: CenterForOpenScience#329
Configuration menu - View commit details
-
Copy full SHA for 4b5f5d7 - Browse repository at this point
Copy the full SHA 4b5f5d7View commit details
Commits on Mar 25, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 9e551d1 - Browse repository at this point
Copy the full SHA 9e551d1View commit details -
Futher improve tests for GoogleCloud
- Removed deprecated `fixture.py`, which is now replaced by `providers.py`, `folders.py` and `files.py` in the `fixtures/` directory. - Updated fixutres for CRUD operations and added back CRUD tests that were accidentally removed - Fix exipration check in utility tests where it now use the `settings.SIGNATURE_EXPIRATION` to calculate the expected exipiration time. Remove redundant type casting for expiration - Type casting is now done in the settings.py. I assume that this piece of code is accidentally left here during the last merge.
Configuration menu - View commit details
-
Copy full SHA for c270be0 - Browse repository at this point
Copy the full SHA c270be0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 36fbf3f - Browse repository at this point
Copy the full SHA 36fbf3fView commit details -
- Fix imports order. `typing` is a standard lib - Use `+=` for request segments contatenation - Replace `return True if <condition> else False` with `return bool(<condition>)` where the condition is `None` or `not None`
Configuration menu - View commit details
-
Copy full SHA for 376969e - Browse repository at this point
Copy the full SHA 376969eView commit details -
Use a strict regex for crc32c and md5 matching
- Update both the metadata method and the utility function to use the strict regex matching based on the RFC specification for Base 64 encoded crc32c and md5 hashes - Google Cloud uses the standard alphabets for Base 64 encoding: [A-Za-z0-9+/=] - RFC reference: http://www.rfc-editor.org/rfc/rfc4648.txt
Configuration menu - View commit details
-
Copy full SHA for 800a7ea - Browse repository at this point
Copy the full SHA 800a7eaView commit details -
BaseGCMetadata now handles resp headers in init
- Added an alternative constructor for the base metadata which takes either an standard python dict or a pair of object name and multi-value dict during initialization - Updated its usage in the provider - Changed @staticmethod to @classmethod for `get_metadata_from_resp_headers()` - Added metadata tests for both successful and failed initialiaztion.
Configuration menu - View commit details
-
Copy full SHA for 49ab45f - Browse repository at this point
Copy the full SHA 49ab45fView commit details -
Add a helper for parsing headers and update tests
- Added `get_multi_dict_from_json()` (and a test for it) to utils so that all tests now use this helper method to build resp headers - Refactored the metadata structure to test three classes separately - Removed import alias such as `core_exception`, `pd_settings` and `pd_utils` since there is no shadowing issues any more
Configuration menu - View commit details
-
Copy full SHA for 58abe70 - Browse repository at this point
Copy the full SHA 58abe70View commit details -
Add .new_from_resp_headers() to init GC metadata
- Both GC file and folder metadata now uses this decidated method to initialize with object name and aiohttp's "MultiDict" response headers - Removed the alternative constructor for GC base metadata and updated related code and tests
Configuration menu - View commit details
-
Copy full SHA for 43153d3 - Browse repository at this point
Copy the full SHA 43153d3View commit details -
Update DocStr and add PyDoc for metadata
- Moved quirks in comments into DocStr so that that they are available in WB Docs - Use `` (double tildes) for code in DocStr - Fixed Sphinx warnings
Configuration menu - View commit details
-
Copy full SHA for b0d385d - Browse repository at this point
Copy the full SHA b0d385dView commit details -
Update DocStr and PyDoc for utils
Side effect: modified the function signature for get_multi_dict_from_python_dict() to expect dict instead of json; updated all related tests.
Configuration menu - View commit details
-
Copy full SHA for 2550678 - Browse repository at this point
Copy the full SHA 2550678View commit details -
Add DocStr and PyDoc for the provider
- GC's private memebers are now availabe in the WB Docs by adding :private-members: - Remove :inherited-members: and :undoc-members: - Updated the .rst files to include GC, metadata and utils
Configuration menu - View commit details
-
Copy full SHA for 6ebaa51 - Browse repository at this point
Copy the full SHA 6ebaa51View commit details -
- mppy doesn't handle class inheritence well - mypy has a problem with **{} arguments - mypy doesn't handle multiple options of return type well
Configuration menu - View commit details
-
Copy full SHA for 2982f54 - Browse repository at this point
Copy the full SHA 2982f54View commit details -
Fix aiohttp's MultiDict and MultiDictProxy issue
- Upload fails due to CIMultiDcitProxy inherits from MultiDictProxy but not from MultiDict. However, aiohttpretty returns CIMultiDict while aiohttp returns CIMultiDictProxy, the type is strict and only recognize CIMultiDict - Updated the check statement with the either MultiDict or MultiDictProxy: - WB code uses CIMultiDcitProxy (a subclass of MultiDictProxy) since aiohttp parses the hash headers already. - WB test code uses MultiDict to modify the dictionary in get_multi_dict_from_python_dict() which returns MultiDcitProxy.
Configuration menu - View commit details
-
Copy full SHA for 6031e56 - Browse repository at this point
Copy the full SHA 6031e56View commit details
Commits on Mar 28, 2018
-
code style & docstring updates for metadata.py
* Minor updates to formatting of method sigantures, import order, and judicious application of DeMorgan's Law to make conditionals more readable. * Update formatting of docstrings to make Sphinx docs more cross-linked and browsable.
Configuration menu - View commit details
-
Copy full SHA for ca26ed8 - Browse repository at this point
Copy the full SHA ca26ed8View commit details -
code style & docstring updates for utils.py
* Minor style fixes. * Update formatting of docstrings to make Sphinx docs more cross-linked and browsable.
Configuration menu - View commit details
-
Copy full SHA for 2f6a4cc - Browse repository at this point
Copy the full SHA 2f6a4ccView commit details -
code style & docstring updates for provider.py
* Minor updates to formatting of method sigantures. * Update formatting of docstrings to make Sphinx docs more cross-linked and browsable.
Configuration menu - View commit details
-
Copy full SHA for ee6d669 - Browse repository at this point
Copy the full SHA ee6d669View commit details -
style fixes for tests; remove unneeded export
* Minor style fixes for import order and signature formatting. * Remove unused fixtures and imports from tests. * __init__.py doesn't need to export the Metadata classes. Remove that and update the test files that were using it.
Configuration menu - View commit details
-
Copy full SHA for c57b537 - Browse repository at this point
Copy the full SHA c57b537View commit details -
Merge branch 'feature/gcloud-updates' into develop
Code improvements, tests, minor fixes for the limited Google Cloud Storage provider. [SVCS-617] Closes: CenterForOpenScience#327
Configuration menu - View commit details
-
Copy full SHA for 691cc97 - Browse repository at this point
Copy the full SHA 691cc97View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7f80253 - Browse repository at this point
Copy the full SHA 7f80253View commit details -
Configuration menu - View commit details
-
Copy full SHA for 998f857 - Browse repository at this point
Copy the full SHA 998f857View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9abd9f6 - Browse repository at this point
Copy the full SHA 9abd9f6View commit details
Commits on Apr 5, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 39454ea - Browse repository at this point
Copy the full SHA 39454eaView commit details -
Fix typing and update import for box
- The type fix also finds a bug in our code where `._intra_move_copy_metadata()` calls a buggy `._get_folder_meta()` that further calls ._serialize_item() with invalid arguments.
Configuration menu - View commit details
-
Copy full SHA for e183bff - Browse repository at this point
Copy the full SHA e183bffView commit details -
Configuration menu - View commit details
-
Copy full SHA for f5c4267 - Browse repository at this point
Copy the full SHA f5c4267View commit details -
Configuration menu - View commit details
-
Copy full SHA for 20756c1 - Browse repository at this point
Copy the full SHA 20756c1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6672fb8 - Browse repository at this point
Copy the full SHA 6672fb8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8a3218e - Browse repository at this point
Copy the full SHA 8a3218eView commit details
Commits on Apr 6, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 95fa944 - Browse repository at this point
Copy the full SHA 95fa944View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3eee8db - Browse repository at this point
Copy the full SHA 3eee8dbView commit details -
Merge branch 'bug-fix/type-annotation' into develop
[SVCS-595] Closes: CenterForOpenScience#332
Configuration menu - View commit details
-
Copy full SHA for 154806f - Browse repository at this point
Copy the full SHA 154806fView commit details -
switch back to non-conda based rtd config
* ReadTheDocs has updated their base python image, meaning the anaconda-based config is no longer needed. The conda config was out-of-date and failing to build anyway since the setuptools dependency version bump.
Configuration menu - View commit details
-
Copy full SHA for 23aa355 - Browse repository at this point
Copy the full SHA 23aa355View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0df8fd6 - Browse repository at this point
Copy the full SHA 0df8fd6View commit details
Commits on Apr 10, 2018
-
don't send logging callbacks for partial requests
* Stop sending download callbacks for 206 Partial responses. These should not be counted as full downloads. WB does not directly support Range requests on direct-from-provider downloads (signed urls), but at least curl and Postman appear to propagate Range headers from the original request to the follow-up redirection request. For now, log 302 responses with Range headers, but continue to send download callbacks as normal. The logs will be used to determine the correct behavior in the future.
Configuration menu - View commit details
-
Copy full SHA for df75986 - Browse repository at this point
Copy the full SHA df75986View commit details -
Configuration menu - View commit details
-
Copy full SHA for 539672c - Browse repository at this point
Copy the full SHA 539672cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 75ba817 - Browse repository at this point
Copy the full SHA 75ba817View commit details -
Configuration menu - View commit details
-
Copy full SHA for 11b6f54 - Browse repository at this point
Copy the full SHA 11b6f54View commit details -
Configuration menu - View commit details
-
Copy full SHA for e6a82e8 - Browse repository at this point
Copy the full SHA e6a82e8View commit details
Commits on Apr 12, 2018
-
Configuration menu - View commit details
-
Copy full SHA for e72a69e - Browse repository at this point
Copy the full SHA e72a69eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1a44c49 - Browse repository at this point
Copy the full SHA 1a44c49View commit details
Commits on Apr 13, 2018
-
don't log revisions metadata requests to callback
* Turn off revisions metadata logging. When file download logging was added regular metadata requests were excluded, but revisions were overlooked.
Configuration menu - View commit details
-
Copy full SHA for 3c5638a - Browse repository at this point
Copy the full SHA 3c5638aView commit details -
return metadata about request in logging callback
* Update WB to return the request method, url, user agent, and referrer url in the logging callback payload. Intended to help the callback listener provide more specific download metrics.
Configuration menu - View commit details
-
Copy full SHA for 666217d - Browse repository at this point
Copy the full SHA 666217dView commit details -
Merge branch 'feature/more-callback-metadata' into develop
[SVCS-673] Closes: CenterForOpenScience#324
Configuration menu - View commit details
-
Copy full SHA for 81bec1a - Browse repository at this point
Copy the full SHA 81bec1aView commit details
Commits on Apr 20, 2018
-
release metadata update response in osfstorage tasks
* Otherwise an "unclosed response" error will appear when the next celery task is scheduled.
Configuration menu - View commit details
-
Copy full SHA for 0169a89 - Browse repository at this point
Copy the full SHA 0169a89View commit details
Commits on Apr 23, 2018
-
add post-task cleanup for osfstorage tasks
* The osfstorage provider kicks off two tasks after upload; one to backup the file to Amazon Glacier and one to generate parity files that are sent to a bucket on the storage backend provider. Since both tasks run in parallel and need a copy of the uploaded file to work, neither could be responsible for deleting it when done. Instead this work would need to be done periodically by an admin to keep the disk from filling up. Both tasks are now included in a Celery chord. When all the tasks in a chord are done, it runs another task. In this case, the chord will run a cleanup task after both other tasks finish. To make this simpler, each upload is moved to a temporary directory where it and its generated parity files lived. This temporary directory is removed by the cleanup task. The parity and archive tasks tests have been commented out rather than updated, since a simpler approach may be implemented soon.
Configuration menu - View commit details
-
Copy full SHA for c67aa5c - Browse repository at this point
Copy the full SHA c67aa5cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7bdc4aa - Browse repository at this point
Copy the full SHA 7bdc4aaView commit details -
Delay URL build/sign for GoogleCloud
- Use `functool.partial()` to delay building and signing URL unitl the request is actually made. - Now `make_request()` get a brand new URL every time it retries a failed request.
Configuration menu - View commit details
-
Copy full SHA for d552440 - Browse repository at this point
Copy the full SHA d552440View commit details -
Merge branch 'hotfix/delay-gcloud-sigining'
[SVCS-697] Closes: CenterForOpenScience#339
Configuration menu - View commit details
-
Copy full SHA for e4e2aca - Browse repository at this point
Copy the full SHA e4e2acaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 43235a3 - Browse repository at this point
Copy the full SHA 43235a3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 19c1749 - Browse repository at this point
Copy the full SHA 19c1749View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2395939 - Browse repository at this point
Copy the full SHA 2395939View commit details
Commits on Apr 24, 2018
-
Configuration menu - View commit details
-
Copy full SHA for a28b0cd - Browse repository at this point
Copy the full SHA a28b0cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e5e9f4 - Browse repository at this point
Copy the full SHA 8e5e9f4View commit details -
release metadata update response in osfstorage tasks
* Otherwise an "unclosed response" error will appear when the next celery task is scheduled.
Configuration menu - View commit details
-
Copy full SHA for 97d508a - Browse repository at this point
Copy the full SHA 97d508aView commit details -
add post-task cleanup for osfstorage tasks
* The osfstorage provider kicks off two tasks after upload; one to backup the file to Amazon Glacier and one to generate parity files that are sent to a bucket on the storage backend provider. Since both tasks run in parallel and need a copy of the uploaded file to work, neither could be responsible for deleting it when done. Instead this work would need to be done periodically by an admin to keep the disk from filling up. Both tasks are now included in a Celery chord. When all the tasks in a chord are done, it runs another task. In this case, the chord will run a cleanup task after both other tasks finish. To make this simpler, each upload is moved to a temporary directory where it and its generated parity files lived. This temporary directory is removed by the cleanup task. The parity and archive tasks tests have been commented out rather than updated, since a simpler approach may be implemented soon.
Configuration menu - View commit details
-
Copy full SHA for 391bb6d - Browse repository at this point
Copy the full SHA 391bb6dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2886a77 - Browse repository at this point
Copy the full SHA 2886a77View commit details -
Merge branch 'hotfix/cleanup-after-osfstorage-tasks'
* These commits were originally merged to develop, but are being hotfixed into master to solve issues with unbounded storage consumption.
Configuration menu - View commit details
-
Copy full SHA for 882d3d8 - Browse repository at this point
Copy the full SHA 882d3d8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5ba3767 - Browse repository at this point
Copy the full SHA 5ba3767View commit details -
Configuration menu - View commit details
-
Copy full SHA for a7eb2f7 - Browse repository at this point
Copy the full SHA a7eb2f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for ce2514c - Browse repository at this point
Copy the full SHA ce2514cView commit details -
move url evocation inside the retry loop
* In 0.38.2, the googlecloud provider was updated to provide a function as the url parameter to `BaseProvider.make_request`. Since googlecloud urls are signed and can expire, it must delay generation until right before issuing. Otherwise, if the first request fails, the second may not get issued until after the signature has expired. Unfortunately, `.make_request` was evoking the url function outside of the retry loop. This resulted in the same signed url being used for each retry. Moving this inside the retry loop will cause new urls to be generated for each retry request.
Configuration menu - View commit details
-
Copy full SHA for 5959ff7 - Browse repository at this point
Copy the full SHA 5959ff7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 25507ec - Browse repository at this point
Copy the full SHA 25507ecView commit details -
Configuration menu - View commit details
-
Copy full SHA for d61d6bc - Browse repository at this point
Copy the full SHA d61d6bcView commit details -
Configuration menu - View commit details
-
Copy full SHA for ac6f192 - Browse repository at this point
Copy the full SHA ac6f192View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9fdd80 - Browse repository at this point
Copy the full SHA e9fdd80View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3d9d429 - Browse repository at this point
Copy the full SHA 3d9d429View commit details -
Configuration menu - View commit details
-
Copy full SHA for cb61a52 - Browse repository at this point
Copy the full SHA cb61a52View commit details -
Configuration menu - View commit details
-
Copy full SHA for d4b0f3e - Browse repository at this point
Copy the full SHA d4b0f3eView commit details -
Configuration menu - View commit details
-
Copy full SHA for cff38c6 - Browse repository at this point
Copy the full SHA cff38c6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 33baec0 - Browse repository at this point
Copy the full SHA 33baec0View commit details
Commits on Apr 25, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 31af967 - Browse repository at this point
Copy the full SHA 31af967View commit details -
Configuration menu - View commit details
-
Copy full SHA for d968683 - Browse repository at this point
Copy the full SHA d968683View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8765321 - Browse repository at this point
Copy the full SHA 8765321View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8849baa - Browse repository at this point
Copy the full SHA 8849baaView commit details -
Configuration menu - View commit details
-
Copy full SHA for e2e6266 - Browse repository at this point
Copy the full SHA e2e6266View commit details
Commits on May 1, 2018
-
Send along user ID when asking for children metadata from the osf
[#SVCS-689] * osfstorage is being updated to include a flag in file metadata that will indicate if the requesting user has seen the most recent version of the file. To help the OSF properly determine this, update WB to send along the requesting user's id when asking for the metadata of all files in a given directory. The requesting user is not the same as the authorizing user. If Barbara asks for the contents of an osfstorage directory created by Alice, Barbara is the *requesting* user, while Alice is the *authorizing* user. WB first verifies that Barbara has the necessary access to the file, but uses a shared secret to retrieve metadata about the file. This change is necessary to inform the OSF who is behind the request. * The osfstorage folder children response is updated to include a new flag, `latestVersionSeen`. If this flag is `null`, the requesting user has never seen *any* version of the file. If it is `true`, the user has seen the latest version of the file. If it is `false`, the user has seen a previous version of the file, but not the most recent one. This flag will be exposed through the `extra.latestVersionSeen` flag in OsfStorageFileMetadata. * Due to a quirk in WB and the OSF, the latestVersionSeen flag will only be correctly set on the responses from folder metadata list requests. Neither service correctly handles requests for previous-version metadata for individual files. * Update tests to include latestVersionSeen in children response.
Configuration menu - View commit details
-
Copy full SHA for 6843bf5 - Browse repository at this point
Copy the full SHA 6843bf5View commit details -
Configuration menu - View commit details
-
Copy full SHA for bfbd106 - Browse repository at this point
Copy the full SHA bfbd106View commit details -
Merge branch 'feature/osfstorage-last-version-seen' into develop
[SVCS-689] Closes: CenterForOpenScience#337
Configuration menu - View commit details
-
Copy full SHA for 9b40f14 - Browse repository at this point
Copy the full SHA 9b40f14View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0aa3e1f - Browse repository at this point
Copy the full SHA 0aa3e1fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 07cb5b1 - Browse repository at this point
Copy the full SHA 07cb5b1View commit details -
Skip parsing response body for HEAD requests
- This only applies to `exception_from_response()` - Side-effect: fix not-released response
Configuration menu - View commit details
-
Copy full SHA for 56c99ad - Browse repository at this point
Copy the full SHA 56c99adView commit details -
Configuration menu - View commit details
-
Copy full SHA for ee86ba3 - Browse repository at this point
Copy the full SHA ee86ba3View commit details -
Merge branch 'feature/silence-mime-type-exceptions' into develop
[SVCS-693] Closes: CenterForOpenScience#338
Configuration menu - View commit details
-
Copy full SHA for 98b057c - Browse repository at this point
Copy the full SHA 98b057cView commit details
Commits on May 8, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 2be4a0e - Browse repository at this point
Copy the full SHA 2be4a0eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 001c495 - Browse repository at this point
Copy the full SHA 001c495View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1d6435d - Browse repository at this point
Copy the full SHA 1d6435dView commit details
Commits on May 23, 2018
-
Remove extra parens in core exception tests
They do nothing syntactically and are confusing.
Configuration menu - View commit details
-
Copy full SHA for aa1fdbd - Browse repository at this point
Copy the full SHA aa1fdbdView commit details -
Merge branch 'fix/remove-parens' into develop
[No Ticket] Closes: CenterForOpenScience#345
Configuration menu - View commit details
-
Copy full SHA for 140ceed - Browse repository at this point
Copy the full SHA 140ceedView commit details
Commits on May 24, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 5ba0d35 - Browse repository at this point
Copy the full SHA 5ba0d35View commit details -
Fix referrer domain calculation
- Build referrer domain from scheme, host and port - Only build referrer domain if exists
Configuration menu - View commit details
-
Copy full SHA for 5052e63 - Browse repository at this point
Copy the full SHA 5052e63View commit details -
Merge branch 'feature/catch-mfr-referrer' into develop
[SVCS-818] Closes: CenterForOpenScience#344
Configuration menu - View commit details
-
Copy full SHA for 2f45e0b - Browse repository at this point
Copy the full SHA 2f45e0bView commit details
Commits on Jun 1, 2018
-
Configuration menu - View commit details
-
Copy full SHA for eb5a8ac - Browse repository at this point
Copy the full SHA eb5a8acView commit details -
OSFStorage: intra move/copy only for same region
* The googlecloud backend does not support intra move/copy if the buckets are in different regions. Add the relevant check to can_intra_* and update tests.
Configuration menu - View commit details
-
Copy full SHA for dd9b536 - Browse repository at this point
Copy the full SHA dd9b536View commit details -
Merge branch 'feature/no-crossregion-intramove' into develop
[SVCS-701] Closes: CenterForOpenScience#343
Configuration menu - View commit details
-
Copy full SHA for 00949cf - Browse repository at this point
Copy the full SHA 00949cfView commit details -
Add size-cast-as-int property to file metadata.
* Owncloud has the unfortunate habit of returning file size as a string instead of an int. WB never enforced a cast to int on the property, so there may be clients in the wild that expect it to be a string. To avoid breaking these, add a new property, `sizeInt` to WB metadata responses. This is guaranteed to be either an `int` or `None` if the size is unknown. * The JSON-API -style responses for folder metadata include a `size` field that is always `None`. Add a similar `sizeInt` field for parity with files. * Update explicit metadata tests for all providers. * Update type annotation for BaseFileMetadata.size to reflect its regrettable potential for stringiness.
Configuration menu - View commit details
-
Copy full SHA for f45e48a - Browse repository at this point
Copy the full SHA f45e48aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2f2f8bc - Browse repository at this point
Copy the full SHA 2f2f8bcView commit details -
Merge branch 'feature/int-size-in-metadata' into develop
[SVCS-499] Closes: CenterForOpenScience#312
Configuration menu - View commit details
-
Copy full SHA for 641ccae - Browse repository at this point
Copy the full SHA 641ccaeView commit details
Commits on Jun 4, 2018
-
fill out tests for v1 server API
* Count the number of times a mock corountine has been awaited. * Expand handler tests, port them to pytest and reorganize fixtures
Configuration menu - View commit details
-
Copy full SHA for 15e7457 - Browse repository at this point
Copy the full SHA 15e7457View commit details -
Merge branch 'feature/test-v1-api-server' into develop
[SVCS-377] Closes: CenterForOpenScience#239
Configuration menu - View commit details
-
Copy full SHA for 5c637a2 - Browse repository at this point
Copy the full SHA 5c637a2View commit details
Commits on Jun 5, 2018
-
pin Dockerfile to use jessie-based python
* The python:3.5-slim docker tag was recently repointed to a debian stretch-based image. Until WB has been verified to work on stretch, pin to the jessie based image it has been using.
Configuration menu - View commit details
-
Copy full SHA for 61da346 - Browse repository at this point
Copy the full SHA 61da346View commit details -
depend on gpg; try other keyservers
* Following the OSF's lead, explicitly depend on gnupg2 and specify fallback gpg keyservers.
Configuration menu - View commit details
-
Copy full SHA for 1f42af3 - Browse repository at this point
Copy the full SHA 1f42af3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8be25cf - Browse repository at this point
Copy the full SHA 8be25cfView commit details -
Configuration menu - View commit details
-
Copy full SHA for f46d301 - Browse repository at this point
Copy the full SHA f46d301View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2732383 - Browse repository at this point
Copy the full SHA 2732383View commit details -
Configuration menu - View commit details
-
Copy full SHA for 08ddacd - Browse repository at this point
Copy the full SHA 08ddacdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 43dc14e - Browse repository at this point
Copy the full SHA 43dc14eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 38760a7 - Browse repository at this point
Copy the full SHA 38760a7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6766721 - Browse repository at this point
Copy the full SHA 6766721View commit details
Commits on Jun 7, 2018
-
Disable intra move/copy region check for filesystem
- Region only applies when googlecloud is the storage provider
Configuration menu - View commit details
-
Copy full SHA for c545615 - Browse repository at this point
Copy the full SHA c545615View commit details -
Configuration menu - View commit details
-
Copy full SHA for 01b0733 - Browse repository at this point
Copy the full SHA 01b0733View commit details
Commits on Jun 12, 2018
-
signal MFR render/export requests to the OSF
* MFR now includes a header when requesting metadata from WB. This header indicates if the MFR request is a render or export action. If WB sees this header, it should relay it to the OSF by changing the action from 'metadata' to either 'render' or 'export'. The OSF will be updated to treat these actions as metadata requests and to use them to keep metrics on MFR usage.
Configuration menu - View commit details
-
Copy full SHA for ce8ddb4 - Browse repository at this point
Copy the full SHA ce8ddb4View commit details -
Merge branch 'feature/log-renders-on-auth' into develop
[SVCS-831] Closes: CenterForOpenScience#350
Configuration menu - View commit details
-
Copy full SHA for 0e53579 - Browse repository at this point
Copy the full SHA 0e53579View commit details
Commits on Jun 22, 2018
-
Configuration menu - View commit details
-
Copy full SHA for c507c36 - Browse repository at this point
Copy the full SHA c507c36View commit details -
Configuration menu - View commit details
-
Copy full SHA for 19c9461 - Browse repository at this point
Copy the full SHA 19c9461View commit details -
Configuration menu - View commit details
-
Copy full SHA for 782231f - Browse repository at this point
Copy the full SHA 782231fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e5935b - Browse repository at this point
Copy the full SHA 6e5935bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6dc3b0d - Browse repository at this point
Copy the full SHA 6dc3b0dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2197194 - Browse repository at this point
Copy the full SHA 2197194View commit details
Commits on Jul 5, 2018
-
Add CutoffStream class to read subset of existing stream
* Most providers have limits on how big a file can be uploaded in a single request. Some providers support uploading larger files by "chunking" uploads: breaking a file into multiple pieces, uploading them individually, then reassembling them on the providers side. Each provider sets its own limit on the max size of a single chunk, but usually they are multi-megabyte chunks. WB receives a single stream during an upload. To chunk this without downloading and manually partitioning the file requires a stream reader class that can read up to `n` bytes, then stop without closing the original stream. The WB stream classes inherit from `asyncio.StreamReader`, whose `readexactly(n)` and `read(n)` methods appear to support this use case. They do, sort of. They attempt to read all `n` bytes into a chunk in memory before sending it off the the provider. This means that 1) uploading a 10 Mb chunk requires 10 Mb of memory, and 2) all 10 Mb must be fetched from the uploader before being sent. 1 could quickly lead to memory exhaustion in WB if multiple uploads happen at the same time. 2 can cause uploads to fail. If the uploader is slow to send data to WB and fill the chunk, the receiving provider may close the connection as inactive. The solution is to continuously send smaller subchunks of data to the provider, terminating after the overall chunk size is reached. This is actually how the `asyncio.StreamReader.read()` method is intended to function, but confusion between what `read()` calls a chunk size and what the provider calls a chunk size led to failures in cross-provider move/copies into Figshare, the only provider at the time of this commit that supports chunked uploads. The new CutoffStream class takes an existing stream object and the provider-given chunk size (superchunk) and continuously reads and feeds subchunks. After each subchunk read, a bytes-thus-far counter is updated. The CutoffStream stops reading when bytes-thus-far equals the superchunk size. Only subchunk bytes of data are stored in memory at a time.
Configuration menu - View commit details
-
Copy full SHA for cc27c32 - Browse repository at this point
Copy the full SHA cc27c32View commit details -
update Figshare to use CutoffStream for multipart uploads
* Fix cross-provider uploads to Figshare by using the new CutoffStream class. The previous approach was buffering the entire chunk into memory before sending. If the source provider was slow, Figshare would close the connection as inactive while waiting. * Update Figshare tests to no longer fake the stream md5sum. This was incorrectly diagnosed as an issue with aiohttpretty, instead of the issue above. Now that CutoffStream is being used, the Figshare tests can calculate an actual hash.
Configuration menu - View commit details
-
Copy full SHA for 195d47c - Browse repository at this point
Copy the full SHA 195d47cView commit details -
gdrive: cast size to int when building ResponseStreamReader
* When creating a download stream for Google Drive files, make sure to pass the size as an integer, to meet the expectations of others who consume the stream. Google Drive reports file size as a string instead of an integer. When a file is copied from GDrive to Figshare, WB initiates an upload by telling Figshare how big a file it should expect. Figshare will throw a 400 is WB passes the size as a string instead of an integer.
Configuration menu - View commit details
-
Copy full SHA for faab3cb - Browse repository at this point
Copy the full SHA faab3cbView commit details -
Merge branch 'feature/figshare-upload-fix' into develop
[SVCS-424] Closes: CenterForOpenScience#306, CenterForOpenScience#352
Configuration menu - View commit details
-
Copy full SHA for 58674d9 - Browse repository at this point
Copy the full SHA 58674d9View commit details -
- The chunked uploads consists primarily of three methods which (1) create a session, (2) upload parts of the stream and (3) close the session. - Add/update tests
Configuration menu - View commit details
-
Copy full SHA for f1d9b90 - Browse repository at this point
Copy the full SHA f1d9b90View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8a567a3 - Browse repository at this point
Copy the full SHA 8a567a3View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9994a7 - Browse repository at this point
Copy the full SHA e9994a7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 59a1630 - Browse repository at this point
Copy the full SHA 59a1630View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6f326c3 - Browse repository at this point
Copy the full SHA 6f326c3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 967da8b - Browse repository at this point
Copy the full SHA 967da8bView commit details -
Configuration menu - View commit details
-
Copy full SHA for f3fcc30 - Browse repository at this point
Copy the full SHA f3fcc30View commit details -
- Fix one S3 test by disable server encrpytion - Use sync loop instead of async for parts uploading [skip ci] [ci skip]
Configuration menu - View commit details
-
Copy full SHA for 42c7dd1 - Browse repository at this point
Copy the full SHA 42c7dd1View commit details
Commits on Jul 10, 2018
-
Several minor updates/reversions
- Use tuple extra comma for `expects=` in `make_request()` - Use `functools.partial()` instead of lambda express to build requests - Reorder multi-part load methods - Update docstr for multi-part load methods - Remove inconsistent typing, which will be added later
Configuration menu - View commit details
-
Copy full SHA for 738bde3 - Browse repository at this point
Copy the full SHA 738bde3View commit details -
Pass upload id instead of full session to methods
- Main change: session upload id is the only info that each method needs to make requests to S3. Pass the string to the methods instead of the full object/dictionary - Side effect: (1) improve returns (2) fix docstr (3) use "CONTIGUOUS_UPLOAD_SIZE_LIMIT"
Configuration menu - View commit details
-
Copy full SHA for 53514e3 - Browse repository at this point
Copy the full SHA 53514e3View commit details -
Rewrite multi-part upload abort action:
- Add max retries for the while loop - Use list length == 0 as the break condition - Instead of raise exceptions, return True if successful, False otherwise. Add debug and error logs respectively.
Configuration menu - View commit details
-
Copy full SHA for 5025ea8 - Browse repository at this point
Copy the full SHA 5025ea8View commit details -
Improve upload logic and fix encryption header
- Abort upload if (1) upload parts or (2) complete upload fails - Add abort status to the errors and logs for both user and devs - Fix encryption headers
Configuration menu - View commit details
-
Copy full SHA for f98e63d - Browse repository at this point
Copy the full SHA f98e63dView commit details -
- Make `CHUNK_SIZE` and `CONTIGUOUS_UPLOAD_SIZE_LIMIT` set as class property from settings. This allows unit tests to have there own settings (with small size limits) - Use a default empty list for remaining uploaded part list to avoid using try/catch to check list length == 0 - Remove server-side encryption from `_upload_parts` since (1) the action does not support the header (2) it is set in `_create_upload_session` where mult-part upload is inititated - Fix style for building xml payload when completing upload
Configuration menu - View commit details
-
Copy full SHA for 293ba0b - Browse repository at this point
Copy the full SHA 293ba0bView commit details -
Fix the issue where successful ABORT deletes the session
- If the ABORT request is successful, the mult-part upload session may have already been deleted when the LIST PARTS request is made. - Update the criteria for successful abort: ether LIST PARTS request returns 404 or returns 200 with an empty parts list.
Configuration menu - View commit details
-
Copy full SHA for da60939 - Browse repository at this point
Copy the full SHA da60939View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9f78ec6 - Browse repository at this point
Copy the full SHA 9f78ec6View commit details -
make _create_upload_session return only session id
* Return only the specific data needed instead of a structure. * Update tests to match.
Configuration menu - View commit details
-
Copy full SHA for ab23271 - Browse repository at this point
Copy the full SHA ab23271View commit details -
release response before error handling
* Avoid triggering an unclosed response warning by closing before testing for the error case. The headers are still readable after the response has been closed.
Configuration menu - View commit details
-
Copy full SHA for fd426be - Browse repository at this point
Copy the full SHA fd426beView commit details -
use CutoffStream to segment stream into parts
* Avoid reading CHUNK_SIZE bytes into memory by wrapping the upload stream with a CutoffStream. CutoffStream allows continuous reading and sending of small subchunks (~10k) until CHUNK_SIZE bytes have been read in total. * Add a test for the case where the final superchunk is less than CHUNK_SIZE bytes.
Configuration menu - View commit details
-
Copy full SHA for 200632b - Browse repository at this point
Copy the full SHA 200632bView commit details
Commits on Jul 11, 2018
-
Merge branch 'feature/s3-chunked-uploading' into develop
[SVCS-547] Closes: CenterForOpenScience#325
Configuration menu - View commit details
-
Copy full SHA for 9ec4aab - Browse repository at this point
Copy the full SHA 9ec4aabView commit details
Commits on Jul 18, 2018
-
Improve the command
invoke test
- Add an option `--provider=` to test a specific provider only - Add an option `--path=` to test a specific file or folder only - Add an option `--nocov=` to disable coverage
Configuration menu - View commit details
-
Copy full SHA for 25fac97 - Browse repository at this point
Copy the full SHA 25fac97View commit details -
Merge branch 'feature/improve-inv-test-cmd' into develop
[SVCS-NO-TICKET] Improve The Command `invoke test` Closes: CenterForOpenScience#353
Configuration menu - View commit details
-
Copy full SHA for 3984900 - Browse repository at this point
Copy the full SHA 3984900View commit details
Commits on Jul 25, 2018
-
Configuration menu - View commit details
-
Copy full SHA for d1b8505 - Browse repository at this point
Copy the full SHA d1b8505View commit details -
Configuration menu - View commit details
-
Copy full SHA for b0d717b - Browse repository at this point
Copy the full SHA b0d717bView commit details -
Configuration menu - View commit details
-
Copy full SHA for eaf1729 - Browse repository at this point
Copy the full SHA eaf1729View commit details -
remove never-implemented geolocation code
* Analytics fields will be left in and hardcoded to `None` to avoid changing the schema.
Configuration menu - View commit details
-
Copy full SHA for 4a33463 - Browse repository at this point
Copy the full SHA 4a33463View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d9ef9b - Browse repository at this point
Copy the full SHA 9d9ef9bView commit details -
Configuration menu - View commit details
-
Copy full SHA for b329c30 - Browse repository at this point
Copy the full SHA b329c30View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4ac4668 - Browse repository at this point
Copy the full SHA 4ac4668View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3627806 - Browse repository at this point
Copy the full SHA 3627806View commit details
Commits on Jul 26, 2018
-
Configuration menu - View commit details
-
Copy full SHA for 785a25f - Browse repository at this point
Copy the full SHA 785a25fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4033bd3 - Browse repository at this point
Copy the full SHA 4033bd3View commit details -
Configuration menu - View commit details
-
Copy full SHA for f9cb9e2 - Browse repository at this point
Copy the full SHA f9cb9e2View commit details -
Use separate methods for cleaner and simpler code
- For chunked upload, add `upload_part()` to handle one chunk upload and `upload_part()` now calls `upload_part()`. - For normal upload, move the code into `contiguous_upload()`
Configuration menu - View commit details
-
Copy full SHA for c2af470 - Browse repository at this point
Copy the full SHA c2af470View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9da695e - Browse repository at this point
Copy the full SHA 9da695eView commit details -
Obtain chunked upload sizes from Dropbox provider settings
- According to Dropbox API docs, files larger than 150 MB must use chunked upload. Chunks can be any size up to 150 MB. A typical size is 4 MB, which is what WB uses. The max file size can be uploaded is 350 GB.
Configuration menu - View commit details
-
Copy full SHA for ddcc5a3 - Browse repository at this point
Copy the full SHA ddcc5a3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d74f99 - Browse repository at this point
Copy the full SHA 5d74f99View commit details -
Configuration menu - View commit details
-
Copy full SHA for e89a750 - Browse repository at this point
Copy the full SHA e89a750View commit details -
Configuration menu - View commit details
-
Copy full SHA for 370ffb3 - Browse repository at this point
Copy the full SHA 370ffb3View commit details -
permit error if no session_id is available
* The dropbox provider should error if it can't get a session identifier. A `KeyError` should suffice until we see what an actual error condition looks like.
Configuration menu - View commit details
-
Copy full SHA for edd2208 - Browse repository at this point
Copy the full SHA edd2208View commit details