-
Notifications
You must be signed in to change notification settings - Fork 15
Commit 0453099

Download dataset (#413)
* fixed atom finding notebook
* fixed atom position finding notebook
* Repo visualizer: updated diagram
* Adding Zeolite OSDB example
* Adding solvation energy example
* updated zeolite notebook
* Repo visualizer: updated diagram
* found the .list() return in the code. just need to use a debugger to see actual data object
* added DOI to list() method
* added DOI to the list() description
* added DOI to _repr_html_ need to preview it but pushing the addition for now
* changed formatting a bit by adding a label for DOI
* made DOI a <p> element for better spacing
* replace spaces with tabs in notebooks
* Repo visualizer: updated diagram
* establish run() function again
* add dlhub_sdk==0.10.0 to reqs, and update foundry version to 0.1.2
* delete extraneous old requirements.txt
* bound globus-sdk to <=2.0.3, since Foundry isn't compatible with Globus SDK 3 yet
* add dlhub to list of service used in testing
* fix syntax typo in service listing
* add funcx to list of services
* Updating auth for tests and client creation
* Repo visualizer: updated diagram
* deleting dupe of python-publish.yml
* add auth handling to init if authorizers not provided
* update local tests to work with auth for search
* remove passing decorators for tests that are local
* add 'mdf' as the default index
* update tests to use prod dataset
* update reqs for Globus SDK 3, and version to 0.2.0
* Repo visualizer: updated diagram
* add option to pass in funcx_endpoint in run()
* add pip dependecy caching
* test caching functionality
* Add keep_hdf5 functionality to Foundry.load_data
* Add Convert to Pytorch Dataset Functionality
* Add Testing for toTorch function
* Remove unnecessary imports
* Replace keep_hdf5 with as_hdf5
* Update foundry/foundry.py
* Update foundry/foundry.py
* Update foundry/foundry.py
* Replace keep_hdf5 with as_hdf5
* Repo visualizer: updated diagram
* Update Testing and Rename Files
* Fix Testing Errors
* Fix a singular typo
* Replace PNG with SVG
* Rebase with dev
* Rebase with dev
* Fix first set of TODOs
* Finish TODO's for Load
* Fix Dataset Loading
* Apply Logan's Changes
* Fix Logging and Remove unnecessary code
* Fix Logging and Remove unnecessary code
* Add Ari's Requests.
* fix unused imports, code style, and update syntax (#229)
* Replace keep_hdf5 with as_hdf5
* Update foundry/foundry.py
* Update foundry/foundry.py
* Update foundry/foundry.py
* Repo visualizer: updated diagram
* fix unused imports, code style, and update syntax
Co-authored-by: Aadit Ambadkar <[email protected]>
Co-authored-by: Ben Blaiszik <[email protected]>
Co-authored-by: repo-visualizer <[email protected]>
* re-add improperly removed warnings import
* add Python 3.10 tests and flake8 error checking
* add setup.cfg to change flake8 parameters
* add comments so flake8 ignores 'unused import' err
* merge tests into single file (#233)
* merge tests into single file
* change name of test file to match new commit
* convert is_gha to a boolean for pytest skip
* remove premature optimization
* Reflect Changes
* Fix Try Catch and Logging.log
* Fix Logging
* Make Logger Reflect Module Name
* add dl.easy_publish wrapper function
* Imports
* Revert "add dl.easy_publish wrapper function"
This reverts commit 7f83611.
* add dl.easy_publish wrapper as f.publish_model
* remove commented code and link to dlhub docs in docstring
* Update testing-work.yml
* Repo visualizer: updated diagram
* Update README.md
* Repo visualizer: updated diagram
* Fix Logging
* update test name
* Rename testing-work.yml to tests.yml
* Rapid removing of XTract (#242)
* Rapid removing of XTract
* Fixing as_object
* Repo visualizer: updated diagram
* Update setup.py
* Repo visualizer: updated diagram
* Update setup.py
* Repo visualizer: updated diagram
* end the file with a newline
* remove redundant flake8 checking
* fix code style (without changing functionality)
* add more style fixes
* fix last style error, others are covered in #231
* To tf dataset (#201) with rebase
* Add Custom Dataset and Implement
* Clean up Branch
* Clean up Branch
* Resolve Some of Logan's Changes
* Resolve Testing Issues?
* Resolve Testing Issues?
* Resolve Testing Issues?
* Resolve Testing Issues?
* Resolve Testing Issues?
* Reflect Logan's Requests
* Fix Import Issues
* Simplify Imports
* Fix Imports
* Apply Logan's Changes
* Comments
* Refactor Common Logic Into New Function
* Add Documentation
* Add Documentation
* Add Custom Dataset and Implement
* Clean up Branch
* Replace keep_hdf5 with as_hdf5
* Resolve Some of Logan's Changes
* Resolve Testing Issues?
* Resolve Testing Issues?
* Resolve Testing Issues?
* Resolve Testing Issues?
* Resolve Testing Issues?
* Reflect Logan's Requests
* Fix Import Issues
* Simplify Imports
* Fix Imports
* Apply Logan's Changes
* Comments
* Refactor Common Logic Into New Function
* Add Documentation
* Add Documentation
* fix reference to _get_inputs_to_targets(); also, whitespace
* remove unused * import
* fix test_foundry.py to have the proper tests from the dev branch
* remove outdated test_to_pytorch() test
* fix passing of self for _get_inputs_targets()
Co-authored-by: Aristana Scourtas <[email protected]>
* delete deprecated build() function and remove 'fail' language from tabular dataset reading
* remove redundant path checking code for loading datasets
* fix logic error in data path verification
* break out path joining logic to be in scope for all dataset types
* add new easy_publish parameters
* set defaults for new parameters
* update to version 0.3.0, add reqs for dlhub 1.0.0, update PyPI info
* Repo visualizer: updated diagram
* Delete bubble-vis.yml
* Update README.md
* Update README.md
* Add files via upload
* Update README.md
* Add files via upload
* Update README.md
* address flake8 concerns in foundry.py
* fix flake8 concerns in torch_wrapper
* address flake8 concers for tf_wrapper
* replace xtract module name with https one in __init__.py
* transfer https methods to https_download.py
* reorder private method
* final flake8 changes
* fix import of https_download
* Add search functionality to make it easier to find datasets
* Flake8 fixes
* Data packages --> datasets
* Dev (#255)
* Fix Logging and Remove unnecessary code
* Fix Logging and Remove unnecessary code
* add Python 3.10 tests and flake8 error checking
* add setup.cfg to change flake8 parameters
* add comments so flake8 ignores 'unused import' err
* Reflect Changes
* Fix Try Catch and Logging.log
* Fix Logging
* Make Logger Reflect Module Name
* Imports
* Fix Logging
* update test name
* Rename testing-work.yml to tests.yml
* end the file with a newline
* remove redundant flake8 checking
* fix code style (without changing functionality)
* add more style fixes
* fix last style error, others are covered in #231
* delete deprecated build() function and remove 'fail' language from tabular dataset reading
* remove redundant path checking code for loading datasets
* fix logic error in data path verification
* break out path joining logic to be in scope for all dataset types
* address flake8 concerns in foundry.py
* fix flake8 concerns in torch_wrapper
* address flake8 concers for tf_wrapper
* replace xtract module name with https one in __init__.py
* transfer https methods to https_download.py
* reorder private method
* final flake8 changes
* fix import of https_download
* Add search functionality to make it easier to find datasets
* Flake8 fixes
* Data packages --> datasets
Co-authored-by: Aadit-Ambadkar <[email protected]>
Co-authored-by: Isaac Darling <[email protected]>
Co-authored-by: Aadit Ambadkar <[email protected]>
Co-authored-by: Ben Blaiszik <[email protected]>
* update version to 0.4.0
also, add Braeden as contributor
* update version
* Fix test badge
* Create README.md
* Open in Colab Buttons (#253)
* Repo visualizer: updated diagram
* Add Badges
* Update Positioning
Co-authored-by: ascourtas <[email protected]>
Co-authored-by: repo-visualizer <[email protected]>
* Moving some functions to utils (#262)
* Moving some functions to utils
* flake8 fixes
* Update README.md
added more to the examples readme to make it more inviting/give better context as to what this page is
* Update README.md (#270)
added screenshots to readme with text explainations
* Update read logic (#271)
* initial example for QMC ML
* simplifying read logic
* swap testing dataset for a smaller dataset
* Added new search test. Removed some stray commented-out code. (#272)
* Paralellize HTTPS downloads. Remove joblib and six requirements (#273)
* Get citation function (#274)
* Paralellize HTTPS downloads. Remove joblib and six requirements
* Add initial bibtex citation output function
* update version to 0.5.0
* Improve HTTPS downloads (#277)
* Cleaned up the keyword arguments, docstring
- Only one keyword argument is used, so having a large (and
undocumented) flexibility with **kwargs is unneeded
- Docstring style mixed NumPy and Google
* Add a test requirements file, simplify test YAML
Sorry, a little house maintenance while I'm at it
* Fix how parallel downloads are implemented
Previous version was using an executor in a way that would
produce an unlimited number of threads, which can cause problems
for large datasets
* Make a progress bar, error checking
* Removed deprecated code
It will live in git and our hearts forever
* Flake8 fixes
* Update README.md (#275)
* Add NSF badge to Foundry
* updated model pub notebook (#284)
* Set Header Images to an absolute URL via raw.githubusercontent (#283)
* Update links to absolute URL for pypi visibility
* Redirect URL to MLMI2-CSSI
* Update README.md (#299)
* Add new logo
* Add updated logos
* Update README [no ci]
* Updating example logos
* Remove stray print
_read_json was printing a debug data frame. Removed.
* Add web (#301)
* Increment version for PyPI deploy
* Update issue templates
* Add https upload (#281)
* add initial directory-making functionality
* add acl permission setting
* add PUT request logiv and ACL setting, plus TODOs
* add logic to delete acl rule after creation
* add try/except handling to acl creation
* add prepare query param so we don't need to make dirs; fix bug when rule_id is not set
* clean up path joining logic, as well as comments
* add capability to upload all files in a folder, instead of one individual file
* update endpoint destination to use a UUID as the folder name
* break out acl rule adding to its own function, tidy up
* break out PUT request functionality
* break out upload_folder() into upload_file() and integrate https functions into publish(), with proper params
* change endpoint to NCSA, make usage more modular; small os.path bug fixes
* reorder functions to be easier to read
* add upload capability for single file, with error handling
* fix logic bugs with destination path setting s.t. all subfolders are written to destination
* cleanup var names in upload_folder() logic; making endpoint_dest path more robust
* code cleanup and breakout helper functions to reduce size of publish()
* add parameter checks to publish() and reduce param complexity
* add docstrings, plus add test param to publish()
* appease flake8
* add one more flake8 fix
* fix auths in tests, add system test for HTTPS publication, small comments
* add system test for HTTPS upload
* break out https publishing into more unit-testable method
* refactor function defs to work better for testing; add https upload unit test
* fix bug where artifact was written to uploaded dataset
* update os.walk block comparison to be more robust
* update publish() docstring and add type hints
* clean up imports, fix type hint for Response, add some context for Xtract file
* WIP to separate helpers into submodule -- need to fix test and method design
* fix typing discrepancy for requests.Response
* update modification date
* Temporarily remove ACL rule creation for https upload
* Fix flake8 comment error
* Fix flake8 once more
* Fixing local tests, flake8, kwargs
* Adding test data
* Debug result on GHA
* Debug result on GHA
* Debug result on GHA
* Debug result on GHA
* add Ben's patch to submodule
* generalize the included functions and
move make_globus_link here from foundry object
* move make_globus_link function to submodule
* update tests to generalized input format
* properly pass 'auths' object between functions
* update modification date
* prepend underscore to private function
* correct call to upload_to_endpoint() in foundry.py
* re-add ACL rule logic
* update auth passing to be more user-friendly; includes test changes
* Introduce a collection to hold authorizers
It uses a dataclass so that we can annotate the type of authorizers
that the tuple, then document them
I put it in a new module, `foundry.auth` so that it can be used
by both the foundry module and the https_upload module (avoiding
circular dependencies)
* alter args such that it's not possible for the user to have endpoint_id and gcs_auth_client misalign
* change language to endpoint_auth_clients for clarity of purpose
* docstring updates
---------
Co-authored-by: Ben Blaiszik <[email protected]>
Co-authored-by: isaac-darling <[email protected]>
Co-authored-by: Logan Ward <[email protected]>
* https upload bugfix (#322)
* add initial directory-making functionality
* add acl permission setting
* add PUT request logiv and ACL setting, plus TODOs
* add logic to delete acl rule after creation
* add try/except handling to acl creation
* add prepare query param so we don't need to make dirs; fix bug when rule_id is not set
* clean up path joining logic, as well as comments
* add capability to upload all files in a folder, instead of one individual file
* update endpoint destination to use a UUID as the folder name
* break out acl rule adding to its own function, tidy up
* break out PUT request functionality
* break out upload_folder() into upload_file() and integrate https functions into publish(), with proper params
* change endpoint to NCSA, make usage more modular; small os.path bug fixes
* reorder functions to be easier to read
* add upload capability for single file, with error handling
* fix logic bugs with destination path setting s.t. all subfolders are written to destination
* cleanup var names in upload_folder() logic; making endpoint_dest path more robust
* code cleanup and breakout helper functions to reduce size of publish()
* add parameter checks to publish() and reduce param complexity
* add docstrings, plus add test param to publish()
* appease flake8
* add one more flake8 fix
* fix auths in tests, add system test for HTTPS publication, small comments
* add system test for HTTPS upload
* break out https publishing into more unit-testable method
* refactor function defs to work better for testing; add https upload unit test
* fix bug where artifact was written to uploaded dataset
* update os.walk block comparison to be more robust
* update publish() docstring and add type hints
* clean up imports, fix type hint for Response, add some context for Xtract file
* WIP to separate helpers into submodule -- need to fix test and method design
* fix typing discrepancy for requests.Response
* update modification date
* Temporarily remove ACL rule creation for https upload
* Fix flake8 comment error
* Fix flake8 once more
* Fixing local tests, flake8, kwargs
* Adding test data
* Debug result on GHA
* Debug result on GHA
* Debug result on GHA
* Debug result on GHA
* add Ben's patch to submodule
* generalize the included functions and
move make_globus_link here from foundry object
* move make_globus_link function to submodule
* update tests to generalized input format
* properly pass 'auths' object between functions
* update modification date
* prepend underscore to private function
* correct call to upload_to_endpoint() in foundry.py
* re-add ACL rule logic
* update auth passing to be more user-friendly; includes test changes
* Introduce a collection to hold authorizers
It uses a dataclass so that we can annotate the type of authorizers
that the tuple, then document them
I put it in a new module, `foundry.auth` so that it can be used
by both the foundry module and the https_upload module (avoiding
circular dependencies)
* alter args such that it's not possible for the user to have endpoint_id and gcs_auth_client misalign
* change language to endpoint_auth_clients for clarity of purpose
* docstring updates
* fix bug from last round of review edits
---------
Co-authored-by: Ben Blaiszik <[email protected]>
Co-authored-by: isaac-darling <[email protected]>
Co-authored-by: Logan Ward <[email protected]>
* add static badge with link to gitbook (#333)
* Update publishing notebook and minor bugfixes (#336)
* update publishing notebook example to use HTTPS upload primarily, along with minor fixes
* add https upload methods and data
* fix function call to publish_dataset
* remove ACL rule code to fix error issue
* update globus images in notebook
* remove commented code
* add missing scopes
* appease flake overlords
* Add search lambda authorizer (sl_authorizer) to dlhub_client
instantiation
* removed unnecessary scopes
* update curation info in notebook
---------
Co-authored-by: Ben Blaiszik <[email protected]>
Co-authored-by: isaac-darling <[email protected]>
Co-authored-by: Logan Ward <[email protected]>
Co-authored-by: Eric Blau <[email protected]>
* delete due to unresolvable merge conflict
-- adding back in dev
* add back publishing notebook
* Split specification (#344)
* adding ability to specify splits for loading
* refining test
* Update splits_to_load --> splits
---------
Co-authored-by: blaiszik <[email protected]>
* Implementing new search() function (and refactor) (#408)
* update to 0.6.0 for HTTPS pub
* Upload Foundry class load() function to default download using https (#340)
* Update setup.py
Fix version number for pyPI deploy
* Update setup.py version for pyPI
* Update requirements.txt to latest DLHub SDK
This is needed to require upgrade of DLHub SDK for Foundry users when they upgrade Foundry.
* Update version to 0.6.3
* incorporate load() to foundry.__init__()
* automating api documentation using github action (#342)
* CI: Automated documentation build
* Removing remnants of XTract
* CI: Automated documentation build
* CI: Automated documentation build
* Update README.md with contributing instructions (#357)
* Update README.md with contributing instructions
* Update PR language
* merging in split specification
* flake fixes
* add jingrui examples (#363)
* removed blank line
* Load on init (#358)
* incorporate load() to foundry.__init__()
* merging in split specification
* flake fixes
* removed blank line
* Adds note for quickstart globus set to false
* Validating metadata before publishing
* remove arguments from Foundry object that are duplicated with base class
* Update setup.py version to 0.7.0
* refactor foundry to separate foundry instance from dataset objects
* fine tuning search functionality
* removing redefinition of FoundryDataset
* address comments in PR review
* remove unused import
* updgrade setup-python from v2 to v4
* upgrade other setup-python from v2 to v4
* modify limit test
---------
Co-authored-by: ascourtas <[email protected]>
Co-authored-by: Ben Blaiszik <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Marshall McDonnell <[email protected]>
* CI: Automated documentation build
* getting cache sorted
* getting cache sorted
* updated docstrings and examples, added extended pandas dataframe class for display and access of datasets
* updated docstrings and examples, added extended pandas dataframe class for display and access of datasets
* cleaning up flake8 issues
* cleaning up flake8 issues
* cleaning up a few loose ends
* cleaning up a few loose ends
* check if download is causing test in GHA to fail
* check if download is causing test in GHA to fail
* disabling all individual tests
* disabling all individual tests
* testing a few tests
* testing a few tests
* testing a few more tests
* testing a few more tests
* testing w/ globus=False
* testing w/ globus=False
* WIP
* WIP
* refactoring tests
* refactoring tests
* wrapping up testing of foundry_cache.py
* wrapping up testing of foundry_cache.py
* refining tests
* refining tests
* tighten up testing, remove downloads from GHA, hold off on https download until next issue
* updating workflow
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* removing test for GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* skipping problematic test in GHA
* testing bump of pytest version
* reverting to skip test that uses pytest.raises in GHA
* addressing feedback from PR
---------
Co-authored-by: KJ <[email protected]>
Co-authored-by: ascourtas <[email protected]>
Co-authored-by: repo-visualizer <[email protected]>
Co-authored-by: Ben Blaiszik <[email protected]>
Co-authored-by: BraedenCu <[email protected]>
Co-authored-by: Aadit-Ambadkar <[email protected]>
Co-authored-by: Aadit Ambadkar <[email protected]>
Co-authored-by: Isaac Darling <[email protected]>
Co-authored-by: Logan Ward <[email protected]>
Co-authored-by: C. Y. Schneck <[email protected]>
Co-authored-by: Sterling G. Baird <[email protected]>
Co-authored-by: Logan Ward <[email protected]>
Co-authored-by: Eric Blau <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Marshall McDonnell <[email protected]>1 parent 8065b6f commit 0453099Copy full SHA for 0453099
File tree
34 files changed
+4471
-3154
lines changedFilter options
- .github/workflows
- data/https_test
- examples
- DefectTrack
- PACBEDCNN-thickness-mistilt
- atom-position-finding
- .ipynb_checkpoints
- bandgap
- dendrite-segmentation
- g4mp2-solvation
- oqmd
- ptychography-airpi
- publishing-guides
- qmc_ml
- zeolite
- .ipynb_checkpoints
- foundry
- tests
- test_data
- elwood_md_v1.2
- test_dataset
34 files changed
+4471
-3154
lines changed.github/workflows/tests.yml
Copy file name to clipboardExpand all lines: .github/workflows/tests.yml+3-3Lines changed: 3 additions & 3 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
20 | 20 |
| |
21 | 21 |
| |
22 | 22 |
| |
23 |
| - | |
| 23 | + | |
24 | 24 |
| |
25 |
| - | |
| 25 | + | |
26 | 26 |
| |
27 | 27 |
| |
28 | 28 |
| |
| |||
46 | 46 |
| |
47 | 47 |
| |
48 | 48 |
| |
49 |
| - | |
| 49 | + | |
50 | 50 |
| |
51 | 51 |
| |
52 | 52 |
| |
|
+2-1Lines changed: 2 additions & 1 deletion
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
2 | 2 |
| |
3 | 3 |
| |
4 | 4 |
| |
5 |
| - | |
| 5 | + | |
| 6 | + |
data/https_test/test_data.json
Copy file name to clipboardExpand all lines: data/https_test/test_data.json-1Lines changed: 0 additions & 1 deletion
This file was deleted.
examples/DefectTrack/000001.png
Copy file name to clipboard-451 KB
Binary file not shown.
examples/DefectTrack/000002.png
Copy file name to clipboard-451 KB
Binary file not shown.
examples/DefectTrack/000003.png
Copy file name to clipboard-451 KB
Binary file not shown.
examples/DefectTrack/000004.png
Copy file name to clipboard-451 KB
Binary file not shown.
0 commit comments