Skip to content

Commit

Permalink
update some additional old references to bionemo.testing.data.load
Browse files Browse the repository at this point in the history
  • Loading branch information
pstjohn committed Nov 8, 2024
1 parent 6498ba3 commit 8612d7d
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 8 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ AWS_ENDPOINT_URL="https://pbss.s8k.io"
Running tests downloads the test data to a cache location when first invoked.

For more information on adding new test artifacts, see the documentation in
[`bionemo.testing.data.load`](sub-packages/bionemo-testing/src/bionemo/testing/data/README.md).
[`bionemo.core.data.load`](sub-packages/bionemo-testing/src/bionemo/testing/data/README.md).

## Updating pinned versions of NeMo / Megatron-LM

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/datasets/uniprot.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ randomly chosen UniRef90 sequence from each.
## Data Availability

Two versions of the dataset are distributed, a full training dataset (~80Gb) and a 10,000 UniRef50 cluster random slice
(~150Mb). To load and use the sanity dataset, the [bionemo.testing.data.load][bionemo.testing.data.load.load] function
(~150Mb). To load and use the sanity dataset, the [bionemo.core.data.load][bionemo.core.data.load.load] function
can be used to materialize the sanity dataset in the BioNeMo2 cache directory:

```python
from bionemo.testing.data.load import load
from bionemo.core.data.load import load

sanity_data_dir = load("esm2/testdata_esm2_pretrain:2.0")
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@
"metadata": {},
"outputs": [],
"source": [
"from bionemo.testing.data.load import load\n",
"from bionemo.core.data.load import load\n",
"# 106m checkpoint\n",
"geneformer_106m = load(\"geneformer/106M_240530:2.0\")\n",
"# 10m checkpoint\n",
Expand Down
8 changes: 4 additions & 4 deletions sub-packages/bionemo-core/src/bionemo/core/data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ example, in `esm2.yaml`:
See https://ngc.nvidia.com/catalog/models/nvidia:clara:esm2nv650m.
```
To load these model weights during a test, use the [load][bionemo.testing.data.load.load] function with the filename and
To load these model weights during a test, use the [load][bionemo.core.data.load.load] function with the filename and
tag of the desired asset, which returns a path a the specified file:
```python
Expand All @@ -46,19 +46,19 @@ config = ESM2Config(nemo1_ckpt_path=path_to_my_checkpoint)

If this function is called without the data available on the local machine, it will be fetched from the default source
(currently `pbss`.) Otherwise, it will return the cached directory. To download with NGC, pass `source="ngc"` to
[load][bionemo.testing.data.load.load].
[load][bionemo.core.data.load.load].

## File unpacking and/or decompression

All test artifacts are individual files. If a zip or tar archive is specified, it will be unpacked automatically, and
the path to the directory will be returned via [load][bionemo.testing.data.load.load]. Compressed files ('gzip', 'bz2',
the path to the directory will be returned via [load][bionemo.core.data.load.load]. Compressed files ('gzip', 'bz2',
or 'xz') are automatically decompressed before they are returned. The file's compression and/or archive format is
determined based on the filename specified in the `pbss` URL.

!!! note "Files in NGC resources"

NGC resources are folders, i.e., they may contain multiple files per resource.
[load][bionemo.testing.data.load.load] will _only_ download the filename matching the stem of the `pbss` url. The
[load][bionemo.core.data.load.load] will _only_ download the filename matching the stem of the `pbss` url. The
same NGC resource can therefore be used to host multiple test assets that are used independently.


Expand Down

0 comments on commit 8612d7d

Please sign in to comment.