From 79f3a031894a762afa4a88c0aada03cda43ab7dc Mon Sep 17 00:00:00 2001 From: <> Date: Wed, 1 May 2024 02:40:33 +0000 Subject: [PATCH] Deployed 4cf19e1 with MkDocs version: 1.5.3 --- nextflow-create-docker/index.html | 9 +++-- nextflow-getting-started/index.html | 16 ++++++++ nextflow-overview-containers/index.html | 16 ++++---- nextflow-upload-docker/index.html | 5 ++- search/search_index.json | 2 +- sitemap.xml | 48 ++++++++++++------------ sitemap.xml.gz | Bin 505 -> 505 bytes 7 files changed, 57 insertions(+), 39 deletions(-) diff --git a/nextflow-create-docker/index.html b/nextflow-create-docker/index.html index 1cfd500..9021bc2 100644 --- a/nextflow-create-docker/index.html +++ b/nextflow-create-docker/index.html @@ -1201,16 +1201,16 @@

Start with a securit

How to choose your base image

GPU vs. CPU

Not sure what these are? Here's a nice overview.

-

If your workflow requires GPU (e.g., deep learning or other AI/ML models), please use the GPU instance; otherwise, use CPU.

+

In our BRH Workspace, we offer workspace images for CPU and GPU tools. You can read more about this on our Getting Started page. Choose the appropriate workspace image (CPU or GPU) for your Docker image and tools.

GPU images

-

We have 3 images in our current selection that offer CUDA support for running on GPUs -- these have "cuda" in the image name, followed by the CUDA version. When possible, please choose the latest version of CUDA compatible with your tools.

+

We have 3 base images in our current selection that offer CUDA support for running on GPUs -- these have "cuda" in the image name, followed by the CUDA version. When possible, please choose the latest version of CUDA compatible with your tools.

gen3-cuda-12.3-ubuntu22.04-openssl (preferred)

gen3-cuda-12.3-torch2.2-ubuntu22.04-openssl (also preferred)

gen3-cuda-11.8-ubuntu22.04-openssl (only use if your tools require a lower version of CUDA)

CPU images

-

We have one image that is available for running workflows on CPUs.

+

We have one base image that is available for running workflows on CPUs.

amazonlinux-base

@@ -1220,7 +1220,8 @@

Test pulling the Docker imageNext, open your terminal. Run docker pull <image URL>, where the image URL is the full line as displayed in the file of security-validated base images. If it's working, you will see language that it is pulling (see below). When it's complete (and successfully pulled), there will be a line that says Status: Downloaded <image> (see yellow highlight below). If you see this, you know that all the steps necessary to pull your image work. If you don't see this, reach out to us on Slack.

Test docker pull command in terminal

Test using Docker Scout to evaluate image vulnerabilities

-

At the end of your test fetch, Docker offers a suggestion to use Docker Scout to examine your image for vulnerabilities (see red box above). We have already evaluated the security compliance for our image, so it's not necessary here. However, since you will want to use Docker Scout to evaluate your custom build later, now is a convenient time to test this tool and make sure you are fully set up to run Docker Scout.

+

At the end of your test fetch, Docker offers a suggestion to use Docker Scout to examine your image for vulnerabilities (see red box above). We have already evaluated the security compliance for our image, so it's not necessary for security here. However, since you will want to use Docker Scout to evaluate your custom build later, now is a convenient time to test this tool and make sure you are fully set up to run Docker Scout.

+

Note: If you don't seem to have access to Docker Scout, check whether you're using the latest Docker version.

Run Docker Scout

To run Docker Scout, you must:

    diff --git a/nextflow-getting-started/index.html b/nextflow-getting-started/index.html index d850271..978da3c 100644 --- a/nextflow-getting-started/index.html +++ b/nextflow-getting-started/index.html @@ -866,6 +866,13 @@ Automated shutdown for idle workspaces + + +
  • + + GPU vs CPU Nextflow workspace images + +
@@ -1180,6 +1187,13 @@ Automated shutdown for idle workspaces + + +
  • + + GPU vs CPU Nextflow workspace images + +
  • @@ -1262,6 +1276,8 @@

    Store all data in the per

    Screenshot of /pd and data folders

    Automated shutdown for idle workspaces

    Workspaces will automatically be shut down (and all workflows terminated) after 90 minutes of idle time.

    +

    GPU vs CPU Nextflow workspace images

    +

    As you can see in the screenshot above, there are 2 Nextflow workspace images: A CPU image and a GPU image. If your workflow requires GPU (e.g., deep learning or other AI/ML models), please use the GPU instance; otherwise, use CPU. You can read more about CPU and GPU options in Gen3 Nextflow here.

    Continue to Overview of Containers in Gen3

    diff --git a/nextflow-overview-containers/index.html b/nextflow-overview-containers/index.html index b44d9e5..428f463 100644 --- a/nextflow-overview-containers/index.html +++ b/nextflow-overview-containers/index.html @@ -945,15 +945,15 @@

    Nextflow logo

    Overview: Developing and Deploying Containers in Gen3

    Overview of steps in developing a container and making it available for use in workflows

    -

    Locally build and test container: -Gen3 provides several FedRAMP security-compliant base images that users can pull and customize.

    -

    Request credentials and push container to Gen3 staging: -Users can email Gen3 to request short-term credentials that permit them to authenticate Docker in their terminal to upload the local Docker image to a Gen3 staging repo for security review.

    -

    Container is security-scanned; Gen3 sends approved container URI: -Gen3 completes the security scan within minutes. If it is compliant, the image is moved to an ECR repo ("approved") from where the container can be run, and Gen3 staff will send a container URI to the user.

    +

    Locally build and test container:

    +

    Gen3 provides several FedRAMP security-compliant base images that users can pull and customize.

    +

    Request credentials and push container to Gen3 staging:

    +

    Users can email Gen3 to request short-term credentials that permit them to authenticate Docker in their terminal to upload the local Docker image to a Gen3 staging repo for security review.

    +

    Container is security-scanned; Gen3 sends approved container URI:

    +

    Gen3 completes the security scan of the container. Typically, the scanning completes within a couple hours; however it takes longer for larger images with more layers. If it is security-compliant, the image is moved to an ECR repo ("approved") from where the container can be run, and Gen3 staff will send a container URI to the user for use in the Nextflow workflows.

    If there are problems that make the image non-compliant with security requirements, a report of the vulnerabilities is provided to the user for remediation and resubmission. Users are responsible for resolving image vulnerabilities and resubmitting for scanning.

    -

    Run workflow using approved container URI: -In the BRH workspace, use a Nextflow Jupyter notebook to run Nextflow workflows in the approved container using the approved container URI. Some example notebooks can be found here, and specific examples that use an approved image URI can be found here and here

    +

    Run workflow using approved container URI:

    +

    In the BRH workspace, use a Nextflow Jupyter notebook to run Nextflow workflows in the approved container using the approved container URI. Some example notebooks can be found here, and specific examples that use an approved image URI can be found here and here


    Continue to Create Dockerfile

    diff --git a/nextflow-upload-docker/index.html b/nextflow-upload-docker/index.html index 4d7340e..aeda08b 100644 --- a/nextflow-upload-docker/index.html +++ b/nextflow-upload-docker/index.html @@ -1280,9 +1280,10 @@

    Push the Docker image to the E

    Completion

    Once the push completes, your Docker image will be available in the ECR repository (although you will not be able to see it). It will be scanned, and if passes the security scanning, CTDS will move it to the nextflow-approved repo. When it's available in nextflow-approved, User Services will share a docker URI that looks something like this:
    143731057154.dkr.ecr.us-east-1.amazonaws.com/nextflow-approved/< your username >:< image-tag >
    -You can then use this new URI to run Nextflow workflows with your container in the BRH workspace.

    +You can then use this new URI to run Nextflow workflows with your container in the BRH workspace. (Note that you need to copy the whole URI into the container field of the nextflow notebook, as described in the next section.)

    How to use an approved Docker URI

    -

    Once you have your Docker URI, you are ready to run your Nextflow workflow! You can take the Docker URI and make it the value for the "container" field(s) in your Nextflow notebook. For example, in the torch_cuda_batch Nextflow notebook, you would go to the nextflow.config section and replace the placeholder value for container with the approved Docker URI.

    +

    Once you have your Docker URI, you are ready to run your Nextflow workflow! You can take the Docker URI (copy the entire line) and make it the value for the "container" field(s) in your Nextflow notebook. For example, in the torch_cuda_batch Nextflow notebook, you would go to the nextflow.config section and replace the placeholder value for container with the approved Docker URI.

    +

    Please note that you will need to replace all placeholder values in the nextflow.config with values specific to your workspace. Please see the section "Get and replace placeholder values from the Nextflow config" on the Tutorials page for more information.

    Screenshot of nextflow.config, showing where you put the Docker URI

    Support

    If you encounter any issues or require assistance, please reach out to the User Services team that provided you with the temporary credentials, or brhsupport@datacommons.io, or reach out on Slack. (Slack will result in the quickest reply.)

    diff --git a/search/search_index.json b/search/search_index.json index 68e40df..07a0044 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Biomedical Research Hub Documentation","text":""},{"location":"#biomedical-research-hub-documentation","title":"Biomedical Research Hub Documentation","text":"

    The Biomedical Research Hub (BRH) is a cloud-based and multifunctional web interface that provides a secure environment for discovery and analysis of scientific results and data. It is designed to serve users with a variety of objectives, backgrounds, and specialties.

    The BRH represents a dynamic Data Ecosystem that aggregates and hosts metadata from multiple resources to make data discovery and access easy for users.

    The platform provides a way to search and query over study metadata and diverse data types, generated by different projects and organizations, and stored across multiple secure repositories.

    The BRH also offers a secure and cost-effective cloud-computing environment for data analysis, empowering collaborative research and development of new analytical tools. New workflows and results of analyses can be shared with the community.

    The BRH is powered by the open-source software \u201cGen3\u201d.

    Gen3 was created by and is actively developed at the University of Chicago\u2019s Center for Translational Data Science (CTDS) with the aim of creating interoperable cloud-based data resources for the scientific research community.

    "},{"location":"01-home/","title":"Home","text":""},{"location":"01-home/#biomedical-research-hub-documentation","title":"Biomedical Research Hub Documentation","text":"

    The Biomedical Research Hub (BRH) is a cloud-based and multifunctional web interface that provides a secure environment for discovery and analysis of scientific results and data. It is designed to serve users with a variety of objectives, backgrounds, and specialties.

    The BRH represents a dynamic Data Ecosystem that aggregates and hosts metadata from multiple resources to make data discovery and access easy for users.

    The platform provides a way to search and query over study metadata and diverse data types, generated by different projects and organizations, and stored across multiple secure repositories.

    The BRH also offers a secure and cost-effective cloud-computing environment for data analysis, empowering collaborative research and development of new analytical tools. New workflows and results of analyses can be shared with the community.

    The BRH is powered by the open-source software \u201cGen3\u201d

    Gen3 was created by and is actively developed at the University of Chicago\u2019s Center for Translational Data Science (CTDS) with the aim of creating interoperable cloud-based data resources for the scientific research community.

    "},{"location":"02-types_of_shared_data/","title":"FAIR Data","text":""},{"location":"02-types_of_shared_data/#types-of-shared-data","title":"Types of Shared Data","text":"

    The BRH provides secure access to study metadata from multiple resources (Data Commons) and will be the driving engine for new discovery. The types of data represented are diverse and include scientific research across multiple disciplines.

    The BRH aims to make data more accessible by following the \"FAIR\" principles:

    Findable Researchers are provided an intuitive interface to search over metadata for all studies and related datasets. Each study and dataset will be assigned a unique, persistent identifier. Accessible

    Authenticated users can request and receive access to controlled-access data by data providers. Metadata can be accessed via an open API.

    Interoperable Data can be easily exported to various workspaces for analysis using a variety of software tools.

    Reusable Data can be easily reused to facilitate reproducibility of results, development and sharing of new tools, and collaboration between investigators.

    "},{"location":"03-data_and_repos/","title":"Data Management and Repositories","text":""},{"location":"03-data_and_repos/#data-and-repositories","title":"Data and Repositories","text":"

    The BRH securely exposes study metadata and data files stored on multiple FAIR repositories and Data Commons, i.e. data libraries or archives, to provide an easy way to connect different repositories on one single location.

    FAIR data repositories are traditionally a part of a larger institution/working group established for research, data archiving, and, to serve data users of that organization.

    As of March 2023, the list of currently shared resources/Data Commons on BRH includes:

    • BioData Catalyst
    • CRDC Cancer Imaging Data Commons
    • CRDC Genomic Data Commons
    • CRDC Integrated Canine Data Commons
    • CRDC Proteomic Data Commons
    • IBD Commons
    • JCOIN
    • MIDRC
    • NIAID ClinicalData
    "},{"location":"04-BRH_overview/","title":"Quickstart - BRH Overview","text":""},{"location":"04-BRH_overview/#brh-overview","title":"BRH Overview","text":"

    You can get started with the Biomedical Research Hub by exploring the features described below.

    "},{"location":"04-BRH_overview/#register-for-workspaces","title":"Register for Workspaces","text":"

    Get a temporary free trial to BRH workspaces and simultaneously register for extended workspace access with NIH STRIDES

    "},{"location":"04-BRH_overview/#login-page","title":"Login Page","text":"

    Log in here to unlock controlled-access data and workspace access with your credentials

    "},{"location":"04-BRH_overview/#check-study-access-and-authorize-external-data-resources","title":"Check Study Access and Authorize External Data Resources","text":"

    Check study access and connect your account to other resources to access all studies for which you are authorized.

    "},{"location":"04-BRH_overview/#discovery-page","title":"Discovery Page","text":"

    Discover datasets across multiple resources and export selected data files to the analysis workspace.

    "},{"location":"04-BRH_overview/#workspaces-page","title":"Workspaces Page","text":"

    Access data across multiple resources and perform analyses in a secure, cloud-based environment

    "},{"location":"04-BRH_overview/#profile-page","title":"Profile Page","text":"

    Review data access permissions and generate API credentials files used for programmatic access.

    "},{"location":"05-workspace_registration/","title":"Workspace Page (Registration)","text":""},{"location":"05-workspace_registration/#register-for-brh-workspace","title":"Register for BRH Workspace","text":"

    To start exploring BRH Workspace right away, users can apply for a Temporary Trial Access. Extended access to BRH Workspace is granted using a persistent pay model workspace account (e.g., STRIDES or Direct Pay), which can be requested after trial access is provisioned. Please see below for more details.

    "},{"location":"05-workspace_registration/#requesting-temporary-trial-access-to-brh-workspace","title":"Requesting Temporary Trial Access to BRH Workspace","text":"

    For new users without workspace access, please follow these steps

    1. Login to BRH
    2. Click on the Workspace tab. That opens the Workspace Access Request form
    3. Fill in the details and submit the form shown below.

    4. The form should be completed only once. Following submission, users will see a success message and a link back to the Discovery page.

    5. Users will receive an email notifying them that the request has been received.

    6. Users will receive another email notifying them that the temporary trial access request has been approved. They should then be able to access workspaces on BRH. Please note that the timeline for this approval can be a few business days.

    "},{"location":"05-workspace_registration/#requesting-extended-access-to-brh-workspace-using-a-persistent-pay-model-eg-strides-direct-pay","title":"Requesting Extended Access to BRH Workspace using a Persistent Pay Model (e.g., STRIDES, Direct Pay)","text":"

    Please Note: The process for granting access for a workspace account can take 2-4 weeks for NIH STRIDES, and a month for Direct Pay.

    Find instructions for funding workspace accounts with any of the persistent pay models on the Workspace Accounts page.

    "},{"location":"06-loginoverview/","title":"Login Page","text":""},{"location":"06-loginoverview/#login-access-overview","title":"Login Access Overview","text":"

    All users are able to browse the study metadata on the Discovery Page without logging in.

    Users will need to log in and obtain authorization (access) in order to:

    • Access studies with controlled data
    • Perform analyses in Workspaces
    • Download data files and file manifests
    • Run interactive tutorial notebooks in the Workspaces

    Start by visiting the login page (https://brh.data-commons.org/login).

    • Login from Google: You may login using any Google account credentials, or a G-suite enabled institutional email. This option may or may not be available depending on the institution or organization the user is associated with.
    • Login via InCommons --> NIH eRA: When selecting the NIH/eRA (electronic Research Administration) login using InCommons, you will need access permissions through the eRA Commons account.

    After successfully logging in, your username will appear in the upper right-hand corner of the page.

    "},{"location":"07-how_to_check_request_access/","title":"Check Study Access and Authorize Resources","text":""},{"location":"07-how_to_check_request_access/#how-to-check-and-request-access","title":"How To Check and Request Access","text":"

    Users can find out to which projects they have access to by navigating to the Discovery Page and by selecting through the column filters at the top of the table.

    "},{"location":"07-how_to_check_request_access/#access-to-individual-studies","title":"Access to individual Studies","text":"

    You can check access by clicking on a study in the Discovery Page, as shown below:

    The Study Page will display access permissions in the top right corner. Click the \u201cPermalink\u201d button in the upper right to copy the link to the clipboard.

    If you have access, a green box will show \u201cYou have access to this study\u201d.

    Access is displayed as a green box on top of each Study Page.

    Note: If you have access but cannot select the study to export to workspace, it is because the manifest is not yet available. Please use API for these cases.

    "},{"location":"07-how_to_check_request_access/#authorize-to-gain-access-to-fair-enabled-repositoriesresources","title":"Authorize to Gain Access to FAIR-enabled Repositories/Resources","text":"

    BRH securely provides access to data stored on multiple FAIR repositories, resources, and Data Commons.

    Users must authorize these resources on their account in order to:

    1. run Jupyter Notebooks that utilize data stored in various FAIR repositories.
    2. export data that is stored in FAIR repositories from the Discovery Page to the Workspaces.
    3. download data that is stored in FAIR repositories from the Discovery Page.

    In order to authorize access to these repositories and data commons, navigate to the Profile Page. Authorize the relevant commons by clicking on the buttons for the relevant commons (e.g., the Refresh or Authenticate buttons in the image shown below).

    Authorization needs to be renewed after 30 days, as indicated after \"Status: expires in [..] days\".

    "},{"location":"08-discovery_page/","title":"Discovery Page","text":""},{"location":"08-discovery_page/#discovery-page","title":"Discovery Page","text":"

    The Discovery Page provides users a venue to search and find studies and datasets displayed on the Biomedical Research Hub. Users can browse through the publicly accessible study-level metadata without requiring authorization.

    Use text-based search, faceted search, and tags to rapidly and efficiently find relevant studies, discover new datasets across multiple resources, and easily export selected data files to the analysis workspace. Browse through datasets and study-level metadata and find studies using tags, advanced search, or the free text search field.

    "},{"location":"08-discovery_page/#search-features","title":"Search Features","text":"

    On the Discovery page, several features help you navigate and refine your search.

    1. Total number of studies: shows the number of studies the BRH is currently displaying.
    2. Total number of subjects: shows the number of subjects the BRH is currently displaying.
    3. Free Text Search: Use keywords or tags in the free-text-based search bar to find studies. The free-text search bar can be used to search for study name, ID number, Data Commons, or any keyword that is mentioned in the metadata of the study.
    4. Data Resources/Data Commons Tags: view these by selecting \"Study Characteristics\". Click on a tag to filter by a Data Resource/Data Commons. Selecting multiple tags works in an \"OR\" logic (e.g., \"find AnVIL OR BioData Catalyst studies\").
    5. Export Options: Login first to leverage the export options. Select one or multiple studies and download a file manifest or export the data files to a secure cloud environment \"Workspace\" to start your custom data analysis in Python or R.
    6. Data Availability: Filter on available, pending, and not-yet-available datasets.
    7. Studies: This table feature presents all current studies on BRH. Click on any study to show useful information about the study (metadata).
    "},{"location":"08-discovery_page/#find-available-study-level-metadata","title":"Find available Study-level Metadata","text":"

    Clicking on any study will display the available study-level and dataset metadata.

    "},{"location":"08-discovery_page/#find-accessible-datasets","title":"Find accessible Datasets","text":"

    Users can select and filter studies from multiple resources and conduct analyses on the selected datasets in a workspace. Users can search but not interact with data they do not have access to. By selecting the data access button in the top right corner of the study page user access can be displayed. The Discovery Page will automatically update the list of studies that are accessible.

    "},{"location":"09-workspace_page/","title":"Getting Started in Workspace","text":""},{"location":"09-workspace_page/#workspaces","title":"Workspaces","text":"

    To use the workspace, users must register for workspace accounts to use the workspaces, as described on the Workspace Registration page.

    BRH workspaces are secure data analysis environments in the cloud that can access data from one or more data resources. By default, Workspaces include Jupyter notebooks, Python and R, but can be configured to host virtually any application, including analysis workflows, data processing pipelines, or data visualization apps.

    New to Jupyter? Learn more about the popular tool for data scientists on Jupyter.org (disclaimer: CTDS is not responsible for the content).

    "},{"location":"09-workspace_page/#guideline-to-get-started-in-workspaces","title":"Guideline to get started in Workspaces","text":"

    Once users have access to workspaces, use this guide below to get started with analysis work in workspaces.

    1. Users need to log in via https://brh.data-commons.org/login to access workspaces.

    2. After navigating to https://brh.data-commons.org/workspace, users will discover a list of pre-configured virtual machine (VM) images, as shown below.

      • (Generic) Jupyter Notebook with R kernel: Choose this VM if you are familiar with setting up Python- or R-based Notebooks, or if you just exported one or multiple studies from the Discovery Page and want to start your custom analysis.
      • Tutorial Notebooks: Explore our Jupyter Notebook tutorials written in Python or R, which pull data from various sources of the Biomedical Research Hub to leverage statistical programs and data analysis tools. These are excellent resources for code to pull and analyze data from BRH, and examples that illustrate the variety of data and analyses available through BRH.
    3. Click \u201cLaunch\u201d on any of the workspace options to spin up a copy of that VM. The status of launching the workspace is displayed after clicking on \u201cLaunch\u201d. Note: Launching the VM may take several minutes.

    4. After launching, the home folders are displayed. One of these folders is the user's persistent drive (\"/pd\").

    5. Select the /pd folder. New files or licenses should be saved in the the /pd directory if users need to access them after restarting the workspaces. Only files saved in the /pd directory will remain available after termination of a workspace session.

      []

      • Attention: Any personal files in the folder \u201cdata\u201d will be lost. Personal files in the directory /pd will persist.
      • Do not save files in the \"data\" or \u201cdata/brh.data-commons.org\u201d folders.
      • The folder \u201cbrh.data-commons.org\u201d in the \u201cdata\u201d folder will host the data files you have exported from the Discovery Page. Move these files to the /pd directory if you do not want to have to export them again.
      • /pd has a capacity limit of 10GB.
    6. Start a new notebook under \u201cNotebook\u201d in the Launcher tab. Click the tiles in the launcher and choose between Python 3 or R Studio as the base programmatic language. Note: You can open and run multiple notebooks in your workspace. However, the generic, tutorial and nextflow workspace images are currently separate docker images, so there is no functionality to combine them or run nextflow in the tutorial or generic images. This may be available in the future, after further testing and development activities.

    7. Experiment away! Code blocks are entered in cells, which can be executed individually or all at once. Code documentation and comments can also be entered in cells, and the cell type can be set to support Markdown.

      Results, including plots, tables, and graphics, can be generated in the workspace and downloaded as files.

    8. Do not forget to terminate your workspace once your work is finished. Unterminated workspaces continue to accrue computational costs. Note, that Workspaces automatically shut down after 90 minutes of idle time.

    Further reading: read more about how to download data files into the Workspaces here.

    "},{"location":"09-workspace_page/#upload-save-and-download-filesnotebooks","title":"Upload, save, and download Files/Notebooks","text":"

    Users can upload data files or Notebooks from the local machine to the home directory by clicking on \u201cUpload\u201d in the top left corner. Access the uploaded content in the Notebook (see below).

    Then run in the cells, for example:

    import os

    import pandas as pd

    os.chdir('/data')

    demo_df = pd.read_csv('/this_is_a_demo.txt', sep='\\t')

    demo_df.head()

    Users can save the notebook by clicking \"File\" - \"Save as\", as shown below.

    Users can download notebooks by clicking \"File\" - \"Download\", as shown below. Download the notebook, for example, as \".ipynb\".

    "},{"location":"09-workspace_page/#environments-languages-and-tools","title":"Environments, Languages, and Tools","text":"

    The following environments are available in the workspaces:

    • Jupyter Lab

    The following programmatic languages are available in Jupyter Notebooks:

    • R
    • Python 3

    The following tools are available in Jupyter Notebooks:

    • GitHub (read GitHub documentation)
    "},{"location":"09-workspace_page/#python-3-and-r-in-jupyter","title":"Python 3 and R in Jupyter","text":"

    Both Python 3 and R are available in Jupyter Notebooks.

    Users can expect to be able to use typical Python or R packages, such as PyPI or CRAN. For Python and R, users can start a new notebook with a tile under \"Notebook\", as shown below.

    "},{"location":"09-workspace_page/#automatic-workspace-shutdown","title":"Automatic Workspace Shutdown","text":"

    Warning: When a BRH Workspace reaches the STRIDES Credits limit for STRIDES Credits Workspaces, or reaches the Hard Limit for STRIDES Grant Workspaces, the Workspace will be automatically terminated. Please be sure to save any work before reaching the STRIDES Credit or Hard Limit.

    Warning: Workspaces will also automatically shut down after 90 minutes of idle time. A pop-up window will remind users to navigate back to the workspaces page in order to save the data.

    "},{"location":"10-profile_page/","title":"Profile Page","text":""},{"location":"10-profile_page/#profile-page","title":"Profile Page","text":"

    On the profile page users will find information regarding their access to projects, access to Gen3-specific tools (e.g. access to the Workspace), and the function to create API keys for credential downloads. API keys are necessary for the download of files using the Gen3 Python SDK.

    Users can view their study access and API keys can be viewed/created/downloaded on the Profile Page.

    "},{"location":"11-downloading_data_files/","title":"Downloading Data Files","text":""},{"location":"11-downloading_data_files/#downloading-data-files","title":"Downloading Data Files","text":"

    Users can download data files for work in the provided Workspace. Utilizing workspaces leverages CTDS-owned python software development kit (SDK) as well as a cloud-based computing platform.

    Note: accessing data files requires linked access to all FAIR enabled repositories, as described here.

    "},{"location":"11-downloading_data_files/#download-data-files-into-a-workspace-with-the-python-sdk","title":"Download Data Files into a Workspace with the Python SDK","text":"

    Users can load data files from a manifest created on the Discovery Page directly into a Workspace. Below are the steps to do so.

    1. Navigate to the Discovery Page. Link your accounts to FAIR repositories as described here.

    2. Find the study or studies of interest by using the search features or the list of accessible studies.

    3. Select the clickable boxes next to the studies. Click on \"Open in Workspace\", which will initiate the Workspace Launcher.

    4. The Workspace will be prepared and the selected data will be made available via a manifest placed in a time/date stamped directory in the following path: pd/data/brh.data-commons.org/exported-manifest-(time/date stamp) Please do not navigate away from this page until the download is complete. Created directory may take several minutes to load.

    5. Once loaded, users can navigate into the directory and access either the manifest or an automatically generated notebook (e.g., data.ipynb) with instructions to download the data. Users should note that the gen3-sdk is utilized in this notebook and directory to download data.

    "},{"location":"12-contact/","title":"Contact","text":""},{"location":"12-contact/#contact-brh-support","title":"Contact BRH Support","text":"

    Need help? Please contact our help desk - brhsupport@datacommons.io

    "},{"location":"13-workspace_accounts/","title":"Workspace Accounts - Trial and Persistent Pay Models (STRIDES, Direct Pay)","text":""},{"location":"13-workspace_accounts/#workspace-accounts-trial-and-persistent-pay-models-strides-direct-pay","title":"Workspace Accounts - Trial and Persistent Pay Models (STRIDES, Direct Pay)","text":""},{"location":"13-workspace_accounts/#four-different-pay-models-for-brh-workspace-accounts","title":"Four Different Pay Models for BRH Workspace Accounts","text":"

    We have 4 different pay models for workspace accounts:

    • Trial Access (free for user, limited to 2 months)
    • OCC Direct Pay (persistent pay model paid by credit card through OCC Payment Portal)
    • STRIDES Grant/Award Funded (persistent pay model paid by organizations with NIH grant funds)
    • STRIDES Credits (persistent pay model paid directly by NIH)

    Instructions for requesting funding for each pay model are provided below.

    Please Note:

    • The process for granting access for a workspace account can take 2-4 weeks for NIH STRIDES, and a month for Direct Pay, although it may also be faster.
    • The account from each different paymodel will have its own workspace storage directory (/pd; read about /pd here); data is not shared between accounts with different funding types. However, you can import and export data among accounts as long as they are active and have funding.
    "},{"location":"13-workspace_accounts/#request-trial-access","title":"Request Trial Access","text":"

    Trial Access is granted for 2 months when you request access to the BRH Workspace page. The instructions for requesting workspace access are on the Workspace Registration page. Note that your trial access will become inactive as soon as you have a workspace account funded through a persistent paymodel.

    "},{"location":"13-workspace_accounts/#requesting-funding-for-workspace-accounts-through-any-persistent-pay-model","title":"Requesting Funding for Workspace Accounts Through Any Persistent Pay Model","text":"
    1. Once they have access to the Workspace page, users can request a workspace account by first logging to BRH (#1), going to the Workspace page (#2), then clicking the Workspace Account Manager link (#3). Click \"Log in\" and \"Yes, I authorize\" to open the portal in a new tab.

      Some STRIDES users may receive an invitation via email to register for an NIH STRIDES workspace account. These users can also click the link in the invitation email to get to the BRH Workspace Account Manager.

    2. In the BRH Workspace Account Manager, users can see their persistent workspace accounts and any available credits or funds in the accounts. If you only have trial access, you will not see any active accounts.

      To request a workspace account with a persistent paymodel, click the \"Request New Workspace\" button.

    3. Choose from any of the 3 persistent pay model funding options: a) STRIDES Grant/Award Funded; b) STRIDES Credits; or c) OCC Direct Pay to request a funded workspace account.

    "},{"location":"13-workspace_accounts/#occ-direct-pay-funded-workspace-account","title":"OCC Direct Pay Funded Workspace Account","text":"

    The OCC Direct Pay form can be selected if a user wants to pay with a personal or organizational credit card. OCC Direct Pay only requires a valid credit card.

    Funding a workspace account with OCC Direct Pay has 2 major parts:

    1. Request BillingID
    2. Use BillingID to provision Direct Pay funds for the Workspace Account
    "},{"location":"13-workspace_accounts/#request-billingid","title":"Request BillingID","text":"

    Note: It can take up to a month to receive a BillingID if you act promptly to complete each step.

    1. Go to https://payments.occ-data.org or click on the Payment Portal link (red arrow below) on the OCC Direct Pay tab for the workspace account request form.

    2. Create an account for the OCC Payment Portal: Under Create Account, enter your email address and click \u201cRequest Token\u201d. Note: This should be the email address you use to log into BRH; Direct Pay is not currently compatible with ORCID login. If you already logged in with ORCID to request a workspace account, please log out and authenticate with either the InCommon or Google option. Please monitor this address, as relevant alerts will be sent here.

    3. You will receive an email with a 6-digit token within a couple minutes of your request. It may go to spam, so watch the Spam folder, as well.
    4. Copy your token from your email. Paste it into the Enter Token field on the OCC Payment Portal. Click Sign In. (Note: You will be asked to enter a token at each log-in.)
    5. Successful sign-in will open a Profile page for your account on the OCC Payment Portal. When you first create your account on the payment portal, you will not have any access requests.

    6. Click the \u201crequest access\u201d button. The form shown below will open. Complete the form and click Submit.

      For Role, indicate your role within your organization/institution. If you don't have an institutional affiliation, you can put \"independent data analyst\"

    7. Once the form is submitted, a message will appear indicating successful submission. You will also receive an email (again, check spam).

    8. If you return to the Profile page in the OCC Payment Portal now, you\u2019ll see there is an active request in the table at the bottom. Click \u201cCheck Status\u201d to view progress on the steps toward final approval and provisioning of your request. You can view what happens at each stage of processing here: https://payments.occ-data.org/processing-stages/.
    9. When you click Check Status, you can see the progress of your request. At first, you will see that they are processing your request (indicated by an orange color). Once OCC finishes processing your access request, you will receive 2 emails, and the progress tracker at the bottom will show that Submit Access Request is completed (green). Complete E-Doc is now colored orange. The first email indicates that your access request status has progressed, and the second email has a link to an electronic document.

      Important: Review the Agreement carefully to understand the terms. PLEASE READ THIS DOCUMENT VERY CAREFULLY BEFORE SIGNING! This document presents the terms governing how your Direct Pay funds will be allocated, among other things. Be sure you understand all the terms before you sign and submit. If you have any questions or concerns about the terms and conditions, please email billing@occ-data.org before you sign.

      Once you submit the signed document, it could take up to 5 days to finish processing receipt of the signed document and update your progress tracker. You will receive an email when processing is complete.

      But, you will receive an email quickly confirming that the document has been signed and providing a link to download the signed document for your records. If you do not receive that within 5 minutes (be sure to check your spam folder), please return to the document and verify that you fully signed and submitted the document. Please save the document so you can reference it as needed.

    10. When your request has been fully approved, the \u201cReceived Approval\u201d step will be green, you\u2019ll receive an email, and your BillingID field will have been populated on the Profile page of the OCC Payment Portal.

    You may now use your BillingID to provision a Direct Pay workspace account in BRH.

    "},{"location":"13-workspace_accounts/#use-billingid-to-provision-direct-pay-funds-for-the-workspace-account","title":"Use BillingID to provision Direct Pay funds for the Workspace Account","text":"

    Note 1: Before you request a workspace account through any persistent pay model (eg, Direct Pay, STRIDES), be sure to backup all data in your /pd for your workspace. Once your persistent pay model is funded, you will no longer have access to the /pd used during trial access. (What\u2019s a /pd?)

    Note 2: It can take up to 12 business days to provision funds from a BillingID.

    1. Copy your BillingID from your User Profile page in the OCC Payment Portal.
    2. Return to the Workspace Account Manager, and click Request New Workspace to open the Workspace Account Request Form.
    3. Click the OCC Direct Pay tab.
    4. Paste your BillingID in the field, and enter the first 3 characters of the email address associated with your BillingID (For example, if your email was john.smith@gmail.com, you would enter \u201cjoh\u201d)
    5. Click Confirm BillingID. Once your BillingID is confirmed, the bottom part of the form will open to allow you to enter the details for provisioning your account.
    6. Be sure to check the box that says that you agree to be invoiced. The amount of the invoice is taken from the value you entered when you made your access request on the Payment Portal.

      • Enter a title for your project and a brief summary. This is to be used to help you keep track of your requests in case you have multiple accounts for different projects.
      • Identify whether your workspace use is personal or organizational.
      • Indicate whether you have a credit card you are allowed to use to pay for provisioning the workspace account. If your workspace is personal and you have any credit card, the answer will be yes. If your workspace is organizational - make sure you are not using a departmental card or similar without permission.
      • Indicate what role you have as a researcher on this project.

    7. Once you submit this form, you will receive an email with the invoice. (It can take up to 5 business days to be sent.) There will be a secure link in the invoice to submit your credit card information and pay the invoice. When you pay the invoice, OCC will apply the funds, create an AWS account for this project\u2019s workspace, and send that information to BRH to provision your account. This can take up to 7 business days after you have signed the form. You will receive an email when your account is set up and ready to be used in your workspace.

      When you submit this form, you will also see a new entry in the OCC Direct Pay Accounts table in the Workspace Account Manager. The request status for your request will be Pending until the invoice is paid and the account is finalized.

    8. Once your Direct Pay request is funded, your workspace will be shown as Active on the Request Status column in the Workspace Account Manager.

    "},{"location":"13-workspace_accounts/#strides-grantaward-funded-workspace-account","title":"STRIDES Grant/Award Funded Workspace Account","text":"

    The STRIDES Grant/Award Funded form can be selected if researchers have received NIH funding (e.g. a grant, contract, cooperative agreement, or other transaction agreement) and intend to use these funds for the BRH workspace account. With this option, the researchers' organization will be responsible for payment.

    Submit the request form. Note that the process of granting access for a workspace account can take 2-4 weeks and users will be notified by email. Following approval, users will see the provisioned workspace account in the BRH Workspace Accounts Manager.

    "},{"location":"13-workspace_accounts/#strides-credits-funded-workspace-account","title":"STRIDES Credits Funded Workspace Account","text":"

    Select the STRIDES Credits form to request credits from the NIH STRIDES Initiative for the BRH Workspace account. With this option, once the request is approved, a new account with a set spending limit of will be provisioned by NIH directly for usage.

    Submit the request form. Note that the process of granting access for a workspace account can take 2-4 weeks and users will be notified by email. Following approval, users will see the provisioned workspace account in the BRH Workspace Accounts Manager.

    "},{"location":"13-workspace_accounts/#what-is-the-nih-strides-initiative","title":"What is the NIH STRIDES Initiative?","text":"

    The NIH STRIDES initiative (NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability) can provide funding for BRH Workspace Accounts. The NIH STRIDES Initiative enables researchers with NIH grants to cost-effectively leverage the use of cloud environments by partnering with commercial providers, such as Amazon Web Services.

    By leveraging the STRIDES Initiative, NIH and NIH-funded institutions can begin to create a robust, interconnected ecosystem that breaks down silos related to generating, analyzing, and sharing research data.

    NIH-funded researchers with an active NIH award may take advantage of the STRIDES Initiative for their NIH-funded research projects. Eligible researchers include NIH intramural researchers and awardees of NIH contracts, other transaction agreements, grants, cooperative agreements, and other agreements. More information on NIH STRIDES options and how to gain access can be found here.

    "},{"location":"14-lauch_with_persistent_paymodel/","title":"Launch a Workspace with a Persistent Pay Model","text":""},{"location":"14-lauch_with_persistent_paymodel/#launch-a-workspace-with-a-persistent-pay-model","title":"Launch a Workspace with a Persistent Pay Model","text":"

    ***Important note: regardless of your account type, remember to terminate your workspace session when you are done using it to avoid accruing additional compute charges (i.e., \"usage\"). You can read more about terminating your workspace on our Getting Started in Workspace page. ***

    You can select which workspace account pay model to use from the Workspace page.

    • Log in and click on Workspace to open the workspace page.

    • Click on Account Information to expand that tab.

    • Under \"Account\" and \"Apply for an account\", you can click the dropdown to select which workspace account you want to use to launch the VM.

      The dropdown will only show workspace accounts that are active.

      When you only have trial access, launching a new VM will always use your Trial Access workspace account.

      However, once you have a workspace account funded through a persistent pay model, you will lose the trial access option, and you will only be able to launch a new VM using funded workspace account.

      Reminder: Each workspace account will have its own /pd for storing data beyond each workspace session. You can download data from one and upload it to another. But note: once you have a funded workspace account, you will no longer have access to the /pd for your trial access workspace account. Download your data from your trial access account /pd before you receive funding in another workspace account.

    Looking for information on how to use the Workspace? Check out our Getting Started page.

    "},{"location":"15-monitor_compute_usage/","title":"Monitor Compute Usage in Persistent Pay Models","text":""},{"location":"15-monitor_compute_usage/#monitor-compute-usage-with-a-persistent-pay-model","title":"Monitor Compute Usage with a Persistent Pay Model","text":""},{"location":"15-monitor_compute_usage/#where-can-you-see-your-usage","title":"Where can you see your usage?","text":"

    You can see and monitor your compute usage in two places:

    1. Workspace Page
    2. Workspace Account Manager

    An important note is that the usage reported in either of these places generally does not immediately update as you use the workspace. AWS reports updated usage about three times per day.

    "},{"location":"15-monitor_compute_usage/#1-monitor-usage-on-the-workspace-page","title":"1. Monitor usage on the Workspace page","text":"

    You can see your total usage for any active paid accounts by logging in to the BRH platform, clicking on Workspace, then clicking on Account Information (in the upper left). You can see your total usage and your spending limit.

    If you have more than one funded workspace account, you can use the dropdown to select different active pay models and view their usage and spending limits.

    "},{"location":"15-monitor_compute_usage/#2-monitor-usage-on-the-workspace-account-manager","title":"2. Monitor usage on the Workspace Account Manager","text":"

    You can see your usage details on the Workspace Account Manager Accounts page.

    After logging in, you will be on the Accounts page. There are different tables for each of the three persistent pay models. Find the table for the persistent pay model account you want to monitor, then look for the active account line in that table.

    The table includes 4 columns relevant for monitoring usage:

    • Total Usage: This column reports on your compute usage for this account as of the most recent AWS report.
    • Compute Purchased (or STRIDES Credits): This is the total purchase amount you have made for this account. For STRIDES Credits accounts, this is the total amount of credit you were awarded.

    The soft and hard limit columns deserve their own section; see below.

    "},{"location":"15-monitor_compute_usage/#hard-and-soft-limits","title":"Hard and Soft Limits","text":"

    You can see your hard and soft limits in the tables on the Workspace Account Manager Accounts page. They each have a column:

    • Soft Limit: This is the limit at which an alert will be triggered for you that you are approaching the spending limit for the account. For STRIDES accounts, users can edit this column to set the alert threshold. For OCC Direct Pay, the soft limit is set in accordance with your Direct Pay User Agreement (signed during Direct Pay BillingID setup).

    • Hard Limit: This is the limit at which a workspace VM will be immediately terminated and the workspace account will be shut down such that no VMs can be launched with it. For STRIDES accounts, users can edit this column to set the threshold at which the account becomes inactive. For OCC Direct Pay, the hard limit is set in accordance with your Direct Pay User Agreement (signed during Direct Pay BillingID setup).

    What happens when you exceed your funding for a workspace account?

    "},{"location":"16-usage_exceeds_funding/","title":"If Compute Usage Exceeds Funding","text":""},{"location":"16-usage_exceeds_funding/#if-compute-usage-exceeds-funding","title":"If Compute Usage Exceeds Funding","text":"

    Coming Soon! We will describe:

    • Data protections if your workspace is shutdown
    • A terminology guide for shutdown vs terminate
    "},{"location":"17-workspace_faq/","title":"Workspace FAQ","text":""},{"location":"17-workspace_faq/#workspace-faqs","title":"Workspace FAQs","text":""},{"location":"17-workspace_faq/#workspace-questions-any-funding-source","title":"Workspace Questions (any funding source)","text":""},{"location":"17-workspace_faq/#what-happens-if-my-trial-access-or-funding-runs-out","title":"What happens if my trial access or funding runs out?","text":"

    When you have used all of your allotted time (trial access) or funding (persistent pay models), you will no longer be able to launch a workspace. You will not have access to your /pd folder. You will still be able to open the BRH Workspace page, and log in to the Workspace Account Manager. You can request new account funding from the Workspace Account Manager.

    "},{"location":"17-workspace_faq/#is-the-persistent-directory-pd-shared-among-all-the-pay-model-options-will-my-data-be-available-in-all-my-workspace-accounts","title":"Is the persistent directory (/pd) shared among all the pay model options? Will my data be available in all my workspace accounts?","text":"

    No. Currently, the /pd folder is separate for each workspace pay model account. So, the data from your /pd for your trial access account will not automatically be available in the /pd for your Direct Pay workspace account or your STRIDES workspace account.

    "},{"location":"17-workspace_faq/#how-should-i-cite-research-done-in-brh-workspace","title":"How should I cite research done in BRH workspace?","text":"

    Make sure to cite the author of the data used in your research, the repository housing the data, and the BRH platform enabling your access to the data. See details here.

    "},{"location":"17-workspace_faq/#direct-pay-workspace-payment-portal-billingid-questions","title":"Direct Pay workspace, Payment Portal, & BillingID questions","text":""},{"location":"17-workspace_faq/#why-does-my-hard-limit-not-match-the-amount-of-funding-i-purchased","title":"Why does my Hard Limit not match the amount of funding I purchased?","text":"

    As described in your Direct Pay User Agreement, some amount of your purchase is set aside for use for account operational expenses beyond compute costs. If you have further questions about this, please contact billing@occ-data.org

    "},{"location":"17-workspace_faq/#can-i-get-a-refund-on-any-unused-compute-funding-i-purchased","title":"Can I get a refund on any unused compute funding I purchased?","text":"

    Unfortunately, as described in your Direct Pay User Agreement, OCC cannot offer refunds for unused compute time already purchased.

    "},{"location":"17-workspace_faq/#ive-been-using-my-workspace-all-day-but-my-total-usage-number-hasnt-changed-at-all","title":"I've been using my workspace all day, but my total usage number hasn't changed at all.","text":"

    Data usage is monitored by AWS, and AWS makes updates several times a day. This means that you can possibly use the workspace for a number of hours before the compute usage is updated. Also, the update is sometimes delayed -- that is, a usage update from AWS could still be omitting usage in the hours preceding it.

    "},{"location":"17-workspace_faq/#what-if-i-dont-pay-the-renewal-invoice-before-my-usage-exceeds-the-hard-limit","title":"What if I don't pay the renewal invoice before my usage exceeds the Hard Limit?","text":"

    If you do not pay the renewal invoice before you reach the Hard Limit, you will lose access to the Direct Pay account /pd, and you will be unable to launch any workspaces from the Direct Pay account. If you still have not renewed funding 2 months (60 days) after reaching your Hard Limit, we may delete the contents of the /pd for that account.

    "},{"location":"17-workspace_faq/#why-is-the-occ-payment-portal-a-separate-account","title":"Why is the OCC Payment Portal a separate account?","text":"

    The OCC payment portal is for our third-party AWS reseller, the Open Commons Consortium (OCC). The site is payments.occ-data.org. This is a separate site and account because we want to protect your financial information; the OCC payment portal site follows more rigorous financial best practices. Please note that neither this site nor OCC as an organization will hold any of your payment information; when you pay an invoice, the actual payment information will go through a separate secure payment processor and is not stored anywhere after the charge is processed.

    "},{"location":"17-workspace_faq/#can-i-change-my-email-address-associated-with-my-occ-payment-portal-account","title":"Can I change my email address associated with my OCC Payment Portal account?","text":"

    No. Unfortunately, you cannot change the email address associated with your BillingID or OCC Payment Portal account. You can create a new OCC Payment Portal account with another email address, but you will need to request a new BillingID for that account.

    "},{"location":"17-workspace_faq/#i-didnt-receive-my-token-for-the-payment-portal-login","title":"I didn\u2019t receive my token for the Payment Portal login.","text":"

    It may take a couple minutes to send, but not more than that. Check your spam folder. If you don't see it there, you can request another token by clicking the Request Token button again.

    "},{"location":"17-workspace_faq/#my-token-for-the-occ-payment-portal-expired","title":"My token for the OCC Payment Portal expired","text":"

    You can request another token by clicking Request Token again.

    "},{"location":"17-workspace_faq/#how-can-i-find-the-terms-for-the-direct-pay-billing-agreement","title":"How can I find the terms for the Direct Pay billing agreement?","text":"

    When that document is available, it will be posted on the OCC Direct Pay site. We will link to it when available.

    "},{"location":"17-workspace_faq/#at-what-point-am-i-actually-paying-for-workspace-funds-what-step-does-the-actual-charge-take-place","title":"At what point am I actually paying for workspace funds - what step does the actual charge take place?","text":"

    You designate how much you want to purchase when you initially request your BillingID on the OCC Payment Portal. However, the invoice for this amount is not actually sent until after you have done both of the following:

    1. receive a BillingID on the OCC Payment Portal
    2. use this BillingID to request a Direct-Pay-funded workspace account in the Workspace Account Manager

    Once you have completed both of these, you will be sent an invoice - when you pay this invoice is when you submit your credit card number and actually pay the charged amount.

    "},{"location":"17-workspace_faq/#what-credit-cards-can-be-used-to-pay-for-direct-pay","title":"What credit cards can be used to pay for Direct Pay?","text":"

    We accept all major credit cards, including Visa, MasterCard, Discover, and American Express.

    We also accept ACH electronic payments from a bank account, Apple Pay, PayPal, and Venmo.

    "},{"location":"17-workspace_faq/#what-are-the-different-websites-involved-in-purchasing-and-using-direct-pay-for-workspaces","title":"What are the different websites involved in purchasing and using Direct Pay for workspaces?","text":"

    BRH Platform: This is the main BRH site, with all the BRH data and the Workspace page. The site is brh.data-commons.org, and you can reach the Workspace page by clicking on the Workspace button at the top of the portal.

    BRH Workspace Account Manager: This is where you can request or view details of your Workspace accounts funded with persistent pay models. The site is brh-portal.org, and you can also reach this site from the BRH Workspace page by clicking on the Workspace Account Manager link at the top left corner, in the Account Information section. This page will be built into the BRH Platform Workspace page in a future update (i.e., it won't be a separate page).

    OCC Payment Portal: This is the payment portal site for our third-party AWS reseller, the Open Commons Consortium (OCC). Here, you can request a BillingID, specify the amount of Direct Pay funding you will want on your Direct Pay account, and track the status of your BillingID request. The site is payments.occ-data.org. Note: Neither this site nor OCC will hold any of your payment information; when you pay an invoice, the actual payment information will go through a separate secure payment processor and is not stored anywhere after the charge is processed.

    "},{"location":"17-workspace_faq/#strides-questions","title":"STRIDES Questions","text":""},{"location":"17-workspace_faq/#what-is-the-strides-program","title":"What is the STRIDES Program?","text":"

    The NIH STRIDES Initiative is a program to help NIH-funded researchers accelerate biomedical research by reducing barriers to utilizing for-fee cloud services, like the BRH Workspace.

    The STRIDES program gives cost discounts and a host of other benefits to researchers with NIH grants and contracts.

    "},{"location":"17-workspace_faq/#what-are-the-benefits-for-using-the-strides-program","title":"What are the benefits for using the STRIDES program?","text":"

    STRIDES program benefits include:

    • Cost discounts on AWS services (e.g., compute, storage, and egress fees)
    • AWS Enterprise Support
    • Training and education programs
    • and more! See the STRIDES program benefits page for more information
    "},{"location":"17-workspace_faq/#who-is-eligible-for-using-the-strides-program","title":"Who is eligible for using the STRIDES program?","text":"

    Anyone with any NIH funding, for any NIH award type, is eligible for the benefits of the STRIDES Initiative.

    "},{"location":"17-workspace_faq/#what-is-the-difference-between-strides-credit-and-strides-grantaward-funded","title":"What is the difference between STRIDES Credit and STRIDES Grant/Award Funded?","text":"

    The main difference is who is responsible for paying to provision the Workspace account.

    STRIDES Grant/Award Funded: This is the most common STRIDES account model. Here, the Workspace account is provisioned by organization receiving the NIH grant funds (generally a PI's institution). Researchers who have received NIH funding (e.g. a grant, contract, cooperative agreement, or other transaction agreement) can use these funds for the BRH workspace account. With this option, the researchers' organization will be responsible for payment.

    STRIDES Credits This is less common. Here, the payment for provisioning the Workspace account is made directly by NIH. Generally this is discussed with NIH ahead of submitting the request. With this option, once the request is approved, a new account with a set spending limit of will be provisioned by NIH directly for usage.

    "},{"location":"nextflow-create-docker/","title":"Nextflow - Create Dockerfile","text":""},{"location":"nextflow-create-docker/#create-a-dockerfile","title":"Create a Dockerfile","text":""},{"location":"nextflow-create-docker/#overview","title":"Overview","text":"

    This guide is for users who want to build Docker containers for use in Gen3 workspaces.

    "},{"location":"nextflow-create-docker/#prerequisites","title":"Prerequisites","text":"
    • Docker installed on your local machine
    • Clone or download the bio-nextflow repo
    "},{"location":"nextflow-create-docker/#start-with-a-security-validated-base-image","title":"Start with a security-validated base image","text":"

    Gen3 offers a collection of FedRAMP security-compliant base images. We re-assess these base images regularly for security compliance. Building on these base images makes it easier for your customized Docker image to pass the security scanning.

    You can access the URLs to pull these images using Docker here:

    https://github.com/uc-cdis/containers/blob/eec9789a57c5bb196a91f035e4cb069cfaa5abcd/nextflow-base-images/allowed_base_images.txt

    "},{"location":"nextflow-create-docker/#how-to-choose-your-base-image","title":"How to choose your base image","text":"

    GPU vs. CPU

    Not sure what these are? Here's a nice overview.

    If your workflow requires GPU (e.g., deep learning or other AI/ML models), please use the GPU instance; otherwise, use CPU.

    GPU images

    We have 3 images in our current selection that offer CUDA support for running on GPUs -- these have \"cuda\" in the image name, followed by the CUDA version. When possible, please choose the latest version of CUDA compatible with your tools.

    gen3-cuda-12.3-ubuntu22.04-openssl (preferred)

    gen3-cuda-12.3-torch2.2-ubuntu22.04-openssl (also preferred)

    gen3-cuda-11.8-ubuntu22.04-openssl (only use if your tools require a lower version of CUDA)

    CPU images

    We have one image that is available for running workflows on CPUs.

    amazonlinux-base

    "},{"location":"nextflow-create-docker/#test-pulling-the-docker-image","title":"Test pulling the Docker image","text":"

    Before you proceed with using this URL in your Dockerfile, you want to make sure you can pull the image. You can verify this by running the docker pull commmand in your terminal while Docker is running.

    First, open your Docker Desktop application (just to be sure Docker is running).

    Next, open your terminal. Run docker pull <image URL>, where the image URL is the full line as displayed in the file of security-validated base images. If it's working, you will see language that it is pulling (see below). When it's complete (and successfully pulled), there will be a line that says Status: Downloaded <image> (see yellow highlight below). If you see this, you know that all the steps necessary to pull your image work. If you don't see this, reach out to us on Slack.

    "},{"location":"nextflow-create-docker/#test-using-docker-scout-to-evaluate-image-vulnerabilities","title":"Test using Docker Scout to evaluate image vulnerabilities","text":"

    At the end of your test fetch, Docker offers a suggestion to use Docker Scout to examine your image for vulnerabilities (see red box above). We have already evaluated the security compliance for our image, so it's not necessary here. However, since you will want to use Docker Scout to evaluate your custom build later, now is a convenient time to test this tool and make sure you are fully set up to run Docker Scout.

    "},{"location":"nextflow-create-docker/#run-docker-scout","title":"Run Docker Scout","text":"

    To run Docker Scout, you must:

    • have Docker running (for example, the desktop application open)
    • be signed in to Docker (in the desktop application, there is a Sign In button in the upper right corner)
    • have created a Docker account (when you sign in for the first time, you will be asked to create an account).

    Once you are signed in to Docker, you can run the command they suggest after pulling an image (for example, see the command in blue text in the red box above, docker scout quickview <image URL>). If the command runs successfully, you should see output similar to the screenshot below. This is a summary of the vulnerabilities in your image.

    You can run the next suggested command (shown in red box above, docker scout cves...) to see the full list of vulnerabilities.

    Images should be able to pass Gen3 security scanning if there are no Critical vulnerabilities.

    Want to know more about Docker Scout? Check out the documentation.

    "},{"location":"nextflow-create-docker/#build-your-image-locally-on-top-of-the-base-image","title":"Build your image locally on top of the base image","text":"

    To build your own image, you need to create a Dockerfile. To build your image on a base image, the first line of the Dockerfile should reference the base image tag. The Dockerfile you create typically lives in the Git repository where you have your code to make it easier to copy into your container.

    "},{"location":"nextflow-create-docker/#unfamiliar-with-creating-dockerfiles","title":"Unfamiliar with creating Dockerfiles?","text":"

    If you are unfamiliar with creating Dockerfiles, we encourage you to explore the excellent tutorial here, as well as review the Dockerfile documentation here, before you proceed. We have identified only key highlights below.

    One reminder: Dockerfiles are typically named just \"Dockerfile\" - with a capital D and no file extension.

    "},{"location":"nextflow-create-docker/#example-build-an-image-with-a-dockerfile-and-a-requirementstxt","title":"Example: Build an image with a Dockerfile and a requirements.txt","text":"

    In our example here, we will have you build your image using a requirements.txt to identify the software tools you want to add to the base image, as well as a Dockerfile that pulls in the base image, adds the software tools specified in the requirements file, copies relevant code files, and establishes some setup parameters.

    Our example will use the files in the torch_cuda_test directory of the bio-nextflow repository. You can review the readme file in this directory for more information. It is a simple example that will build up from our base image by adding PyTorch. The Nextflow script will ultimately use a python script that checks the version of CUDA in the GPU instance and checks whether it is compatible with the version of PyTorch and CUDA available in the container.

    First, in the terminal, navigate to the directory where you cloned the the bio-nextflow repository (see Prerequisites section). Next, navigate to where the downloaded Dockerfile and requirements.txt are located:

    cd bio-nextflow/nextflow_notebooks/containerized_gpu_workflows/torch_cuda_test

    If you open the Dockerfile, note that the first line of the Dockerfile references the URL for one of our GPU base images. This is always how you will reference a base image -- with FROM and the URL.

    Then, run the Docker build command. For example:

    docker build . -t my_docker

    will tag the Dockerfile in the directory with the tag my_docker, and build a Docker image using the Dockerfile.

    "},{"location":"nextflow-create-docker/#example-examine-built-image-for-vulnerabilities","title":"Example: Examine built image for vulnerabilities","text":"

    You now have a new Docker image built upon our security-compliant base image. To more rapidly identify and address any security concerns in your customized image, we encourage all users to locally scan their image for vulnerabilities using Docker Scout, as described in our test above. Here, we have tagged our new image with my_docker. So, we would run the Docker Scout quickview command on the image using this command:

    docker scout quickview my_docker

    And to identify the specific vulnerabilities and recommendations, you would run:

    docker scout cves my_docker

    "},{"location":"nextflow-create-docker/#my-image-passes-the-local-security-scanning","title":"My image passes the local security scanning","text":"

    Once your custom image is security-compliant based on the analysis from Docker Scout, you are ready to request credentials to submit your Docker image for Gen3 security scanning.

    Continue to Request Credentials

    "},{"location":"nextflow-getting-started/","title":"Nextflow - Getting Started","text":""},{"location":"nextflow-getting-started/#getting-started-with-workflows-on-gen3","title":"Getting started with workflows on Gen3","text":"

    Please note: Nextflow features are only available to users with a Direct Pay workspace account. See our documentation for persistent paymodels to learn more about getting a Direct Pay workspace account.

    "},{"location":"nextflow-getting-started/#background","title":"Background","text":""},{"location":"nextflow-getting-started/#what-is-gen3","title":"What is Gen3?","text":"

    The Gen3 platform consists of open-source software services that make up data commons and data ecosystems (also called meshes or fabrics). A data commons is a platform that co-locates both data and compute resources so researchers can bring algorithms to the data. Data ecosystems or meshes are systems that researchers can use to search and query across multiple data commons in one location.

    More information about Gen3 can be found here. A list of data platforms using the Gen3 technology can be found here.

    "},{"location":"nextflow-getting-started/#what-are-workflows","title":"What are workflows?","text":"

    A workflow is a computational pipeline that consists of a series of steps to be executed. It could run using a software container that is a standalone, self-contained piece of software containing all the executables needed for the workflow.

    Many workflow languages have been developed in recent years. Common examples include Common Workflow Language (CWL), Open Workflow Description Language (WDL), and Nextflow. We will be using Nextflow for our exercises.

    "},{"location":"nextflow-getting-started/#workflow-execution-in-gen3","title":"Workflow execution in Gen3","text":"

    Gen3 is based on kubernetes and is container-based. A container is a standalone, self-contained collection of software that contains specific software you may need for your application (e.g., Pydicom/DICOM, Numpy, SciPy). We are testing a new workflow execution system in Gen3 that researchers can use to run containers on the cloud for various applications in a secure and isolated manner. We developed an isolation process so that each user\u2019s workflow is separate from each other, from the Gen3 core system, and from Gen3 data, except when approved and required for the specific task. The testing and development of workflows is currently underway in the Biomedical Research Hub (BRH), one of the first data ecosystems (or meshes) built at CTDS.

    "},{"location":"nextflow-getting-started/#what-is-nextflow-what-is-aws-batch","title":"What is Nextflow? What is AWS Batch?","text":"

    The workflow execution in Gen3 is powered using Nextflow, a framework for writing data-driven computational pipelines using software containers. It is a very popular and convenient framework for specifying containers, inputs and outputs, and running jobs on the cloud. Researchers have used Nextflow for several years, and 2023 has continued to see a rapid gain in its popularity per a recent survey. The scalability of workflows in Gen3 comes from AWS Batch, an AWS service capable of running compute jobs over large datasets on the Cloud.

    "},{"location":"nextflow-getting-started/#steps-to-run-workflows-in-gen3","title":"Steps to run workflows in Gen3","text":"

    To run workflows in Gen3, you will need the following:

    • Access to the BRH workspace (covered on this page)
    • A funded workspace account (covered on this page)
    • A Docker image uploaded to an ECR created for you (start with Create Dockerfile)

    Depending on your specific workflows, you may also need additional tools, resources, or access.

    "},{"location":"nextflow-getting-started/#get-access-to-the-brh-workspace-and-set-up-a-funded-account","title":"Get access to the BRH workspace and set up a funded account","text":""},{"location":"nextflow-getting-started/#1-request-access-to-the-brh-workspace","title":"1) Request access to the BRH workspace","text":"

    The BRH exposes a computational workspace that researchers can use to run simple Jupyter notebooks and submit workflows. To submit workflow jobs, you need access to the BRH workspace.

    Follow these instructions to request trial access to the BRH workspace. After you have submitted your request, please ping @Sara Volk de Garcia in Slack to alert her to look for your request and approve it.

    "},{"location":"nextflow-getting-started/#2-establish-a-workspace-account-with-a-persistent-pay-model-in-brh","title":"2) Establish a workspace account with a persistent pay model in BRH","text":"

    When you initially are granted workspace access in BRH, it is a trial access that is free for the user (paid by CTDS). However, the trial access paymodel does not permit access to the Nextflow image. To gain access to the Nextflow image needed for testing, you must request a workspace account with a persistent paymodel, so that the cost of compute jobs in your project can accrue to the right account. BRH currently supports several persistent pay models such as NIH STRIDES (payment through grant funds) and Direct Pay (credit card payment). If you're curious, see here for more information about pay models.

    For MIDRC, we have already established a Direct-Pay-type* of workspace account for testing. When you receive workspace access, Sara will work with the Nextflow team to add a Direct Pay account to your workspace.

    * Note about this Direct-Pay-type of account: It is not an ACTUAL Direct Pay account, and it does not go through the normal Direct Pay account route, nor through OCC, at all. It is funded with MIDRC contract funds, but will be labeled Direct Pay in your workspace.

    "},{"location":"nextflow-getting-started/#3-launch-a-workspace-with-the-persistent-paymodel","title":"3) Launch a workspace with the persistent paymodel","text":"

    Once you have been notified that you have a workspace account provisioned with persistent paymodel funds, you can proceed.

    • Log in to BRH and open the workspace page.
    • In the dropdown under \"Account\" in top left, select \"Direct Pay\" as your paymodel (#1 in screenshot below).
    • Once you select the Direct Pay workspace account, you should see a new option for workspace image: \"(Beta) Nextflow with CPU instances\"
    • Click the Launch button for this Nextflow workspace image (#2).
    • When you click the button, the workspace will begin to launch. This can take 5-10 minutes. You will know you successfully started the launch because you will see 3 animated dots rippling in the Launch button (see yellow highlight).

      If it takes longer than 10 minutes, try refreshing the screen and re-trying the launch. If it seems to stall out (longer than 10 min again), or if you get an error, reach out to CTDS staff through the Slack channel (but don't close the tab with the launch).

    "},{"location":"nextflow-getting-started/#quick-orientation-to-the-the-workspace","title":"Quick orientation to the the workspace","text":"

    Before using the workspace, we strongly encourage you to review the BRH Workspace documentation.

    There are several key points we want you to be aware of:

    "},{"location":"nextflow-getting-started/#store-all-data-in-the-persistent-directory-pd","title":"Store all data in the persistent directory (/pd)","text":"

    Store all files you want to keep after the workspace closes in the /pd directory; only files saved in the /pd directory will persist. Any personal files in the folder data will be lost.

    "},{"location":"nextflow-getting-started/#automated-shutdown-for-idle-workspaces","title":"Automated shutdown for idle workspaces","text":"

    Workspaces will automatically be shut down (and all workflows terminated) after 90 minutes of idle time.

    Continue to Overview of Containers in Gen3

    "},{"location":"nextflow-overview-containers/","title":"Nextflow - Containers in Gen3","text":""},{"location":"nextflow-overview-containers/#overview-developing-and-deploying-containers-in-gen3","title":"Overview: Developing and Deploying Containers in Gen3","text":"

    Locally build and test container: Gen3 provides several FedRAMP security-compliant base images that users can pull and customize.

    Request credentials and push container to Gen3 staging: Users can email Gen3 to request short-term credentials that permit them to authenticate Docker in their terminal to upload the local Docker image to a Gen3 staging repo for security review.

    Container is security-scanned; Gen3 sends approved container URI: Gen3 completes the security scan within minutes. If it is compliant, the image is moved to an ECR repo (\"approved\") from where the container can be run, and Gen3 staff will send a container URI to the user.

    If there are problems that make the image non-compliant with security requirements, a report of the vulnerabilities is provided to the user for remediation and resubmission. Users are responsible for resolving image vulnerabilities and resubmitting for scanning.

    Run workflow using approved container URI: In the BRH workspace, use a Nextflow Jupyter notebook to run Nextflow workflows in the approved container using the approved container URI. Some example notebooks can be found here, and specific examples that use an approved image URI can be found here and here

    Continue to Create Dockerfile

    "},{"location":"nextflow-request-creds/","title":"Nextflow - Request Credentials","text":""},{"location":"nextflow-request-creds/#request-credentials-for-uploading-a-docker-container","title":"Request Credentials for Uploading a Docker Container","text":"

    Please copy and paste the email template below into a new email and send to brhsupport@datacommons.io. Please be sure to add the relevant information to the bolded fields.

    Hello, User Services,

    Please create new temporary AWS credentials to permit me to upload a Nextflow container.

    The email address or ORCID I use to log in to BRH is: [BRH login email here]

    I understand that these credentials will last for 1 hour, once created. If I continue to need access to upload after they expire, I will request new credentials.

    Since the credentials will ONLY last 1 hour after creation, you may prefer we send them at a certain time of day. Please delete which of these do NOT apply:

    • Please generate and send my credentials tomorrow morning
    • Please generate and send my credentials in the afternoon
    • Please generate and send my credentials ASAP

    To ensure prompt attention, I will also ping @Sara Volk de Garcia on the Slack channel after I have sent my email.

    Thanks!

    [your name]

    Please note: If you receive credentials but you are not able to successfully upload an image before they expire, please ping @Sara on Slack to let her know she does not need to monitor your submitted image.

    Continue to Upload Docker Image

    "},{"location":"nextflow-tutorial-workflows/","title":"Nextflow - Tutorials Workflows","text":""},{"location":"nextflow-tutorial-workflows/#tutorial-nextflow-workflows","title":"Tutorial Nextflow Workflows","text":"

    We have a collection of notebooks using Nextflow in Gen3 here: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks

    Before you start executing any tutorial notebooks, review the information in the Get Set section.

    "},{"location":"nextflow-tutorial-workflows/#get-set-download-necessary-credentials-and-software","title":"Get set: Download necessary credentials and software","text":"

    Be ready to execute the tutorial workflows below by gathering credentials and installing necessary software.

    "},{"location":"nextflow-tutorial-workflows/#get-and-replace-placeholder-values-from-the-nextflow-config","title":"Get and replace placeholder values from the Nextflow config","text":"

    You can find the values to replace the placeholders in thequeue, jobRole and workDir fields in the nextflow.config file in your Nextflow workspace. Directions for finding this file are at the bottom of the \"Welcome to Nextflow\" page that opens when your Nextflow workspace first opens. These placeholder values will need to be replaced in each of the various tutorial Nextflow notebooks.

    Note that you should only copy/paste the value to replace placeholder for each field; do not copy/paste larger sections of the nextflow config, or there could be indentation problems that interfere with the code.

    "},{"location":"nextflow-tutorial-workflows/#midrc-credentials","title":"MIDRC credentials","text":"

    To download GUIDs in the workspace, you will first need to generate a MIDRC credentials on the profile page of the MIDRC portal. For this, please go to data.midrc.org, click on the user icon in the right corner (#1), and open the Profile page (#2). Click on Create API Key (#3). A pop-up window will appear with the key. If you scroll down slightly, you can see the button to download the credentials as a JSON. Credentials are valid for 1 month.

    "},{"location":"nextflow-tutorial-workflows/#example-nextflow-notebooks","title":"Example Nextflow notebooks","text":""},{"location":"nextflow-tutorial-workflows/#notebooks-with-no-containers","title":"Notebooks with no containers","text":"

    There are several general Nextflow notebooks that do not use containers at all. If you're new to nextflow and just want to get started with workflow commands, try these notebooks: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks/non_containerized_nextflow_workflows

    "},{"location":"nextflow-tutorial-workflows/#notebooks-using-containers","title":"Notebooks using containers","text":"

    We have several containerized notebooks using CPU, and other containerized notebooks using GPU.

    "},{"location":"nextflow-tutorial-workflows/#notebooks-using-cpu","title":"Notebooks using CPU","text":"

    You can find the directory with containerized notebooks using CPU here: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks/containerized_cpu_workflows

    Some use cases in this directory include:

    • Cancer use case: The chip_cancer example tutorial here includes a Dockerfile, a requirements file, a Nextflow notebook, a collection of python scripts, and a README.
    • DICOM metadata extraction use case: The midrc_batch_demo example tutorial here includes a Dockerfile, a requirements file, two Nextflow notebooks, a collection of python scripts, and a README.
    "},{"location":"nextflow-tutorial-workflows/#notebooks-using-gpu","title":"Notebooks using GPU","text":"

    You can find the directory with containerized notebooks using GPU here: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks/containerized_gpu_workflows

    Some use cases in this directory include:

    • Pytorch/cuda test simple use case: The torch_cuda_test example tutorial here includes a Dockerfile, a requirements file, a Nextflow notebook, a simple python script, and a README.
    • COVID Challenge 2022 use case: The covid_challenge_container example tutorial here includes a Dockerfile, a requirements file, a Nextflow notebook, a python script, and a README.
    "},{"location":"nextflow-upload-docker/","title":"Nextflow - Upload Docker Image","text":""},{"location":"nextflow-upload-docker/#pushing-docker-images-to-aws-ecr","title":"Pushing Docker Images to AWS ECR","text":""},{"location":"nextflow-upload-docker/#overview","title":"Overview","text":"

    This guide is for users who have received temporary credentials granting access to push container images to a specific AWS Elastic Container Registry (ECR) repository.

    "},{"location":"nextflow-upload-docker/#prerequisites","title":"Prerequisites","text":"
    • Docker installed on your local machine.
    • AWS CLI installed locally
    • Temporary AWS credentials provided by the User Services team.
    • The URI of the ECR repository you have been given access to (shared in the AWS credentials)
    • A image you want to push, or a Dockerfile to build your image.
    "},{"location":"nextflow-upload-docker/#a-note-about-timing","title":"A note about timing","text":"

    Your temporary AWS credentials only lasts for 1 hour from when they were created; User Services should have provided an expiration time when sharing the credentials with you. You must fully complete the push to ECR before they expire, or you will need to request new credentials from User Services.

    If you do not complete pushing an image to the ECR before they expire, please ping @Sara Volk de Garcia in Slack so she knows not to monitor for an image to progress through scanning.

    "},{"location":"nextflow-upload-docker/#a-note-about-security-and-expiration-of-approved-docker-images","title":"A note about security and expiration of approved Docker images","text":"

    Because of the ever-updating nature of vulnerability detection, an image that has passed in the past is not guaranteed to always pass. Even if you are resumitting an image that has passed previously, there may be new vulnerabilities that have been reported that mean the image does not pass now. Best practices for most efficient submission are to always examine an image with Docker Scout before pushing it.

    Similarly, because new vulnerabilities are always emerging, to protect the security of the Gen3 Workspace, approved containers will only remain available in the approved repo for 30 days. However, users can always request new credentials and resubmit their image for scanning.

    "},{"location":"nextflow-upload-docker/#set-aws-environment-variables","title":"Set AWS environment variables:","text":"

    The commands in this section are valid if you are using a Linux or MacOS. If you are using Windows, we will provide a separate set of commands for you to set the AWS environment variables.

    Before you can push your Docker image to the ECR repository, you need to configure the AWS CLI with the temporary credentials you received. In the credentials sent to you, there should be the commands needed to run this below the line \"Please run the following commands to set your AWS credentials:\". Copy those (they will look similar to the block below) and run them in the terminal.

      export AWS_ACCESS_KEY_ID=<AccessKeyId>\n  export AWS_SECRET_ACCESS_KEY=<SecretAccessKey>\n  export AWS_SESSION_TOKEN=<SessionToken>\n

    Note: the variables are set only as long as the terminal is open; export the variables again if you close and open a new terminal.

    "},{"location":"nextflow-upload-docker/#verify-configuration","title":"Verify configuration:","text":"

    Run aws sts get-caller-identity to verify that your CLI is using the temporary credentials. If you successfully set the variables, you should see output showing the AWS information - UserID, account, etc.

    "},{"location":"nextflow-upload-docker/#authenticate-docker-to-ecr","title":"Authenticate Docker to ECR","text":"

    Next, use the AWS CLI to retrieve an authentication token and authenticate your Docker client to your registry. In the credentials, there is a command below the line \"After setting credentials you will need to log in to your docker registry. Please run the following command:\". Copy that (it will look similar to the command below) and run it in the terminal.

     aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <repositoryUri>\n
    "},{"location":"nextflow-upload-docker/#preparing-to-push-your-docker-image","title":"Preparing to push your Docker image","text":"

    The specific steps you use to prepare to push your image depends on whether you have an image already built or if you will need to build from a Dockerfile.

    "},{"location":"nextflow-upload-docker/#if-you-already-built-a-local-docker-image-tag-your-docker-image","title":"If you already built a local Docker image: Tag your Docker image","text":"

    If you already have a locally-built Docker image, you will not need to run the docker build command included in the credentials. But, you do need to tag it with the ECR repository URI and the image tag you want to use. This command is not in the credentials file.

     docker tag <local-image>:<local-tag> <repositoryUri>:<image-tag>\n

    Replace < local-image > with the name of your local Docker image and < local-tag > with the tag you want to push.

    If you're not sure what your local image and tag names are, you can run docker images in your terminal. It will provide a list of all your saved docker images. The column called REPOSITORY is the local image name. The column called TAG is the local tag name for this image. Note: the image-tag you select will travel with your image to the approved repo. So, select a tag you are comfortable with.

    Replace < repositoryUri > with the ECR repository URI provided at the top of the credentials file, and < image-tag > with the image tag name you want to use in your ECR.

    Important note: If you do not want the most-recently-pushed image to replace an earlier version with the same tag in your ECR, image-tags should be unique. For example, you create an image with an image-tag batch-poc. If you later push another image to < repositoryUri >:batch-poc, it will overwrite the previous version of the image in your ECR (you will only have 1 container with the image tag \"batch-poc\"). If you do not want to overwrite, you can use versioned image-tags. For example: batch-poc-1.0, and then batch-poc-1.1. If you want to replace previous versions of your container, you can use the same image-tag.

    "},{"location":"nextflow-upload-docker/#if-you-need-to-build-your-docker-image","title":"If you need to build your Docker image:","text":"

    If you haven't already built your Docker image, you can use the docker build command that is included in your credentials, similar to what is shown below. Note that you should run this command from the directory holding your Dockerfile. You will need to replace the < tag > in your command with the image tag name you want to use in your ECR. (Read more about image tags in the previous section.)

     docker build -t <repositoryUri>:<tag>\n

    If you use this docker build command from your credentials, you do not need to use the docker tag command (described in the previous section).

    "},{"location":"nextflow-upload-docker/#push-the-docker-image-to-the-ecr","title":"Push the Docker image to the ECR","text":"

    Push the tagged image to the ECR repository. The docker push command is also in the credentials - you just need to specify the image tag you selected when either tagging or building the image in the previous section.

     docker push <repositoryUri>:<image-tag>\n

    If the push is successful, you will see various \"layer 1\" \"layer 2\" etc outputs, and it will indicate progress in pushing. This can take minutes, depending on how large your container is.

    If the push fails, you will get a persistent message about \"Waiting for layer\". This usually means it cannot find the repository, so double-check that there is no typo, and that you have set your AWS environment variables since you opened the terminal most recently.

    "},{"location":"nextflow-upload-docker/#completion","title":"Completion","text":"

    Once the push completes, your Docker image will be available in the ECR repository (although you will not be able to see it). It will be scanned, and if passes the security scanning, CTDS will move it to the nextflow-approved repo. When it's available in nextflow-approved, User Services will share a docker URI that looks something like this: 143731057154.dkr.ecr.us-east-1.amazonaws.com/nextflow-approved/< your username >:< image-tag > You can then use this new URI to run Nextflow workflows with your container in the BRH workspace.

    "},{"location":"nextflow-upload-docker/#how-to-use-an-approved-docker-uri","title":"How to use an approved Docker URI","text":"

    Once you have your Docker URI, you are ready to run your Nextflow workflow! You can take the Docker URI and make it the value for the \"container\" field(s) in your Nextflow notebook. For example, in the torch_cuda_batch Nextflow notebook, you would go to the nextflow.config section and replace the placeholder value for container with the approved Docker URI.

    "},{"location":"nextflow-upload-docker/#support","title":"Support","text":"

    If you encounter any issues or require assistance, please reach out to the User Services team that provided you with the temporary credentials, or brhsupport@datacommons.io, or reach out on Slack. (Slack will result in the quickest reply.)

    Continue to Tutorial Workflows

    "}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Biomedical Research Hub Documentation","text":""},{"location":"#biomedical-research-hub-documentation","title":"Biomedical Research Hub Documentation","text":"

    The Biomedical Research Hub (BRH) is a cloud-based and multifunctional web interface that provides a secure environment for discovery and analysis of scientific results and data. It is designed to serve users with a variety of objectives, backgrounds, and specialties.

    The BRH represents a dynamic Data Ecosystem that aggregates and hosts metadata from multiple resources to make data discovery and access easy for users.

    The platform provides a way to search and query over study metadata and diverse data types, generated by different projects and organizations, and stored across multiple secure repositories.

    The BRH also offers a secure and cost-effective cloud-computing environment for data analysis, empowering collaborative research and development of new analytical tools. New workflows and results of analyses can be shared with the community.

    The BRH is powered by the open-source software \u201cGen3\u201d.

    Gen3 was created by and is actively developed at the University of Chicago\u2019s Center for Translational Data Science (CTDS) with the aim of creating interoperable cloud-based data resources for the scientific research community.

    "},{"location":"01-home/","title":"Home","text":""},{"location":"01-home/#biomedical-research-hub-documentation","title":"Biomedical Research Hub Documentation","text":"

    The Biomedical Research Hub (BRH) is a cloud-based and multifunctional web interface that provides a secure environment for discovery and analysis of scientific results and data. It is designed to serve users with a variety of objectives, backgrounds, and specialties.

    The BRH represents a dynamic Data Ecosystem that aggregates and hosts metadata from multiple resources to make data discovery and access easy for users.

    The platform provides a way to search and query over study metadata and diverse data types, generated by different projects and organizations, and stored across multiple secure repositories.

    The BRH also offers a secure and cost-effective cloud-computing environment for data analysis, empowering collaborative research and development of new analytical tools. New workflows and results of analyses can be shared with the community.

    The BRH is powered by the open-source software \u201cGen3\u201d

    Gen3 was created by and is actively developed at the University of Chicago\u2019s Center for Translational Data Science (CTDS) with the aim of creating interoperable cloud-based data resources for the scientific research community.

    "},{"location":"02-types_of_shared_data/","title":"FAIR Data","text":""},{"location":"02-types_of_shared_data/#types-of-shared-data","title":"Types of Shared Data","text":"

    The BRH provides secure access to study metadata from multiple resources (Data Commons) and will be the driving engine for new discovery. The types of data represented are diverse and include scientific research across multiple disciplines.

    The BRH aims to make data more accessible by following the \"FAIR\" principles:

    Findable Researchers are provided an intuitive interface to search over metadata for all studies and related datasets. Each study and dataset will be assigned a unique, persistent identifier. Accessible

    Authenticated users can request and receive access to controlled-access data by data providers. Metadata can be accessed via an open API.

    Interoperable Data can be easily exported to various workspaces for analysis using a variety of software tools.

    Reusable Data can be easily reused to facilitate reproducibility of results, development and sharing of new tools, and collaboration between investigators.

    "},{"location":"03-data_and_repos/","title":"Data Management and Repositories","text":""},{"location":"03-data_and_repos/#data-and-repositories","title":"Data and Repositories","text":"

    The BRH securely exposes study metadata and data files stored on multiple FAIR repositories and Data Commons, i.e. data libraries or archives, to provide an easy way to connect different repositories on one single location.

    FAIR data repositories are traditionally a part of a larger institution/working group established for research, data archiving, and, to serve data users of that organization.

    As of March 2023, the list of currently shared resources/Data Commons on BRH includes:

    • BioData Catalyst
    • CRDC Cancer Imaging Data Commons
    • CRDC Genomic Data Commons
    • CRDC Integrated Canine Data Commons
    • CRDC Proteomic Data Commons
    • IBD Commons
    • JCOIN
    • MIDRC
    • NIAID ClinicalData
    "},{"location":"04-BRH_overview/","title":"Quickstart - BRH Overview","text":""},{"location":"04-BRH_overview/#brh-overview","title":"BRH Overview","text":"

    You can get started with the Biomedical Research Hub by exploring the features described below.

    "},{"location":"04-BRH_overview/#register-for-workspaces","title":"Register for Workspaces","text":"

    Get a temporary free trial to BRH workspaces and simultaneously register for extended workspace access with NIH STRIDES

    "},{"location":"04-BRH_overview/#login-page","title":"Login Page","text":"

    Log in here to unlock controlled-access data and workspace access with your credentials

    "},{"location":"04-BRH_overview/#check-study-access-and-authorize-external-data-resources","title":"Check Study Access and Authorize External Data Resources","text":"

    Check study access and connect your account to other resources to access all studies for which you are authorized.

    "},{"location":"04-BRH_overview/#discovery-page","title":"Discovery Page","text":"

    Discover datasets across multiple resources and export selected data files to the analysis workspace.

    "},{"location":"04-BRH_overview/#workspaces-page","title":"Workspaces Page","text":"

    Access data across multiple resources and perform analyses in a secure, cloud-based environment

    "},{"location":"04-BRH_overview/#profile-page","title":"Profile Page","text":"

    Review data access permissions and generate API credentials files used for programmatic access.

    "},{"location":"05-workspace_registration/","title":"Workspace Page (Registration)","text":""},{"location":"05-workspace_registration/#register-for-brh-workspace","title":"Register for BRH Workspace","text":"

    To start exploring BRH Workspace right away, users can apply for a Temporary Trial Access. Extended access to BRH Workspace is granted using a persistent pay model workspace account (e.g., STRIDES or Direct Pay), which can be requested after trial access is provisioned. Please see below for more details.

    "},{"location":"05-workspace_registration/#requesting-temporary-trial-access-to-brh-workspace","title":"Requesting Temporary Trial Access to BRH Workspace","text":"

    For new users without workspace access, please follow these steps

    1. Login to BRH
    2. Click on the Workspace tab. That opens the Workspace Access Request form
    3. Fill in the details and submit the form shown below.

    4. The form should be completed only once. Following submission, users will see a success message and a link back to the Discovery page.

    5. Users will receive an email notifying them that the request has been received.

    6. Users will receive another email notifying them that the temporary trial access request has been approved. They should then be able to access workspaces on BRH. Please note that the timeline for this approval can be a few business days.

    "},{"location":"05-workspace_registration/#requesting-extended-access-to-brh-workspace-using-a-persistent-pay-model-eg-strides-direct-pay","title":"Requesting Extended Access to BRH Workspace using a Persistent Pay Model (e.g., STRIDES, Direct Pay)","text":"

    Please Note: The process for granting access for a workspace account can take 2-4 weeks for NIH STRIDES, and a month for Direct Pay.

    Find instructions for funding workspace accounts with any of the persistent pay models on the Workspace Accounts page.

    "},{"location":"06-loginoverview/","title":"Login Page","text":""},{"location":"06-loginoverview/#login-access-overview","title":"Login Access Overview","text":"

    All users are able to browse the study metadata on the Discovery Page without logging in.

    Users will need to log in and obtain authorization (access) in order to:

    • Access studies with controlled data
    • Perform analyses in Workspaces
    • Download data files and file manifests
    • Run interactive tutorial notebooks in the Workspaces

    Start by visiting the login page (https://brh.data-commons.org/login).

    • Login from Google: You may login using any Google account credentials, or a G-suite enabled institutional email. This option may or may not be available depending on the institution or organization the user is associated with.
    • Login via InCommons --> NIH eRA: When selecting the NIH/eRA (electronic Research Administration) login using InCommons, you will need access permissions through the eRA Commons account.

    After successfully logging in, your username will appear in the upper right-hand corner of the page.

    "},{"location":"07-how_to_check_request_access/","title":"Check Study Access and Authorize Resources","text":""},{"location":"07-how_to_check_request_access/#how-to-check-and-request-access","title":"How To Check and Request Access","text":"

    Users can find out to which projects they have access to by navigating to the Discovery Page and by selecting through the column filters at the top of the table.

    "},{"location":"07-how_to_check_request_access/#access-to-individual-studies","title":"Access to individual Studies","text":"

    You can check access by clicking on a study in the Discovery Page, as shown below:

    The Study Page will display access permissions in the top right corner. Click the \u201cPermalink\u201d button in the upper right to copy the link to the clipboard.

    If you have access, a green box will show \u201cYou have access to this study\u201d.

    Access is displayed as a green box on top of each Study Page.

    Note: If you have access but cannot select the study to export to workspace, it is because the manifest is not yet available. Please use API for these cases.

    "},{"location":"07-how_to_check_request_access/#authorize-to-gain-access-to-fair-enabled-repositoriesresources","title":"Authorize to Gain Access to FAIR-enabled Repositories/Resources","text":"

    BRH securely provides access to data stored on multiple FAIR repositories, resources, and Data Commons.

    Users must authorize these resources on their account in order to:

    1. run Jupyter Notebooks that utilize data stored in various FAIR repositories.
    2. export data that is stored in FAIR repositories from the Discovery Page to the Workspaces.
    3. download data that is stored in FAIR repositories from the Discovery Page.

    In order to authorize access to these repositories and data commons, navigate to the Profile Page. Authorize the relevant commons by clicking on the buttons for the relevant commons (e.g., the Refresh or Authenticate buttons in the image shown below).

    Authorization needs to be renewed after 30 days, as indicated after \"Status: expires in [..] days\".

    "},{"location":"08-discovery_page/","title":"Discovery Page","text":""},{"location":"08-discovery_page/#discovery-page","title":"Discovery Page","text":"

    The Discovery Page provides users a venue to search and find studies and datasets displayed on the Biomedical Research Hub. Users can browse through the publicly accessible study-level metadata without requiring authorization.

    Use text-based search, faceted search, and tags to rapidly and efficiently find relevant studies, discover new datasets across multiple resources, and easily export selected data files to the analysis workspace. Browse through datasets and study-level metadata and find studies using tags, advanced search, or the free text search field.

    "},{"location":"08-discovery_page/#search-features","title":"Search Features","text":"

    On the Discovery page, several features help you navigate and refine your search.

    1. Total number of studies: shows the number of studies the BRH is currently displaying.
    2. Total number of subjects: shows the number of subjects the BRH is currently displaying.
    3. Free Text Search: Use keywords or tags in the free-text-based search bar to find studies. The free-text search bar can be used to search for study name, ID number, Data Commons, or any keyword that is mentioned in the metadata of the study.
    4. Data Resources/Data Commons Tags: view these by selecting \"Study Characteristics\". Click on a tag to filter by a Data Resource/Data Commons. Selecting multiple tags works in an \"OR\" logic (e.g., \"find AnVIL OR BioData Catalyst studies\").
    5. Export Options: Login first to leverage the export options. Select one or multiple studies and download a file manifest or export the data files to a secure cloud environment \"Workspace\" to start your custom data analysis in Python or R.
    6. Data Availability: Filter on available, pending, and not-yet-available datasets.
    7. Studies: This table feature presents all current studies on BRH. Click on any study to show useful information about the study (metadata).
    "},{"location":"08-discovery_page/#find-available-study-level-metadata","title":"Find available Study-level Metadata","text":"

    Clicking on any study will display the available study-level and dataset metadata.

    "},{"location":"08-discovery_page/#find-accessible-datasets","title":"Find accessible Datasets","text":"

    Users can select and filter studies from multiple resources and conduct analyses on the selected datasets in a workspace. Users can search but not interact with data they do not have access to. By selecting the data access button in the top right corner of the study page user access can be displayed. The Discovery Page will automatically update the list of studies that are accessible.

    "},{"location":"09-workspace_page/","title":"Getting Started in Workspace","text":""},{"location":"09-workspace_page/#workspaces","title":"Workspaces","text":"

    To use the workspace, users must register for workspace accounts to use the workspaces, as described on the Workspace Registration page.

    BRH workspaces are secure data analysis environments in the cloud that can access data from one or more data resources. By default, Workspaces include Jupyter notebooks, Python and R, but can be configured to host virtually any application, including analysis workflows, data processing pipelines, or data visualization apps.

    New to Jupyter? Learn more about the popular tool for data scientists on Jupyter.org (disclaimer: CTDS is not responsible for the content).

    "},{"location":"09-workspace_page/#guideline-to-get-started-in-workspaces","title":"Guideline to get started in Workspaces","text":"

    Once users have access to workspaces, use this guide below to get started with analysis work in workspaces.

    1. Users need to log in via https://brh.data-commons.org/login to access workspaces.

    2. After navigating to https://brh.data-commons.org/workspace, users will discover a list of pre-configured virtual machine (VM) images, as shown below.

      • (Generic) Jupyter Notebook with R kernel: Choose this VM if you are familiar with setting up Python- or R-based Notebooks, or if you just exported one or multiple studies from the Discovery Page and want to start your custom analysis.
      • Tutorial Notebooks: Explore our Jupyter Notebook tutorials written in Python or R, which pull data from various sources of the Biomedical Research Hub to leverage statistical programs and data analysis tools. These are excellent resources for code to pull and analyze data from BRH, and examples that illustrate the variety of data and analyses available through BRH.
    3. Click \u201cLaunch\u201d on any of the workspace options to spin up a copy of that VM. The status of launching the workspace is displayed after clicking on \u201cLaunch\u201d. Note: Launching the VM may take several minutes.

    4. After launching, the home folders are displayed. One of these folders is the user's persistent drive (\"/pd\").

    5. Select the /pd folder. New files or licenses should be saved in the the /pd directory if users need to access them after restarting the workspaces. Only files saved in the /pd directory will remain available after termination of a workspace session.

      []

      • Attention: Any personal files in the folder \u201cdata\u201d will be lost. Personal files in the directory /pd will persist.
      • Do not save files in the \"data\" or \u201cdata/brh.data-commons.org\u201d folders.
      • The folder \u201cbrh.data-commons.org\u201d in the \u201cdata\u201d folder will host the data files you have exported from the Discovery Page. Move these files to the /pd directory if you do not want to have to export them again.
      • /pd has a capacity limit of 10GB.
    6. Start a new notebook under \u201cNotebook\u201d in the Launcher tab. Click the tiles in the launcher and choose between Python 3 or R Studio as the base programmatic language. Note: You can open and run multiple notebooks in your workspace. However, the generic, tutorial and nextflow workspace images are currently separate docker images, so there is no functionality to combine them or run nextflow in the tutorial or generic images. This may be available in the future, after further testing and development activities.

    7. Experiment away! Code blocks are entered in cells, which can be executed individually or all at once. Code documentation and comments can also be entered in cells, and the cell type can be set to support Markdown.

      Results, including plots, tables, and graphics, can be generated in the workspace and downloaded as files.

    8. Do not forget to terminate your workspace once your work is finished. Unterminated workspaces continue to accrue computational costs. Note, that Workspaces automatically shut down after 90 minutes of idle time.

    Further reading: read more about how to download data files into the Workspaces here.

    "},{"location":"09-workspace_page/#upload-save-and-download-filesnotebooks","title":"Upload, save, and download Files/Notebooks","text":"

    Users can upload data files or Notebooks from the local machine to the home directory by clicking on \u201cUpload\u201d in the top left corner. Access the uploaded content in the Notebook (see below).

    Then run in the cells, for example:

    import os

    import pandas as pd

    os.chdir('/data')

    demo_df = pd.read_csv('/this_is_a_demo.txt', sep='\\t')

    demo_df.head()

    Users can save the notebook by clicking \"File\" - \"Save as\", as shown below.

    Users can download notebooks by clicking \"File\" - \"Download\", as shown below. Download the notebook, for example, as \".ipynb\".

    "},{"location":"09-workspace_page/#environments-languages-and-tools","title":"Environments, Languages, and Tools","text":"

    The following environments are available in the workspaces:

    • Jupyter Lab

    The following programmatic languages are available in Jupyter Notebooks:

    • R
    • Python 3

    The following tools are available in Jupyter Notebooks:

    • GitHub (read GitHub documentation)
    "},{"location":"09-workspace_page/#python-3-and-r-in-jupyter","title":"Python 3 and R in Jupyter","text":"

    Both Python 3 and R are available in Jupyter Notebooks.

    Users can expect to be able to use typical Python or R packages, such as PyPI or CRAN. For Python and R, users can start a new notebook with a tile under \"Notebook\", as shown below.

    "},{"location":"09-workspace_page/#automatic-workspace-shutdown","title":"Automatic Workspace Shutdown","text":"

    Warning: When a BRH Workspace reaches the STRIDES Credits limit for STRIDES Credits Workspaces, or reaches the Hard Limit for STRIDES Grant Workspaces, the Workspace will be automatically terminated. Please be sure to save any work before reaching the STRIDES Credit or Hard Limit.

    Warning: Workspaces will also automatically shut down after 90 minutes of idle time. A pop-up window will remind users to navigate back to the workspaces page in order to save the data.

    "},{"location":"10-profile_page/","title":"Profile Page","text":""},{"location":"10-profile_page/#profile-page","title":"Profile Page","text":"

    On the profile page users will find information regarding their access to projects, access to Gen3-specific tools (e.g. access to the Workspace), and the function to create API keys for credential downloads. API keys are necessary for the download of files using the Gen3 Python SDK.

    Users can view their study access and API keys can be viewed/created/downloaded on the Profile Page.

    "},{"location":"11-downloading_data_files/","title":"Downloading Data Files","text":""},{"location":"11-downloading_data_files/#downloading-data-files","title":"Downloading Data Files","text":"

    Users can download data files for work in the provided Workspace. Utilizing workspaces leverages CTDS-owned python software development kit (SDK) as well as a cloud-based computing platform.

    Note: accessing data files requires linked access to all FAIR enabled repositories, as described here.

    "},{"location":"11-downloading_data_files/#download-data-files-into-a-workspace-with-the-python-sdk","title":"Download Data Files into a Workspace with the Python SDK","text":"

    Users can load data files from a manifest created on the Discovery Page directly into a Workspace. Below are the steps to do so.

    1. Navigate to the Discovery Page. Link your accounts to FAIR repositories as described here.

    2. Find the study or studies of interest by using the search features or the list of accessible studies.

    3. Select the clickable boxes next to the studies. Click on \"Open in Workspace\", which will initiate the Workspace Launcher.

    4. The Workspace will be prepared and the selected data will be made available via a manifest placed in a time/date stamped directory in the following path: pd/data/brh.data-commons.org/exported-manifest-(time/date stamp) Please do not navigate away from this page until the download is complete. Created directory may take several minutes to load.

    5. Once loaded, users can navigate into the directory and access either the manifest or an automatically generated notebook (e.g., data.ipynb) with instructions to download the data. Users should note that the gen3-sdk is utilized in this notebook and directory to download data.

    "},{"location":"12-contact/","title":"Contact","text":""},{"location":"12-contact/#contact-brh-support","title":"Contact BRH Support","text":"

    Need help? Please contact our help desk - brhsupport@datacommons.io

    "},{"location":"13-workspace_accounts/","title":"Workspace Accounts - Trial and Persistent Pay Models (STRIDES, Direct Pay)","text":""},{"location":"13-workspace_accounts/#workspace-accounts-trial-and-persistent-pay-models-strides-direct-pay","title":"Workspace Accounts - Trial and Persistent Pay Models (STRIDES, Direct Pay)","text":""},{"location":"13-workspace_accounts/#four-different-pay-models-for-brh-workspace-accounts","title":"Four Different Pay Models for BRH Workspace Accounts","text":"

    We have 4 different pay models for workspace accounts:

    • Trial Access (free for user, limited to 2 months)
    • OCC Direct Pay (persistent pay model paid by credit card through OCC Payment Portal)
    • STRIDES Grant/Award Funded (persistent pay model paid by organizations with NIH grant funds)
    • STRIDES Credits (persistent pay model paid directly by NIH)

    Instructions for requesting funding for each pay model are provided below.

    Please Note:

    • The process for granting access for a workspace account can take 2-4 weeks for NIH STRIDES, and a month for Direct Pay, although it may also be faster.
    • The account from each different paymodel will have its own workspace storage directory (/pd; read about /pd here); data is not shared between accounts with different funding types. However, you can import and export data among accounts as long as they are active and have funding.
    "},{"location":"13-workspace_accounts/#request-trial-access","title":"Request Trial Access","text":"

    Trial Access is granted for 2 months when you request access to the BRH Workspace page. The instructions for requesting workspace access are on the Workspace Registration page. Note that your trial access will become inactive as soon as you have a workspace account funded through a persistent paymodel.

    "},{"location":"13-workspace_accounts/#requesting-funding-for-workspace-accounts-through-any-persistent-pay-model","title":"Requesting Funding for Workspace Accounts Through Any Persistent Pay Model","text":"
    1. Once they have access to the Workspace page, users can request a workspace account by first logging to BRH (#1), going to the Workspace page (#2), then clicking the Workspace Account Manager link (#3). Click \"Log in\" and \"Yes, I authorize\" to open the portal in a new tab.

      Some STRIDES users may receive an invitation via email to register for an NIH STRIDES workspace account. These users can also click the link in the invitation email to get to the BRH Workspace Account Manager.

    2. In the BRH Workspace Account Manager, users can see their persistent workspace accounts and any available credits or funds in the accounts. If you only have trial access, you will not see any active accounts.

      To request a workspace account with a persistent paymodel, click the \"Request New Workspace\" button.

    3. Choose from any of the 3 persistent pay model funding options: a) STRIDES Grant/Award Funded; b) STRIDES Credits; or c) OCC Direct Pay to request a funded workspace account.

    "},{"location":"13-workspace_accounts/#occ-direct-pay-funded-workspace-account","title":"OCC Direct Pay Funded Workspace Account","text":"

    The OCC Direct Pay form can be selected if a user wants to pay with a personal or organizational credit card. OCC Direct Pay only requires a valid credit card.

    Funding a workspace account with OCC Direct Pay has 2 major parts:

    1. Request BillingID
    2. Use BillingID to provision Direct Pay funds for the Workspace Account
    "},{"location":"13-workspace_accounts/#request-billingid","title":"Request BillingID","text":"

    Note: It can take up to a month to receive a BillingID if you act promptly to complete each step.

    1. Go to https://payments.occ-data.org or click on the Payment Portal link (red arrow below) on the OCC Direct Pay tab for the workspace account request form.

    2. Create an account for the OCC Payment Portal: Under Create Account, enter your email address and click \u201cRequest Token\u201d. Note: This should be the email address you use to log into BRH; Direct Pay is not currently compatible with ORCID login. If you already logged in with ORCID to request a workspace account, please log out and authenticate with either the InCommon or Google option. Please monitor this address, as relevant alerts will be sent here.

    3. You will receive an email with a 6-digit token within a couple minutes of your request. It may go to spam, so watch the Spam folder, as well.
    4. Copy your token from your email. Paste it into the Enter Token field on the OCC Payment Portal. Click Sign In. (Note: You will be asked to enter a token at each log-in.)
    5. Successful sign-in will open a Profile page for your account on the OCC Payment Portal. When you first create your account on the payment portal, you will not have any access requests.

    6. Click the \u201crequest access\u201d button. The form shown below will open. Complete the form and click Submit.

      For Role, indicate your role within your organization/institution. If you don't have an institutional affiliation, you can put \"independent data analyst\"

    7. Once the form is submitted, a message will appear indicating successful submission. You will also receive an email (again, check spam).

    8. If you return to the Profile page in the OCC Payment Portal now, you\u2019ll see there is an active request in the table at the bottom. Click \u201cCheck Status\u201d to view progress on the steps toward final approval and provisioning of your request. You can view what happens at each stage of processing here: https://payments.occ-data.org/processing-stages/.
    9. When you click Check Status, you can see the progress of your request. At first, you will see that they are processing your request (indicated by an orange color). Once OCC finishes processing your access request, you will receive 2 emails, and the progress tracker at the bottom will show that Submit Access Request is completed (green). Complete E-Doc is now colored orange. The first email indicates that your access request status has progressed, and the second email has a link to an electronic document.

      Important: Review the Agreement carefully to understand the terms. PLEASE READ THIS DOCUMENT VERY CAREFULLY BEFORE SIGNING! This document presents the terms governing how your Direct Pay funds will be allocated, among other things. Be sure you understand all the terms before you sign and submit. If you have any questions or concerns about the terms and conditions, please email billing@occ-data.org before you sign.

      Once you submit the signed document, it could take up to 5 days to finish processing receipt of the signed document and update your progress tracker. You will receive an email when processing is complete.

      But, you will receive an email quickly confirming that the document has been signed and providing a link to download the signed document for your records. If you do not receive that within 5 minutes (be sure to check your spam folder), please return to the document and verify that you fully signed and submitted the document. Please save the document so you can reference it as needed.

    10. When your request has been fully approved, the \u201cReceived Approval\u201d step will be green, you\u2019ll receive an email, and your BillingID field will have been populated on the Profile page of the OCC Payment Portal.

    You may now use your BillingID to provision a Direct Pay workspace account in BRH.

    "},{"location":"13-workspace_accounts/#use-billingid-to-provision-direct-pay-funds-for-the-workspace-account","title":"Use BillingID to provision Direct Pay funds for the Workspace Account","text":"

    Note 1: Before you request a workspace account through any persistent pay model (eg, Direct Pay, STRIDES), be sure to backup all data in your /pd for your workspace. Once your persistent pay model is funded, you will no longer have access to the /pd used during trial access. (What\u2019s a /pd?)

    Note 2: It can take up to 12 business days to provision funds from a BillingID.

    1. Copy your BillingID from your User Profile page in the OCC Payment Portal.
    2. Return to the Workspace Account Manager, and click Request New Workspace to open the Workspace Account Request Form.
    3. Click the OCC Direct Pay tab.
    4. Paste your BillingID in the field, and enter the first 3 characters of the email address associated with your BillingID (For example, if your email was john.smith@gmail.com, you would enter \u201cjoh\u201d)
    5. Click Confirm BillingID. Once your BillingID is confirmed, the bottom part of the form will open to allow you to enter the details for provisioning your account.
    6. Be sure to check the box that says that you agree to be invoiced. The amount of the invoice is taken from the value you entered when you made your access request on the Payment Portal.

      • Enter a title for your project and a brief summary. This is to be used to help you keep track of your requests in case you have multiple accounts for different projects.
      • Identify whether your workspace use is personal or organizational.
      • Indicate whether you have a credit card you are allowed to use to pay for provisioning the workspace account. If your workspace is personal and you have any credit card, the answer will be yes. If your workspace is organizational - make sure you are not using a departmental card or similar without permission.
      • Indicate what role you have as a researcher on this project.

    7. Once you submit this form, you will receive an email with the invoice. (It can take up to 5 business days to be sent.) There will be a secure link in the invoice to submit your credit card information and pay the invoice. When you pay the invoice, OCC will apply the funds, create an AWS account for this project\u2019s workspace, and send that information to BRH to provision your account. This can take up to 7 business days after you have signed the form. You will receive an email when your account is set up and ready to be used in your workspace.

      When you submit this form, you will also see a new entry in the OCC Direct Pay Accounts table in the Workspace Account Manager. The request status for your request will be Pending until the invoice is paid and the account is finalized.

    8. Once your Direct Pay request is funded, your workspace will be shown as Active on the Request Status column in the Workspace Account Manager.

    "},{"location":"13-workspace_accounts/#strides-grantaward-funded-workspace-account","title":"STRIDES Grant/Award Funded Workspace Account","text":"

    The STRIDES Grant/Award Funded form can be selected if researchers have received NIH funding (e.g. a grant, contract, cooperative agreement, or other transaction agreement) and intend to use these funds for the BRH workspace account. With this option, the researchers' organization will be responsible for payment.

    Submit the request form. Note that the process of granting access for a workspace account can take 2-4 weeks and users will be notified by email. Following approval, users will see the provisioned workspace account in the BRH Workspace Accounts Manager.

    "},{"location":"13-workspace_accounts/#strides-credits-funded-workspace-account","title":"STRIDES Credits Funded Workspace Account","text":"

    Select the STRIDES Credits form to request credits from the NIH STRIDES Initiative for the BRH Workspace account. With this option, once the request is approved, a new account with a set spending limit of will be provisioned by NIH directly for usage.

    Submit the request form. Note that the process of granting access for a workspace account can take 2-4 weeks and users will be notified by email. Following approval, users will see the provisioned workspace account in the BRH Workspace Accounts Manager.

    "},{"location":"13-workspace_accounts/#what-is-the-nih-strides-initiative","title":"What is the NIH STRIDES Initiative?","text":"

    The NIH STRIDES initiative (NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability) can provide funding for BRH Workspace Accounts. The NIH STRIDES Initiative enables researchers with NIH grants to cost-effectively leverage the use of cloud environments by partnering with commercial providers, such as Amazon Web Services.

    By leveraging the STRIDES Initiative, NIH and NIH-funded institutions can begin to create a robust, interconnected ecosystem that breaks down silos related to generating, analyzing, and sharing research data.

    NIH-funded researchers with an active NIH award may take advantage of the STRIDES Initiative for their NIH-funded research projects. Eligible researchers include NIH intramural researchers and awardees of NIH contracts, other transaction agreements, grants, cooperative agreements, and other agreements. More information on NIH STRIDES options and how to gain access can be found here.

    "},{"location":"14-lauch_with_persistent_paymodel/","title":"Launch a Workspace with a Persistent Pay Model","text":""},{"location":"14-lauch_with_persistent_paymodel/#launch-a-workspace-with-a-persistent-pay-model","title":"Launch a Workspace with a Persistent Pay Model","text":"

    ***Important note: regardless of your account type, remember to terminate your workspace session when you are done using it to avoid accruing additional compute charges (i.e., \"usage\"). You can read more about terminating your workspace on our Getting Started in Workspace page. ***

    You can select which workspace account pay model to use from the Workspace page.

    • Log in and click on Workspace to open the workspace page.

    • Click on Account Information to expand that tab.

    • Under \"Account\" and \"Apply for an account\", you can click the dropdown to select which workspace account you want to use to launch the VM.

      The dropdown will only show workspace accounts that are active.

      When you only have trial access, launching a new VM will always use your Trial Access workspace account.

      However, once you have a workspace account funded through a persistent pay model, you will lose the trial access option, and you will only be able to launch a new VM using funded workspace account.

      Reminder: Each workspace account will have its own /pd for storing data beyond each workspace session. You can download data from one and upload it to another. But note: once you have a funded workspace account, you will no longer have access to the /pd for your trial access workspace account. Download your data from your trial access account /pd before you receive funding in another workspace account.

    Looking for information on how to use the Workspace? Check out our Getting Started page.

    "},{"location":"15-monitor_compute_usage/","title":"Monitor Compute Usage in Persistent Pay Models","text":""},{"location":"15-monitor_compute_usage/#monitor-compute-usage-with-a-persistent-pay-model","title":"Monitor Compute Usage with a Persistent Pay Model","text":""},{"location":"15-monitor_compute_usage/#where-can-you-see-your-usage","title":"Where can you see your usage?","text":"

    You can see and monitor your compute usage in two places:

    1. Workspace Page
    2. Workspace Account Manager

    An important note is that the usage reported in either of these places generally does not immediately update as you use the workspace. AWS reports updated usage about three times per day.

    "},{"location":"15-monitor_compute_usage/#1-monitor-usage-on-the-workspace-page","title":"1. Monitor usage on the Workspace page","text":"

    You can see your total usage for any active paid accounts by logging in to the BRH platform, clicking on Workspace, then clicking on Account Information (in the upper left). You can see your total usage and your spending limit.

    If you have more than one funded workspace account, you can use the dropdown to select different active pay models and view their usage and spending limits.

    "},{"location":"15-monitor_compute_usage/#2-monitor-usage-on-the-workspace-account-manager","title":"2. Monitor usage on the Workspace Account Manager","text":"

    You can see your usage details on the Workspace Account Manager Accounts page.

    After logging in, you will be on the Accounts page. There are different tables for each of the three persistent pay models. Find the table for the persistent pay model account you want to monitor, then look for the active account line in that table.

    The table includes 4 columns relevant for monitoring usage:

    • Total Usage: This column reports on your compute usage for this account as of the most recent AWS report.
    • Compute Purchased (or STRIDES Credits): This is the total purchase amount you have made for this account. For STRIDES Credits accounts, this is the total amount of credit you were awarded.

    The soft and hard limit columns deserve their own section; see below.

    "},{"location":"15-monitor_compute_usage/#hard-and-soft-limits","title":"Hard and Soft Limits","text":"

    You can see your hard and soft limits in the tables on the Workspace Account Manager Accounts page. They each have a column:

    • Soft Limit: This is the limit at which an alert will be triggered for you that you are approaching the spending limit for the account. For STRIDES accounts, users can edit this column to set the alert threshold. For OCC Direct Pay, the soft limit is set in accordance with your Direct Pay User Agreement (signed during Direct Pay BillingID setup).

    • Hard Limit: This is the limit at which a workspace VM will be immediately terminated and the workspace account will be shut down such that no VMs can be launched with it. For STRIDES accounts, users can edit this column to set the threshold at which the account becomes inactive. For OCC Direct Pay, the hard limit is set in accordance with your Direct Pay User Agreement (signed during Direct Pay BillingID setup).

    What happens when you exceed your funding for a workspace account?

    "},{"location":"16-usage_exceeds_funding/","title":"If Compute Usage Exceeds Funding","text":""},{"location":"16-usage_exceeds_funding/#if-compute-usage-exceeds-funding","title":"If Compute Usage Exceeds Funding","text":"

    Coming Soon! We will describe:

    • Data protections if your workspace is shutdown
    • A terminology guide for shutdown vs terminate
    "},{"location":"17-workspace_faq/","title":"Workspace FAQ","text":""},{"location":"17-workspace_faq/#workspace-faqs","title":"Workspace FAQs","text":""},{"location":"17-workspace_faq/#workspace-questions-any-funding-source","title":"Workspace Questions (any funding source)","text":""},{"location":"17-workspace_faq/#what-happens-if-my-trial-access-or-funding-runs-out","title":"What happens if my trial access or funding runs out?","text":"

    When you have used all of your allotted time (trial access) or funding (persistent pay models), you will no longer be able to launch a workspace. You will not have access to your /pd folder. You will still be able to open the BRH Workspace page, and log in to the Workspace Account Manager. You can request new account funding from the Workspace Account Manager.

    "},{"location":"17-workspace_faq/#is-the-persistent-directory-pd-shared-among-all-the-pay-model-options-will-my-data-be-available-in-all-my-workspace-accounts","title":"Is the persistent directory (/pd) shared among all the pay model options? Will my data be available in all my workspace accounts?","text":"

    No. Currently, the /pd folder is separate for each workspace pay model account. So, the data from your /pd for your trial access account will not automatically be available in the /pd for your Direct Pay workspace account or your STRIDES workspace account.

    "},{"location":"17-workspace_faq/#how-should-i-cite-research-done-in-brh-workspace","title":"How should I cite research done in BRH workspace?","text":"

    Make sure to cite the author of the data used in your research, the repository housing the data, and the BRH platform enabling your access to the data. See details here.

    "},{"location":"17-workspace_faq/#direct-pay-workspace-payment-portal-billingid-questions","title":"Direct Pay workspace, Payment Portal, & BillingID questions","text":""},{"location":"17-workspace_faq/#why-does-my-hard-limit-not-match-the-amount-of-funding-i-purchased","title":"Why does my Hard Limit not match the amount of funding I purchased?","text":"

    As described in your Direct Pay User Agreement, some amount of your purchase is set aside for use for account operational expenses beyond compute costs. If you have further questions about this, please contact billing@occ-data.org

    "},{"location":"17-workspace_faq/#can-i-get-a-refund-on-any-unused-compute-funding-i-purchased","title":"Can I get a refund on any unused compute funding I purchased?","text":"

    Unfortunately, as described in your Direct Pay User Agreement, OCC cannot offer refunds for unused compute time already purchased.

    "},{"location":"17-workspace_faq/#ive-been-using-my-workspace-all-day-but-my-total-usage-number-hasnt-changed-at-all","title":"I've been using my workspace all day, but my total usage number hasn't changed at all.","text":"

    Data usage is monitored by AWS, and AWS makes updates several times a day. This means that you can possibly use the workspace for a number of hours before the compute usage is updated. Also, the update is sometimes delayed -- that is, a usage update from AWS could still be omitting usage in the hours preceding it.

    "},{"location":"17-workspace_faq/#what-if-i-dont-pay-the-renewal-invoice-before-my-usage-exceeds-the-hard-limit","title":"What if I don't pay the renewal invoice before my usage exceeds the Hard Limit?","text":"

    If you do not pay the renewal invoice before you reach the Hard Limit, you will lose access to the Direct Pay account /pd, and you will be unable to launch any workspaces from the Direct Pay account. If you still have not renewed funding 2 months (60 days) after reaching your Hard Limit, we may delete the contents of the /pd for that account.

    "},{"location":"17-workspace_faq/#why-is-the-occ-payment-portal-a-separate-account","title":"Why is the OCC Payment Portal a separate account?","text":"

    The OCC payment portal is for our third-party AWS reseller, the Open Commons Consortium (OCC). The site is payments.occ-data.org. This is a separate site and account because we want to protect your financial information; the OCC payment portal site follows more rigorous financial best practices. Please note that neither this site nor OCC as an organization will hold any of your payment information; when you pay an invoice, the actual payment information will go through a separate secure payment processor and is not stored anywhere after the charge is processed.

    "},{"location":"17-workspace_faq/#can-i-change-my-email-address-associated-with-my-occ-payment-portal-account","title":"Can I change my email address associated with my OCC Payment Portal account?","text":"

    No. Unfortunately, you cannot change the email address associated with your BillingID or OCC Payment Portal account. You can create a new OCC Payment Portal account with another email address, but you will need to request a new BillingID for that account.

    "},{"location":"17-workspace_faq/#i-didnt-receive-my-token-for-the-payment-portal-login","title":"I didn\u2019t receive my token for the Payment Portal login.","text":"

    It may take a couple minutes to send, but not more than that. Check your spam folder. If you don't see it there, you can request another token by clicking the Request Token button again.

    "},{"location":"17-workspace_faq/#my-token-for-the-occ-payment-portal-expired","title":"My token for the OCC Payment Portal expired","text":"

    You can request another token by clicking Request Token again.

    "},{"location":"17-workspace_faq/#how-can-i-find-the-terms-for-the-direct-pay-billing-agreement","title":"How can I find the terms for the Direct Pay billing agreement?","text":"

    When that document is available, it will be posted on the OCC Direct Pay site. We will link to it when available.

    "},{"location":"17-workspace_faq/#at-what-point-am-i-actually-paying-for-workspace-funds-what-step-does-the-actual-charge-take-place","title":"At what point am I actually paying for workspace funds - what step does the actual charge take place?","text":"

    You designate how much you want to purchase when you initially request your BillingID on the OCC Payment Portal. However, the invoice for this amount is not actually sent until after you have done both of the following:

    1. receive a BillingID on the OCC Payment Portal
    2. use this BillingID to request a Direct-Pay-funded workspace account in the Workspace Account Manager

    Once you have completed both of these, you will be sent an invoice - when you pay this invoice is when you submit your credit card number and actually pay the charged amount.

    "},{"location":"17-workspace_faq/#what-credit-cards-can-be-used-to-pay-for-direct-pay","title":"What credit cards can be used to pay for Direct Pay?","text":"

    We accept all major credit cards, including Visa, MasterCard, Discover, and American Express.

    We also accept ACH electronic payments from a bank account, Apple Pay, PayPal, and Venmo.

    "},{"location":"17-workspace_faq/#what-are-the-different-websites-involved-in-purchasing-and-using-direct-pay-for-workspaces","title":"What are the different websites involved in purchasing and using Direct Pay for workspaces?","text":"

    BRH Platform: This is the main BRH site, with all the BRH data and the Workspace page. The site is brh.data-commons.org, and you can reach the Workspace page by clicking on the Workspace button at the top of the portal.

    BRH Workspace Account Manager: This is where you can request or view details of your Workspace accounts funded with persistent pay models. The site is brh-portal.org, and you can also reach this site from the BRH Workspace page by clicking on the Workspace Account Manager link at the top left corner, in the Account Information section. This page will be built into the BRH Platform Workspace page in a future update (i.e., it won't be a separate page).

    OCC Payment Portal: This is the payment portal site for our third-party AWS reseller, the Open Commons Consortium (OCC). Here, you can request a BillingID, specify the amount of Direct Pay funding you will want on your Direct Pay account, and track the status of your BillingID request. The site is payments.occ-data.org. Note: Neither this site nor OCC will hold any of your payment information; when you pay an invoice, the actual payment information will go through a separate secure payment processor and is not stored anywhere after the charge is processed.

    "},{"location":"17-workspace_faq/#strides-questions","title":"STRIDES Questions","text":""},{"location":"17-workspace_faq/#what-is-the-strides-program","title":"What is the STRIDES Program?","text":"

    The NIH STRIDES Initiative is a program to help NIH-funded researchers accelerate biomedical research by reducing barriers to utilizing for-fee cloud services, like the BRH Workspace.

    The STRIDES program gives cost discounts and a host of other benefits to researchers with NIH grants and contracts.

    "},{"location":"17-workspace_faq/#what-are-the-benefits-for-using-the-strides-program","title":"What are the benefits for using the STRIDES program?","text":"

    STRIDES program benefits include:

    • Cost discounts on AWS services (e.g., compute, storage, and egress fees)
    • AWS Enterprise Support
    • Training and education programs
    • and more! See the STRIDES program benefits page for more information
    "},{"location":"17-workspace_faq/#who-is-eligible-for-using-the-strides-program","title":"Who is eligible for using the STRIDES program?","text":"

    Anyone with any NIH funding, for any NIH award type, is eligible for the benefits of the STRIDES Initiative.

    "},{"location":"17-workspace_faq/#what-is-the-difference-between-strides-credit-and-strides-grantaward-funded","title":"What is the difference between STRIDES Credit and STRIDES Grant/Award Funded?","text":"

    The main difference is who is responsible for paying to provision the Workspace account.

    STRIDES Grant/Award Funded: This is the most common STRIDES account model. Here, the Workspace account is provisioned by organization receiving the NIH grant funds (generally a PI's institution). Researchers who have received NIH funding (e.g. a grant, contract, cooperative agreement, or other transaction agreement) can use these funds for the BRH workspace account. With this option, the researchers' organization will be responsible for payment.

    STRIDES Credits This is less common. Here, the payment for provisioning the Workspace account is made directly by NIH. Generally this is discussed with NIH ahead of submitting the request. With this option, once the request is approved, a new account with a set spending limit of will be provisioned by NIH directly for usage.

    "},{"location":"nextflow-create-docker/","title":"Nextflow - Create Dockerfile","text":""},{"location":"nextflow-create-docker/#create-a-dockerfile","title":"Create a Dockerfile","text":""},{"location":"nextflow-create-docker/#overview","title":"Overview","text":"

    This guide is for users who want to build Docker containers for use in Gen3 workspaces.

    "},{"location":"nextflow-create-docker/#prerequisites","title":"Prerequisites","text":"
    • Docker installed on your local machine
    • Clone or download the bio-nextflow repo
    "},{"location":"nextflow-create-docker/#start-with-a-security-validated-base-image","title":"Start with a security-validated base image","text":"

    Gen3 offers a collection of FedRAMP security-compliant base images. We re-assess these base images regularly for security compliance. Building on these base images makes it easier for your customized Docker image to pass the security scanning.

    You can access the URLs to pull these images using Docker here:

    https://github.com/uc-cdis/containers/blob/eec9789a57c5bb196a91f035e4cb069cfaa5abcd/nextflow-base-images/allowed_base_images.txt

    "},{"location":"nextflow-create-docker/#how-to-choose-your-base-image","title":"How to choose your base image","text":"

    GPU vs. CPU

    Not sure what these are? Here's a nice overview.

    In our BRH Workspace, we offer workspace images for CPU and GPU tools. You can read more about this on our Getting Started page. Choose the appropriate workspace image (CPU or GPU) for your Docker image and tools.

    GPU images

    We have 3 base images in our current selection that offer CUDA support for running on GPUs -- these have \"cuda\" in the image name, followed by the CUDA version. When possible, please choose the latest version of CUDA compatible with your tools.

    gen3-cuda-12.3-ubuntu22.04-openssl (preferred)

    gen3-cuda-12.3-torch2.2-ubuntu22.04-openssl (also preferred)

    gen3-cuda-11.8-ubuntu22.04-openssl (only use if your tools require a lower version of CUDA)

    CPU images

    We have one base image that is available for running workflows on CPUs.

    amazonlinux-base

    "},{"location":"nextflow-create-docker/#test-pulling-the-docker-image","title":"Test pulling the Docker image","text":"

    Before you proceed with using this URL in your Dockerfile, you want to make sure you can pull the image. You can verify this by running the docker pull commmand in your terminal while Docker is running.

    First, open your Docker Desktop application (just to be sure Docker is running).

    Next, open your terminal. Run docker pull <image URL>, where the image URL is the full line as displayed in the file of security-validated base images. If it's working, you will see language that it is pulling (see below). When it's complete (and successfully pulled), there will be a line that says Status: Downloaded <image> (see yellow highlight below). If you see this, you know that all the steps necessary to pull your image work. If you don't see this, reach out to us on Slack.

    "},{"location":"nextflow-create-docker/#test-using-docker-scout-to-evaluate-image-vulnerabilities","title":"Test using Docker Scout to evaluate image vulnerabilities","text":"

    At the end of your test fetch, Docker offers a suggestion to use Docker Scout to examine your image for vulnerabilities (see red box above). We have already evaluated the security compliance for our image, so it's not necessary for security here. However, since you will want to use Docker Scout to evaluate your custom build later, now is a convenient time to test this tool and make sure you are fully set up to run Docker Scout.

    Note: If you don't seem to have access to Docker Scout, check whether you're using the latest Docker version.

    "},{"location":"nextflow-create-docker/#run-docker-scout","title":"Run Docker Scout","text":"

    To run Docker Scout, you must:

    • have Docker running (for example, the desktop application open)
    • be signed in to Docker (in the desktop application, there is a Sign In button in the upper right corner)
    • have created a Docker account (when you sign in for the first time, you will be asked to create an account).

    Once you are signed in to Docker, you can run the command they suggest after pulling an image (for example, see the command in blue text in the red box above, docker scout quickview <image URL>). If the command runs successfully, you should see output similar to the screenshot below. This is a summary of the vulnerabilities in your image.

    You can run the next suggested command (shown in red box above, docker scout cves...) to see the full list of vulnerabilities.

    Images should be able to pass Gen3 security scanning if there are no Critical vulnerabilities.

    Want to know more about Docker Scout? Check out the documentation.

    "},{"location":"nextflow-create-docker/#build-your-image-locally-on-top-of-the-base-image","title":"Build your image locally on top of the base image","text":"

    To build your own image, you need to create a Dockerfile. To build your image on a base image, the first line of the Dockerfile should reference the base image tag. The Dockerfile you create typically lives in the Git repository where you have your code to make it easier to copy into your container.

    "},{"location":"nextflow-create-docker/#unfamiliar-with-creating-dockerfiles","title":"Unfamiliar with creating Dockerfiles?","text":"

    If you are unfamiliar with creating Dockerfiles, we encourage you to explore the excellent tutorial here, as well as review the Dockerfile documentation here, before you proceed. We have identified only key highlights below.

    One reminder: Dockerfiles are typically named just \"Dockerfile\" - with a capital D and no file extension.

    "},{"location":"nextflow-create-docker/#example-build-an-image-with-a-dockerfile-and-a-requirementstxt","title":"Example: Build an image with a Dockerfile and a requirements.txt","text":"

    In our example here, we will have you build your image using a requirements.txt to identify the software tools you want to add to the base image, as well as a Dockerfile that pulls in the base image, adds the software tools specified in the requirements file, copies relevant code files, and establishes some setup parameters.

    Our example will use the files in the torch_cuda_test directory of the bio-nextflow repository. You can review the readme file in this directory for more information. It is a simple example that will build up from our base image by adding PyTorch. The Nextflow script will ultimately use a python script that checks the version of CUDA in the GPU instance and checks whether it is compatible with the version of PyTorch and CUDA available in the container.

    First, in the terminal, navigate to the directory where you cloned the the bio-nextflow repository (see Prerequisites section). Next, navigate to where the downloaded Dockerfile and requirements.txt are located:

    cd bio-nextflow/nextflow_notebooks/containerized_gpu_workflows/torch_cuda_test

    If you open the Dockerfile, note that the first line of the Dockerfile references the URL for one of our GPU base images. This is always how you will reference a base image -- with FROM and the URL.

    Then, run the Docker build command. For example:

    docker build . -t my_docker

    will tag the Dockerfile in the directory with the tag my_docker, and build a Docker image using the Dockerfile.

    "},{"location":"nextflow-create-docker/#example-examine-built-image-for-vulnerabilities","title":"Example: Examine built image for vulnerabilities","text":"

    You now have a new Docker image built upon our security-compliant base image. To more rapidly identify and address any security concerns in your customized image, we encourage all users to locally scan their image for vulnerabilities using Docker Scout, as described in our test above. Here, we have tagged our new image with my_docker. So, we would run the Docker Scout quickview command on the image using this command:

    docker scout quickview my_docker

    And to identify the specific vulnerabilities and recommendations, you would run:

    docker scout cves my_docker

    "},{"location":"nextflow-create-docker/#my-image-passes-the-local-security-scanning","title":"My image passes the local security scanning","text":"

    Once your custom image is security-compliant based on the analysis from Docker Scout, you are ready to request credentials to submit your Docker image for Gen3 security scanning.

    Continue to Request Credentials

    "},{"location":"nextflow-getting-started/","title":"Nextflow - Getting Started","text":""},{"location":"nextflow-getting-started/#getting-started-with-workflows-on-gen3","title":"Getting started with workflows on Gen3","text":"

    Please note: Nextflow features are only available to users with a Direct Pay workspace account. See our documentation for persistent paymodels to learn more about getting a Direct Pay workspace account.

    "},{"location":"nextflow-getting-started/#background","title":"Background","text":""},{"location":"nextflow-getting-started/#what-is-gen3","title":"What is Gen3?","text":"

    The Gen3 platform consists of open-source software services that make up data commons and data ecosystems (also called meshes or fabrics). A data commons is a platform that co-locates both data and compute resources so researchers can bring algorithms to the data. Data ecosystems or meshes are systems that researchers can use to search and query across multiple data commons in one location.

    More information about Gen3 can be found here. A list of data platforms using the Gen3 technology can be found here.

    "},{"location":"nextflow-getting-started/#what-are-workflows","title":"What are workflows?","text":"

    A workflow is a computational pipeline that consists of a series of steps to be executed. It could run using a software container that is a standalone, self-contained piece of software containing all the executables needed for the workflow.

    Many workflow languages have been developed in recent years. Common examples include Common Workflow Language (CWL), Open Workflow Description Language (WDL), and Nextflow. We will be using Nextflow for our exercises.

    "},{"location":"nextflow-getting-started/#workflow-execution-in-gen3","title":"Workflow execution in Gen3","text":"

    Gen3 is based on kubernetes and is container-based. A container is a standalone, self-contained collection of software that contains specific software you may need for your application (e.g., Pydicom/DICOM, Numpy, SciPy). We are testing a new workflow execution system in Gen3 that researchers can use to run containers on the cloud for various applications in a secure and isolated manner. We developed an isolation process so that each user\u2019s workflow is separate from each other, from the Gen3 core system, and from Gen3 data, except when approved and required for the specific task. The testing and development of workflows is currently underway in the Biomedical Research Hub (BRH), one of the first data ecosystems (or meshes) built at CTDS.

    "},{"location":"nextflow-getting-started/#what-is-nextflow-what-is-aws-batch","title":"What is Nextflow? What is AWS Batch?","text":"

    The workflow execution in Gen3 is powered using Nextflow, a framework for writing data-driven computational pipelines using software containers. It is a very popular and convenient framework for specifying containers, inputs and outputs, and running jobs on the cloud. Researchers have used Nextflow for several years, and 2023 has continued to see a rapid gain in its popularity per a recent survey. The scalability of workflows in Gen3 comes from AWS Batch, an AWS service capable of running compute jobs over large datasets on the Cloud.

    "},{"location":"nextflow-getting-started/#steps-to-run-workflows-in-gen3","title":"Steps to run workflows in Gen3","text":"

    To run workflows in Gen3, you will need the following:

    • Access to the BRH workspace (covered on this page)
    • A funded workspace account (covered on this page)
    • A Docker image uploaded to an ECR created for you (start with Create Dockerfile)

    Depending on your specific workflows, you may also need additional tools, resources, or access.

    "},{"location":"nextflow-getting-started/#get-access-to-the-brh-workspace-and-set-up-a-funded-account","title":"Get access to the BRH workspace and set up a funded account","text":""},{"location":"nextflow-getting-started/#1-request-access-to-the-brh-workspace","title":"1) Request access to the BRH workspace","text":"

    The BRH exposes a computational workspace that researchers can use to run simple Jupyter notebooks and submit workflows. To submit workflow jobs, you need access to the BRH workspace.

    Follow these instructions to request trial access to the BRH workspace. After you have submitted your request, please ping @Sara Volk de Garcia in Slack to alert her to look for your request and approve it.

    "},{"location":"nextflow-getting-started/#2-establish-a-workspace-account-with-a-persistent-pay-model-in-brh","title":"2) Establish a workspace account with a persistent pay model in BRH","text":"

    When you initially are granted workspace access in BRH, it is a trial access that is free for the user (paid by CTDS). However, the trial access paymodel does not permit access to the Nextflow image. To gain access to the Nextflow image needed for testing, you must request a workspace account with a persistent paymodel, so that the cost of compute jobs in your project can accrue to the right account. BRH currently supports several persistent pay models such as NIH STRIDES (payment through grant funds) and Direct Pay (credit card payment). If you're curious, see here for more information about pay models.

    For MIDRC, we have already established a Direct-Pay-type* of workspace account for testing. When you receive workspace access, Sara will work with the Nextflow team to add a Direct Pay account to your workspace.

    * Note about this Direct-Pay-type of account: It is not an ACTUAL Direct Pay account, and it does not go through the normal Direct Pay account route, nor through OCC, at all. It is funded with MIDRC contract funds, but will be labeled Direct Pay in your workspace.

    "},{"location":"nextflow-getting-started/#3-launch-a-workspace-with-the-persistent-paymodel","title":"3) Launch a workspace with the persistent paymodel","text":"

    Once you have been notified that you have a workspace account provisioned with persistent paymodel funds, you can proceed.

    • Log in to BRH and open the workspace page.
    • In the dropdown under \"Account\" in top left, select \"Direct Pay\" as your paymodel (#1 in screenshot below).
    • Once you select the Direct Pay workspace account, you should see a new option for workspace image: \"(Beta) Nextflow with CPU instances\"
    • Click the Launch button for this Nextflow workspace image (#2).
    • When you click the button, the workspace will begin to launch. This can take 5-10 minutes. You will know you successfully started the launch because you will see 3 animated dots rippling in the Launch button (see yellow highlight).

      If it takes longer than 10 minutes, try refreshing the screen and re-trying the launch. If it seems to stall out (longer than 10 min again), or if you get an error, reach out to CTDS staff through the Slack channel (but don't close the tab with the launch).

    "},{"location":"nextflow-getting-started/#quick-orientation-to-the-the-workspace","title":"Quick orientation to the the workspace","text":"

    Before using the workspace, we strongly encourage you to review the BRH Workspace documentation.

    There are several key points we want you to be aware of:

    "},{"location":"nextflow-getting-started/#store-all-data-in-the-persistent-directory-pd","title":"Store all data in the persistent directory (/pd)","text":"

    Store all files you want to keep after the workspace closes in the /pd directory; only files saved in the /pd directory will persist. Any personal files in the folder data will be lost.

    "},{"location":"nextflow-getting-started/#automated-shutdown-for-idle-workspaces","title":"Automated shutdown for idle workspaces","text":"

    Workspaces will automatically be shut down (and all workflows terminated) after 90 minutes of idle time.

    "},{"location":"nextflow-getting-started/#gpu-vs-cpu-nextflow-workspace-images","title":"GPU vs CPU Nextflow workspace images","text":"

    As you can see in the screenshot above, there are 2 Nextflow workspace images: A CPU image and a GPU image. If your workflow requires GPU (e.g., deep learning or other AI/ML models), please use the GPU instance; otherwise, use CPU. You can read more about CPU and GPU options in Gen3 Nextflow here.

    Continue to Overview of Containers in Gen3

    "},{"location":"nextflow-overview-containers/","title":"Nextflow - Containers in Gen3","text":""},{"location":"nextflow-overview-containers/#overview-developing-and-deploying-containers-in-gen3","title":"Overview: Developing and Deploying Containers in Gen3","text":"

    Locally build and test container:

    Gen3 provides several FedRAMP security-compliant base images that users can pull and customize.

    Request credentials and push container to Gen3 staging:

    Users can email Gen3 to request short-term credentials that permit them to authenticate Docker in their terminal to upload the local Docker image to a Gen3 staging repo for security review.

    Container is security-scanned; Gen3 sends approved container URI:

    Gen3 completes the security scan of the container. Typically, the scanning completes within a couple hours; however it takes longer for larger images with more layers. If it is security-compliant, the image is moved to an ECR repo (\"approved\") from where the container can be run, and Gen3 staff will send a container URI to the user for use in the Nextflow workflows.

    If there are problems that make the image non-compliant with security requirements, a report of the vulnerabilities is provided to the user for remediation and resubmission. Users are responsible for resolving image vulnerabilities and resubmitting for scanning.

    Run workflow using approved container URI:

    In the BRH workspace, use a Nextflow Jupyter notebook to run Nextflow workflows in the approved container using the approved container URI. Some example notebooks can be found here, and specific examples that use an approved image URI can be found here and here

    Continue to Create Dockerfile

    "},{"location":"nextflow-request-creds/","title":"Nextflow - Request Credentials","text":""},{"location":"nextflow-request-creds/#request-credentials-for-uploading-a-docker-container","title":"Request Credentials for Uploading a Docker Container","text":"

    Please copy and paste the email template below into a new email and send to brhsupport@datacommons.io. Please be sure to add the relevant information to the bolded fields.

    Hello, User Services,

    Please create new temporary AWS credentials to permit me to upload a Nextflow container.

    The email address or ORCID I use to log in to BRH is: [BRH login email here]

    I understand that these credentials will last for 1 hour, once created. If I continue to need access to upload after they expire, I will request new credentials.

    Since the credentials will ONLY last 1 hour after creation, you may prefer we send them at a certain time of day. Please delete which of these do NOT apply:

    • Please generate and send my credentials tomorrow morning
    • Please generate and send my credentials in the afternoon
    • Please generate and send my credentials ASAP

    To ensure prompt attention, I will also ping @Sara Volk de Garcia on the Slack channel after I have sent my email.

    Thanks!

    [your name]

    Please note: If you receive credentials but you are not able to successfully upload an image before they expire, please ping @Sara on Slack to let her know she does not need to monitor your submitted image.

    Continue to Upload Docker Image

    "},{"location":"nextflow-tutorial-workflows/","title":"Nextflow - Tutorials Workflows","text":""},{"location":"nextflow-tutorial-workflows/#tutorial-nextflow-workflows","title":"Tutorial Nextflow Workflows","text":"

    We have a collection of notebooks using Nextflow in Gen3 here: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks

    Before you start executing any tutorial notebooks, review the information in the Get Set section.

    "},{"location":"nextflow-tutorial-workflows/#get-set-download-necessary-credentials-and-software","title":"Get set: Download necessary credentials and software","text":"

    Be ready to execute the tutorial workflows below by gathering credentials and installing necessary software.

    "},{"location":"nextflow-tutorial-workflows/#get-and-replace-placeholder-values-from-the-nextflow-config","title":"Get and replace placeholder values from the Nextflow config","text":"

    You can find the values to replace the placeholders in thequeue, jobRole and workDir fields in the nextflow.config file in your Nextflow workspace. Directions for finding this file are at the bottom of the \"Welcome to Nextflow\" page that opens when your Nextflow workspace first opens. These placeholder values will need to be replaced in each of the various tutorial Nextflow notebooks.

    Note that you should only copy/paste the value to replace placeholder for each field; do not copy/paste larger sections of the nextflow config, or there could be indentation problems that interfere with the code.

    "},{"location":"nextflow-tutorial-workflows/#midrc-credentials","title":"MIDRC credentials","text":"

    To download GUIDs in the workspace, you will first need to generate a MIDRC credentials on the profile page of the MIDRC portal. For this, please go to data.midrc.org, click on the user icon in the right corner (#1), and open the Profile page (#2). Click on Create API Key (#3). A pop-up window will appear with the key. If you scroll down slightly, you can see the button to download the credentials as a JSON. Credentials are valid for 1 month.

    "},{"location":"nextflow-tutorial-workflows/#example-nextflow-notebooks","title":"Example Nextflow notebooks","text":""},{"location":"nextflow-tutorial-workflows/#notebooks-with-no-containers","title":"Notebooks with no containers","text":"

    There are several general Nextflow notebooks that do not use containers at all. If you're new to nextflow and just want to get started with workflow commands, try these notebooks: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks/non_containerized_nextflow_workflows

    "},{"location":"nextflow-tutorial-workflows/#notebooks-using-containers","title":"Notebooks using containers","text":"

    We have several containerized notebooks using CPU, and other containerized notebooks using GPU.

    "},{"location":"nextflow-tutorial-workflows/#notebooks-using-cpu","title":"Notebooks using CPU","text":"

    You can find the directory with containerized notebooks using CPU here: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks/containerized_cpu_workflows

    Some use cases in this directory include:

    • Cancer use case: The chip_cancer example tutorial here includes a Dockerfile, a requirements file, a Nextflow notebook, a collection of python scripts, and a README.
    • DICOM metadata extraction use case: The midrc_batch_demo example tutorial here includes a Dockerfile, a requirements file, two Nextflow notebooks, a collection of python scripts, and a README.
    "},{"location":"nextflow-tutorial-workflows/#notebooks-using-gpu","title":"Notebooks using GPU","text":"

    You can find the directory with containerized notebooks using GPU here: https://github.com/uc-cdis/bio-nextflow/tree/master/nextflow_notebooks/containerized_gpu_workflows

    Some use cases in this directory include:

    • Pytorch/cuda test simple use case: The torch_cuda_test example tutorial here includes a Dockerfile, a requirements file, a Nextflow notebook, a simple python script, and a README.
    • COVID Challenge 2022 use case: The covid_challenge_container example tutorial here includes a Dockerfile, a requirements file, a Nextflow notebook, a python script, and a README.
    "},{"location":"nextflow-upload-docker/","title":"Nextflow - Upload Docker Image","text":""},{"location":"nextflow-upload-docker/#pushing-docker-images-to-aws-ecr","title":"Pushing Docker Images to AWS ECR","text":""},{"location":"nextflow-upload-docker/#overview","title":"Overview","text":"

    This guide is for users who have received temporary credentials granting access to push container images to a specific AWS Elastic Container Registry (ECR) repository.

    "},{"location":"nextflow-upload-docker/#prerequisites","title":"Prerequisites","text":"
    • Docker installed on your local machine.
    • AWS CLI installed locally
    • Temporary AWS credentials provided by the User Services team.
    • The URI of the ECR repository you have been given access to (shared in the AWS credentials)
    • A image you want to push, or a Dockerfile to build your image.
    "},{"location":"nextflow-upload-docker/#a-note-about-timing","title":"A note about timing","text":"

    Your temporary AWS credentials only lasts for 1 hour from when they were created; User Services should have provided an expiration time when sharing the credentials with you. You must fully complete the push to ECR before they expire, or you will need to request new credentials from User Services.

    If you do not complete pushing an image to the ECR before they expire, please ping @Sara Volk de Garcia in Slack so she knows not to monitor for an image to progress through scanning.

    "},{"location":"nextflow-upload-docker/#a-note-about-security-and-expiration-of-approved-docker-images","title":"A note about security and expiration of approved Docker images","text":"

    Because of the ever-updating nature of vulnerability detection, an image that has passed in the past is not guaranteed to always pass. Even if you are resumitting an image that has passed previously, there may be new vulnerabilities that have been reported that mean the image does not pass now. Best practices for most efficient submission are to always examine an image with Docker Scout before pushing it.

    Similarly, because new vulnerabilities are always emerging, to protect the security of the Gen3 Workspace, approved containers will only remain available in the approved repo for 30 days. However, users can always request new credentials and resubmit their image for scanning.

    "},{"location":"nextflow-upload-docker/#set-aws-environment-variables","title":"Set AWS environment variables:","text":"

    The commands in this section are valid if you are using a Linux or MacOS. If you are using Windows, we will provide a separate set of commands for you to set the AWS environment variables.

    Before you can push your Docker image to the ECR repository, you need to configure the AWS CLI with the temporary credentials you received. In the credentials sent to you, there should be the commands needed to run this below the line \"Please run the following commands to set your AWS credentials:\". Copy those (they will look similar to the block below) and run them in the terminal.

      export AWS_ACCESS_KEY_ID=<AccessKeyId>\n  export AWS_SECRET_ACCESS_KEY=<SecretAccessKey>\n  export AWS_SESSION_TOKEN=<SessionToken>\n

    Note: the variables are set only as long as the terminal is open; export the variables again if you close and open a new terminal.

    "},{"location":"nextflow-upload-docker/#verify-configuration","title":"Verify configuration:","text":"

    Run aws sts get-caller-identity to verify that your CLI is using the temporary credentials. If you successfully set the variables, you should see output showing the AWS information - UserID, account, etc.

    "},{"location":"nextflow-upload-docker/#authenticate-docker-to-ecr","title":"Authenticate Docker to ECR","text":"

    Next, use the AWS CLI to retrieve an authentication token and authenticate your Docker client to your registry. In the credentials, there is a command below the line \"After setting credentials you will need to log in to your docker registry. Please run the following command:\". Copy that (it will look similar to the command below) and run it in the terminal.

     aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <repositoryUri>\n
    "},{"location":"nextflow-upload-docker/#preparing-to-push-your-docker-image","title":"Preparing to push your Docker image","text":"

    The specific steps you use to prepare to push your image depends on whether you have an image already built or if you will need to build from a Dockerfile.

    "},{"location":"nextflow-upload-docker/#if-you-already-built-a-local-docker-image-tag-your-docker-image","title":"If you already built a local Docker image: Tag your Docker image","text":"

    If you already have a locally-built Docker image, you will not need to run the docker build command included in the credentials. But, you do need to tag it with the ECR repository URI and the image tag you want to use. This command is not in the credentials file.

     docker tag <local-image>:<local-tag> <repositoryUri>:<image-tag>\n

    Replace < local-image > with the name of your local Docker image and < local-tag > with the tag you want to push.

    If you're not sure what your local image and tag names are, you can run docker images in your terminal. It will provide a list of all your saved docker images. The column called REPOSITORY is the local image name. The column called TAG is the local tag name for this image. Note: the image-tag you select will travel with your image to the approved repo. So, select a tag you are comfortable with.

    Replace < repositoryUri > with the ECR repository URI provided at the top of the credentials file, and < image-tag > with the image tag name you want to use in your ECR.

    Important note: If you do not want the most-recently-pushed image to replace an earlier version with the same tag in your ECR, image-tags should be unique. For example, you create an image with an image-tag batch-poc. If you later push another image to < repositoryUri >:batch-poc, it will overwrite the previous version of the image in your ECR (you will only have 1 container with the image tag \"batch-poc\"). If you do not want to overwrite, you can use versioned image-tags. For example: batch-poc-1.0, and then batch-poc-1.1. If you want to replace previous versions of your container, you can use the same image-tag.

    "},{"location":"nextflow-upload-docker/#if-you-need-to-build-your-docker-image","title":"If you need to build your Docker image:","text":"

    If you haven't already built your Docker image, you can use the docker build command that is included in your credentials, similar to what is shown below. Note that you should run this command from the directory holding your Dockerfile. You will need to replace the < tag > in your command with the image tag name you want to use in your ECR. (Read more about image tags in the previous section.)

     docker build -t <repositoryUri>:<tag>\n

    If you use this docker build command from your credentials, you do not need to use the docker tag command (described in the previous section).

    "},{"location":"nextflow-upload-docker/#push-the-docker-image-to-the-ecr","title":"Push the Docker image to the ECR","text":"

    Push the tagged image to the ECR repository. The docker push command is also in the credentials - you just need to specify the image tag you selected when either tagging or building the image in the previous section.

     docker push <repositoryUri>:<image-tag>\n

    If the push is successful, you will see various \"layer 1\" \"layer 2\" etc outputs, and it will indicate progress in pushing. This can take minutes, depending on how large your container is.

    If the push fails, you will get a persistent message about \"Waiting for layer\". This usually means it cannot find the repository, so double-check that there is no typo, and that you have set your AWS environment variables since you opened the terminal most recently.

    "},{"location":"nextflow-upload-docker/#completion","title":"Completion","text":"

    Once the push completes, your Docker image will be available in the ECR repository (although you will not be able to see it). It will be scanned, and if passes the security scanning, CTDS will move it to the nextflow-approved repo. When it's available in nextflow-approved, User Services will share a docker URI that looks something like this: 143731057154.dkr.ecr.us-east-1.amazonaws.com/nextflow-approved/< your username >:< image-tag > You can then use this new URI to run Nextflow workflows with your container in the BRH workspace. (Note that you need to copy the whole URI into the container field of the nextflow notebook, as described in the next section.)

    "},{"location":"nextflow-upload-docker/#how-to-use-an-approved-docker-uri","title":"How to use an approved Docker URI","text":"

    Once you have your Docker URI, you are ready to run your Nextflow workflow! You can take the Docker URI (copy the entire line) and make it the value for the \"container\" field(s) in your Nextflow notebook. For example, in the torch_cuda_batch Nextflow notebook, you would go to the nextflow.config section and replace the placeholder value for container with the approved Docker URI.

    Please note that you will need to replace all placeholder values in the nextflow.config with values specific to your workspace. Please see the section \"Get and replace placeholder values from the Nextflow config\" on the Tutorials page for more information.

    "},{"location":"nextflow-upload-docker/#support","title":"Support","text":"

    If you encounter any issues or require assistance, please reach out to the User Services team that provided you with the temporary credentials, or brhsupport@datacommons.io, or reach out on Slack. (Slack will result in the quickest reply.)

    Continue to Tutorial Workflows

    "}]} \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml index 55f09c0..dcfd87b 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -2,122 +2,122 @@ https://brh.data-commons.org/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/01-home/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/02-types_of_shared_data/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/03-data_and_repos/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/04-BRH_overview/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/05-workspace_registration/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/06-loginoverview/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/07-how_to_check_request_access/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/08-discovery_page/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/09-workspace_page/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/10-profile_page/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/11-downloading_data_files/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/12-contact/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/13-workspace_accounts/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/14-lauch_with_persistent_paymodel/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/15-monitor_compute_usage/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/16-usage_exceeds_funding/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/17-workspace_faq/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/nextflow-create-docker/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/nextflow-getting-started/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/nextflow-overview-containers/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/nextflow-request-creds/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/nextflow-tutorial-workflows/ - 2024-04-30 + 2024-05-01 daily https://brh.data-commons.org/nextflow-upload-docker/ - 2024-04-30 + 2024-05-01 daily \ No newline at end of file diff --git a/sitemap.xml.gz b/sitemap.xml.gz index f0591eae5e60d97110e558642a59af38ac2b4fc9..6d9b1900b9c09980c07492e2cb90e7b751e35196 100644 GIT binary patch delta 486 zcmVBw$9M5kyf+W~DLMpNrDTtXq9+>P z%5t9PHClo}jkpf6D^d69faN_{R_oWId03^D3*2_swRb@`xeBH-zhb{Oihq>QVHkqH zmD-m%W%-6e8iHvy)n+TIov7-Ra*r+2bgz5|ZG=-}ly#SVxrp3Q%D*iCOjXsQ4+C(e z+X%XhfHJfadlkXRoWqs&QRJ9Z-pB}JK(3ry@$K_NhGdd23o}>BotQ&B;;1yFc6Jsi zDoUp2%KIc-=&a}7*5_pSS$|THX|(BM>g;I(B&oCpWUJU0VJzz0+KU`j$97)5E;P%S zv#!J#Lu(yx%37E(dlyvxd0!{L%*T5raiTMAL8@Hi332^o9rNnG(HoHc+!*-k#YjaMt~^FqO`Ix)rc zaHiIUS?CCgAdNA7gvcGX1ETcl1(J#chCA+`i`H%ETAxM)cjWtJE6c*LhovbEF{|9w caXG@CcpCFv;lH`<`aq!OH}~e%-*gZF0RKk$v;Y7A delta 486 zcmV4!AAgejMB)QF6gMB*Sx_a~79vZINS~8`UnxlouU!gi zu|&|s1Nwbw?cdLXTYW)98~mZD*Hy6suY<9^I}|^Ed>1dpd-JfLqC=onO7?gtdZO{I zEa!P%qa_&Bi0cr$5_O*rSl)wWwSFy{hgDj+z-@P3dlz(*t6(bgEB0%nNPh_(h9US{ zsePGKmTxGeA(&=UZFZvCiEWiq?y*Ih?v?MLjc{s=vhK1k7m*uE`IqINsj6D^VF0dl z8$p*5P=;1wuOb+kbGXuOMUF}3jf^k`OD&dwr5 zMak4$d7p#}o%P(?`kV|uOMeP7jW&Huojpx}B$d{HY!&+=j76PWdy%8+*v_lhg=QIZ z)|D7zXszQ-Sql?p?}Exd@9X53nVG9f>rFa(Qku8|ZWr=Qh7Xe`w$0oLrzYLY*-|gZ ze4j%CPf4(s1OkpL_oGOAOCib>9w!1hA!Cm{i7TFjvj#9I+sS9QaaYBCUMRR#C#IMl z&eXav3mribq%o$C5V^y4K$JecKvI#waL4^~(Yg&?>(hwfj(optWmy>Zur#G1W|g}- cE=SlCPh-9-{5Q8<9|+X^23yICzH|@(0C$n&?*IS*