Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated machine spec recommendations #23

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

nainathangaraj
Copy link
Contributor

No description provided.

README.md Outdated

Minimal Ansible version: 2.0.

This program is intended for Ubuntu 14.04 and 16.04, and has been tested on Red Hat 7.4/7.5 and OLE (Oracle Linux Enterprise) 7. It has not been tested on any other versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add something like: It has not been tested on any other versions but it should work with most of the Linux OS releases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added! :)

README.md Outdated

More information and tutorials about the DNAnexus platform can be found at the [DNAnexus wiki page](https://wiki.dnanexus.com/Home).

The remote-user that the role is run against must possess READ access to monitored_folder and WRITE access to disk for logging and temporary storage of tar files. These are typically stored under the remote-user's home directory, and is specified in the file monitor_run_config.template or as given explicitly by the variables local_tar_directory and local_log_directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remote-user is a little bit confusing, let's define it and find a better name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this phrase to local user

Copy link
Contributor

@alphabdiallo alphabdiallo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, you can add those changes and ask for a last review by Samantha, thanks

Copy link
Contributor

@slzarate slzarate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My computer is about to die, so I didn't get to everything. But I think you have enough to work with here. :)

dx-streaming-upload
===================

The dx-streaming-upload Ansible role packages the streaming upload module for increamentally uploading a RUN directory from an Illumina sequencer onto the DNAnexus platform.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> "The dx-streaming-upload Ansible role packages the streaming upload module for incrementally uploading a RUN directory from an Illumina sequencer onto the DNAnexus platform."


The dx-streaming-upload Ansible role packages the streaming upload module for increamentally uploading a RUN directory from an Illumina sequencer onto the DNAnexus platform.

Instruments that this module support include the Illumina MiSeq, NextSeq, HiSeq-2500, HiSeq-4000, HiSeq-X and NovaSeq.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> "Instruments that this module supports include the Illumina MiSeq, NextSeq, HiSeq-2500, HiSeq-4000, HiSeq-X, and NovaSeq."


## Requirements

Users of this module needs a DNAnexus account and its accompanying authentication. To register for a trial account, visit the [DNAnexus homepage](https://platform.dnanexus.com/register).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"needs" --> "need"


The local user utilizing this package should possess READ access to monitored_folder and WRITE access to disk for logging and temporary storage of tar files. These are typically stored under the local user's home directory, and is specified in the file monitor_run_config.template or as given explicitly by the variables local_tar_directory and local_log_directory.

The machine that this role is deployed to should have sufficient free memory depending on the throughput of the sequencing instrument. For Novaseq and HiSeqs we recommend a machine with atleast 8 cores, 32 GB of RAM, and 500GB - 1TB of storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> "The machine to which this role is deployed should have sufficient memory available depending on the throughput of the sequencing instrument. For Novaseq and HiSeqs, we recommend a machine with at least 8 cores, 32 GB of RAM, and 500GB - 1TB of storage."

#### Role Variables
- `mode`: `{deploy, debug}` In the *debug* mode, monitoring cron job is triggered every minute; in *deploy mode*, monitoring cron job is triggered every hour.
- `upload_project`: ID of the DNAnexus project that the RUN folders should be uploaded to. The ID is of the form `project-BpyQyjj0Y7V0Gbg7g52Pqf8q`
- `dx_token`: API token for the DNAnexus user to be used for data upload. The API token should give minimally UPLOAD access to the `{{ upload project }}`, or CONTRIBUTE access if `downstream_applet` is specified. Instructions for generating a API token can be found at [DNAnexus wiki](https://wiki.dnanexus.com/UI/API-Tokens). This value is overriden by `dx_user_token` in `monitored_users`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> - dx_token: API token for the DNAnexus user to be used for data upload. The API token should give at a minimum UPLOAD access to the {{ upload project }}, or CONTRIBUTE access if downstream_applet is specified. Instructions for generating a API token can be found at DNAnexus wiki. This value is overridden by dx_user_token in monitored_users.

For an applet, the `executable_input` hash to the `run` command will be prepopulated with the key-value pair {"`upload_sentinel_record`": `$record_id`} where `$record_id` is the DNAnexus file-id of the sentinel record generated for the uploaded RUN directory (see section titled **Files generated**).
For a workflow the `executable_input` hash will be prepoluated with the key-value pair {"`0.upload_sentinel_record`": `$record_id`} where `$record_id` is the DNAnexus file-id of the sentinel record generated for the uploaded RUN directory (see section titled **Files generated**).
**It is the user's responsibility to ensure that the specified applet/workflow has an appropriate input contract which accepts a DNAnexus record with the input name of `upload_sentinel_record`**
Additional input/options can be specified, statically using the Ansible variable `downstream_input`. This should be provided as a JSON string, parsable, at the top level, as a Python dict of `str` to `str`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"specified, statically" --> "specified statically"

For a workflow the `executable_input` hash will be prepoluated with the key-value pair {"`0.upload_sentinel_record`": `$record_id`} where `$record_id` is the DNAnexus file-id of the sentinel record generated for the uploaded RUN directory (see section titled **Files generated**).
**It is the user's responsibility to ensure that the specified applet/workflow has an appropriate input contract which accepts a DNAnexus record with the input name of `upload_sentinel_record`**
Additional input/options can be specified, statically using the Ansible variable `downstream_input`. This should be provided as a JSON string, parsable, at the top level, as a Python dict of `str` to `str`.
Example of a properly formatted `downstream_input` for an `applet`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add colon to end of this and next

- ```{"input_name1": "value1", "input_name2": "value2"}```
Example of a properly formatted `downstream_input` for a `workflow`
- ```{"0.step0_input": "value1", "1.step2_input": "value2"})```
*Note the numerical index prefix necessary when specifying input for an `workflow`, which disambiguates which step in the workflow an input is targeted to*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> "Note that the numerical index prefix necessary when specifying input for a workflow, which disambiguates to which step in the workflow an input is targeted. "

(specified in monitor_run_config.template file)
- no persistent files (tar files stored transiently, deleted upon successful upload to DNAnexus)
```
**Files Streamed to DNAnexus project**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> either make "streamed" lowercase or "project" title case

- ```{"0.step0_input": "value1", "1.step2_input": "value2"})```
*Note the numerical index prefix necessary when specifying input for an `workflow`, which disambiguates which step in the workflow an input is targeted to*
#### Files generated
We use a hypothetical example of a local RUN folder named `20160101_M000001_0001_000000000-ABCDE`, that was placed into the `monitored_directory`, after the `dx-streaming-upload` role has been set up.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--> " We use a hypothetical example of a local RUN folder named 20160101_M000001_0001_000000000-ABCDE, which was placed into the monitored_directory after the dx-streaming-upload role has been set up. "

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants