-
Notifications
You must be signed in to change notification settings - Fork 1
June 26, 2018, Tuesday
Liya Wang edited this page Jun 30, 2018
·
3 revisions
- Bugs
- Jbrowse_out
- Maybe just keep it at CSHL? Otherwise will need to copy it and uncompress it
- If so, need to remove both annotation and assembly workflows from de.sciapps.org
- FastQC_out
- Folder can not be visualized
- For the visualization window, we might need to browse down to files for both 'link' and 'visualization'
- TACC samtools index does not always work, not sure why we get following error
- http://datacommons.cyverse.org/browse/iplant/home/lwang/sci_data/results/STAR_align_Stampede2-2.5.3_7329c7f7-c480-49c9-a7c1-d33cd05236ca/job-for-star_align_stampede2-2-5-3-6785646041787526680-242ac113-0001-007.err
- samtools: error while loading shared libraries: /usr/local/bin/../lib/./libssl.so.1.0.0: unsupported version 0 of Verneed record
- https://github.com/openssl/openssl/issues/4170
- Jbrowse_out
- To do
- Push to maizecode branch, and deploy on de2.sciapps.org
- Adjust history job name length so it won't wrap to two lines
- Modify link and visualize buttons
- Can we let link button to copy link to history (not clickable but downloadable)
- Visualize button will open the file (disable it for genome browser files, bam, etc)
- Plan for ENCODE DCC
- iRODS webfront end (might be tricky, we can rely on icommands, DE, CyberDuck)
- Workflow API
- Command line version of SciApps
- Give RNA-Seq as an example, we have the workflow json for one replicate, now we want to apply the workflow to another replicate (paired reads)
- curl -sk -H "Authorization: Bearer $ACCESS_TOKEN" -X POST -F "[email protected]" https://public.tenants.agaveapi.co/jobs/v2?pretty=true
- Workflow JSON
- New set of reads (replace inside JSON)
- Workflow name (replace inside JSON)
- Command line version of SciApps
- Flow
- Domain we will use: data.maizecode.org (brie)
- We have a workflow JSON with inputs from CyVerse Data Store and outputs archived to there too
- We have metadata attached to inputs (reads)
- /iplant/home/shared/maizecode/B73v4/RNA-seq/B73LongRampage/BioProject_PRJNA438108/BioSample1/BioSample1Library1
- Most metadata @BioSample1, some @BioProject_PRJNA438108, and @BioSample1Library1 (distinguish RNA-Seq from RAMPAGE)
- Can we generate a 'experiment JSON'?
- Workflow JSON id
- Replicate JSON id for each input data: combines metadata for each input file
- We need to use DS uuid for each file instead of path? Maybe we will use path for now
- Search
- elasticsearch vs solr vs mysql (preferred)
- Assume we have run two sets of RNA-Seq data (data_a (root) and data_b (shoot))
- Search for 'root' should return following page
- Left: filter by organism (B73, W22, NC350, Til11) and Tissue
- Right: List of workflow names (some metadata, workflow_id, summary or workflow description)
- Example:
- /iplant/home/shared/maizecode/B73v4/RNA-seq/B73LongRampage/BioProject_PRJNA438108
- https://www.encodeproject.org/search/?searchTerm=h3k4me3
- Click on each workflow will bring up the workflow page
- Example: https://www.encodeproject.org/experiments/ENCSR285FZP/
- Question: Do we use one workflow for one replicate or one workflow for all replicates?
- We will use one workflow for all replicates
- However, each replicate has one BioSample ID @ NCBI SRA
- Page rendering
- Download
- For inputs, we can make them public and direct user to Data Common landing page
- For outputs, they are already public through the workflow, and direct user to DC landing page
- Visualize
- Direct user to SciApps to visualize results
- Domain we will use: data.maizecode.org (brie)