Skip to content

Commit

Permalink
fixed bug in seqspec init which did not initialize the sequence spec.…
Browse files Browse the repository at this point in the history
… updated documentation, greatly improved the tutorial, still WIP
  • Loading branch information
sbooeshaghi committed Apr 24, 2024
1 parent e1e9a4b commit 853775d
Show file tree
Hide file tree
Showing 3 changed files with 491 additions and 85 deletions.
31 changes: 30 additions & 1 deletion docs/DOCUMENTATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,36 @@ seqspec check [-o OUT] yaml
- optionally, `-o OUT` can be used to write the output to a file.
- `yaml` corresponds to the `seqspec` file.

For an explanation of possible errors, see the [TUTORIAL.md](https://github.com/IGVF/seqspec/blob/main/docs/TUTORIAL.md).
A list of possible errors are shown below:

```bash
# The "assay" value was not specified in the spec
[error 1] None is not of type 'string' in spec['assay']

# The "modalities" are not using the controlled vocabulary
[error 2] 'Ribonucleic acid' is not one of ['rna', 'tag', 'protein', 'atac', 'crispr'] in spec['modalities'][0]

# The "region_type" is not using the controlled vocabulary
[error 3] 'link_1' is not one of ['atac', 'barcode', 'cdna', 'crispr', 'fastq', 'gdna', 'hic', 'illumina_p5', 'illumina_p7', 'index5', 'index7', 'linker', 'ME1', 'ME2', 'methyl', 'nextera_read1', 'nextera_read2', 'poly_A', 'poly_G', 'poly_T', 'poly_C', 'protein', 'rna', 's5', 's7', 'tag', 'truseq_read1', 'truseq_read2', 'umi'] in spec['library_spec'][0]['regions'][3]['region_type']

# The "sequence_type" is not using the controlled vocabulary
[error 4] 'linker' is not one of ['fixed', 'random', 'onlist', 'joined'] in spec['library_spec'][0]['regions'][3]['sequence_type']

# The "region_id" is not unique across the spec
[error 5] region_id 'cell_bc' is not unique across all regions

# The length of the given "sequence" is less than the "min_len" specified for the sequence
[error 6] 'sample_bc' sequence 'NNNNNNNN' length '8' is less than min_len '10'

# The "filename" for the specified "onlist" does not exist in the same location as the spec.
[error 7] i5_index_onlist.txt does not exist

# The provided "sequence" contains invalid characters (only A, C, G, T, N, and X are permitted)
[error 8] 'NNNNNNNNZN' does not match '^[ACGTNX]+$' in spec['library_spec'][0]['regions'][4]['sequence']

# The "md5" for the given "onlist" file is not a valid md5sum
[error 9] '7asddd7asd7' does not match '^[a-f0-9]{32}$' in spec['library_spec'][0]['regions'][8]['onlist']['md5']
```

#### Examples

Expand Down
Loading

0 comments on commit 853775d

Please sign in to comment.