Skip to content

Commit

Permalink
fixup! merge: Support sequences
Browse files Browse the repository at this point in the history
  • Loading branch information
victorlin committed Jan 7, 2025
1 parent 6ce8d2c commit 3d55c60
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 0 deletions.
4 changes: 4 additions & 0 deletions augur/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -617,6 +617,10 @@ def merge_sequences(
print_info(f"Reading sequences from {s.path!r}…")
subprocess.Popen(cat(s.path), stdout=f)

# Add an newline character to support FASTA files that are missing one at the end.
# Extraneous newline characters are stripped by seqkit.
print(file=f)

print_info(f"Merging sequences and writing to {output_sequences!r}…")
process = seqkit('rmdup', temp_file.name, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

Expand Down
25 changes: 25 additions & 0 deletions tests/functional/merge/cram/merge-sequences.t
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,28 @@ Duplicates are not allowed within individual sequence inputs.
'three'

[2]

FASTA files without trailing newlines are supported.

$ cat >y.fasta <<~~
> >two
> ATCG
> >three
> GCTA
> ~~

$ truncate -s -1 x.fasta
$ truncate -s -1 y.fasta

$ ${AUGUR} merge \
> --sequences x.fasta y.fasta \
> --output-sequences - > merged.fasta \
> --quiet

$ cat merged.fasta
>two
ATCG
>three
GCTA
>one
ATCG

0 comments on commit 3d55c60

Please sign in to comment.