-
Notifications
You must be signed in to change notification settings - Fork 8
6. Adding order and family classification schemes from Fishes of the World (5th edition)
shenjean edited this page Mar 16, 2022
·
12 revisions
- We use R to add in the order and family numbers from the classifications scheme used in "Fishes of the World, 5th edition" (Nelson et al. 2016) , a widely used reference book in fish systematics.
- In this scheme, fish orders are numbered from 1 to 85, while families are numbered from 1 to 536.
- Data files for the classification scheme are available in the db.scripts/FotW5_classification folder in the mitohelper repository:
- A detailed version of the classification scheme (FotW5Classification.pdf)
- Tab-separated text file of the order classification scheme (
FotW5_order.tsv
)
- Tab-separated text file of the family classification scheme (
FotW5_family.tsv
)
# Import relevant files
mitofish=read.delim("mitofish.ref.tsv",header=T,sep="\t")
order=read.delim("FotW5_order.tsv",header=T,sep="\t")
family=read.delim("FotW5_family.tsv",header=T,sep="\t")
# Merge tables
ordermerge=merge(mitofish,order,by="Order",all.x=TRUE)
familymerge=merge(ordermerge,family,by="Family",all.x=TRUE)
# Output new table
write.table(familymerge,"mitofish.ref2.tsv",sep="\t")
cat mitofish.ref2.tsv | tr -d "\"" | sed "s/^Family/#Family/" | tr "#" "\t" | awk -F "\t" '{OFS="\t"}{print $4,$5,$6,$7,$8,$9,$3,$2,$10,$11,$12,$13,$14}' | sed "s/Gene.definition/Gene definition/" >mitofish.final.tsv
- The column
OrderID
contains order numbers assigned by "Fishes of the World, 5th edition" (Nelson et al. 2016)- The column
FamilyID
contains family numbers assigned by "Fishes of the World, 5th edition" (Nelson et al. 2016)- "Fishes of the World, 5th edition" (Nelson et al. 2016) was last updated in 2018 and is outdated.
- Therefore, some of the new or re-classified fish orders and families are not yet classified in the book.
- These have
NA
values in theOrderID
orFamilyID
columns.
Accession Gene definition taxid Superkingdom Phylum Class Family Order Genus Species Sequence OrderID FamilyID
U46868 Macquaria novemaculeata mitochondrial DNA control region haplotype 45783 Eukaryota Chordata Actinopteri 'Percalatidae' Centrarchiformes Percalates Percalates novemaculeata ACACCATACATTTATATTAACCATATCAGGGGTATTCAAGGACATATATGTTTTATCAACATTTCTCGTATTACACCATTCATATATCACTTAAACAAGAAGAATTCCATAACCCATTAAAAGTATACCGTATATAAATGAAATCTGGGATGGGCGAAATTTAAGACCGAGCACAATCACTCATAAGGTTAAGATATACCAGGACTCAACATATAGACATTCTTCACAATCTTAATGTAGTAAGAACCGACCAACAGTGATTTCTTAATGCATACTCTCATTG NA NA
DQ107935 Macquaria novemaculeata voucher BIOUG<CAN>:BW-A568 cytochrome oxidase subunit 1 (COI) gene, partial cds; mitochondrial 45783 Eukaryota Chordata Actinopteri 'Percalatidae' Centrarchiformes Percalates Percalates novemaculeata CCTCTATCTTGTATTTGGTGCCTGGGCCGGAATAGTAGGCACGGCTTTAAGTTTGCTCATTCGAGCAGAGCTTAGCCAGCCAGGCGCCCTCCTTGGGGATGACCAAATTTATAATGTAATTGTTACAGCACATGCATTTGTAATAATTTTCTTTATGGTAATGCCTATCATAATTGGGGGCTTTGGAAACTGACTCATCCCCCTTATGATCGGTGCCCCTGATATAGCTTTCCCTCGCATAAATAATATAAGCTTTTGACTTCTCCCCCCCTCTTTCCTGCTTCTCCTTGCTTCTTCTGGAGTAGAGGCTGGTGCCGGAACCGGGTGAACAGTGTATCCGCCCCTAGCAGGCAATTTAGCTCACGCAGGAGCATCTGTTGATCTAACCATCTTCTCCCTTCATCTAGCAGGGGTTTCCTCAATTCTTGGGGCCATTAACTTCATCACTACTATTATTAACATAAAGCCTCCAGCTACTTCTCAATATCAAACCCCCCTATTCGTGTGAGCAGTTCTGATTACCGCTGTCCTTCTTCTCCTTTCCCTCCCAGTCCTTGCCGCTGGTATTACAATACTACTCACAGATCGTAATCTTAACACTACCTTTTTTGACCCCGCGGGGGGAGGAGACCCAATTCTCTATCAGCACTTGTTC NA NA