-
Notifications
You must be signed in to change notification settings - Fork 1
fasta_reference
Dave Lawrence edited this page Nov 21, 2022
·
1 revision
For fast random access, we need indexed fasta files which means they must be compressed with BGZip
If your files are gzipped, you will see the error:
[E::fai_build3_core] Cannot index files compressed with gzip, please use bgzip
Pick your fasta from NCBI human genome assemblies
You can download and bgzip in 1 step via:
FASTA_VERSION=GCF_000001405.40_GRCh38.p14
FASTA_FILE=${FASTA_VERSION}_genomic.fna.gz
wget --quiet -O - https://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Homo_sapiens/all_assembly_versions/${FASTA_VERSION}/${FASTA_FILE} | gzip -d | bgzip > ${FASTA_FILE}
samtools faidx ${FASTA_FILE}