From d48e8575fa6cdac0c0806568417f7ef016ea1cab Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Fri, 25 Apr 2025 16:29:14 -0400 Subject: [PATCH 1/8] Changed version numbers --- lessons/03_sequence_alignment_theory.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lessons/03_sequence_alignment_theory.md b/lessons/03_sequence_alignment_theory.md index 80e9313..00e8ae0 100644 --- a/lessons/03_sequence_alignment_theory.md +++ b/lessons/03_sequence_alignment_theory.md @@ -200,8 +200,8 @@ Next, we need to **add the modules** that we will be using for alignment: ``` # Load modules -module load gcc/6.2.0 -module load bwa/0.7.17 +module load gcc/14.2.0 +module load bwa/0.7.18 ``` > NOTE: On O2, many of the common tools were compiled using `GCC` version 6.2.0, so to be able to access them, we first need to load the `GCC` module. @@ -270,8 +270,8 @@ bwa mem \ #SBATCH -o bwa_alignment_normal_%j.out #SBATCH -e bwa_alignment_normal_%j.err
# Load modules -module load gcc/6.2.0 -module load bwa/0.7.17
+module load gcc/14.2.0 +module load bwa/0.7.18
# Assign files to bash variables REFERENCE_SEQUENCE=/n/groups/hbctraining/variant_calling/reference/GRCh38.p7.fa LEFT_READS=/home/$USER/variant_calling/raw_data/syn3_normal_1.fq.gz @@ -319,8 +319,8 @@ $ sed 's/normal/tumor/g' bwa_alignment_normal.sbatch > bwa_alignment_tumor.sbat #SBATCH -o bwa_alignment_tumor_%j.out #SBATCH -e bwa_alignment_tumor_%j.err
# Load modules -module load gcc/6.2.0 -module load bwa/0.7.17
+module load gcc/14.2.0 +module load bwa/0.7.18
# Assign files to bash variables REFERENCE_SEQUENCE=/n/groups/hbctraining/variant_calling/reference/GRCh38.p7.fa LEFT_READS=/home/$USER/variant_calling/raw_data/syn3_tumor_1.fq.gz From d8049fee38ed51fcfb91e0911ad28361950a6704 Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Fri, 25 Apr 2025 16:39:52 -0400 Subject: [PATCH 2/8] Updated GATK version number --- lessons/07_variant_calling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lessons/07_variant_calling.md b/lessons/07_variant_calling.md index e3128f1..5f9cf37 100644 --- a/lessons/07_variant_calling.md +++ b/lessons/07_variant_calling.md @@ -145,7 +145,7 @@ Next to add the `GATK4` module we are going to load: ``` # Load the GATK module -module load gatk/4.1.9.0 +module load gatk/4.6.1.0 ``` And now, we need to create our variables: From e5bc4a305cd2d01267432547203b18bde656f8ed Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Fri, 25 Apr 2025 16:41:54 -0400 Subject: [PATCH 3/8] Updated GATK and snpEff versions --- lessons/08_variant_filtering.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lessons/08_variant_filtering.md b/lessons/08_variant_filtering.md index 642be64..31481b9 100644 --- a/lessons/08_variant_filtering.md +++ b/lessons/08_variant_filtering.md @@ -67,8 +67,8 @@ Next, we need to add the modules that we will be loading: ``` # Load modules -module load gatk/4.1.9.0 -module load snpEff/4.3g +module load gatk/4.6.1.0 +module load snpEff/5.2f ``` Next, we will add our variables: From 4f59f5f3b78197d7b043fff7a3de47a8bf085dcb Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Fri, 25 Apr 2025 16:43:11 -0400 Subject: [PATCH 4/8] Updated snpEff version --- lessons/09_variant_annotation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lessons/09_variant_annotation.md b/lessons/09_variant_annotation.md index c20521e..2daeb28 100644 --- a/lessons/09_variant_annotation.md +++ b/lessons/09_variant_annotation.md @@ -40,7 +40,7 @@ The first step in annotating your VCF file is finding the appropriate SnpEff dat To see if your genome of interest is in the `SnpEff` database, we first need to load the `SnpEff` module: ``` -module load snpEff/4.3g +module load snpEff/5.2f ``` With the `SnpEff` module loaded, let's use the following command to browse all of the currently available genomes: From 2013d6996dac2950cbf5aaca78f9a8cb40322962 Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Mon, 28 Apr 2025 09:23:43 -0400 Subject: [PATCH 5/8] Updated annotation --- lessons/09_variant_annotation.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lessons/09_variant_annotation.md b/lessons/09_variant_annotation.md index 2daeb28..3488efa 100644 --- a/lessons/09_variant_annotation.md +++ b/lessons/09_variant_annotation.md @@ -49,13 +49,13 @@ With the `SnpEff` module loaded, let's use the following command to browse all o java -jar $SNPEFF/snpEff.jar databases | less ``` -The first column is the database name and the second column in the `Genus_species` for the organism. There is also a database download link where the database can be downloaded at but this can be ignored as SnpEff will automatically download the database if needed. As you can see there are tens of thousands of these pre-built databases. So let's exit the `less` buffer page and **see which GRCh databases are available**: +The first column is the database name and the second column in the `Genus_species` for the organism. There is also a database download link where the database can be downloaded at but this can be ignored as SnpEff will automatically download the database if needed. As you can see there are tens of thousands of these pre-built databases. So let's exit the `less` buffer page and **see which hg38 databases are available**: ``` -java -jar $SNPEFF/snpEff.jar databases | grep "GRCh" +java -jar $SNPEFF/snpEff.jar databases | grep "hg38" ``` -We can see that this build of SnpEff has five possible GRCh databases that we can use for annotation, including one for GRCh38.p7 called GRCh38.p7.RefSeq. Now that we have found the database that we would like to use for our analysis, we can run `SnpEff`. +We can see that this build of SnpEff has three possible hg38 databases that we can use for annotation. We will use the one labelled hg38. Now that we have found the database that we would like to use for our analysis, we can run `SnpEff`. ### Running SnpEff @@ -119,7 +119,7 @@ Next, we will add the line to load the modules that we will need: # Load modules module load gcc/9.2.0 module load bcftools/1.14 -module load snpEff/4.3g +module load snpEff/5.2f ``` Also, we will add our variables: @@ -348,7 +348,7 @@ Let's explain each part of this command: # Load modules module load gcc/9.2.0 module load bcftools/1.14 -module load snpEff/4.3g
+module load snpEff/5.2f
# Assign variables REPORTS_DIRECTORY=/home/$USER/variant_calling/reports/snpeff/ SAMPLE_NAME=mutect2_syn3_normal_syn3_tumor From f92ce818b2c4758b89cc720cdc84a4daae7bb659 Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Mon, 28 Apr 2025 09:27:07 -0400 Subject: [PATCH 6/8] Updated versions --- lessons/09_variant_annotation.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lessons/09_variant_annotation.md b/lessons/09_variant_annotation.md index 3488efa..ed8e6de 100644 --- a/lessons/09_variant_annotation.md +++ b/lessons/09_variant_annotation.md @@ -117,8 +117,8 @@ Next, we will add the line to load the modules that we will need: ``` # Load modules -module load gcc/9.2.0 -module load bcftools/1.14 +module load gcc/14.2.0 +module load bcftools/1.21 module load snpEff/5.2f ``` @@ -279,8 +279,8 @@ Let's discuss each part of this command: Click here to see how to annotate our VCF file with the dbSNP annotation in bcftools The first thing we are going to need to do is load the modules that we will be using:
-module load gcc/9.2.0
-module load bcftools/1.14
+module load gcc/14.2.0
+module load bcftools/1.21
 
Assuming we have already indexed our dbSNP VCF file, the first thing that we are going to need to do is compress the VCF file that we wish to annotate with:
@@ -346,8 +346,8 @@ Let's explain each part of this command:
 #SBATCH -o variant_annotation_syn3_normal_syn3_tumor_%j.out
 #SBATCH -e variant_annotation_syn3_normal_syn3_tumor_%j.err
# Load modules -module load gcc/9.2.0 -module load bcftools/1.14 +module load gcc/14.2.0 +module load bcftools/1.21 module load snpEff/5.2f
# Assign variables REPORTS_DIRECTORY=/home/$USER/variant_calling/reports/snpeff/ From 81d0bcc947d5c32be414f1282246f0fcb916ff4a Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Mon, 28 Apr 2025 09:28:08 -0400 Subject: [PATCH 7/8] Updated packages --- lessons/09_variant_annotation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lessons/09_variant_annotation.md b/lessons/09_variant_annotation.md index ed8e6de..8f0a593 100644 --- a/lessons/09_variant_annotation.md +++ b/lessons/09_variant_annotation.md @@ -230,8 +230,8 @@ Next, we are going to index this file. While it is not necesscary for us to inde We are going to be using tabix, which is part of the HTSlib module. First, we will need to load the HTSlib module, which also requires us to load the gcc module as well:
-module load gcc/9.2.0
-module load htslib/1.14
+module load gcc/14.2.0
+module load htslib/1.21 
 
In order to index our dbSNP file using tabix, we just need to run the following command: From bac022fcce1b3760e36af3e880c043dd18a05c58 Mon Sep 17 00:00:00 2001 From: Will Gammerdinger Date: Tue, 29 Apr 2025 14:41:42 -0400 Subject: [PATCH 8/8] Updated version number --- lessons/10_variant_prioritization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lessons/10_variant_prioritization.md b/lessons/10_variant_prioritization.md index 73c7731..004c740 100644 --- a/lessons/10_variant_prioritization.md +++ b/lessons/10_variant_prioritization.md @@ -32,7 +32,7 @@ Before we do anything, let's move to the directory with our VCF files and load t ``` cd /n/scratch/users/${USER:0:1}/$USER/variant_calling/vcf_files/ -module load snpEff/4.3g +module load snpEff/5.2f ``` **SnpSift filter** is one of the most useful SnpSift commands. Using SnpSift filter you can filter VCF files **using arbitrary expressions.** In the most simple case, you can filter your SnpEff annotated VCF file based upon any of the **first seven fields** of the VCF file: