Skip to content

Changed version numbers #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions lessons/03_sequence_alignment_theory.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,8 +200,8 @@ Next, we need to **add the modules** that we will be using for alignment:

```
# Load modules
module load gcc/6.2.0
module load bwa/0.7.17
module load gcc/14.2.0
module load bwa/0.7.18
```

> NOTE: On O2, many of the common tools were compiled using `GCC` version 6.2.0, so to be able to access them, we first need to load the `GCC` module.
Expand Down Expand Up @@ -270,8 +270,8 @@ bwa mem \
#SBATCH -o bwa_alignment_normal_%j.out
#SBATCH -e bwa_alignment_normal_%j.err<br>
# Load modules
module load gcc/6.2.0
module load bwa/0.7.17<br>
module load gcc/14.2.0
module load bwa/0.7.18<br>
# Assign files to bash variables
REFERENCE_SEQUENCE=/n/groups/hbctraining/variant_calling/reference/GRCh38.p7.fa
LEFT_READS=/home/$USER/variant_calling/raw_data/syn3_normal_1.fq.gz
Expand Down Expand Up @@ -319,8 +319,8 @@ $ sed 's/normal/tumor/g' bwa_alignment_normal.sbatch > bwa_alignment_tumor.sbat
#SBATCH -o bwa_alignment_tumor_%j.out
#SBATCH -e bwa_alignment_tumor_%j.err<br>
# Load modules
module load gcc/6.2.0
module load bwa/0.7.17<br>
module load gcc/14.2.0
module load bwa/0.7.18<br>
# Assign files to bash variables
REFERENCE_SEQUENCE=/n/groups/hbctraining/variant_calling/reference/GRCh38.p7.fa
LEFT_READS=/home/$USER/variant_calling/raw_data/syn3_tumor_1.fq.gz
Expand Down
2 changes: 1 addition & 1 deletion lessons/07_variant_calling.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ Next to add the `GATK4` module we are going to load:

```
# Load the GATK module
module load gatk/4.1.9.0
module load gatk/4.6.1.0
```

And now, we need to create our variables:
Expand Down
4 changes: 2 additions & 2 deletions lessons/08_variant_filtering.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,8 @@ Next, we need to add the modules that we will be loading:

```
# Load modules
module load gatk/4.1.9.0
module load snpEff/4.3g
module load gatk/4.6.1.0
module load snpEff/5.2f
```

Next, we will add our variables:
Expand Down
28 changes: 14 additions & 14 deletions lessons/09_variant_annotation.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The first step in annotating your VCF file is finding the appropriate SnpEff dat
To see if your genome of interest is in the `SnpEff` database, we first need to load the `SnpEff` module:

```
module load snpEff/4.3g
module load snpEff/5.2f
```

With the `SnpEff` module loaded, let's use the following command to browse all of the currently available genomes:
Expand All @@ -49,13 +49,13 @@ With the `SnpEff` module loaded, let's use the following command to browse all o
java -jar $SNPEFF/snpEff.jar databases | less
```

The first column is the database name and the second column in the `Genus_species` for the organism. There is also a database download link where the database can be downloaded at but this can be ignored as SnpEff will automatically download the database if needed. As you can see there are tens of thousands of these pre-built databases. So let's exit the `less` buffer page and **see which GRCh databases are available**:
The first column is the database name and the second column in the `Genus_species` for the organism. There is also a database download link where the database can be downloaded at but this can be ignored as SnpEff will automatically download the database if needed. As you can see there are tens of thousands of these pre-built databases. So let's exit the `less` buffer page and **see which hg38 databases are available**:

```
java -jar $SNPEFF/snpEff.jar databases | grep "GRCh"
java -jar $SNPEFF/snpEff.jar databases | grep "hg38"
```

We can see that this build of SnpEff has five possible GRCh databases that we can use for annotation, including one for GRCh38.p7 called GRCh38.p7.RefSeq. Now that we have found the database that we would like to use for our analysis, we can run `SnpEff`.
We can see that this build of SnpEff has three possible hg38 databases that we can use for annotation. We will use the one labelled hg38. Now that we have found the database that we would like to use for our analysis, we can run `SnpEff`.

### Running SnpEff

Expand Down Expand Up @@ -117,9 +117,9 @@ Next, we will add the line to load the modules that we will need:

```
# Load modules
module load gcc/9.2.0
module load bcftools/1.14
module load snpEff/4.3g
module load gcc/14.2.0
module load bcftools/1.21
module load snpEff/5.2f
```

Also, we will add our variables:
Expand Down Expand Up @@ -230,8 +230,8 @@ Next, we are going to index this file. While it is not necesscary for us to inde
We are going to be using <code>tabix</code>, which is part of the <code>HTSlib</code> module. First, we will need to load the <code>HTSlib</code> module, which also requires us to load the <code>gcc</code> module as well:

<pre>
module load gcc/9.2.0
module load htslib/1.14
module load gcc/14.2.0
module load htslib/1.21
</pre>

In order to index our dbSNP file using <code>tabix</code>, we just need to run the following command:
Expand Down Expand Up @@ -279,8 +279,8 @@ Let's discuss each part of this command:
<summary><b>Click here to see how to annotate our VCF file with the dbSNP annotation in <code>bcftools</code></b></summary>
The first thing we are going to need to do is load the modules that we will be using:<br>
<pre>
module load gcc/9.2.0
module load bcftools/1.14
module load gcc/14.2.0
module load bcftools/1.21
</pre>
Assuming we have already indexed our dbSNP VCF file, the first thing that we are going to need to do is compress the VCF file that we wish to annotate with:
<pre>
Expand Down Expand Up @@ -346,9 +346,9 @@ Let's explain each part of this command:
#SBATCH -o variant_annotation_syn3_normal_syn3_tumor_%j.out
#SBATCH -e variant_annotation_syn3_normal_syn3_tumor_%j.err<br>
# Load modules
module load gcc/9.2.0
module load bcftools/1.14
module load snpEff/4.3g<br>
module load gcc/14.2.0
module load bcftools/1.21
module load snpEff/5.2f<br>
# Assign variables
REPORTS_DIRECTORY=/home/$USER/variant_calling/reports/snpeff/
SAMPLE_NAME=mutect2_syn3_normal_syn3_tumor
Expand Down
2 changes: 1 addition & 1 deletion lessons/10_variant_prioritization.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Before we do anything, let's move to the directory with our VCF files and load t

```
cd /n/scratch/users/${USER:0:1}/$USER/variant_calling/vcf_files/
module load snpEff/4.3g
module load snpEff/5.2f
```

**SnpSift filter** is one of the most useful SnpSift commands. Using SnpSift filter you can filter VCF files **using arbitrary expressions.** In the most simple case, you can filter your SnpEff annotated VCF file based upon any of the **first seven fields** of the VCF file:
Expand Down