From f6ce90023bf58e165a3026096e56766407a9dca0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dan=20Ros=C3=A9n?= Date: Fri, 16 Oct 2020 18:12:13 +0200 Subject: [PATCH 1/2] Update ht19 -> ht20 --- ht20/project.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ht20/project.md b/ht20/project.md index 9e6afded..45ce20a8 100644 --- a/ht20/project.md +++ b/ht20/project.md @@ -35,7 +35,7 @@ from a family at risk of carrying mutations related to the disease.

Your task is to write a Python program that will extract a disease-causing transcript from the CFTR gene, translate the gene sequence to its corresponding amino-acid sequence and based on the reference amino-acid sequence determine whether any of the five given individuals is affected.

-

Download the lecture slides from here.

+

Download the lecture slides from here.

@@ -52,7 +52,7 @@ Human reference annotation file (`GTF` format): If you are not familiar with the file formats, read up online on how the files are structured. For example, here you can find a short description of the different (tab-delimited) fields of a GTF file. -Some of the tasks involve outputting long sequences. To make sure they are correct, use the utils.check_answers package (from the downloads folder from the course topics website). You can import it that way: +Some of the tasks involve outputting long sequences. To make sure they are correct, use the utils.check_answers package (from the downloads folder from the course topics website). You can import it that way:
from utils import check_answers
More detailed instructions are given with each task that uses the package. @@ -196,7 +196,7 @@ In the annotation file (the GTF file), the CFTR gene has the id `ENSG00000001626 -6. Translate the above sequence of all exons into amino acids, using an implementation of the translation table from the utils.rna package (from the downloads folder from the course topics website). +6. Translate the above sequence of all exons into amino acids, using an implementation of the translation table from the utils.rna package (from the downloads folder from the course topics website).
Tip From eccbd846232ef0d104d90f090d38dca3f8dc7925 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dan=20Ros=C3=A9n?= Date: Fri, 16 Oct 2020 18:13:13 +0200 Subject: [PATCH 2/2] Add a very first warm up exercise, fixes #11 --- ht20/project.md | 38 +++++++++++++++++++++++++++++++++++--- 1 file changed, 35 insertions(+), 3 deletions(-) diff --git a/ht20/project.md b/ht20/project.md index 45ce20a8..f584b9b0 100644 --- a/ht20/project.md +++ b/ht20/project.md @@ -58,12 +58,44 @@ More detailed instructions are given with each task that uses the package. # Warmup {#warmup} -1. What is the length of chromosome 7 on the reference sequence? +1. Make a directory for the project for you to work in and put the project + files there.
Tip
-

Open the reference fasta file and read it line by line.

+ + The commands below are for Mac and Linux and should also work on Windows Subsystem for Linux. + You can get help in the terminal by writing the command name followed by --help, such as cd --help. + Naturally you can also search the web! + +
    +
  1. Download the files (see instructions above)
  2. + +
  3. Open a terminal navigate to a directory where you want your project to be. + Use cd to change directory and pwd to print the working (current) directory.
  4. + +
  5. Make a folder for the project. + Use mkdir or a file explorer. Check that the files are there using by listing the directory's contents with ls.
  6. + +
  7. Move the files to this project. + Move files using mv. Again, use ls to see that the files end up where you want them to be.
  8. + +
  9. Unpack any compressed files: the file extension .gz indicates + gzip compression which can be decompressed using gunzip. + The command for .zip files is unzip.
  10. + +
  11. Examine the file contents using cat, head and tail.
  12. +
+
+
+ +2. What is the length of chromosome 7 on the reference sequence? + +
+ Tip +
+

Open the reference fasta file and read it line by line. Study the example in the lecture!

In a loop, ignore the first line and get the length of each following line.

Don't forget to remove the trailing newline character from each line.

Sum up all the lengths you found.

@@ -78,7 +110,7 @@ More detailed instructions are given with each task that uses the package.
-2. How many genes are annotated in the GTF file? +3. How many genes are annotated in the GTF file?
Tip