Skip to content

jsalt/syllabus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Computer Programming for Biologists

  • Course: BIOL 7800, LSU
  • Time/Location: T/Th, 10:30 - 11:50 AM | 0206 Williams
  • Instructor: Brant Faircloth
  • Need help?
    • Slack, don't email
    • Problem with the syllabus? File an issue
  • Office Hours T/Th 12:00 - 1:30 PM | 220 Life Sciences

Course description

The analysis of large data sets in biological research is becoming common, particularly as new sequencing technologies and data collection strategies exponentially increase the amount of data that can be collected by an individual researcher. Programmatic approaches are often needed to format and analyze these large data sets, yet few biologists receive training in applying programming languages to these tasks. Programming for Biologists is meant to introduce graduate or advanced undergraduate students to the practice of computer programming as it is applied to biological problems using a common programming language (Python, R) and programmatic techniques and algorithms.

Course credo

This course is going to challenge and frustrate you. A lot. I promise. You are learning a new language really quickly - that's a hard thing to do. Along with the hard parts of learning a new language, in this case, comes having to learn a number of new tools that you have not (likely) been exposed to. That's also really hard. You're also going to have to actually think on top of all that. But, if you think, and work, and collaborate with your classmates to understand what's going on, you will end up learning much, much more in a shorter period of time than you expected.

Teaching philosophy / Communication

I'm here to help you learn to program a computer. It's up to you to learn how to make that work for you. I view my role as providing guidance and direction and your role as using that guidance and direction to get where you want/need to go. If you decide that you like this sort of thing, you will be teaching yourself this way for the rest of your life. Better to learn how to do that now.

Along those lines, I am not going to answer questions about this or that program/technique/assignment via email. All class communication should happen on Slack, and almost all of that communication should happen in a open channel where your classmates can help you answer your question. You should each be able to create additional #channels, if needed. You should also take some time to learn about the features Slack offers, like code-formatting, etc.

A wise person once said that "99% of bioinformatics is learning how to google", and that idea is just as important when talking about computer programming. Learn how to answer a question for yourself, test out some new ideas if you're close but not quite there, and you'll be kicking-ass in no time.

Textbook

Think Python: How to Think Like a Computer Scientist by Allen Downey

This is a freely-available textbook. We will follow parts of it for the class. It is also an invaluable reference text when you need to remind yourself of relatively simple Python details.

Primary language

Python 3.5.1

We are using the Anaconda Python Distribution. You want to be sure to download and installed Python 3.5.x for your operating system (Linux/OSX/Windows). The Anaconda installers are linked, below:

Why are we using Python 3.5.x?

There exists a weird schism in the world where a now (much) older version of a programming language (Python 2.7.x) is used by many developers versus the newer (and mostly improved) version of that same language (Python 3.5.x). The reasons for this are many and varied, but they largely dealt with the unavailability of many important packages in Python 3.5.x until "recently".

I would argue that the time is right for scientists to make the move to Python 3.5.x from Python 2.7.x. So, we're starting that movement. I also need to teach myself what has changed, and this is a pretty good way to do that.

Why are we using Anaconda?

  1. Because it generally "just works"
  2. Offers package manager for installing packages it's missing
  3. Has convenient "virtual" environments for testing
  4. Comes with many important, precompiled/preinstalled packages (ipython, numpy, requests, etc.)

Software License

We are releasing the contents of this course (e.g. all my notes, all of our code) under an open-source license (BSD). As a member of this course, you agree to make your assignments, code review comments, and your final projects open-source, as well. The reasons we are doing this are many, but it's one (very small) way that we can repay the debt we owe everyone else contributing to open source projects.

One requirement of your Software Project is that you release it on github as an open-source project.

Grading

In accordance with the LSU grading policy, grades will be assigned using an A-F scale and the +/- system. Grading is pretty simple:

Item Points # of assignments % of grade
Class assignments 15 each 23 34.5 %
Code reviews 15 each 23 34.5 %
Project Proposal 50 points 1 5.00 %
Software Project Presentation 60 points 1 6.00 %
Software Project 200 points 1 20.0 %
Total 1000 points 100 %

Your Software Project grade will be assigned based on:

  1. my assessment of your work according to a rubric we will discuss in class (50 %)
  2. your classmates assessment of your work based on this same rubric (50%)

There is no extra-credit.

Grading expectations

A significant portion of this course requires you to read, evaluate, and evaluate the work of others. The points that you earn for Code reviews are based on how well you do this code-review. If you fail to conduct that review or your review is sloppy, you will receive reduced (or zero) points for that Code review assignment.

Although your classmates will be evaluating your assignments, I will assess the code review that your perform, as well as your performance on the assignment. I will assign a final grade for both the assignment and code review, and I will post the grades for each assignment (your submission and your code review) to moodle, so that you can track your grade.

Grading scale

Points Letter Grade Assigned
970-1000 A+
930-969 A
900-929 A-
870-899 B+
830-869 B
800-829 B-
770-799 C+
730-769 C
700-729 C-
670-699 D+
630-669 D
600-629 D-
< 600 F

Absentee policy

If you are in the field during the first portion of class, I will work with you. Otherwise, if you don't turn in the assignments on time, you will lose points. Class is optional, I guess. But it will greatly benefit you to show up in class for the discussion and exercises that will give you a head-start on your assignments.

There are no tests. There are no exams.

Course overview

The course will be a mix of lecture, in-class "active" learning, individual assignments, group assignments, and group projects. That keeps it fun for all of us. You will be expected to participate and complete the assignments given to you or your group. You will also be expected to contribute equally during any and all group work. If you do not, I will ensure your grade reflects that lack of participation. See Commitment to Community and Academic Integrity regarding my expectations with respect to being civil to your classmates and doing your own work.

Lecture

Some portions of our class will be lecture-based. These lectures will, for the most part, derive from the Textbook chapter or the URL provided in the Schedule . I, of course, will elaborate on some items and focus less on others - as I feel they are appropriate. It would be wise for you to read the assigned reading prior to coming to class. You may want to read the same chapter, again, after lecture. Repetition is one key to learning a new language efficiently.

Class assignments

We will have assignments associated with almost every class period. To receive credit for those assignments, you will need to turn them in on time. Your submission will be assessed by your classmates, who will be doing Code Reviews for each of your assignments.

I will assign final grades for all assignments and code reviews, and I will post those to moodle. The score that you receive on any given assignment will be based on a rubric that is located in each assignments directory.

Code reviews

One good way to learn how to write computer code is to read, understand, and test the computer code of others. To facilitate this learning experience, you are going to be doing "in-house" code reviews of each other's assignments.

Two code-reviewers will be assigned systematically to review a given assignment (for a given person). A reviewer will assess the work of the classmate they are grading, based on a rubric I provide. The rubric will be provided along with the assignment when it is posted on [github][https://github.com] (e.g. assingment-1 rubric).

I will assess the quality of each code review and assign you a grade for your review. These reviews will start out relatively simple and get more complex as the course proceeds.

Software project proposal

For your final assignment, you will be responsible for putting together a Software Project for this class that builds upon what you've learned during the course.

Prior to starting your Software Project (or your group's Software Project), and about half-way through the course (due Mar 29), you will write a 2-page proposal that:

  • Describes the problem your Software Project will attempt to solve
  • Gives the rational for the solution you propose
  • Explains how (roughly) you plan to go about implementing a solution
  • Lists potential user-groups of the code you will write

This proposal should be heavy on the description of the problem you intend to solve; why it's important; and why it will benefit other people. The details can be lighter on implementation.

Software project

As mentioned above, your final assignment of this course will be to complete the Software Project you proposed for this class. This Software Project should build upon what you've learned during class but it is also very important that the Software Project incorporate things that we did not explicitly cover in class - your goal here is to move beyond only those things we covered in class. That could mean writing a software package that uses a new package that we never covered, creating a new package to use, scraping parts of the web in interesting ways, etc.

As part of your software project, you will make a short presentation on the last day of class that described the problem your software package solves, your rational for the approach you used, how you implemented a solution, and gives a (live) example of the program in action.

You will have several days after the live demo to fix any remaining problems with the package and address any comments from your classmates prior to the review of your final project code (your Software Project is due 4 May).

Submitting assignments

  1. Fork and clone the appropriate Assignment repository (e.g. assingment-1) to your computer
  2. Open that repository in Github Desktop
  3. Create a directory, nested in the answers directory in the cloned, forked assignment repository that is your username on github
  4. Navigate to this directory on your computer
  5. Add the answers to the assignment questions in the README.md (e.g. https://github.com/biolprogramming/assignment-1)
  6. Commit all of the changes to your repository
  7. Push/Sync that to Github
  8. Make a pull request to the main biolprogramming repository

Schedule

Date Subject Chapter Due Assignment Due Code Review
14 Jan Syllabus; Prep; Installations --- --- ---
19 Jan Introduction to the CLI & REPL This [OSX] or This [Win] Assign 1
21 Jan Regular Expressions & Pseudocode This and re module Assign 2 Assign 1
26 Jan Python Variables/Expressions Chap 1 & 2 Assign 3 Assign 2
28 Jan Functions Part I Chap 3 Assign 4 Assign 3
2 Feb Conditionals and Recursion Chap 5 Assign 5 Assign 4
4 Feb Functions Part II Chap 6 Assign 6 Assign 5
9 Feb (Mardi Gras) --- --- ---
11 Feb Iteration Chap 7 Assign 7 Assign 6
16 Feb Class was cancelled --- --- ---
18 Feb Class was cancelled --- --- ---
23 Feb Strings & Lists Chap 8 & 10 Assign 8 Assign 7
25 Feb Dictionaries & Tuples Chap 11 & 12 Assign 9 Assign 8
1 Mar Files Chap 14 Assign 10 Assign 9
3 Mar Classes & objects Chap 15 & Chap 16 Assign 11 Assign 10
8 Mar Classes & methods Chap 17 & Chap 18 Assign 12 Assign 11
10 Mar The Kitchen Sink Chap 19 Assign 13 Assign 12
15 Mar PEP8, programs, modules, practices PEP 8 Assign 14 Assign 13
17 Mar TDD and Documentation Assign 15 Assign 14
22 Mar (Spring Break) --- --- ---
24 Mar (Spring Break) --- --- ---
29 Mar BioPython BioPython Cookbook Project proposal ---
31 Mar BioPython BioPython Cookbook Assign 16 Assign 15
5 Apr numpy numpy user guide Assign 17 Assign 16
7 Apr numpy + pandas pandas user guide1 Assign 18 Assign 17
12 Apr requests requests manual Assign 19 Assign 18
14 Apr BioPython + NCBI --- Assign 20 Assign 19
19 Apr subprocess subprocess Assign 21 Assign 20
21 Apr itertools & sqlite3 itertools & sqlite3 Assign 22 Assign 21
26 Apr speed, timing, and multiprocessing timeit & multiprocessing Assign 23 Assign 22
28 Apr Software Project Demos --- Software project demo Assign 23
30 Apr (Classes end) --- --- ---
4 May Final Software Projects Due --- Software project ---

1 No, I do not expect you to read all 1800+ pages. Read Chapters 5, 6, 8, 9, 10. Experiment w/ the examples.

Conduct

Commitment to Community

You should be familiar with the LSU Commitment to Community, which is outlined here. You should also be familiar with the LSU Code of Student Conduct, which is available here. You are expected to follow the Commitment to Community during your time in this class and when working on assignments outside of class. Students who do not respect the instructor(s) or other members of the class will be asked to leave the lecture immediately. This includes using the telephone, texting, or using the internet for non-class-related purposes during the lecture.

Academic Integrity

I take academic integrity seriously. You are expected to reference sources appropriately in your written work. You are absolutely expected to reference any third party computer code that you include in your assignments. You may reference sources in your writing using any method that you prefer (footnotes, Chicago-style, MLA-format), although I expect any referenced material to be paraphrased and cited appropriately. You may reference any computer code that you use using a URL link to the source of the code.

If you need guidance relative to appropriately paraphrasing sources, please see this link. If you need guidance relative to appropriately citing sources, please see this link. If you have remaining questions or concerns, I am happy to help you during my office hours.

You are expected to submit your own work for evaluation (or the work of your group, if a group assignment).

Academic Misconduct

If I suspect that you have committed Academic Misconduct, I am required to report the incident to the Student Advocacy and Accountability office, and they will follow-up. Definitions of academic misconduct are provided here.

About

Syllabus for Computer Programming for Biologists (BIOL 7800) at Louisiana State University

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published