This is the course syllabus for the Fall 2018 edition of STAT 545 (click here for the STAT 547M syllabus). You should use this syllabus to:
- find information about the course, and
- navigate the course.
Both STAT 545 and STAT 547M make use of the following tools:
Tool + Link | Description |
---|---|
http://stat545.com/Classroom (here) | Think of this as the course "home" -- and this syllabus as your launch pad to other destinations. Contains lecture notes, assignments, and course information. |
assignments and participation information | Assignments, assignment info, and participation info can be found here. |
Discussion-Internal GitHub repository | For internal discussion. The world cannot see this. |
Discussion GitHub repository. | For public discussion. The world can see this. |
STAT545-UBC-students GitHub Organization | This will contain one GitHub repository per student, for you to submit homework to and give peer reviews. |
UBC canvas | This is for grade management. You'll be interacting with it by submitting a link to your homework. |
http://stat545.com | This holds course content, such as tutorials. Think of this as a textbook. We'll point you there when needed. This previously contained the information contained in Classroom , but that eventually became confusing. Some headers there are becoming deprecated. |
This framework of tools is under construction, as we move to a solution that's more scalable in terms of future iterations and multiple collaborators. We appreciate your patience and welcome your feedback!
STAT 545 is "Part I" of learning how to
- explore, groom, visualize, and analyze data
- make all of that reproducible, reusable, and shareable
- using R
Part II is STAT 547M -- hope to see you there, too!
Credits: 1.5
- Introduction to R and the RStudio IDE: scripts, the workspace, RStudio Projects, daily workflow
- Generate reports from R scripts and R Markdown
- Coding style, file and project organization
- Data frames or "tibbles" are the core data structure for data analysis: care for them with the tidyverse
- Data visualization with
ggplot2
- How to write functions and work with R in a functional style
- Version control with Git; collaboration via GitHub
This course runs from September 4 until October 18, 2018.
We'll meet as a class every Tuesday and Thursday, 09:30-11:00, in ESB 2012.
I'll aim to end class at 10:45.
There are no official pre-requisites for STAT 545A, but most students will have had at least one prior statistics course or comparable experience.
Jenny Bryan deserves a huge amount of credit for founding and developing both STAT 545 and 547M over many years, along with her TA's. Some of their content is even being used in this very syllabus. Thank you!
Here are the topics and links to the notes for each class meeting.
Warning: adjustments are still being made!
Meeting No. | Date | TA's | Topic | Resources |
---|---|---|---|---|
01 | sep-04 tues | Chad, Sherrie | Intro to course and software | |
02 | sep-06 thurs | Chad, Hossam | Markdown and GitHub | Tutorials for getting started with markdown and GitHub |
03 | sep-11 tues | Rashedul, Sherrie | Getting familiar with R & RMarkdown | R: stat545: hello r, or adv-r: data structures for a more advanced intro. Rmd: stat545: Rmd test drive. |
04 | sep-13 thurs | Rashedul, Hossam | The git workflow; collab with GitHub | |
05 | sep-18 tues | Chad, Sherrie | working with data in R; dplyr and the tidyverse |
|
06 | sep-20 thurs | Chad, Rashedul | ggplot2, Part I | |
07 | sep-25 tues | Hossam, Sherrie | ggplot2, Part II | |
08 | sep-27 thurs | Hossam, Rashedul | Grouping of tibbles | |
09 | oct-02 tues | Rashedul, Sherrie | Tidy data, reshaping | |
10 | oct-04 thurs | Rashedul, Chad, Hossam | Guest Lecture: Rashedul, on table joins | |
11 | oct-09 tues | Hossam, Sherrie | Advanced R programming; file I/O | |
12 | oct-11 thurs | Hossam, Rashedul | The joy of Factors | |
13 | oct-16 tues | Rashedul, Sherrie | Revisit ggplot, practicalities of daily figure-making | |
14 | oct-18 thurs | Rashedul, Chad | The model-fitting paradigm in R; broom ; deep thoughts about data analytic work |
Expectations:
- Show up to every class, and
- Bring a laptop to every class
(Navigation link to Assignments)
To gain marks in this course, you'll be completing five assignments, and submitting two peer reviews for each assignment. Participation counts too!
NOTE: You must have a GitHub account (free) to earn marks in this course, and then complete the course survey.
Here's the breakdown of your course grade:
Assessment | Weight |
---|---|
5 Assignments | 75% (15% per assignment) |
10 Peer Reviews | 15% (3% per assignment) |
Participation | 10% |
There is no final exam.
Auditing students must still complete and submit all assessments, to be graded on a pass/fail basis.
Assignments and peer review: For information about and links to assignments and peer reviews, go to the assignments page.
Participation: See the participation page.
Here is your dedicated teaching team!
Teaching Member | Position | Contact |
---|---|---|
Vincenzo Coia | Instructor | Email: [email protected] GitHub: @vincenzocoia Twitter: @VincenzoCoia LinkedIn: vincenzocoia |
Chad Fibke | Teaching Assistant | GitHub: @ChadFibke |
Hossameldin Mohammed | Teaching Assistant | GitHub: @hsmohammed ; LinkedIn |
Rashedul Islam | Teaching Assistant | GitHub: @rashedul , LinkedIn |
Sherrie Lau | Teaching Assistant | GitHub: @sherrie9 |
Please see the "Conversation" section below to determine who to get in touch with for what, and how.
Office hours: Want to talk about the course outside of lecture? Let's talk during these dedicated times (generally, 11:00-12:00 every Monday, Tuesday, Wednesday). You're always welcome to schedule alternative times, too.
Teaching Member | Date | Time | Place |
---|---|---|---|
Rashedul | Tue, Sept 04 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Sept 05 | 11:00 - 12:00 | ESB 1043 |
Sherrie | Mon, Sept 10 | 11:00 - 12:00 | ESB 3174 |
Chad | Tue, Sept 11 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Sept 12 | 11:00 - 12:00 | ESB 1043 |
Hossam | Mon, Sept 17 | 11:00 - 12:00 | ESB 3174 |
Rashedul | Tue, Sept 18 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Sept 19 | 11:00 - 12:00 | ESB 1043 |
Sherrie | Mon, Sept 24 | 11:00 - 12:00 | ESB 3174 |
Rashedul | Tue, Sept 25 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Sept 26 | 11:00 - 12:00 | ESB 1043 |
Chad | Mon, Oct 01 | 11:00 - 12:00 | ESB 3174 |
Hossam | Tue, Oct 02 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Oct 03 | 11:00 - 12:00 | ESB 1043 |
Sherrie | Mon, Oct 08 | 11:00 - 12:00 | ESB 3174 |
Chad | Tue, Oct 09 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Oct 10 | 11:00 - 12:00 | ESB 1043 |
Chad | Mon, Oct 15 | 11:00 - 12:00 | ESB 3174 |
Hossam | Tue, Oct 16 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Oct 17 | 11:00 - 12:00 | ESB 1043 |
Sherrie | Mon, Oct 22 | 11:00 - 12:00 | ESB 3174 |
Rashedul | Tue, Oct 23 | 11:00 - 12:00 | ESB 3174 |
Vincenzo | Wed, Oct 24 | 11:00 - 12:00 | ESB 1043 |
Do you need my permission to enroll or audit in this course? Space is limited, but I'm happy to sign the Change of Registration form if there's still room.
Are you stuck? First, try to get unstuck by yourself by following the advice of stat545.com: help-general.
While you are getting started, we recommend you seek help within the STAT545 community first, before, e.g. posting to external forums. We are more cuddly.
The instructor and TAs stand ready to assist you, but your peers will also be a great source of good questions and answers. For that reason, we encourage you to seek help in ways that are visible to others.
The options are sort of listed in order of preference. But we realize every situation is different and your comfort level with these approaches will change as you learn more.
- Want to talk about content/coding issues? Post an Issue in the Discussion (public) repository.
- Want to talk about the course? Post an Issue in the Discussion-Internal (private) repository.
- Want to talk in person? Come visit us during office hours!
- Want to privately contact Vincenzo? Feel free to send me an email.
- I look forward to receiving your email, though I do encourage you to post in one of the
Discussion
repositories unless it's really not appropriate for either platform.
- I look forward to receiving your email, though I do encourage you to post in one of the
Some advice on opening an Issue on GitHub:
- Give it a specific title.
- BAD: "aaaaaarrrrrrgh!", "things not working", "i need help"
- GOOD: 'error when indexing a matrix: "incorrect number of dimensions"'
- Stay specific and be complete-but-concise in the body of the description. Don't expect your helper to play 20 questions with you.
- (Optional) Tag someone:
- To get the attention of the teaching team, add the
@2018_teaching_team
tag to notify all five of us. - To get the attention of your fellow students, add the
@2018_students
tag to notify them.
- To get the attention of the teaching team, add the
- Don't just create Issues -- also respond to them! Think about this in terms of adding to the conversation, not in terms of "correctness".
- Don't forget to click "Submit new issue"!
Typically, this will trigger an email to the person/team you tagged. The title of your issue will be in the subject line, so I repeat, make it specific. Your description will become the body of the email. At the bottom will be a link to the issue on GitHub.
If all goes well, your helper will respond. I almost always do this directly via GitHub, though simply replying to the email basically works. In any case, this back-and-forth will show up as a series of comments on your original issue. It's like an email dialogue but better:
- It's embedded in a relevant Organization/project/repo, so it will be easier to find later vs. digging out of your giant vat of unfiled email.
- It's potentially visible to others (depending on the repo), which could save us from asking/answering the same questions repeatedly.
- The whole discussion will be mirrored via email, so that still serves as a great way to prompt participants to tune in.
- Later you can get fancy and refer to commits and other issues within the repo in slick ways.
Once the problem is resolved, the issue can be closed. Note that closed issues remain accessible, in case anyone needs to consult them in the future.
Here are the resources we will be referring to throughout the course, along with a brief description of the resource.
Overarching resources:
- http://stat545.com
- As mentioned earlier, this website can be thought of as a textbook for STAT 545/547.
- R for Data Science (aka "r4ds"), by Garrett Grolemund and Hadley Wickham.
- STAT 545/547 closely mirrors the topics of this book, making this book more of a true "textbook" for the course.
Resources for more specific topics:
- Happy git with R, by Jenny Bryan and the STAT 545 TA's
- Great for marrying git, GitHub, R, and RStudio in your workflow.
- Advanced R, by Hadley Wickham
- If you want to learn more about R as a programming language, this is a very readable and concise way of doing so.
- ggplot2 book, by Hadley Wickham
- Useful for digging deeper into
ggplot2
.
- Useful for digging deeper into
- RMarkdown book, by Yihui Xie et al.
- Brand new! Looks like a comprehensive resource for everything R Markdown related.