diff --git a/homework.html b/homework.html new file mode 100644 index 0000000..f8766b3 --- /dev/null +++ b/homework.html @@ -0,0 +1,647 @@ + + + + + + + + + + + + + +SYS 6018: Homework + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +



+

+All assignments are submitted on the course +Canvas +site. +

+
+
+

Homework #0 (.qmd)

+
+

Due: Thu Jan 16 3:30pm (optional; not for credit)

+
+
+
+

Homework #1 (.qmd)

+
+

Due: Thu Jan 30 3:30pm

+
+
+
+

Homework #2 (.qmd)

+
+

Due: Thu Feb 06 3:30pm

+
+
+
+

Homework #3 (.qmd)

+
+

Due: Thu Feb 13 3:30pm

+
+
+
+

Homework #4 (.qmd)

+
+

Due: Thu Feb 20 3:30pm
Note: This is an independent assignment. +All work done on your own.

+
+
+
+

Homework #5 (.qmd)

+
+

Due: Thu Feb 27 3:30pm

+
+
+
+

Homework #6 (.qmd)

+
+

Due: Thu Mar 06 3:30pm

+
+
+
+

Homework #7 (.qmd)

+
+

Due: Thu Mar 27 3:30pm

+
+
+
+

Homework #8 (.qmd)

+
+

Due: Thu Apr 03 3:30pm
Note: This is an independent assignment. +All work done on your own.

+
+
+
+

Homework #9 (.qmd)

+
+

Due: Thu Apr 10 3:30pm

+
+
+
+

Homework #10 (.qmd)

+
+

Due: Thu Apr 17 3:30pm

+
+
+
+

Final Exam

+
+

Due: Sat May 03 Due 5pm
Note: This is an independent assignment. +All work done on your own.

+
+
+ + + + + + + + + +
+ + + + + + + + + + + + + + + diff --git a/index.html b/index.html index ad399ce..1823cfa 100644 --- a/index.html +++ b/index.html @@ -13,4 +13,956 @@ SYS 6018: Schedule + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + +



+
+

Course Schedule

+ ++++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ClassDateDayTopicReading (Pre-Class)Notes
1Jan 13MonIntroductionPre-Req material; R and Rstudio
Modern Data Science with +R sections: 2,3,4,6,7,9
HW 0 assigned; Download textbooks; update software; +review pre-req material
2Jan 15WedLinear RegressionISL 3.1-3.6R formula +interface
Jan 16ThuHW 0 Due
Jan 20MonNo Class
3Jan 22WedSupervised Learning IISL 1, 2.1, 7.1
ESL 1, 2.1-2.4
4Jan 27MonSupervised Learning IIISL 2.2-2.3
ESL 2.5-2.9
5Jan 29WedResampling: Bootstrap and SplinesISL 5.2, 7.2-7.4
ESL 7.11, 5.1-5.3
(optional) +Bootstrapping +Regression Models
Jan 30ThuHW 1 Due
6Feb 03MonResampling: Cross-Validation and Model SelectionISL 5.1
ESL 7.10
ISL Chap 3 (review)
7Feb 05WedPenalized RegressionISL 6.1-6.2
ESL 3.3-3.4
Feb 06ThuHW 2 Due
8Feb 10MonPenalized RegressionISL 6.4-6.5 (optional 6.3)
ESL 3.5-3.6
9Feb 12WedTree-Based MethodsISL 8.1, 8.3.1-8.3.2
ESL 9.2
Feb 13ThuHW 3 Due
10Feb 17MonTree-Based MethodsISL 8.2, 8.3.3-8.3.4
ESL 15
11Feb 19WedClassification: Probability ModelingISL 4.1-4.3
ESL 4.1-4.4
Feb 20ThuHW 4 Due
12Feb 24MonClassification: Decision TheoryESL 9.1
13Feb 26WedSupport Vector Machines (SVM)ISL 9.1-9.6
ESL 12.1-12.3 or MMDS 12.3
Feb 27ThuHW 5 Due
14Mar 03MonPrediction Bias and CalibrationCalibration: +the Achilles heel of predictive analytics
15Mar 05WedReview
Mar 06ThuHW 6 Due
Mar 10MonNo Class
Mar 12WedNo Class
16Mar 17MonNo Class
17Mar 19WedEnsembles and StackingISL 8.2
18Mar 24MonBoostingISL 8.2.3; ESL 10.1-10.9
LogitBoost +paper
19Mar 26WedBoostingESL 10.10-10.14
XGBoost, CatBoost, LightGBM
Mar 27ThuHW 7 Due
20Mar 31MonSpecial topics in boostingGeneralized +Additive Models (GAM)
21Apr 02WedFeature Importance and XAIInterpretable +Machine Learning: Permutation Feature Importance
Apr 03ThuHW 8 Due
22Apr 07MonFeature EngineeringISLR 6.3
23Apr 09WedDensity/Probability EstimationIPSUR: +pg. 209-218 (MLE intro)
Maximum +Likelihood Estimation: pg. 221-223
IPSUR +Chap. 5-7 (Random Variable Review, if necessary)
Apr 10ThuHW 9 Due
24Apr 14MonDensity/Probability EstimationSilverman +2.1-2.6
25Apr 16WedGenerative ClassifiersISL 4.4-4.5; ESL 6.6
Apr 17ThuHW 10 Due
26Apr 21MonClustering 1Primary: ISL 12.4-12.5; ITDM 7.1; 7.5
Secondary: +ESL 14.3.1-14.3.8; 14.3.12; ITDM 7.2-7.3
27Apr 23WedClustering 2Primary: ESL 8.5.1
Secondary: Gaussian Mixture Models: +11.1-11.3
28Apr 28MonTBD
May 03SatFinal Exam
Due 5pm
+
+ +
+ + + + + + + + + +
+ + + + + + + + + + + + + + diff --git a/syllabus.html b/syllabus.html new file mode 100644 index 0000000..3200dbc --- /dev/null +++ b/syllabus.html @@ -0,0 +1,2270 @@ + + + + + + + + + + + + + +SYS-6018: Syllabus + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + + + +



+
+

Course Info

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + +
Course Info
Class Time:Mon, Wed 12:30 - 1:45pm
Class Location:Rice 011
Course Canvas site:https://canvas.its.virginia.edu/courses/133118
Course Teams site:SYS 6018 Teams (find join code on +Canvas)
+ ++++ + + + + + + + + + + + + + + + + + + + + +
InstructorDr. Michael D. Porter
Email:mdp2u {at} virginia.edu
Office:TBD and Zoom
Office Hours:Wed 2:00-3:00pm (and by appt.)
+ ++++ + + + + + + + + + + + + + + + + +
TASamarth Singh
Email:ss9vz {at} virginia.edu
Office Hours:TBD
+
+
+
+

Course Delivery

+

The course will be delivered live and in-person. There may be +occasional remote synchronous or pre-recorded lectures. Any such +lectures will be recorded and available on Canvas.

+
+
+
+

Course Prerequisites

+

Students taking this course should have prior knowledge in linear +regression analysis (e.g., SYS/STAT 4021/6021, STAT 5120), statistical +inference (e.g., APMA 3120), and linear algebra (e.g., APMA 3080). +Students should also have a basic working knowledge in a scientific +programming language (e.g., R, Python, Matlab). All course examples will +be in R (tidyverse dialect).

+
+
+
+

Course Description

+

Fundamentals of data mining and machine learning within a common +statistical framework. Topics include regression, classification, +clustering, resampling, regularization, tree-based methods, ensembles, +boosting, and predictive bias/calibration.

+
+
+
+

Student Learning Objectives

+

Students will learn how and when to use common data mining and +statistical learning methods, understand their comparative strengths and +weaknesses, and how to critically evaluate their performance. Students +completing this course should be able to: (i) construct and apply modern +statistical learning methods for predictive modeling, (ii) use +unsupervised learning methods to find patterns and structure in data, +(iii) properly select, tune, and evaluate models.

+
+
+
+

Required Textbooks

+
    +
  1. An Introduction to Statistical Learning (2nd) by +James, Witten, Hastie and Tibshirani. +
      +
    • An electronic version of this book is freely available at https://statlearning.com/. This book provides a less +technical description of common statistical learning methods.
    • +
  2. +
+
    +
  1. The Elements of Statistical Learning: Data Mining, +Inference, and Prediction (2nd Edition) by Hastie, Tibshirani, +and Friedman. +
  2. +
  3. Mining of Massive Datasets by Jure Leskovec, Anand +Rajaraman, Jeff Ullman +
      +
    • An electronic version of this book is freely available at http://www.mmds.org/. We +will only cover some parts of this text.
    • +
  4. +
  5. Introduction to Data Mining (Second Edition) by +Tan, Steinbach, Karpatne, and Kumar. +
  6. +
+
+
+
+

Other Course Materials

+
    +
  • This course requires the use of the following statistical and +typesetting software:

    +
      +
    • R (http://cran.us.r-project.org) is a free programming +language for statistical computing, graphics, and machine learning. I am +using R 4.4.2. It is recommended that you update to +this version or newer.
    • +
    • RStudio is a free IDE for R (https://posit.co/downloads/). I am using RStudio +2024.09.0+375. It is recommended that you update to this +version or newer.
    • +
  • +
  • Quarto (https://quarto.org/docs/get-started/) free technical +publishing system that replaces RMarkdown. We will use quarto documents +for homework. Version 1.5.56 or higher is required.

    +

  • +
  • Other course material and reading assignment will come from +instructor notes and recent journal articles.

  • +
+ +
    +
  • The free textbook Modern Data Science with +R by Baumer, Kaplan, and Horton is an undergrad level “Intro to Data +Science” course. It covers tidyverse, statistical inference, and basic +intro to many of the methods we will study this semester. This would +provide a good overall preparation or handy reference.

  • +
  • The free textbook Feature +Engineering and Selection: A Practical Approach for Predictive +Models by Kuhn and Johnson provides a more in-depth coverage of +feature engineering than we will be able to do in this course.

  • +
  • The free textbook Hands-on Machine Learning +with R by Boehmke and Greenwell gives R code with some helpful +details for most of the methods we will cover. This can be a handy +reference.

  • +
  • The free textbook Interpretable +Machine Learning by Christoph Molnar is described as A Guide for +Making Black Box Models Explainable and covers topics such as +feature importance and how to measure the influence of a feature on the +predictions (e.g., Shapley, Partial Dependence).

  • +
  • The free textbook Introduction to +Modern Statistics by Mine Çetinkaya-Rundel and Johanna Hardin is an +accessible introduction to modern (i.e., resampling based) statistical +inference. If you feel you are still missing the big picture of +statistical inference, this is a good place to start.

  • +
  • The free textbook Math for +Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng +Soon On is a good reference for the mathematical concepts helpful for +machine learning. Chapters 1-7 provide a good foundation for this +course.

  • +
  • The free textbook Forecasting: +Principles and Practice 3e by Rob J Hyndman and George +Athanasopoulos provides a great introduction to time series data and +forecasting.

  • +
+
+
+
+

Course Assessment

+
    +
  • The course grade will be based on ten homework assignments (65%), +reading quizzes (10%), course participation (in class and on teams) +(5%), and a final exam (20%).

  • +
  • A: \(\ge\) 92%, A-: 90-91%, B+: +88-89%, B: 82-87%, B-: 80-81%, etc.

    +
      +
    • A+: awarded rarely for exceptional work
    • +
  • +
  • There is no grade “curving” in this course.

    +
      +
    • There will be no make-up homework, exams, projects, or quizzes.
    • +
    • Note: There will be no “extra credit” assignments; spend your time +on the assigned work.
    • +
  • +
  • All homework assignment dates are posted in the course homework page. Note these now so there +are no conflicts.

  • +
  • All assignment submissions will be made through Canvas. You are +given a grace period of 3 minutes for late submissions, the +time stamps produced by Canvas will be the authoritative reference for +all such decisions. If you have special circumstances (e.g., a +documented physical condition) that prevent you from adhering to the +posted deadlines, please inform me at least 1 week in advance of the +deadline so that I can make arrangement to accommodate you.

  • +
+
+

Homework

+
    +
  • The 10 homeworks are each worth 50 pts (500 pts total). Your +homework percentage will be min(HW total, 475)/475 allowing you to +effectively drop low scoring problems. Another way to view this policy +is that receiving an unadjusted 95% will give you full homework credit. +
      +
    • Several homeworks (see homework page +will be treated like an exam; they are required and must be +completed independently (with no help from classmates).
    • +
    • You can discuss and work with classmates on the other homework +assignments, but what you submit must be in your own words (and code). +See Honor Code for more details.
    • +
  • +
  • Homework will be submitted as Quarto source (which will contain the +code) and the compiled html. +
      +
    • Quarto will +produce the html and contain the code.
    • +
    • All code must be easy to follow (e.g., by good commenting)
    • +
    • Mathematical symbols follows LaTex +notation.
    • +
  • +
  • You will self-assess your homework assignments. The purpose +of this is to allow you to actively compare your answers to the +solutions as the course progresses (instead of reviewing occasionally or +only at end of the course). This will provide immediate guidance if your +solutions are incorrect, show you improved coding, and give you +additional questions to ponder. +
      +
    • The TA will assign points; it is only your responsibility is to +indicate what you did wrong or didn’t complete. This is also a place to +ask questions if you aren’t sure if your solution is correct.
    • +
    • You will receive (+2) bonus points on each homework assignment that +you accurately self-grade within 2 days of the posted +solutions.
    • +
  • +
+
+
+

Quizzes

+
    +
  • There will be around 24 pre-class reading quizzes (due before the +start of class) each worth 1 point. Your quiz percentage will be +min(Quiz Total, 20)/20.

    +
      +
    • This effectively allows you to drop the 4 lowest quiz scores.
    • +
  • +
  • The pre-class quizzes are to encourage you to prepare for the +lectures.

  • +
  • Quizzes will completed in Canvas/Quizzes.

  • +
+
+
+

Course Participation

+

Your course participation grade is to encourage robust discussion +about the course material. I’ve found that students and the professor +often learn valuable insights from open discussion. You can earn your +participation score from in-class activity and/or posting questions or +responses on the course Teams page.

+

Full credit is equivalent to participating at least 1 time per +week.

+
+
+

Final Exam

+

The final exam will be a comprehensive review of all course materials +including lectures, readings, and homeworks.

+
+
+
+
+

Course Outline

+
    +
  • Bias-Variance Trade-off
  • +
  • Penalized Regression
  • +
  • Nonparametric Methods
  • +
  • Classification and Probability Modeling
  • +
  • Support Vector Machines
    +
  • +
  • Trees and Random Forest
  • +
  • Ensembles and Boosting
  • +
  • Resampling Methods
  • +
  • Feature Engineering and Importance
  • +
  • Predictive model evaluation
  • +
  • Non-parametric Density Estimation
  • +
  • Clustering
  • +
+
+
+
+

Course Management

+
    +
  • Most course material will be available from the class webpage
  • +
  • All assignments (e.g., homeworks, quizzes, exams) will be submitted +in Canvas
  • +
  • Announcements may be made in email or teams
  • +
  • Course Discussion on Teams +
      +
    • We will be using teams +for class discussion. Rather than emailing questions to the teaching +staff, I encourage you to post your questions here.
    • +
    • The teaching staff will always check discussions during our office +hours and possibly at other times.
    • +
    • Please feel free to answer questions from other students, but use +your discretion in not directly providing specific solutions to a +homework problem (e.g., don’t give the code that directly answers a +question).
    • +
    • Also, please post any discussion questions or material that you want +input from the class and instructors.
    • +
  • +
+
+
+
+

Recording of classroom lectures

+

In the event that I or a large number of students cannot attend class +in-person, I will record the lecture on zoom.

+

Because lectures may include fellow students, you and they may be +personally identifiable on the recordings. These recordings may +only be used for the purpose of individual or group +study with other students enrolled in this class during this semester. +You may not distribute them in whole or in part through any other +platform or to any persons outside of this class, nor may you make your +own recordings of this class unless written permission has been obtained +from the Instructor and all participants in the class have been informed +that recording will occur. If you want additional details on this, +please see Provost Policy +005.

+
+
+
+

Academic Calendar

+

Important dates for the semester can be found on the academic +calendar: http://www.virginia.edu/registrar/calendar.html

+
+
+
+

Policy on Academic Misconduct (Honor Code)

+

I trust every student in this course to fully comply with all +provisions of the University’s Honor Code and work together to maintain +UVA’s Community of +Trust. By enrolling in this course, you have agreed to abide by and +uphold the Honor System of the University of Virginia, as well as the +following policies specific to this course.

+
    +
  • All submitted work must be pledged.
  • +
  • You are not permitted to submit any work after you have accessed the +solutions. Be careful not to accidentally view the solutions before your +final submission.
  • +
  • All work must be completed individually unless specific permissions +are given on the assignment. +
      +
    • Homework and in-class exercises can be discussed with classmates, +but the final write-up, code, and solutions must be your own. List the +names of who you worked with (like a citation).
    • +
    • The individual homework sets must be done completely on your own. +You are not to discuss exams with anyone except the teaching staff.
    • +
    • You are not permitted to copy code. You will no doubt come across +examples on the internet. You can consult them to help understand the +concept or process, but code in your own words.
    • +
  • +
  • It is a scholarly responsibility to attribute all your work. This +includes figures, code, ideas, etc. Think of it this way: Will someone +who reads your submission think that it is your original idea, figure, +code, etc? Add a link and/or reference to all sources you used to solve +a problem. It is really of no value to you when you just copy someone +else’s solutions (other then preserve a grade that you didn’t +earn).
  • +
  • It is not always easy to tell what qualifies as a violation, so do +not be afraid to talk to me about it. Such discussions do not imply +guilt of any kind.
  • +
  • All suspected violations will be forwarded to the Honor Committee, +and you may, at my discretion, receive an immediate zero on that +assignment regardless of any action taken by the Honor Committee.
  • +
+

Please let me know if you have any questions regarding the course +Honor policy. If you believe you may have committed an Honor Offense, +you may wish to file a Conscientious Retraction by calling the Honor +Offices at (434) 924-7602. For your retraction to be considered valid, +it must, among other things, be filed with the Honor Committee before +you are aware that the act in question has come under suspicion by +anyone. More information can be found at http://honor.virginia.edu. Your Honor representatives +can be found at: http://honor.virginia.edu/representatives.

+
+

Generative AI Policy

+

Generative AI (GenAI), like ChatGPT, is new disruptive technology +that has the potential to fundamentally change how we learn, code, and +do data science. However, there is little guidance on when and how to +use GenAI for learning. As such, I don’t feel very confident in +recommending or restricting its use. Therefore, there are no +Generative AI restrictions in this course. However, be sure to +follow the honor policy as stated above. You cannot copy code and must +attribute and detail if and how you used GenAI in the assignments.

+

GenAI tools can be an especially great resource for troubleshooting +and improving code. However, they can also limit your ability to +learn good coding if you become too dependent. You know you are +depending too heavily on GenAI when you can’t think how to begin a +problem.

+

I do not think GenAI is currently reliable enough to trust for +conceptual understanding. I still recommend the assigned reading and +references found in the course notes for additional learning resources. +If GenAI hallucinates in producing code, you will be able to see right +away that it does produced the desired result. However, if it +hallucinates about how a model works or perpetuates common +misconceptions on methodology you may not know about it for a long +time.

+
+
+
+
+

Disability Statement

+

The University of Virginia strives to provide accessibility to all +students. If you require an accommodation to fully access this course, +please contact the Student Disability Access Center (SDAC) at (434) +243-5180 or . If you are unsure if you require an +accommodation, or to learn more about their services, you may contact +the SDAC at the number above or by visiting their website at http://studenthealth.virginia.edu/student-disability-access-center/faculty-staff.

+
+
+
+

Your Well Being

+

The University of Virginia and SEAS serve as a safe space for +students and aims to promote your well-being. If you are feeling +overwhelmed, stressed, or isolated, there are many individuals here who +are ready and wanting to help. If you wish, you can make an appointment +with me to discuss in private. Alternatively, the Student Health Center +offers Counseling and Psychological Services (CAPS) https://www.studenthealth.virginia.edu/caps. If you +prefer to speak anonymously and confidentially over the phone, call +Madison House’s HELP Line 24/7 at434-295-8255 https://www.madisonhouse.org/overview-helpline/. +Engineering undergraduates are supported through an array of student +support services including peer-to-peer tutoring, professional +academic coaching, access to mental health support, and dedicated +advising. Graduate Engineering students can find similar student +support resources. If you are in another school, you can contact the +above Engineering resources and they will help direct you to the +appropriate resources.

+

If you or someone you know is struggling with gender, sexual, or +domestic violence, there are many community and University of Virginia +resources available. The Office of +the Dean of Students, Sexual +Assault Resource Agency (SARA), and UVA Women’s Center are +ready and eager to help. Contact the Director of Sexual and Domestic +Violence Services at 434-982-2774.

+
+
+
+

Discrimination and power-based violence

+

The University of Virginia is dedicated to providing a safe and +equitable learning environment for all students. To that end, it is +vital that you know two values that I and the University hold as +critically important:

+
    +
  1. Power-based personal violence will not be tolerated.
  2. +
  3. Everyone has a responsibility to do their part to maintain a safe +community on Grounds.
  4. +
+

If you or someone you know has been affected by power-based personal +violence, more information can be found on the UVA Sexual Violence +website that describes reporting options and resources available +<www.virginia.edu/sexualviolence>. As your professor and as a +person, know that I care about you and your well-being and stand ready +to provide support and resources as I can. As a faculty member, I am a +responsible employee, which means that I am required by University +policy and federal law to report what you tell me to the University’s +Title IX Coordinator. The Title IX Coordinator’s job is to ensure that +the reporting student receives the resources and support that they need, +while also reviewing the information presented to determine whether +further action is necessary to ensure survivor safety and the safety of +the University community. If you wish to report something that you have +seen, you can do so at the Just Report It portal. The +worst possible situation would be for you or your friend to remain +silent when there are so many here willing and able to help.

+
+
+
+

Religious Accommodations

+

Students who wish to request academic accommodation for a religious +observance should submit their request to me by email as far in advance +as possible. If you have questions or concerns about your request, you +can contact the University’s Office for Equal Opportunity and Civil +Rights (EOCR) https://eocr.virginia.edu/accommodations-religious-observance. +Accommodations do not relieve you of the responsibility for completion +of any part of the coursework you miss as the result of a religious +observance.

+
+ + + + + + + + +
+
+ +
+ + + + + + + + + + + + + + + + diff --git a/syllabus_online.html b/syllabus_online.html new file mode 100644 index 0000000..13b986a --- /dev/null +++ b/syllabus_online.html @@ -0,0 +1,2268 @@ + + + + + + + + + + + + + +SYS-6018: Syllabus + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + + + +



+
+

Course Info

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + +
Course Info
Class Time:Lectures posted Mon, Wed (Asynchronous)
Class Location:Online
Course Canvas site:https://canvas.its.virginia.edu/courses/133118
Course Teams site:SYS 6018 Teams (find join code on +Canvas)
+ ++++ + + + + + + + + + + + + + + + + + + + + +
InstructorDr. Michael D. Porter
Email:mdp2u {at} virginia.edu
Office:TBD and Zoom
Office Hours:Wed 2:00-3:00pm (and by appt.)
+ ++++ + + + + + + + + + + + + + + + + +
TASamarth Singh
Email:ss9vz {at} virginia.edu
Office Hours:TBD
+
+
+
+

Course Delivery

+

The course lectures will be delivered asynchronously. Recorded +lectures will be posted in Canvas on Mondays and Wednesdays.

+
+
+
+

Course Prerequisites

+

Students taking this course should have prior knowledge in linear +regression analysis (e.g., SYS/STAT 4021/6021, STAT 5120), statistical +inference (e.g., APMA 3120), and linear algebra (e.g., APMA 3080). +Students should also have a basic working knowledge in a scientific +programming language (e.g., R, Python, Matlab). All course examples will +be in R (tidyverse dialect).

+
+
+
+

Course Description

+

Fundamentals of data mining and machine learning within a common +statistical framework. Topics include regression, classification, +clustering, resampling, regularization, tree-based methods, ensembles, +boosting, and predictive bias/calibration.

+
+
+
+

Student Learning Objectives

+

Students will learn how and when to use common data mining and +statistical learning methods, understand their comparative strengths and +weaknesses, and how to critically evaluate their performance. Students +completing this course should be able to: (i) construct and apply modern +statistical learning methods for predictive modeling, (ii) use +unsupervised learning methods to find patterns and structure in data, +(iii) properly select, tune, and evaluate models.

+
+
+
+

Required Textbooks

+
    +
  1. An Introduction to Statistical Learning (2nd) by +James, Witten, Hastie and Tibshirani. +
      +
    • An electronic version of this book is freely available at https://statlearning.com/. This book provides a less +technical description of common statistical learning methods.
    • +
  2. +
+
    +
  1. The Elements of Statistical Learning: Data Mining, +Inference, and Prediction (2nd Edition) by Hastie, Tibshirani, +and Friedman. +
  2. +
  3. Mining of Massive Datasets by Jure Leskovec, Anand +Rajaraman, Jeff Ullman +
      +
    • An electronic version of this book is freely available at http://www.mmds.org/. We +will only cover some parts of this text.
    • +
  4. +
  5. Introduction to Data Mining (Second Edition) by +Tan, Steinbach, Karpatne, and Kumar. +
  6. +
+
+
+
+

Other Course Materials

+
    +
  • This course requires the use of the following statistical and +typesetting software:

    +
      +
    • R (http://cran.us.r-project.org) is a free programming +language for statistical computing, graphics, and machine learning. I am +using R 4.4.2. It is recommended that you update to +this version or newer.
    • +
    • RStudio is a free IDE for R (https://posit.co/downloads/). I am using RStudio +2024.09.0+375. It is recommended that you update to this +version or newer.
    • +
  • +
  • Quarto (https://quarto.org/docs/get-started/) free technical +publishing system that replaces RMarkdown. We will use quarto documents +for homework. Version 1.5.56 or higher is required.

    +

  • +
  • Other course material and reading assignment will come from +instructor notes and recent journal articles.

  • +
+ +
    +
  • The free textbook Modern Data Science with +R by Baumer, Kaplan, and Horton is an undergrad level “Intro to Data +Science” course. It covers tidyverse, statistical inference, and basic +intro to many of the methods we will study this semester. This would +provide a good overall preparation or handy reference.

  • +
  • The free textbook Feature +Engineering and Selection: A Practical Approach for Predictive +Models by Kuhn and Johnson provides a more in-depth coverage of +feature engineering than we will be able to do in this course.

  • +
  • The free textbook Hands-on Machine Learning +with R by Boehmke and Greenwell gives R code with some helpful +details for most of the methods we will cover. This can be a handy +reference.

  • +
  • The free textbook Interpretable +Machine Learning by Christoph Molnar is described as A Guide for +Making Black Box Models Explainable and covers topics such as +feature importance and how to measure the influence of a feature on the +predictions (e.g., Shapley, Partial Dependence).

  • +
  • The free textbook Introduction to +Modern Statistics by Mine Çetinkaya-Rundel and Johanna Hardin is an +accessible introduction to modern (i.e., resampling based) statistical +inference. If you feel you are still missing the big picture of +statistical inference, this is a good place to start.

  • +
  • The free textbook Math for +Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng +Soon On is a good reference for the mathematical concepts helpful for +machine learning. Chapters 1-7 provide a good foundation for this +course.

  • +
  • The free textbook Forecasting: +Principles and Practice 3e by Rob J Hyndman and George +Athanasopoulos provides a great introduction to time series data and +forecasting.

  • +
+
+
+
+

Course Assessment

+
    +
  • The course grade will be based on ten homework assignments (65%), +reading quizzes (10%), course participation (in class and on teams) +(5%), and a final exam (20%).

  • +
  • A: \(\ge\) 92%, A-: 90-91%, B+: +88-89%, B: 82-87%, B-: 80-81%, etc.

    +
      +
    • A+: awarded rarely for exceptional work
    • +
  • +
  • There is no grade “curving” in this course.

    +
      +
    • There will be no make-up homework, exams, projects, or quizzes.
    • +
    • Note: There will be no “extra credit” assignments; spend your time +on the assigned work.
    • +
  • +
  • All homework assignment dates are posted in the course homework page. Note these now so there +are no conflicts.

  • +
  • All assignment submissions will be made through Canvas. You are +given a grace period of 3 minutes for late submissions, the +time stamps produced by Canvas will be the authoritative reference for +all such decisions. If you have special circumstances (e.g., a +documented physical condition) that prevent you from adhering to the +posted deadlines, please inform me at least 1 week in advance of the +deadline so that I can make arrangement to accommodate you.

  • +
+
+

Homework

+
    +
  • The 10 homeworks are each worth 50 pts (500 pts total). Your +homework percentage will be min(HW total, 475)/475 allowing you to +effectively drop low scoring problems. Another way to view this policy +is that receiving an unadjusted 95% will give you full homework credit. +
      +
    • Several homeworks (see homework page +will be treated like an exam; they are required and must be +completed independently (with no help from classmates).
    • +
    • You can discuss and work with classmates on the other homework +assignments, but what you submit must be in your own words (and code). +See Honor Code for more details.
    • +
  • +
  • Homework will be submitted as Quarto source (which will contain the +code) and the compiled html. +
      +
    • Quarto will +produce the html and contain the code.
    • +
    • All code must be easy to follow (e.g., by good commenting)
    • +
    • Mathematical symbols follows LaTex +notation.
    • +
  • +
  • You will self-assess your homework assignments. The purpose +of this is to allow you to actively compare your answers to the +solutions as the course progresses (instead of reviewing occasionally or +only at end of the course). This will provide immediate guidance if your +solutions are incorrect, show you improved coding, and give you +additional questions to ponder. +
      +
    • The TA will assign points; it is only your responsibility is to +indicate what you did wrong or didn’t complete. This is also a place to +ask questions if you aren’t sure if your solution is correct.
    • +
    • You will receive (+2) bonus points on each homework assignment that +you accurately self-grade within 2 days of the posted +solutions.
    • +
  • +
+
+
+

Quizzes

+
    +
  • There will be around 24 pre-class reading quizzes (due before the +start of class) each worth 1 point. Your quiz percentage will be +min(Quiz Total, 20)/20.

    +
      +
    • This effectively allows you to drop the 4 lowest quiz scores.
    • +
  • +
  • The pre-class quizzes are to encourage you to prepare for the +lectures.

  • +
  • Quizzes will completed in Canvas/Quizzes.

  • +
+
+
+

Course Participation

+

Your course participation grade is to encourage robust discussion +about the course material. I’ve found that students and the professor +often learn valuable insights from open discussion. You can earn your +participation score from in-class activity and/or posting questions or +responses on the course Teams page.

+

Full credit is equivalent to participating at least 1 time per +week.

+
+
+

Final Exam

+

The final exam will be a comprehensive review of all course materials +including lectures, readings, and homeworks.

+
+
+
+
+

Course Outline

+
    +
  • Bias-Variance Trade-off
  • +
  • Penalized Regression
  • +
  • Nonparametric Methods
  • +
  • Classification and Probability Modeling
  • +
  • Support Vector Machines
    +
  • +
  • Trees and Random Forest
  • +
  • Ensembles and Boosting
  • +
  • Resampling Methods
  • +
  • Feature Engineering and Importance
  • +
  • Predictive model evaluation
  • +
  • Non-parametric Density Estimation
  • +
  • Clustering
  • +
+
+
+
+

Course Management

+
    +
  • Most course material will be available from the class webpage
  • +
  • All assignments (e.g., homeworks, quizzes, exams) will be submitted +in Canvas
  • +
  • Announcements may be made in email or teams
  • +
  • Course Discussion on Teams +
      +
    • We will be using teams +for class discussion. Rather than emailing questions to the teaching +staff, I encourage you to post your questions here.
    • +
    • The teaching staff will always check discussions during our office +hours and possibly at other times.
    • +
    • Please feel free to answer questions from other students, but use +your discretion in not directly providing specific solutions to a +homework problem (e.g., don’t give the code that directly answers a +question).
    • +
    • Also, please post any discussion questions or material that you want +input from the class and instructors.
    • +
  • +
+
+
+
+

Recording of classroom lectures

+

I will be recording every lecture to accommodate students who will be +learning remotely. Because lectures may include fellow students, you and +they may be personally identifiable on the recordings. These recordings +may only be used for the purpose of individual or group +study with other students enrolled in this class during this semester. +You may not distribute them in whole or in part through any other +platform or to any persons outside of this class, nor may you make your +own recordings of this class unless written permission has been obtained +from the Instructor and all participants in the class have been informed +that recording will occur. If you want additional details on this, +please see Provost Policy +005.

+
+
+
+

Academic Calendar

+

Important dates for the semester can be found on the academic +calendar: http://www.virginia.edu/registrar/calendar.html

+
+
+
+

Policy on Academic Misconduct (Honor Code)

+

I trust every student in this course to fully comply with all +provisions of the University’s Honor Code and work together to maintain +UVA’s Community of +Trust. By enrolling in this course, you have agreed to abide by and +uphold the Honor System of the University of Virginia, as well as the +following policies specific to this course.

+
    +
  • All submitted work must be pledged.
  • +
  • You are not permitted to submit any work after you have accessed the +solutions. Be careful not to accidentally view the solutions before your +final submission.
  • +
  • All work must be completed individually unless specific permissions +are given on the assignment. +
      +
    • Homework and in-class exercises can be discussed with classmates, +but the final write-up, code, and solutions must be your own. List the +names of who you worked with (like a citation).
    • +
    • The individual homework sets must be done completely on your own. +You are not to discuss exams with anyone except the teaching staff.
    • +
    • You are not permitted to copy code. You will no doubt come across +examples on the internet. You can consult them to help understand the +concept or process, but code in your own words.
    • +
  • +
  • It is a scholarly responsibility to attribute all your work. This +includes figures, code, ideas, etc. Think of it this way: Will someone +who reads your submission think that it is your original idea, figure, +code, etc? Add a link and/or reference to all sources you used to solve +a problem. It is really of no value to you when you just copy someone +else’s solutions (other then preserve a grade that you didn’t +earn).
  • +
  • It is not always easy to tell what qualifies as a violation, so do +not be afraid to talk to me about it. Such discussions do not imply +guilt of any kind.
  • +
  • All suspected violations will be forwarded to the Honor Committee, +and you may, at my discretion, receive an immediate zero on that +assignment regardless of any action taken by the Honor Committee.
  • +
+

Please let me know if you have any questions regarding the course +Honor policy. If you believe you may have committed an Honor Offense, +you may wish to file a Conscientious Retraction by calling the Honor +Offices at (434) 924-7602. For your retraction to be considered valid, +it must, among other things, be filed with the Honor Committee before +you are aware that the act in question has come under suspicion by +anyone. More information can be found at http://honor.virginia.edu. Your Honor representatives +can be found at: http://honor.virginia.edu/representatives.

+
+

Generative AI Policy

+

Generative AI (GenAI), like ChatGPT, is new disruptive technology +that has the potential to fundamentally change how we learn, code, and +do data science. However, there is little guidance on when and how to +use GenAI for learning. As such, I don’t feel very confident in +recommending or restricting its use. Therefore, there are no +Generative AI restrictions in this course. However, be sure to +follow the honor policy as stated above. You cannot copy code and must +attribute and detail if and how you used GenAI in the assignments.

+

GenAI tools can be an especially great resource for troubleshooting +and improving code. However, they can also limit your ability to +learn good coding if you become too dependent. You know you are +depending too heavily on GenAI when you can’t think how to begin a +problem.

+

I do not think GenAI is currently reliable enough to trust for +conceptual understanding. I still recommend the assigned reading and +references found in the course notes for additional learning resources. +If GenAI hallucinates in producing code, you will be able to see right +away that it does produced the desired result. However, if it +hallucinates about how a model works or perpetuates common +misconceptions on methodology you may not know about it for a long +time.

+
+
+
+
+

Disability Statement

+

The University of Virginia strives to provide accessibility to all +students. If you require an accommodation to fully access this course, +please contact the Student Disability Access Center (SDAC) at (434) +243-5180 or . If you are unsure if you require an +accommodation, or to learn more about their services, you may contact +the SDAC at the number above or by visiting their website at http://studenthealth.virginia.edu/student-disability-access-center/faculty-staff.

+
+
+
+

Your Well Being

+

The University of Virginia and SEAS serve as a safe space for +students and aims to promote your well-being. If you are feeling +overwhelmed, stressed, or isolated, there are many individuals here who +are ready and wanting to help. If you wish, you can make an appointment +with me to discuss in private. Alternatively, the Student Health Center +offers Counseling and Psychological Services (CAPS) https://www.studenthealth.virginia.edu/caps. If you +prefer to speak anonymously and confidentially over the phone, call +Madison House’s HELP Line 24/7 at434-295-8255 https://www.madisonhouse.org/overview-helpline/. +Engineering undergraduates are supported through an array of student +support services including peer-to-peer tutoring, professional +academic coaching, access to mental health support, and dedicated +advising. Graduate Engineering students can find similar student +support resources. If you are in another school, you can contact the +above Engineering resources and they will help direct you to the +appropriate resources.

+

If you or someone you know is struggling with gender, sexual, or +domestic violence, there are many community and University of Virginia +resources available. The Office of +the Dean of Students, Sexual +Assault Resource Agency (SARA), and UVA Women’s Center are +ready and eager to help. Contact the Director of Sexual and Domestic +Violence Services at 434-982-2774.

+
+
+
+

Discrimination and power-based violence

+

The University of Virginia is dedicated to providing a safe and +equitable learning environment for all students. To that end, it is +vital that you know two values that I and the University hold as +critically important:

+
    +
  1. Power-based personal violence will not be tolerated.
  2. +
  3. Everyone has a responsibility to do their part to maintain a safe +community on Grounds.
  4. +
+

If you or someone you know has been affected by power-based personal +violence, more information can be found on the UVA Sexual Violence +website that describes reporting options and resources available +<www.virginia.edu/sexualviolence>. As your professor and as a +person, know that I care about you and your well-being and stand ready +to provide support and resources as I can. As a faculty member, I am a +responsible employee, which means that I am required by University +policy and federal law to report what you tell me to the University’s +Title IX Coordinator. The Title IX Coordinator’s job is to ensure that +the reporting student receives the resources and support that they need, +while also reviewing the information presented to determine whether +further action is necessary to ensure survivor safety and the safety of +the University community. If you wish to report something that you have +seen, you can do so at the Just Report It portal. The +worst possible situation would be for you or your friend to remain +silent when there are so many here willing and able to help.

+
+
+
+

Religious Accommodations

+

Students who wish to request academic accommodation for a religious +observance should submit their request to me by email as far in advance +as possible. If you have questions or concerns about your request, you +can contact the University’s Office for Equal Opportunity and Civil +Rights (EOCR) https://eocr.virginia.edu/accommodations-religious-observance. +Accommodations do not relieve you of the responsibility for completion +of any part of the coursework you miss as the result of a religious +observance.

+
+ + + + + + + + +
+
+ +
+ + + + + + + + + + + + + + + + diff --git a/syllabus_undergrad.html b/syllabus_undergrad.html new file mode 100644 index 0000000..3b8f199 --- /dev/null +++ b/syllabus_undergrad.html @@ -0,0 +1,2272 @@ + + + + + + + + + + + + + +SYS-6018: Syllabus + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+
+
+
+ +
+ + + + + + + + + + +



+
+

Course Info

+ ++++ + + + + + + + + + + + + + + + + + + + + + + + + +
Course Info
Class Time:Mon, Wed 12:30 - 1:45pm
Class Location:Rice 011
Course Canvas site:https://canvas.its.virginia.edu/courses/133118
Course Teams site:SYS 6018 Teams (find join code on +Canvas)
+ ++++ + + + + + + + + + + + + + + + + + + + + +
InstructorDr. Michael D. Porter
Email:mdp2u {at} virginia.edu
Office:TBD and Zoom
Office Hours:Wed 2:00-3:00pm (and by appt.)
+ ++++ + + + + + + + + + + + + + + + + +
TASamarth Singh
Email:ss9vz {at} virginia.edu
Office Hours:TBD
+
+
+
+

Course Delivery

+

The course will be delivered live and in-person. There may be +occasional remote synchronous or pre-recorded lectures. Any such +lectures will be recorded and available on Canvas.

+
+
+
+

Course Prerequisites

+

Students taking this course should have prior knowledge in linear +regression analysis (e.g., SYS/STAT 4021/6021, STAT 5120), statistical +inference (e.g., APMA 3120), and linear algebra (e.g., APMA 3080). +Students should also have a basic working knowledge in a scientific +programming language (e.g., R, Python, Matlab). All course examples will +be in R (tidyverse dialect).

+
+
+
+

Course Description

+

Fundamentals of data mining and machine learning within a common +statistical framework. Topics include regression, classification, +clustering, resampling, regularization, tree-based methods, ensembles, +boosting, and predictive bias/calibration.

+
+
+
+

Student Learning Objectives

+

Students will learn how and when to use common data mining and +statistical learning methods, understand their comparative strengths and +weaknesses, and how to critically evaluate their performance. Students +completing this course should be able to: (i) construct and apply modern +statistical learning methods for predictive modeling, (ii) use +unsupervised learning methods to find patterns and structure in data, +(iii) properly select, tune, and evaluate models.

+
+
+
+

Required Textbooks

+
    +
  1. An Introduction to Statistical Learning (2nd) by +James, Witten, Hastie and Tibshirani. +
      +
    • An electronic version of this book is freely available at https://statlearning.com/. This book provides a less +technical description of common statistical learning methods.
    • +
  2. +
+
    +
  1. The Elements of Statistical Learning: Data Mining, +Inference, and Prediction (2nd Edition) by Hastie, Tibshirani, +and Friedman. +
  2. +
  3. Mining of Massive Datasets by Jure Leskovec, Anand +Rajaraman, Jeff Ullman +
      +
    • An electronic version of this book is freely available at http://www.mmds.org/. We +will only cover some parts of this text.
    • +
  4. +
  5. Introduction to Data Mining (Second Edition) by +Tan, Steinbach, Karpatne, and Kumar. +
  6. +
+
+
+
+

Other Course Materials

+
    +
  • This course requires the use of the following statistical and +typesetting software:

    +
      +
    • R (http://cran.us.r-project.org) is a free programming +language for statistical computing, graphics, and machine learning. I am +using R 4.4.2. It is recommended that you update to +this version or newer.
    • +
    • RStudio is a free IDE for R (https://posit.co/downloads/). I am using RStudio +2024.09.0+375. It is recommended that you update to this +version or newer.
    • +
  • +
  • Quarto (https://quarto.org/docs/get-started/) free technical +publishing system that replaces RMarkdown. We will use quarto documents +for homework. Version 1.5.56 or higher is required.

    +

  • +
  • Other course material and reading assignment will come from +instructor notes and recent journal articles.

  • +
+ +
    +
  • The free textbook Modern Data Science with +R by Baumer, Kaplan, and Horton is an undergrad level “Intro to Data +Science” course. It covers tidyverse, statistical inference, and basic +intro to many of the methods we will study this semester. This would +provide a good overall preparation or handy reference.

  • +
  • The free textbook Feature +Engineering and Selection: A Practical Approach for Predictive +Models by Kuhn and Johnson provides a more in-depth coverage of +feature engineering than we will be able to do in this course.

  • +
  • The free textbook Hands-on Machine Learning +with R by Boehmke and Greenwell gives R code with some helpful +details for most of the methods we will cover. This can be a handy +reference.

  • +
  • The free textbook Interpretable +Machine Learning by Christoph Molnar is described as A Guide for +Making Black Box Models Explainable and covers topics such as +feature importance and how to measure the influence of a feature on the +predictions (e.g., Shapley, Partial Dependence).

  • +
  • The free textbook Introduction to +Modern Statistics by Mine Çetinkaya-Rundel and Johanna Hardin is an +accessible introduction to modern (i.e., resampling based) statistical +inference. If you feel you are still missing the big picture of +statistical inference, this is a good place to start.

  • +
  • The free textbook Math for +Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng +Soon On is a good reference for the mathematical concepts helpful for +machine learning. Chapters 1-7 provide a good foundation for this +course.

  • +
  • The free textbook Forecasting: +Principles and Practice 3e by Rob J Hyndman and George +Athanasopoulos provides a great introduction to time series data and +forecasting.

  • +
+
+
+
+

Course Assessment

+
    +
  • The course grade will be based on ten homework assignments (65%), +reading quizzes (10%), course participation (in class and on teams) +(5%), and a final exam (20%).

  • +
  • A: \(\ge\) 92%, A-: 90-91%, B+: +88-89%, B: 82-87%, B-: 80-81%, etc.

    +
      +
    • A+: awarded rarely for exceptional work
    • +
  • +
  • There is no grade “curving” in this course.

    +
      +
    • There will be no make-up homework, exams, projects, or quizzes.
    • +
    • Note: There will be no “extra credit” assignments; spend your time +on the assigned work.
    • +
  • +
  • All homework assignment dates are posted in the course homework page. Note these now so there +are no conflicts.

  • +
  • All assignment submissions will be made through Canvas. You are +given a grace period of 3 minutes for late submissions, the +time stamps produced by Canvas will be the authoritative reference for +all such decisions. If you have special circumstances (e.g., a +documented physical condition) that prevent you from adhering to the +posted deadlines, please inform me at least 1 week in advance of the +deadline so that I can make arrangement to accommodate you.

  • +
+
+

Homework

+
    +
  • The 10 homeworks are each worth 50 pts (500 pts total). Your +homework percentage will be min(HW total, 450)/450 allowing you to +effectively drop low scoring problems. Another way to view this policy +is that receiving an unadjusted 90% will give you full homework credit. +
      +
    • Several homeworks (see homework page +will be treated like an exam; they are required and must be +completed independently (with no help from classmates).
    • +
    • You can discuss and work with classmates on the other homework +assignments, but what you submit must be in your own words (and code). +See Honor Code for more details.
    • +
  • +
  • Homework will be submitted as Quarto source (which will contain the +code) and the compiled html. +
      +
    • Quarto will +produce the html and contain the code.
    • +
    • All code must be easy to follow (e.g., by good commenting)
    • +
    • Mathematical symbols follows LaTex +notation.
    • +
  • +
  • You will self-assess your homework assignments. The purpose +of this is to allow you to actively compare your answers to the +solutions as the course progresses (instead of reviewing occasionally or +only at end of the course). This will provide immediate guidance if your +solutions are incorrect, show you improved coding, and give you +additional questions to ponder. +
      +
    • The TA will assign points; it is only your responsibility is to +indicate what you did wrong or didn’t complete. This is also a place to +ask questions if you aren’t sure if your solution is correct.
    • +
    • You will receive (+2) bonus points on each homework assignment that +you accurately self-grade within 2 days of the posted +solutions.
    • +
  • +
+
+
+

Quizzes

+
    +
  • There will be around 24 pre-class reading quizzes (due before the +start of class) each worth 1 point. Your quiz percentage will be +min(Quiz Total, 20)/20.

    +
      +
    • This effectively allows you to drop the 4 lowest quiz scores.
    • +
  • +
  • The pre-class quizzes are to encourage you to prepare for the +lectures.

  • +
  • Quizzes will completed in Canvas/Quizzes.

  • +
+
+
+

Course Participation

+

Your course participation grade is to encourage robust discussion +about the course material. I’ve found that students and the professor +often learn valuable insights from open discussion. You can earn your +participation score from in-class activity and/or posting questions or +responses on the course Teams page.

+

Full credit is equivalent to participating at least 1 time per +week.

+
+
+

Final Exam

+

The final exam will be a comprehensive review of all course materials +including lectures, readings, and homeworks.

+

The undergrad version of the course, SYS 4582, will have a reduced +final exam in comparison to the graduate course.

+
+
+
+
+

Course Outline

+
    +
  • Bias-Variance Trade-off
  • +
  • Penalized Regression
  • +
  • Nonparametric Methods
  • +
  • Classification and Probability Modeling
  • +
  • Support Vector Machines
    +
  • +
  • Trees and Random Forest
  • +
  • Ensembles and Boosting
  • +
  • Resampling Methods
  • +
  • Feature Engineering and Importance
  • +
  • Predictive model evaluation
  • +
  • Non-parametric Density Estimation
  • +
  • Clustering
  • +
+
+
+
+

Course Management

+
    +
  • Most course material will be available from the class webpage
  • +
  • All assignments (e.g., homeworks, quizzes, exams) will be submitted +in Canvas
  • +
  • Announcements may be made in email or teams
  • +
  • Course Discussion on Teams +
      +
    • We will be using teams +for class discussion. Rather than emailing questions to the teaching +staff, I encourage you to post your questions here.
    • +
    • The teaching staff will always check discussions during our office +hours and possibly at other times.
    • +
    • Please feel free to answer questions from other students, but use +your discretion in not directly providing specific solutions to a +homework problem (e.g., don’t give the code that directly answers a +question).
    • +
    • Also, please post any discussion questions or material that you want +input from the class and instructors.
    • +
  • +
+
+
+
+

Recording of classroom lectures

+

In the event that I or a large number of students cannot attend class +in-person, I will record the lecture on zoom.

+

Because lectures may include fellow students, you and they may be +personally identifiable on the recordings. These recordings may +only be used for the purpose of individual or group +study with other students enrolled in this class during this semester. +You may not distribute them in whole or in part through any other +platform or to any persons outside of this class, nor may you make your +own recordings of this class unless written permission has been obtained +from the Instructor and all participants in the class have been informed +that recording will occur. If you want additional details on this, +please see Provost Policy +005.

+
+
+
+

Academic Calendar

+

Important dates for the semester can be found on the academic +calendar: http://www.virginia.edu/registrar/calendar.html

+
+
+
+

Policy on Academic Misconduct (Honor Code)

+

I trust every student in this course to fully comply with all +provisions of the University’s Honor Code and work together to maintain +UVA’s Community of +Trust. By enrolling in this course, you have agreed to abide by and +uphold the Honor System of the University of Virginia, as well as the +following policies specific to this course.

+
    +
  • All submitted work must be pledged.
  • +
  • You are not permitted to submit any work after you have accessed the +solutions. Be careful not to accidentally view the solutions before your +final submission.
  • +
  • All work must be completed individually unless specific permissions +are given on the assignment. +
      +
    • Homework and in-class exercises can be discussed with classmates, +but the final write-up, code, and solutions must be your own. List the +names of who you worked with (like a citation).
    • +
    • The individual homework sets must be done completely on your own. +You are not to discuss exams with anyone except the teaching staff.
    • +
    • You are not permitted to copy code. You will no doubt come across +examples on the internet. You can consult them to help understand the +concept or process, but code in your own words.
    • +
  • +
  • It is a scholarly responsibility to attribute all your work. This +includes figures, code, ideas, etc. Think of it this way: Will someone +who reads your submission think that it is your original idea, figure, +code, etc? Add a link and/or reference to all sources you used to solve +a problem. It is really of no value to you when you just copy someone +else’s solutions (other then preserve a grade that you didn’t +earn).
  • +
  • It is not always easy to tell what qualifies as a violation, so do +not be afraid to talk to me about it. Such discussions do not imply +guilt of any kind.
  • +
  • All suspected violations will be forwarded to the Honor Committee, +and you may, at my discretion, receive an immediate zero on that +assignment regardless of any action taken by the Honor Committee.
  • +
+

Please let me know if you have any questions regarding the course +Honor policy. If you believe you may have committed an Honor Offense, +you may wish to file a Conscientious Retraction by calling the Honor +Offices at (434) 924-7602. For your retraction to be considered valid, +it must, among other things, be filed with the Honor Committee before +you are aware that the act in question has come under suspicion by +anyone. More information can be found at http://honor.virginia.edu. Your Honor representatives +can be found at: http://honor.virginia.edu/representatives.

+
+

Generative AI Policy

+

Generative AI (GenAI), like ChatGPT, is new disruptive technology +that has the potential to fundamentally change how we learn, code, and +do data science. However, there is little guidance on when and how to +use GenAI for learning. As such, I don’t feel very confident in +recommending or restricting its use. Therefore, there are no +Generative AI restrictions in this course. However, be sure to +follow the honor policy as stated above. You cannot copy code and must +attribute and detail if and how you used GenAI in the assignments.

+

GenAI tools can be an especially great resource for troubleshooting +and improving code. However, they can also limit your ability to +learn good coding if you become too dependent. You know you are +depending too heavily on GenAI when you can’t think how to begin a +problem.

+

I do not think GenAI is currently reliable enough to trust for +conceptual understanding. I still recommend the assigned reading and +references found in the course notes for additional learning resources. +If GenAI hallucinates in producing code, you will be able to see right +away that it does produced the desired result. However, if it +hallucinates about how a model works or perpetuates common +misconceptions on methodology you may not know about it for a long +time.

+
+
+
+
+

Disability Statement

+

The University of Virginia strives to provide accessibility to all +students. If you require an accommodation to fully access this course, +please contact the Student Disability Access Center (SDAC) at (434) +243-5180 or . If you are unsure if you require an +accommodation, or to learn more about their services, you may contact +the SDAC at the number above or by visiting their website at http://studenthealth.virginia.edu/student-disability-access-center/faculty-staff.

+
+
+
+

Your Well Being

+

The University of Virginia and SEAS serve as a safe space for +students and aims to promote your well-being. If you are feeling +overwhelmed, stressed, or isolated, there are many individuals here who +are ready and wanting to help. If you wish, you can make an appointment +with me to discuss in private. Alternatively, the Student Health Center +offers Counseling and Psychological Services (CAPS) https://www.studenthealth.virginia.edu/caps. If you +prefer to speak anonymously and confidentially over the phone, call +Madison House’s HELP Line 24/7 at434-295-8255 https://www.madisonhouse.org/overview-helpline/. +Engineering undergraduates are supported through an array of student +support services including peer-to-peer tutoring, professional +academic coaching, access to mental health support, and dedicated +advising. Graduate Engineering students can find similar student +support resources. If you are in another school, you can contact the +above Engineering resources and they will help direct you to the +appropriate resources.

+

If you or someone you know is struggling with gender, sexual, or +domestic violence, there are many community and University of Virginia +resources available. The Office of +the Dean of Students, Sexual +Assault Resource Agency (SARA), and UVA Women’s Center are +ready and eager to help. Contact the Director of Sexual and Domestic +Violence Services at 434-982-2774.

+
+
+
+

Discrimination and power-based violence

+

The University of Virginia is dedicated to providing a safe and +equitable learning environment for all students. To that end, it is +vital that you know two values that I and the University hold as +critically important:

+
    +
  1. Power-based personal violence will not be tolerated.
  2. +
  3. Everyone has a responsibility to do their part to maintain a safe +community on Grounds.
  4. +
+

If you or someone you know has been affected by power-based personal +violence, more information can be found on the UVA Sexual Violence +website that describes reporting options and resources available +<www.virginia.edu/sexualviolence>. As your professor and as a +person, know that I care about you and your well-being and stand ready +to provide support and resources as I can. As a faculty member, I am a +responsible employee, which means that I am required by University +policy and federal law to report what you tell me to the University’s +Title IX Coordinator. The Title IX Coordinator’s job is to ensure that +the reporting student receives the resources and support that they need, +while also reviewing the information presented to determine whether +further action is necessary to ensure survivor safety and the safety of +the University community. If you wish to report something that you have +seen, you can do so at the Just Report It portal. The +worst possible situation would be for you or your friend to remain +silent when there are so many here willing and able to help.

+
+
+
+

Religious Accommodations

+

Students who wish to request academic accommodation for a religious +observance should submit their request to me by email as far in advance +as possible. If you have questions or concerns about your request, you +can contact the University’s Office for Equal Opportunity and Civil +Rights (EOCR) https://eocr.virginia.edu/accommodations-religious-observance. +Accommodations do not relieve you of the responsibility for completion +of any part of the coursework you miss as the result of a religious +observance.

+
+ + + + + + + + +
+
+ +
+ + + + + + + + + + + + + + + +