Skip to content

Latest commit

 

History

History
131 lines (90 loc) · 21.6 KB

syllabus.md

File metadata and controls

131 lines (90 loc) · 21.6 KB
title layout nav_order has_children
Syllabus
default
2
true

Syllabus 📋

Course Details 📌

Section: CSC 10800 (LEC): Foundations of Data Science
Dates: Tue/Thu, 3:30-4:45pm, Aug 28 - Dec 21
Location: Marshak Science Building, Rm 410
Instructor: Prof. Zach Muhlbauer | [email protected]
Office Hours: Wed 3-5pm over Zoom, or in person by appointment

Course Description 📄

This course introduces the fundamental concepts and computational techniques of data science to all students, including those majoring in the Arts, Humanities, and Social Sciences. Students engage with data arising from real-world phenomena—including literary corpora, spatial datasets, and social networks data—to learn analytical skills such as inferential thinking and computational thinking.

The competencies learned in this course will provide students with skills that will be of use in their professional careers, as well as tools to better understand, quantitatively and qualitatively, the social world around them. Finally, by teaching critical concepts and skills in computer programming and statistical inference, the class prepares students for further coursework in technology-aware fields of study, from Python programming and cultural analytics to the big umbrella of the Digital Humanities. The course is therefore designed for students who are new to statistics and programming. Students will make use of the Python programming language, but no computer science pre-requisites are required.

This course does not satisfy degree requirements for Computer Science students, who should not be enrolled in this course.

Course Materials 🗂️

All required reading materials, activities, and instructions are provided on the Schedule page. Additionally, datasets are provided on the [Datasets]((https://zmuhls.github.io/ccny-data-science/datasets/) page, and assets for the course website are hosted here.

Technical Readings: These readings draw from Melanie Walsh's open-access Introduction to Cultural Analytics and Python (2021), an online textbook written for students in humanities and social sciences to gain a practical introduction to the Python programming language within the context of cultural analysis. The textbook demonstrates how Python can be applied to a wide range of cultural materials, such as magazine articles, classic novels, TV scripts, technical manuals, social networks, and so more.

Critical Readings: These readings engage with the complex social and political dimensions of "big data" in contemporary U.S. society. Through them, we will explore how data has evolved into the world's most valuable commodity. Authors of these pieces will therefore challenge us to critically engage with the ethical concerns, power imbalances, and hidden costs associated with today's data-driven economy.

Grading Distribution 🧮

The grading distribution below offers a glimpse of how your work will be evaluated over the semester:

  • Collaborative Annotations: 150 pts (15%)
  • Programming Activities: 500 pts (50%)
    • 100 pts (10%) for notebook and reflection
  • Social Coding Portfolio: 250 pts (25%)
  • Participation & Attendance: 100 pts (10%)

Total Available Points: 1000 (100% or A)

Schedule 📅

Assigned readings, activities, and projects are linked below. All work is due on the date in the same row as the required reading or activity. If you encounter a dead link then give me holler!

Date Topic/Theme Critical Readings Technical Readings Due by Class
08/29 (Thur) Introduction
09/02 (Mon) No Class. College Closed.
09/03 (Tue) What is Data Science? Data 8: What is Data Science? (Ch. 1.1-1.2) —Join Hypothesis group via invite link & annotate readings
09/05 (Thur) Navigating Command Line Melanie Walsh:
(1) The Command Line
(2) How to Use Jupyter Notebooks
—Annotate reading
09/10 (Tue) Data Feminism D'Ignazio & Klein: Why Data Science Needs Feminism —Annotate reading
09/12 (Thur) What is Python? Walsh: Anatomy of a Python Script
09/17 (Tue) Digital Humanities 2.0 (1) Digital Humanities Manifesto 2.0
(2) Ted Underwood: Seven ways humanists are using computers to understand text
—Annotate reading
09/19 (Thur) Data Types & String Methods Walsh:
(1) Data Types
(2) String Methods
09/24 (Tue) Metaphors of Data Annette Markham: Undermining 'data' —Annotate reading
09/26 (Thur) Computational Thinking Jeanette Wing: Computational Thinking Walsh: Comparisons & Conditionals
10/01 (Tue) GitHub as a Platform Nadia Eghbal: "GitHub as a Platform" from Working in Public, (Read: PDF in Discord via #announcements) —Sign up for GitHub
10/03 (Thu) No classes scheduled
10/08 (Tue) Lists & Loops Walsh: Lists and Loops Activity 1: Building Blocks
10/10 (Thu) Batteries Included Walsh:
(1) Functions
(2) Common Python Errors
10/15 (Tue) No Classes Scheduled - Monday Schedule
10/17 (Thu) Python Jeopardy! Activity 2: Python Primer
10/22 (Tue) Against Cleaning Rawson & Muñoz: Against Cleaning
10/24 (Thur) Cleaning Data Alice Zhao: Data Cleaning
10/29 (Tue) (Re)Humanizing Data Anelise Hanson Shrout: (Re)Humanizing Data: Digitally Navigating the Bellevue Almshouse —Annotate reading(s)
10/31 (Thur) Practicing Pandas Walsh: Panda Basics I
11/05 (Tue) Data Visualized Basic data visualizations Walsh: Panda Basics I (cont'd)
11/07 (Thur) Data Ethics VPRO Documentary: Shoshana Zuboff on Surveillance Capitalism (Optional) Walsh: Users’ Data Activity 3: Practicing Pandas
11/12 (Tue) Distant Reading Stephen Ramsay: The Hermeneutics of Screwing Around; or What To Do With a Million Books —Annotate reading
11/14 (Thur) Digital Hermeneutics Walsh: TF-IDF with HathiTrust Data Activity 4: Writing Docs
11/19 (Tue) Sentiment Analysis Simone Rebora: Sentiment Analysis in Literary Studies: A Critical Survey Walsh: Sentiment Analysis
11/21 (Thur) Network Analysis Walsh: Network Analysis —Explore Social Network Datasets
11/26 (Tue) Network Visualization Walsh: Interactive Network Visualization
11/28 (Thur) No Class. College Closed. Activity 5: Data Visualization
12/03 (Tue) Reddit API Workshop Reddit Data
12/05 (Thur) Coworking Lab
12/10 (Tue) Lightning Talks Send slides by Sun
12/12 (Thur) Lightning Talks Send slides by Tue
12/18 (Wed) Last day to submit work Final: Social Coding Portfolio
Last updated: 2024 Sep 26 (9:09pm)

Class Technology 🛠️

Hardware 🔌

Bring a laptop or personal computer to (nearly) every time we meet.

Need a spare laptop? Try CCNY's Laptop Loaner Program.

Phones or Tablets: I strongly recommend using a laptop for this course, even though it is technically feasible to take notes and run snippets of code on smart tablets or mobile devices. Relying solely on these devices will present challenges when need to view or modify Jupyter notebooks, run Python files, or stage coding environments that require computational resources which smart devices lack. Still, if you're unable to rent a laptop, then please reach out to me as early as possible so that we can discuss alternative arrangements.

Software 🖥️

Jupyter Notebook: In-class lessons and homeworks are done in Jupyter Notebook so that we can use markdown and Python simultaneously. The notebooks assume a Python 3 installation with the standard modules from the Anaconda installation (e.g. NLTK, Pandas, Numpy and Matplotlib) linked on the schedule.

Hypothesis Annotation: Expect to post 2-3 annotations for most critical readings we do in this class, each about 25-50 words in length and assessed on thoughtfulness, style and craft, and a demonstrated effort to respond to others. Hypothesis can also be a useful research tool and means of information management, so I encourage you to engage with it throughout the course.

Class Policies ⚐

**Respect and accountability are crucial to productive class discussions.**As co-producers of knowledge, I am expecting that we will practice respect for each other and be accountable to our words and actions. The classroom space is a learning space that can be, at times, uncomfortable, especially as we speak through our different perspectives and experiences. As long as we strive to be respectful to each other and accountable to the opinions, comments, questions, and concerns we share, this learning space will become a great place for us to nudge our boundaries.

Attendance and participation is required. Learning a new programming language requires consistent practice and your understanding of the material will be greatly facilitated by your participation in class. Students are expected to come to class prepared, which includes completing the assigned reading before class and being ready to engage in class discussions.

Absences. If you are unable to attend class, please email me in advance. If you are unable to email me in advance, please let me know as soon as it is possible. I do not require proof and would just need to know if you are not going to be in class. You may miss up to 3 classes. Missing more than 3 classes may impact your grades in class.

Plagiarism and academic integrity. Plagiarism is copying and using other people’s words without proper acknowledgment or citation as it is indicated in the CUNY Policy on Academic Integrity. All writing submitted for this course is understood to be your original work. Knowing acts of plagiarism carry serious consequences that may result in required resubmission, a failing grade, or worst. If I suspect your writing or work is plagiarized, then I will request a consultation with you to discuss the matter, at which point we will discuss next steps with care and best intentions in mind. In return, I expect that you will take time to read and more importantly adhere to CCNY’s Policy on Academic Integrity.

Generative AI Policy. For permission to use large language models (LLMs) or generative artificial intelligence for programming activities or reflections, please write an email or speak with me in person about your interest in the technology. With that condition satisfied , I politely request that when you use ChatGPT-4x or Gemini you also share a link to the conversation or enclose a screenshot of the prompts used to generate output whether it be text, code, or multimedia. See the citation below for one such case where coauthorship of an image is attributed to ChatGPT-4.0 and myself, with the conversation linked after the date of use.

Muhlbauer, Zachary and ChatGPT-4.0. "AI Hype 2050." OpenAI, July 2024, Input: Generate an image for a syllabus section that humorously makes light of AI hype in the year 2050.

Grading policy. I use a 100% grading scale to assess final course grades, which correlates with the 1000 total points that you can earn from class proceedings. If you have a question or concern about your grade in the class, please bring it to my attention immediately.

B+ 87-89 C+ 77-79 D+ 67-69
A 93-100 B 83-86 C 73-76 D 60-66
A- 90-92 B- 80-82 C- 70-72

When learning a new area of study in addition to a programming language, it is inevitable that we end up with mistakes or “fail” to obtain our desired ends. For this reason and others, grades are awarded primarily on process and effort rather than accuracy or "correctness," with ample opportunities to participate in seminar and lab meetings.

Student Support Services 🛟

The AccessAbility Center/Student Disability Services ♿

Please see the following information for those in need of accomodations or services:

The AccessAbility Center/Student Disability Services ensures equal access and full participation to all of City College's programs, services, and activities by coordinating and implementing appropriate accommodations. If you are a student with a disability who requires accommodations and services, please visit the office in NAC 1/218, or contact AAC/SDS via email ([email protected]), or phone (212-650-5913 or TTY/TTD 212-650-8441).

CCNY Writing Center 🖋️

The CCNY Writing Center offers a supportive learning environment where students can have one-on-one tutoring sessions with writing consultants. It is a great resource for you to obtain extra help as you write and revise your papers. They DO NOT proofread your papers, but offer assistance on improving certain aspects of them. They also offer ESL tutoring. To set up an appointment visit their online booking system, or call 212-784-6065. I strongly advise you contact them as soon as possible, even if you don’t have anything specific you need assistance with yet.