Skip to content

Latest commit

 

History

History
106 lines (72 loc) · 8.24 KB

day-4.md

File metadata and controls

106 lines (72 loc) · 8.24 KB

Day Four: Text as Data

Day: 0 | 1 | 2 | 3 | 4 | 5

Text analysis, digital reading, computation.

1. Reflection / discussion

Ted Underwood, paceofchange (2015). (GitHub repository sharing code to reproduce analysis reported in his article "How Quickly Do Literary Standards Change?". The article explicitly explains the research process, from collecting/selecting data to analysis. Creating a supervised classification machine learning model not to predict, but to interrogate the model.)

2. Text / Data Visualization

VPOD PoemChoice

Thinking with visualization vs. communicating with visualizations: analytics < - > communication

New ways of readings: statistics, visualizations, NLP

Visualize text(s):

Text Big Data:

Think about Big Data 3 V’s (volume, variety and velocity). Advances are driven by business seeking to process unstructured text data (the web / social media) to extract value.

Natural Language Processing / Machine Learning:

Programming as inquiry

Lunch Break

3. Unstructured text to data

OpenRefine Sonnets project

See Fetch and Parse Data with OpenRefine

4. Text and Machine Learning

OpenRefine sentiment analysis project

Visualize the numbers with Rawgraphs

5. Project Work and Discussion

Distant versus close reading. Digital editing and annotating, versus computation.

Resources

Gideon Lewis-Kraus, The Great A.I. Awakening, NYTimes Magazine, December 2016.

Ted Underwood, "Seven Ways Humanists are Using Computers to Understand Text" (2015). (intro overview of types of computational analysis)

Ted Underwood, paceofchange (2015). (GitHub repository sharing code to reproduce analysis reported in his article "How Quickly Do Literary Standards Change?". The article explicitly explains the research process, from collecting/selecting data to analysis.)

Ted Underwood, "The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us", in New Literary History (2014).

Jeffrey M. Binder, "Alien Reading: Text Mining, Language Standardization, and the Humanities" in Debates in the Digital Humanities (2016).

Stanford Literary Lab Pamphlets. (ongoing series of publications relating to "computational criticism")

Sunspring (a film script written by AI)

Periscopic data ("socially-conscious data visualization")

Tools directories:

Textbooks:

Syllabus / assignments:

Data: