lot-language-learning-2023

Materials for LOT School 2023, "Language Learning: A Data-Driven Approach"

Course Description

In this course, we will examine early language learning through the lens of new data resources that facilitate quantitative studies. Our framework will be the "Standard Model" of Kachergis, Marchman, and Frank (2022) that links language input to processing and learning outcomes, and we will consider the strengths and weaknesses of this model for describing vocabulary learning as well as the learning of some morphology and syntax. Our hands-on approach will involve learning the use of CHILDES and childes-db for studying language input, Wordbank for studying language outcomes, and Peekbank for studying processing.

Prerequisite: Some knowledge of R sufficient to manipulate datasets from these resources.

Additional useful tools: familiarity with github for version control and the tidyverse for data manipulation and visualization.

Learning Goals

Discuss the "standard model" framework for early word learning, focusing on input, processing, and uptake constructs,
Compare different instruments and approaches for measuring child language,
Learn a reproducible workflow for exploring language acquisition data in R, and
Explore data from Wordbank, CHILDES, and Peekbank as a source of insights into language learning.

Software

Before we start, please ensure you have installed a recent version of:

R and R Studio - follow instructions here: https://posit.co/download/rstudio-desktop/
The tidyverse - write install.packages("tidyverse") in R once you have R and R studio installed

If these are not working on your computer, you won't be able to do any of the in-class assignments, which will make up the bulk of the course.

Course Schedule

Day 1: Foundations and workflow

Readings:

Agenda:

Introduce the data-driven perspective on early language learning acquisition
Discuss the three instruments/data sources used in the course: the MacArthur-Bates CDI, CHILDES, and the Looking While Listening paradigm
Practice the toolset (github, RMarkdown, and the Tidyverse) that we will use for the remainder of the course

Day 2: Characterizing vocabulary growth with Wordbank

Readings:

Bates & Goodman (1997), "On the inseparability of grammar and the lexicon"
Frank et al. (2021), Chapter 13, "Morphology, Grammar, and the Lexicon"

Agenda:

Take an in-depth look at Wordbank and the use of CDI data
Explore how to create reproducible pipelines with Wordbank data
Reproduce analyses on grammar/lexicon correspondences from Bates & Goodman (1997)

Day 3: Accessing language input with CHILDES and childes-db

We'll be using CHILDES and accessing it via childes-db. You can install childesr from CRAN via install.packages("childesr").

Readings:

Goals:

Learn about CHILDES and the CHAT format
Discuss issues of frequency and frequency estimation from corpus data
Reproduce analyses of the development of disjunction from Jasbi, Jaggi, Clark, & Frank (2022).

Day 4: Exploring online processing using Peekbank

We'll be working with data from Peekbank and using the peekbankr package, which can be installed via remotes::install_github("langcog/peekbankr").

Readings:

Agenda:

Introduce the looking-while-listening paradigm
Discuss the role of online language processing in language learning
Reproduce and extend results from Swingley & Aslin (2002).

We will devote ten minutes at the end of class to talking about the group projects on Friday. By the end of the day, please form a group and send me an email with the names of the people in your group and a paragraph about what you hope to do; I'll try to get you comments.

Day 5: Group projects

On the final day of the course, we will primarily be doing group projects. The goal of a group project is to work together to develop some of the ideas we have discussed.

Groups will be 2 - 3 people (more makes it impossible to code together all looking at the same screen).

You are encouraged to come up with your own project idea, and I am happy to talk with students about how to use these resources to explore your own interests. Here are a few "starter ideas".

Easier:

Estimate the frequencies of color terms (or some other interesting set of words) in speech to children over age (CHILDES)
Explore cohort effects on vocabulary size using the date_of_test field (Wordbank)
Look at grammar/lexicon relationships within specific lexical subcategories, perhaps for languages beyond English (Wordbank) - this was the challenge problem for Day 3

Harder:

Explore effects of maternal education on the growth of vocabulary in different categories (Wordbank)
Characterize the developmental trajectory of children's lexical diversity (e.g., MTLD) and how it differs by gender (CHILDES)
Measure whether there are sex differences in vocabulary variability (MADM)
Check on the presence of a noun bias in the new ASL CDI

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
pdf_slides		pdf_slides
solutions		solutions
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
day-1-tidyverse.Rmd		day-1-tidyverse.Rmd
day-2-wordbank.Rmd		day-2-wordbank.Rmd
day-3-childes-db.Rmd		day-3-childes-db.Rmd
day-4-peekbank.Rmd		day-4-peekbank.Rmd
lot-language-learning-2023.Rproj		lot-language-learning-2023.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lot-language-learning-2023

Course Description

Learning Goals

Software

Course Schedule

Day 1: Foundations and workflow

Day 2: Characterizing vocabulary growth with Wordbank

Day 3: Accessing language input with CHILDES and childes-db

Day 4: Exploring online processing using Peekbank

Day 5: Group projects

About

Releases

Packages

Languages

License

mcfrank/lot-language-learning-2023

Folders and files

Latest commit

History

Repository files navigation

lot-language-learning-2023

Course Description

Learning Goals

Software

Course Schedule

Day 1: Foundations and workflow

Day 2: Characterizing vocabulary growth with Wordbank

Day 3: Accessing language input with CHILDES and childes-db

Day 4: Exploring online processing using Peekbank

Day 5: Group projects

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages