Skip to content

Latest commit

 

History

History
33 lines (28 loc) · 1.28 KB

README.md

File metadata and controls

33 lines (28 loc) · 1.28 KB

Unix-Like Data Processing 2017

Challenges

Jon Bentley, a renowned computer scientist, known for the k-d tree data structure, once asked Donald Knuth, another renowned computer scientist, known for his foundational volumes on The Art of Computer Programming, to write a program that: Reads a file, determines the n most frequently used words in that file, and prints a sorted list of those words, along with their frequencies.

Knuth's original program used a fancy data structure, and required more than 10 "pages" of Pascal. Doug McIlroy, one of the Unix pioneers responded with the following 6-stage shell pipeline:

tr -cs A-Za-z '\n' | \
  tr A-Z a-z | \
  sort | \
  uniq -c | \
  sort -rn | \
  sed ${1}q

We encourage you present us with similar challenges in our Issues. In our experience, these can make for great exercises in our notes.