Skip to content
stefan-schroedl edited this page Oct 13, 2014 · 5 revisions

A set of Unix command line tools for quick and convenient batch processing of tabular text files (a.k.a., tab-delimited, csv, or flat file format) with a header line. Provides delimiter and compression detection, column reference by name.

  • tblmap: per-line ("map") computation: derive columns through an expression, delete, reorder, filter rows.

  • tblred: compute ("reduce") aggregations (e.g., sum, average) over groups defined by key columns.

  • tbldesc: Summarize columns in file (e.g., proportion of character/numeric values, min/mean/median/max, missing values, correlation with a target column).

  • tbljoin: Relational join without the need of pre-sorting and column matching.

  • ...

See README for examples.

Clone this wiki locally