This library is a series of unit tested, thoroughly commented CSV parsing functions which I have developed off and on since 2006. Extremely small and easy to implement; includes unit tests for the majority of odd CSV edge cases. Library supports different delimiters, qualifiers, and embedded newlines. Can read and write from data tables.
A few reasons:
- Compatible with DotNet Framework / C# 2.0 and later. Makes it easy to integrate this library into extremely old legacy projects.
- Between 16-32 kilobytes in size, depending on framework.
- No dependencies.
- Handles all the horrible edge cases from poorly written CSV generating software: custom delimiters, embedded newlines, and doubled-up text qualifiers.
- Reads via streams, optionally using asynchronous I/O. You can parse CSV files larger than you can hold in memory without thrashing.
- Ability to pipe data tables directly into SQL Server using Table Parameter Inserts
- Fastest with direct string parsing or async I/O, but good enough performance when reading from MemoryStreams
This library was designed to handle edge cases I experienced when working with partner files.
Case | Example |
---|---|
CSV files larger than can fit into memory streamed off disk | 10TB files |
Pipe delimited files | field1|field2 |
Hand written CSV with spaces after deliminters | "field1", "field2", "field3" |
Embedded newlines within a text qualifier | "field1\r\nanother line","field2" |
Text qualifiers as regular characters within a field | "field1",field2 "field2" field2,"field3" |
Doubled up text qualifiers within a qualified field | "field1","field2 ""field2"" field2","field3" |
Different line separators | CR, LF, something else |
Different text encoding | UTF-8, UTF-16, ASCII |
SEP= lines for European CSV files | sep=;\\r\\n |
Want to get started? Here are a few walkthroughs.
Do you have files that use the pipe symbol as a delimiter, or does your application need double quotes around all fields? No problem!
var settings = new CSVSettings()
{
FieldDelimiter = '|',
TextQualifier = '\'',
ForceQualifiers = true
};
s = array.ToCSVString(settings);
The latest asynchronous I/O frameworks allow you to stream CSV data off disk without blocking. Here's how to use the asynchronous I/O features of Dot Net 5.0:
using (var cr = CSVReader.FromFile(filename, settings)) {
await foreach (string[] line in cr) {
// Do whatever you want with this one line - the buffer will
// only hold a small amount of memory at once, so you can
// iterate at your own pace!
}
}
Don't worry if your project isn't yet able to use asynchronous foreach loops. You can still use the existing reader logic:
using (CSVReader cr = new CSVReader(sr, settings)) {
foreach (string[] line in cr) {
// Process this one line
}
}
You can serialize and deserialize between List and CSV arrays. Serialization supports all basic value types, and it can even optionally support storing null values in CSV cells.
var list = new List<MyClass>();
// Serialize an array of objects to a CSV string
string csv = CSV.Serialize<MyClass>(list);
// Deserialize a CSV back into an array of objects
foreach (var myObject in CSV.Deserialize<MyClass>(csv)) {
// Use the objects
}
For those of you who work in older frameworks that still use DataTables, this feature is still available:
// This code assumes the file is on disk, and the first row of the file
// has the names of the columns on it
DataTable dt = CSV.LoadDataTable(myfilename);
// Save a datatable to a file
dt.SaveAsCSV(myfilename, true);
The class CSV contains a lot of useful functions for hand rolling your own CSV related code. You can use any of the functions in the CSV
class directly.