diff --git a/docs/src/content/docs/guides/Examples/bean-mapping.mdx b/docs/src/content/docs/guides/Examples/bean-mapping.mdx index 8b657625..468cc5d0 100644 --- a/docs/src/content/docs/guides/Examples/bean-mapping.mdx +++ b/docs/src/content/docs/guides/Examples/bean-mapping.mdx @@ -8,7 +8,7 @@ Many CSV libraries come with built-in support for mapping CSV records to Java be While this is a convenient feature, the reflection-based approach used by most libraries comes with a heavy performance penalty, which contradicts FastCSV’s design goal of being fast. -Thanks to Java stream mapping, FastCSV can provide a similar feature without sacrificing performance. +Thanks to Java stream mapping, FastCSV can provide similar functionality without sacrificing performance. ## Example diff --git a/docs/src/content/docs/guides/Examples/byte-order-mark.mdx b/docs/src/content/docs/guides/Examples/byte-order-mark.mdx index c27f6863..53a948f8 100644 --- a/docs/src/content/docs/guides/Examples/byte-order-mark.mdx +++ b/docs/src/content/docs/guides/Examples/byte-order-mark.mdx @@ -8,10 +8,12 @@ FastCSV is capable of reading CSV files with a [Byte order mark](https://en.wiki (BOM) header. :::note -A BOM header is a sequence of bytes at the beginning of a text file that indicates the file's encoding, -such as UTF-8, UTF-16, or UTF-32. -Although UTF-8 is the default encoding for most text files today, -some applications still use the BOM header to specify the file's encoding. +A byte order mark (BOM) is a sequence of 2 to 4 bytes at the start of a text file +that serves as a header to indicate the file's Unicode encoding, such as UTF-8, UTF-16, or UTF-32. +For UTF-16 and UTF-32, the BOM header also indicates the byte order (big-endian or little-endian). + +While UTF-8 is the standard encoding for most text files today, +some applications still use the BOM header to explicitly specify the file's encoding. ::: Enabling automatic BOM header detection can impact performance. @@ -22,6 +24,16 @@ You may also want to check out the corresponding [Javadoc](https://javadoc.io/doc/de.siegmar/fastcsv/latest/de.siegmar.fastcsv/de/siegmar/fastcsv/reader/CsvReader.CsvReaderBuilder.html#detectBomHeader(boolean)) for more information. +The following table shows the BOM headers for different Unicode encodings that FastCSV can detect: + +| Encoding | BOM header (hex) | +|-------------|------------------| +| UTF-8 | `EF BB BF` | +| UTF-16 (BE) | `FE FF` | +| UTF-16 (LE) | `FF FE` | +| UTF-32 (BE) | `00 00 FE FF` | +| UTF-32 (LE) | `FF FE 00 00` | + ## Example In the following example, a CSV file with a BOM header is created and read using FastCSV.