Skip to content

Commit

Permalink
[Website] Correct statement about compression in FAQ (#541)
Browse files Browse the repository at this point in the history
  • Loading branch information
ianmcook authored Sep 14, 2024
1 parent 917e1ed commit 47eb6fb
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,10 +180,12 @@ This efficiency comes at the cost of relatively expensive reading into memory,
as Parquet data cannot be directly operated on but must be decoded in
large chunks.

Conversely, Arrow is an in-memory format meant for direct and efficient use
for computational purposes. Arrow data is not compressed (or only lightly so,
when using dictionary encoding) but laid out in natural format for the CPU,
so that data can be accessed at arbitrary places at full speed.
Conversely, Arrow is an in-memory format meant primarily for direct and
efficient use for computational purposes. Arrow data is typically not
compressed but laid out in natural format for the CPU, so that data can be
accessed at arbitrary places at full speed. (However, Arrow does provide a
limited set of options for increasing space efficiency, including
dictionary encoding, run-end encoding, and buffer compression.)

Therefore, Arrow and Parquet complement each other
and are commonly used together in applications. Storing your data on disk
Expand Down

0 comments on commit 47eb6fb

Please sign in to comment.