Skip to content

Latest commit

 

History

History
43 lines (26 loc) · 1.92 KB

dataframe.md

File metadata and controls

43 lines (26 loc) · 1.92 KB

DataFrame Support

Since v4.8 support for Microsoft.Data.Analysis was added.

What's Supported?

Due to DataFrame being in general less functional than Parquet, only primitive (atomic) columns are supported at the moment. If DataFrame supports more functionality in future (see related links below), this integration can be extended.

When reading and writing, this integration will ignore any columns that are not atomic.

Writing

There is magic happening under the hood, but as a user you only need to call WriteAsync() extension method on DataFrame and specify the destination stream to write it to, like so:

DataFrame df;
await df.WriteAsync(stream);

Reading

As with writing, the magic is already done under the hood, so you can use System.IO.Stream's extension method to read from parquet stream into DataFrame

DataFrame df = await fs.ReadParquetAsDataFrameAsync();

Samples

For your convenience, there is a sample Jupyter notebook available that demonstrates reading parquet files into DataFrame and displaying them:

In order to run this notebook, you can use VS Code with Polyglot Notebooks extension.

Related Links