You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While investigating a potential memory leak in my Azure Function app, I noticed that the Managed Memory Tool in Visual Studio tells me that I have many objects in the .NET managed heap. When looking through those objects (byte[]), they seem to have Microsoft.IO.RecyclableMemoryStreamManager as their root.
Parquet.Net uses RecyclableMemoryStreamManager in DataColumnWriter.
the RecyclableMemoryStreamManager will use the properties MaximumFreeSmallPoolBytes and MaximumFreeLargePoolBytes to determine whether to put those buffers back in the pool, or let them go (and thus be garbage collected). It is through these properties that you determine how large your pool can grow. If you set these to 0, you can have unbounded pool growth, which is essentially indistinguishable from a memory leak.
It seems that the DataColumnWriter does not set these properties, so I guess that might be the reason for my app's high memory usage.
Should those MaximumFreeSmallPoolBytes and MaximumFreeLargePoolBytes properties be somehow user configurable? Maybe via ParquetOptions?
The text was updated successfully, but these errors were encountered:
hey @aloneguid ! we're also running into some weird memory issues in our app. we haven't done the same profiling as @TapaniAalto to be sure, but this seems like some low-hanging fruit that i'd be happy to tackle. How does it sound to set the defaults to this?
I'd also expose this through ParquetOptions to the library's users (we might bump this higher in our personal usage to 64MB & 192MB respectively). Would you prefer to use the same name for clarity, or something simplified like SmallBufferPoolLimit?
I'll open up a draft PR shortly, let me know any thoughts!
While investigating a potential memory leak in my Azure Function app, I noticed that the Managed Memory Tool in Visual Studio tells me that I have many objects in the .NET managed heap. When looking through those objects (
byte[]
), they seem to haveMicrosoft.IO.RecyclableMemoryStreamManager
as their root.Parquet.Net uses
RecyclableMemoryStreamManager
inDataColumnWriter
.In it's documentation it states that
And also that:
It seems that the
DataColumnWriter
does not set these properties, so I guess that might be the reason for my app's high memory usage.Should those
MaximumFreeSmallPoolBytes
andMaximumFreeLargePoolBytes
properties be somehow user configurable? Maybe viaParquetOptions
?The text was updated successfully, but these errors were encountered: