Open
Description
There have been a few reports of slow binary decoding performance on Discourse recently:
- https://discourse.dhall-lang.org/t/figuring-out-performance-bottlenecks/251
- https://discourse.dhall-lang.org/t/high-memory-use-when-decoding-dhall-expressions/171
In general, people seem to get only about 10 MB/s.
I've tried profiling dhall decode
with the new --quiet
option, but the output only points at two functions from cborg
:
COST CENTRE MODULE SRC %time %alloc
getDecodeAction Codec.CBOR.Decoding src/Codec/CBOR/Decoding.hs:311:1-55 81.5 88.7
deserialiseIncremental Codec.CBOR.Read src/Codec/CBOR/Read.hs:(165,1)-(167,46) 18.5 11.0
Things to investigate:
- Might high rates of branch mispredictions slow us down?
Things to try:
-
Map
andSet
deserialization in the style of Deserializing maps and sets haskell/containers#405 - Adopt a UTF8-based string type for record and union labels etc.: Try ShortText or Basement.String instead of Text for strings in Expr #1032
- Tweak RTS options. A larger nursery might be useful?