Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI ByteString is slow #23

Open
winterland1989 opened this issue Oct 21, 2016 · 5 comments · May be fixed by #34
Open

CI ByteString is slow #23

winterland1989 opened this issue Oct 21, 2016 · 5 comments · May be fixed by #34

Comments

@winterland1989
Copy link

Constructing a CI ByteString will ask for pinned memory, but usually the ByteString is short so this behavior not only add overhead but contribute to heap fragment. I think we can do better here, any idea?

@basvandijk
Copy link
Owner

Since we have to construct a new ByteString to foldCase the original we can't avoid asking for pinned memory.

What we could do is add an instance FoldCase ShortByteString. Care to write PR?

@winterland1989
Copy link
Author

winterland1989 commented Oct 21, 2016

OK, I'll send one. please reopen to track this.

BTW, what's the purpose of this rewrite rule?

{-# RULES "foldCase/ByteString" foldCase = foldCaseBS #-}

@basvandijk basvandijk reopened this Oct 21, 2016
@basvandijk
Copy link
Owner

For some reason that RULE made the benchmark faster.

@winterland1989
Copy link
Author

winterland1989 commented Oct 24, 2016

What if we implemented CI using a type family? then we can keep original ByteString slice and do a more efficient copy to FoldedCase ByteString. I think this is the best option but it has some compatibility issue. What do you think?

type family FoldedCase a where
    FoldedCase B.ByteString = Short.ShortByteString
    FoldedCase BL.ByteString = [Short.ShortByteString]
    FoldedCase T.Text = T.Text
    FoldedCase TL.Text = TL.Text

data CI s = CI { original   :: !s -- ^ Retrieve the original string-like value.
               , foldedCase :: !(FoldedCase s) -- ^ Retrieve the case folded string-like value.
                                  --   (Also see 'foldCase').
               }

Another reason i propose this solution is that the document of ShortByteString says It is suitable for use as an internal representation for code that needs to keep many short strings in memory, but it should not be used as an interchange type..

@winterland1989
Copy link
Author

winterland1989 commented Oct 24, 2016

Another approach is to provide a Data.CaseInsensitive.ByteString module which exports a specialized CIByteString type using ShortByteString internally. So with providing ShortByteString instance we have three options here.

ulidtko added a commit to ulidtko/haskell-case-insensitive that referenced this issue May 13, 2024
@ulidtko ulidtko linked a pull request May 13, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants