-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Disambiguation production #63
base: master
Are you sure you want to change the base?
Conversation
If this would be desirable even for just nonterminals, then I'd be happy to neaten things up from the state it's in at the moment. |
Unhandled in parse at the moment
Ok, I neatened it up a little and limited disambiguation to nonterminals. Motivating example parsing an expression without precedence
I'm sure there's still some neatening + testing to do! |
As a further simplification to the implementation, I wonder if we could just make |
Ah, instead of making
I don't think this should be a performance disadvantage compared to running in |
This is a neat idea. I had a little think about the API and implementation: If we had expose :: Prod r e t a -> Prod r e t [a] and conceal :: Prod r e t [a] -> Prod r e t a (but perhaps with better chosen names), then it seems like we should be able to implement disambiguate :: ([a] -> [b]) -> Prod r e t a -> Prod r e t b
disambiguate f = conceal . fmap f . expose without it needing to be in grammar. This also seems to solve #62, because we can implement constraint :: (a -> Bool) -> Prod r e t a -> Prod r e t a
constraint = disambiguate . filter This is possible if we tweak the data Prod r e t a where
-- Applicative.
Terminal :: !(t -> Maybe a) -> !(Prod r e t (a -> [b])) -> Prod r e t b
NonTerminal :: !(r e t a) -> !(Prod r e t (a -> [b])) -> Prod r e t b
Pure :: [a] -> Prod r e t a
-- Monoid/Alternative. We have to special-case 'many' (though it can be done
-- with rules) to be able to satisfy the Alternative interface.
Alts :: ![Prod r e t a] -> !(Prod r e t (a -> [b])) -> Prod r e t b
Many :: !(Prod r e t a) -> !(Prod r e t ([a] -> [b])) -> Prod r e t b
-- Error reporting.
Named :: !(Prod r e t a) -> e -> Prod r e t a because then we get expose :: Prod r e t a -> Prod r e t [a]
expose (Terminal b p) = Terminal b (fmap (fmap pure) p)
expose (NonTerminal r p) = NonTerminal r (fmap (fmap pure) p)
expose (Pure x) = Pure [x]
expose (Alts as p) = Alts as (fmap (fmap pure) p)
expose (Many p q) = Many p (fmap (fmap pure) q)
expose (Named p n) = Named (expose p) n
conceal :: Prod r e t [a] -> Prod r e t a
conceal (Terminal b p) = Terminal b $ fmap (fmap concat) p
conceal (NonTerminal r p) = NonTerminal r $ fmap (fmap concat) p
conceal (Pure as) = Pure $ concat as
conceal (Alts as p) = Alts as $ fmap (fmap concat) p
conceal (Many p q) = Many p $ fmap (fmap concat) q
conceal (Named p n) = Named (conceal p) n but I have not tried updating the other modules (parser etc). Maybe it breaks down somewhere. |
I like It makes me worry a little bit from a performance point of view to be passing lists around everywhere in Just as an aside, @ollef, did you ever experiment with using |
mmm, I think that keeping this to nonterminals might be the pragmatic thing to do, with your scheme above it doesn't seem possible to push Attaching to nonterminals only seems important wrt |
I haven't thought it through, but I wonder if it would be possible to push disambiguations all the way down to |
hmm, I've got a testcase where: This works, and xx <-
disambiguate d $
(Foo <$ anyToken)
<|> (Bar <$ anyToken) This doesn't work, xx <-
disambiguate d $
(Foo <$ many anyToken)
<|> (Bar <$ anyToken) Sadly I've long since flushed the context needed to easily understand what's going on :( ... Oh, I still see several todos in the code, I guess I'm just hitting one of these cases past-me foresaw. ... On a hunch I removed the first case in |
Definitely WIP at the moment,
The thrust of it is that the continuations are explicitly
[a] -> [b]
now, which for most productions is just a map, however for disambiguation, this structure is exposed.Currently only nonterminals work, as, for example, the
Pure
case runs the continuation once for every individual element. What needs to change would be to maintain a list ofa
, and defer applyingargs
. I had a quick look at jiggling things around towards that end without luck.