Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prettyprinter and tests #67

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Prettyprinter and tests #67

wants to merge 20 commits into from

Conversation

inariksit
Copy link
Contributor

@inariksit inariksit commented Sep 4, 2024

WIP, the Printer module is generated automatically from a throwaway BNFC grammar, and I just kept manually updating it as Lam4 CSTs evolved. I am still totally prepared to throw Printer.hs away.

The same PR also contains testing infra. I have made the following changes:

  • Move the file reading/parsing code from lam4-cli back into lam4-backend, as inspired by Preparations for proper tests. simala#13 . This so that it's possible to call those functions from tests and not have them in two different places.
  • Read all files in examples, parse them and prettyprint them, finally comparing the prettyprinted result against a golden file. Indirectly, this test also works as a test for the parser, because if the parser fails, the test will print the error message in the .actual file.

@inariksit inariksit marked this pull request as draft September 4, 2024 20:57
Comment on lines 174 to 176
-- Name is actually part of the Predicate.
-- doing this quick hack to move it to where it belongs.
Rec name (Predicate vars expr ref) -> prPrec i 0 (concatD [prt 0 (Predicate (name:vars) expr ref)])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ym-han I wonder about the reason to structure the AST like this. Here's an example of a tree with a Predicate:

Rec
    "`some predicate`"
    _4
    (Predicate
       []
       (BinExpr
          Gt
          (FunApp (Var "sumIntegers" _1) [ Lit (IntLit 1) , Lit (IntLit 2) ])
          (Lit (IntLit 2)))
       Nothing)

which comes from this Lam4 expression:

DECIDE `some predicate`
IF sumIntegers (1, 2) > 2

The string "`some predicate`" corresponds to the variable name on line 176, and the empty list is the vars. If I rendered name at the time of handling Rec, it would become like this instead:

`some predicate` DECIDE 
IF sumIntegers (1, 2) > 2

I can of course solve it by moving the name into the list of vars, and in rendering Predicate I know to treat the first element of the list specially. But changing the Haskell datatype sounds like a safer solution 😁

Copy link
Contributor

@ym-han ym-han Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things here (and let's talk more about this after the meeting today if time permits):

  1. Re Decl: I just remembered that Simala also has a Render / Pretty instance that might be helpful: https://github.com/smucclaw/simala/blob/746a6e0c98923374e55d7590fe30d93b1c9a3bef/src/Simala/Expr/Render.hs#L33
    I would adopt a strategy like the one they take, at least for now, since the intent behind my design of the Decl vs Expr is pretty similar to theirs. In particular, the name in the Decl is not in some sense unique to Predicates. A Decl is basically just a Binding, except that it'd typically be a binding for a declaration (and in the case of Lam4, if I recall correctly, a top-level declaration, though this might change in the future).
    Sorry, I should have linked you to their Render code earlier.

  2. The Predicate example, though, does remind me that I probably have to augment the concrete syntax in the Haskell backend with more type signature / annotation constructs (and similarly with constructs like the FunDecl), if we really want to be able to support 'feature-complete' prettyprinting. The concrete syntax is a bit too 'abstract' in that sense (my original intent had just been to get something that would be easy to desugar even more and evaluate; I hadn't been aiming for bidirectionality).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I see, I wasn't aware of the Simala AST. That makes total sense, and as I said, I can get the things rendered correctly by just passing some arguments to lower levels. I will in that case just make that more intentional.
  2. Yes, I also noticed that. This example for instance
// placeholder function for illustration
Integer => Integer
sumIntegers(x, y) = x + y

GIVEN ()
DECIDE `some predicate`
IF sumIntegers(1, 2) > 2

becomes this after parsing and prettyprinting:

sumIntegers (x, y) = x + y

GIVEN ()
DECIDE `some predicate`
IF sumIntegers (1, 2) > 2

Copy link
Contributor

@ym-han ym-han Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I see, I wasn't aware of the Simala AST. That makes total sense, and as I said, I can get the things rendered correctly by just passing some arguments to lower levels. I will in that case just make that more intentional.
  2. Yes, I also noticed that. This example for instance
// placeholder function for illustration
Integer => Integer
sumIntegers(x, y) = x + y

GIVEN ()
DECIDE `some predicate`
IF sumIntegers(1, 2) > 2

becomes this after parsing and prettyprinting:

sumIntegers (x, y) = x + y

GIVEN ()
DECIDE `some predicate`
IF sumIntegers (1, 2) > 2

I could also write a transformation to a 'more concrete' version of the concrete syntax. I just need to think more about this / play around with this a bit / look at your code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order for the natural language rendering, keeping the Lam4 annotations seem to be the most important thing missing right now. Still working under the assumption that the desired input vs. output is something like this:

  • Lam4 code
    STRUCTURE Lottery {
      /- description: "how much can be won from the jackpot." -/
     total_jackpot: Integer
      /- description: "whether buying tickets from this lottery is tax deductible." -/
    `tax deductible status`: Boolean
    }
  • NL rendering:
    A Lottery is a type of game. Each Lottery has associated with it information like
       * its ‘total jackpot’; i.e. how much can be won from the jackpot
       * its ‘tax deductible status’; i.e. whether buying tickets from this lottery is tax deductible
    

Copy link
Contributor

@ym-han ym-han Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into this! Probably the most important annotations / metadata to preserve would be the descriptions, if I'm understanding the point of this example correctly. (The Integer and Boolean type annotations in this example should already be preserved.)

Also, re what / how to render: you can think of that NL rendering I suggested as a 'default' template. But we should give users the ability to control things like whether it gets rendered at all (this would be the Simala-style Transparency), and if so, what 'template' to use for the rendering / how to render it (the latter is obviously going to require more work though, heh). I imagine that for certain audiences (e.g. busy executives), one might not even want to render most, perhaps even all, of the records and data types

Copy link
Contributor

@ym-han ym-han Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just added a commit to PR #69 -- 93e9ec7 --- that partially addresses this (and tweaks the convention re metadata in the Langium grammar).

The new record decl metadata convention (this has also changed for function and predicate decls):

/- description: "Info about a lottery." -/
STRUCTURE Lottery {
  -- how much can be won from the jackpot.
  total_jackpot: Integer
  -- whether buying tickets from this lottery is tax deductible.  
  `tax deductible status`: Boolean
}

The backend parsing for the /- description: "Info about a lottery." -/ doesn't work yet --- I want to refactor that part of the Langium grammar first. But the inline descriptions do work (though maybe I should move the description up above the rest in the RowTypeDecl...):

[ TypeDecl
    "Lottery"
    _1
    (RecordDecl
       [ MkRowTypeDecl
           { name = "total_jackpot" _3
           , typeAnnot = BuiltinType BuiltinTypeInteger
           , metadata = Just "-- how much can be won from the jackpot."
           }
       , MkRowTypeDecl
           { name = "`tax deductible status`" _4
           , typeAnnot = BuiltinType BuiltinTypeBoolean
           , metadata =
               Just
                 "-- whether buying tickets from this lottery is tax deductible.  "
           }
       ]
       []
       RecordDeclMetadata
         { transparency = Transparent , description = Nothing })
]

Hopefully this is helpful, and thank you @inariksit for pointing this out!

Copy link
Contributor Author

@inariksit inariksit Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I just changed the printer based on the new ConcreteSyntax.hs in add-eval branch. As of 93e9ec7 this

TypeDecl
    (MkName "Lottery" 1)
    (RecordDecl
       [ MkRowTypeDecl
           { name = MkName "total_jackpot" 2
           , typeAnnot = BuiltinType BuiltinTypeInteger
           , metadata = MkRowMetadata (Just "how much can be won from the jackpot")
           }
       , MkRowTypeDecl
           { name = MkName "`tax deductible status`" 4
           , typeAnnot = BuiltinType BuiltinTypeBoolean
           , metadata = MkRowMetadata (Just "whether buying tickets from this lottery is tax deductible")
           }
       ]
       []
       RecordDeclMetadata
         { transparency = Transparent , description = Just "game where you lose money" })

is prettyprinted into this:

/-
  description:  game where you lose money
-/
STRUCTURE Lottery
{
  /-
    description:  how much can be won from the jackpot
  -/
  total_jackpot : Integer
  
  /-
    description:  whether buying tickets from this lottery is tax deductible
  -/
  `tax deductible status` : Boolean
}

Copy link
Contributor

@ym-han ym-han Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah there is one little complication: the /- ... -/ is no longer permitted for the fields in the Langium grammar. I.e., it should be

/-
  description:  game where you lose money
-/
STRUCTURE Lottery
{
   -- how much can be won from the jackpot
  total_jackpot : Integer
  
   -- whether buying tickets from this lottery is tax deductible
  `tax deductible status` : Boolean
}

But if this new convention doesn't seem like a good idea, please do let me know!

Comment on lines 139 to 143
-- Function arguments are separated by newlines
newtype FunArg = FunArg {unFunArg :: String}

mkFunArg :: T.Text -> FunArg
mkFunArg t = FunArg [I.i|#{t} : TypeMissing|]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this, because I couldn't find the information about the argument types in the CST. Here's an example, when printing the file examples/infix_pred_app.l4.

  • Original:

    GIVEN (some_person: Person 
           another_person: Person)
    DECIDE some_person `has helped` another_person
    IF True
    
  • CST:

    Rec
    "has helped"
    _2
    (Predicate
       RuleMetadata
         { originalRuleRef = Nothing
         , transparency = Transparent
         , description = Nothing
         }
       [ "some_person" _3 , "another_person" _4 ]
       (Lit (BoolLit True)))
    

And this is how it's prettyprinted currently.

GIVEN (some_person : TypeMissing
       another_person : TypeMissing)
DECIDE `has helped`
IF True

(The CST is missing also the information about how the arguments are used in the body, but I will talk about that in another comment.)

, prt 0 (fmap (\a -> mkFunArg a.name) args) -- custom newtype to print args in a specific way
, doc (showString ")")
, doc (showString "\nDECIDE")
, prt 0 funname
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example from the file as in the comment above: examples/infix_pred_app.l4.

  • Original:
    GIVEN (person: Person)
    DECIDE `is fun`
    IF True
    
  • CST:
    Rec
      "is fun"
      _8
      (Predicate
        RuleMetadata
          { originalRuleRef = Nothing
          , transparency = Transparent
          , description = Nothing
          }
        [ "person" _9 ]
        (Lit (BoolLit True)))
    

With the current line 300, this is prettyprinted as follows:

GIVEN (person : TypeMissing)
DECIDE `is fun`
IF True

so it happens to work in this case, except for the TypeMissing part. But for the other cases where the predicate is actually applied to its argument, it's not working.

(pinging @ym-han for this and the comment on lines 138-143)

@inariksit inariksit force-pushed the prettyprinter branch 2 times, most recently from 00f00f1 to d36eb3e Compare September 23, 2024 11:06
@inariksit inariksit changed the title Prettyprinter (eventually) Prettyprinter and tests Sep 23, 2024
@inariksit inariksit marked this pull request as ready for review September 23, 2024 11:12
@inariksit inariksit marked this pull request as draft September 23, 2024 12:25
@inariksit inariksit marked this pull request as ready for review September 23, 2024 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants