-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native string interpolation syntax #570
base: master
Are you sure you want to change the base?
Native string interpolation syntax #570
Conversation
18e7f90
to
2ec836c
Compare
Do you have any idea how your new syntax can be made extensible so it allows formatting, ala python "f-string", or https://hackage.haskell.org/package/PyF? |
That's a big bike shed to paint, I expect :-) I'm wondering if, these days, a feature like this could not begin as a library (likely using a GHC plugin). It'd be as opt in as a language extension, a bit less convenient to obtain (but not too bad in a cabal project), but allow quicker iteration. When it has become reasonable stable and reasonable popular, turning it into a native feature (for even easier access and better error messages, I presume) can then be discussed. @JakobBruenker's plugin for banged monadic sub expressions is an example for that path. |
Worth noting that syntax plugins, with what GHC offers at the moment, are a bit annoying if the GHC parser doesn't already accept what you want your syntax to be. (You'd need a pre-processor with The |
To me it sounds like the principles that lead people to avoid Template Haskell are fixable. I think the main complaint against template Haskell is that it can run arbitrary code at compile time. But that can be addressed by adding a language extension like Or are there more reasons to avoid quasi quotes? |
More reasons people avoid Template Haskell: slow compilation, bad recompilation, and bad cross-compile support. The first two are potentially fixable, but at least the last one would involve redesigning how TH works. I think PureQuasiQuote is a step in the right direction, but idk if it would solve those problems. Plus, as I mention in the proposal, it wouldnt allow reusing features like multiline string support, so youd have to reimplement the multiline indentation algorithm (not a big deal, but still) |
My feeling is that
|
@nomeata yeah, that's fair (which also relates to @ChickenProp's point). I might prototype a GHC plugin for this at some point |
I’d also like to advertise the idea of giving quasiquotes better multi-line syntax in #569 (comment) |
My concern is that if you interpolate in some |
@re-xyr yes, you're right. With that performance issue and @endgame's comments about I'll be prototyping this approach here — hopefully I'll be able to make progress on this in the near future |
@nomeata in this case, I think I'll need to use a preprocessor, so that I can prototype #569 at the same time, which isn't valid syntax currently. One thing I just realized: it doesn't seem like GHC supports multiple |
e66b128
to
f43e388
Compare
@brandonchinn178 I haven't tried it, but I think you can by writing a script that runs both pre-processors and use that as pre-processor. But admittedly that's annoying (and only works if the pre-processors are compatible with each other.) |
Yes, preprocessors are probably just good enough for demo propotypes, but not for production quality (like GHC plugin based approaches might be) |
Apart from the already mentioned problems, most string interpolation libraries depend on Another (minor) problem with quasiquoters is that they don't have nice syntax highlighting and I don't see how that could be realistically implemented for 3rd party libraries. |
FYI I have a working prototype at https://github.com/brandonchinn178/string-syntax + I've updated the proposal |
c8f715f
to
7b4cc1d
Compare
"printf is partial and unsafe, which especially safety-conscious people might always stay away from anyway." You could patch "printf" to use the default formatting when the types don't match instead of crashing. E.g. Or #387 |
7b4cc1d
to
580ae21
Compare
@yy0zz Other than you personally mentioning that it's ugly, I would also like to mention that even with that merged proposal it's still not as convenient as what @brandonchinn178's proposal achieves. Speaking of, how come there's no activity? It would be really nice if we get this into Haskell... |
far, far too saccharine for me (ie too much special syntactic sugar) -- whatever the semantics of this special s""" ... """ syntax should be, a normal variadic printf-like function (some_fn """ ... """) that takes a (multiline) string argument seems to be able to implement just as well. |
(as an aside, from looking at/using various type safe printf libraries in Haskell, I'm not convinced most of them offer much more of a benefit compared to using |
Then here is design Explicit 3, featuring an Explicit 3 / Implicit 2https://play.haskell.org/saved/iHkkM7g3 class Buildable s => Interpolate a s where
interpolate :: a -> Builder s
instance Interpolate String String where
interpolate = toBuilder
instance Interpolate (Endo String) String where
interpolate = id
instance Interpolate T.Text T.Text where
interpolate = toBuilder
instance Interpolate T.Builder T.Text where
interpolate = id
showS :: (Show a) => a -> Builder String
showS = Endo . shows
showT :: (Show a) => a -> Builder T.Text
showT = fromString . show
main :: IO ()
-- main = do
-- putStrLn s"a ${"b"} c ${showS d}"
-- T.putStrLn s"a ${"b"} c ${showT d}"
main = do
putStrLn $ fromBuilder (toBuilder (fromString "a ") <> interpolate (fromString "b" :: String) <> toBuilder (fromString " c ") <> interpolate (showS (42 :: Int)))
T.putStrLn $ fromBuilder (toBuilder (fromString "a ") <> interpolate (fromString "b" :: T.Text) <> toBuilder (fromString " c ") <> interpolate (showT (42 :: Int))) Note that I can do the same for This is to demonstrate that the Implicit design from the OP is almost this design; the only difference is whether or not we want the overlappable |
That's obviously not what I want to get from Let's visualise control flow of interpolated value. Current proposal:
Explicit 2 with explicit interpolate call:
As you can see, there is an additional step "Int becomes s", where the original type information is lost because |
@sgraf812 type inference works in explicit 3 because of injectivity annotation on Builder, |
@s-and-witch I elaborated proposal Explicit 3 and made the use of I wonder if @michaelpj likes Explicit 3 better than Explicit 1. It certainly has better inference behavior than Explicit 2 while |
Personally I would consider it also an "implicit" version, it's just more measured about its conversions. I certainly like Explicit 3 (or should we call it Implicit 2 ;) ) more than Implicit 1. I think that is mostly due to the subtext about the use pattern. As I understand you, you intend that Explicit 3 would be used as follows:
Whereas I understood the use pattern of the original proposal to be:
For me, this doesn't answer the question of "are implicit conversions desirable here?", but I do think it's a better design if we want the implicit conversions. Incidentally, I suppose we could just rename your
I don't think this is a good example because it already has a generic mechanism for supporting multiple targets. That's just smuggling in the N*M work already. A better example is just something like
So the key advantage of Explicit 2 here is that yes, we already have lots of functions for converting between lots of string types in lots of ways. Probably way more than N*M. But! They already exist! You can just use them! You don't also define
Okay, then it seems like you have something different in mind than the |
Actually, N plus M done on the side of
This is not the responsibility of Oh, wait, just realised the source of our disagreement. I'm assuming that nobody should define Interpolate instances like this:
It's indeed the poor interface for a library that want to format values. But instead, I think that we may enforce this in the interface, see Implicit 3 at the bottom of my post.
I think the best way to show what I want is to define this particular example: https://play.haskell.org/saved/FmD1nNSW Design highlight:
instance Interpolate Int SqlQuery where
interpolate i = Endo (addSqlArgument (SqlInt i))
Implicit 3The inference works fine because class Interpolate s where
type InterpolationConstraint :: Type -> Constraint
interpolate :: InterpolationConstraint s a => a -> Builder s
instance Interpolate String where
type InterpolationConstraint String = Show
interpolate = Endo . shows
-- Demo of the explicit interpolation
instance Interpolate Text where
type InterpolationConstraint Text = (~) Text
interpolate = toBuilder Pros:
|
Yes. At the heart of Implicit 2 is that people are free to choose not to import "implicit conversions". Personally, I find Implicit 2 and an orphan module a bit too inconvenient; I would rather have the orphan instance provided by default. But I do have an alternative proposal: How about we build on the I tried that here: https://play.haskell.org/saved/gpdiCYwI The key is: usingShow :: forall r. ((forall a. Show a => Interpolate a String) => r) -> r
usingShow f = withDict @(forall a. Show a => Interpolate a String) @(forall a. Show a => a -> Builder String) showS f Alas, GHC is unable to figure out the quantified
Perhaps the low-teck solution of adapter orphan instances (for |
I'm not a huge fan, I must say, exactly because of
which is pretty strange. The additional constraint IMO serves no real use other than to complicate matters. |
It actually improves inference, especially with (~) case. Saying that, I don't insist on any specific interpolation, I'm fine with all from the Implicit 1,2,3 list, but I don't like any of the proposed explicit ones. BTW, here is the most powerful explicit: Explicit 4https://play.haskell.org/saved/tn6Rlg6x -- s"Name: ${name}, age: ${age}"
f_mono name age = fromBuilder $
toBuilder "Name: " <> name <> toBuilder ", age: " <> age
f_poly name age = fromBuilder $
toBuilder (fromString "Name: ") <> name <> toBuilder (fromString ", age: ") <> age Pros:
P.S. I really hope that we all agree on Implicit 1/2 and close the topic. |
(I've edited Explicit 2 to add a simpler desugaring that I think is quite nice, and makes use of a generic function for concatenating with builders. There shouldn't be more overhead since the lists are short and
I see. You really want to do things differently. I guess my feeling about this is: this is going beyond the usage pattern of string interpolation that I thought this proposal was targeting. This is not something you could write today as a monoidal concatenation of literals and values, e.g.:
It seems to me that this says that the only way to interpolate into a (In the example you can see this, it prints:
I'm missing something, how is this different from Explicit 2? You've put the functional dependency back and not added the local function, but I think those accomplish more or less the same thing?
I think we have viable Implicit and Explicit designs, what we need is a way to decide what kind of design we want. I wonder if we should do a community poll or something? |
I don't get it. My types fit the Implicit 1/2 API, the only thing that we have to do to support it is not to change things.
Yes, but not only. With Explicit 2 I sometimes have to define interpolation for the intermediate builder, and sometimes for the final representation, and often define both because I can't have interpolation value either builder or the final one.
I agree, but it represents the idea that we delegate formatting to another class. And another option is to use
Oh, well, the main difference is desugaring. Explicit 2 (old version): f_mono name age = fromBuilder $
toBuilder "Name: " <> toBuilder name <> toBuilder ", age: " <> toBuilder age Explicit 4: f_mono name age = fromBuilder $
toBuilder "Name: " <> name <> toBuilder ", age: " <> age |
fwiw, I agree with @michaelpj that implicit string conversion is badbadnotgood. I love how people discount the “yeah you might interpolate something in a form you didn’t want to show this way”, which is about 99% of the issues I have in production with interpolation. The interpolation types should be forced to the type of the whole string, forcing the user to interpolate only things that are of that type. Everything else is madness imo (not even talking about the type ambiguity problems that are gonna result, with the bad error messages that go with it …). I guess having an |
Apologies for the radio silence; I've been working on a refactor of String parsing, which will make it easier for me to prototype implementing string interpolation, which I haven't gotten to yet. I've also yet to internalize the last stream of comments around the various Implicit/Explicit proposals. Some quick random musings:
For transparency, here's my rough TODO list: <EDIT: Moved into PR body> |
Personally I'm unsure. I would probably use it, just explicitly. I would mostly just be sad. But it's not really my opinion that matters, I think we should ask the community. I mostly wanted to make the case that at least some people would prefer an |
Agreed. In my experience, the amount of implicit magic that is tolerable in a large codebase is much much smaller than people initially think. The low takeup of |
I still don't see how this is any more implicit magic than IMO string interpolation would be pretty unergonomic without "implicit" conversions, to the point where I don't see much advantage over using I also don't know any other language where you can only interpolate strings and I've never had any problems with string interpolation in other languages. |
I appreciate @Profpatsch's case for a more explicit design. If I understand him correctly, the Explicit 3/Implicit 2 design in #570 (comment) would be a viable compromise, because it advocates for explicit calls to First, do note that the proposal can only prescribe instances for In that, the overlapping A community poll might be worthwhile, however IMO it should be a binary choice or otherwise use some kind of ranked voting mechanism. I imagine it will be challenging to pre-select designs to vote on and properly characterise their pros and cons! |
Is there precedent for this? What I imagine happening if we do this, is that some library authors import that module, exposing the instance from some of their modules; and then app authors find the instance is available in some but not others of their modules. |
Yeah, |
I would consider a library that does this (i.e., importing the orphan instances) simply a bad library. The |
On orphan instances and their acceptability: I don't think I think that is the key: every less-controversial use of orphans is predictable in some way: (Personally, I am -1 on orphans unless there is no other option, but I wanted to lay out the wider context as I have seen it.) |
all of that is true here as well though? |
Fwiw I do expect libraries to use string interpolation, e.g. for error messages. |
On the other hand, I expect a lot of people to use string interpolation, and there is no chance of then not hitting the overlapping instance if it's in scope. (IMO orphan instances are just not a good way of doing this kind of configurability, since what you really want is something like a module-level switch.) |
Taking this in another direction, C++'s std::format at some point updated to a compile time string by default. data Uninterpolated (a :: Symbol) = Uninterpolated
-- "an ${example}" = "${Uninterpolated @"an "}${example}" This would allow extra constraints on what can go in beyond interpolated values - you could even rule them out entirely. |
Fair enough, then it is perhaps less controversial to not provide these orphan instances with Implicit 2. Users are free to define them nonetheless. Perhaps in the future, a scoped library solution like usingShow s"Hi ${name}! You're ${age} years old." Needs someone to make it happen, but IMO it's a good trade-off.
I don't see what use case an interpolatable Do note that the C++ motivation is compile-time checking of format specifiers (e.g., no |
#570 (comment) I still think ultimately it would be better if one class received the whole input at once (e.g. to do a single allocation or restrict combinations of arguments). |
@sgraf812 For newtype UsingShow s = UsingShow { usingShow :: s }
instance (Show a, Interpolate String s) => Interpolate a (UsingShow s) where ... I could easily be missing something, but it feels like it solves the same problem as you're trying to, but with less magic. But I think a problem with both of our versions is that everything now uses Hm, I previously suggested a
|
I want to throw my hat into the ring in terms of opinions. To me, string interpolation is a quick and easy way to embed string and non-string values into a stringy value This means that explicit proposals are somewhat defeating the purpose here; I can see the strong arguments for them, but if I have to do What I am more comfortable with is some kind of extensible string interpolater, like the Something specific I want to bring to the table is to introduce a Potential Display definition: class Display a where
display :: (IsString s, Monoid s) => a -> s
newtype DisplayViaShow a = MkDVS a
instance Show a => Display (DisplayViaShow a) where
display (MkDVS a) = fromString . show Another option I hadn't considered before now is a mix and match approach for the interpolation itself. For example, in |
Related to #569, but is an orthogonal feature that could be added or not added independently of it. Additionally, I expect this to be more controversial than multiline string support. Regardless, there hasn't been a proposal for this yet, so this will at least get discussion going on an official channel.
Related discussions:
Rendered
Updates
Rough timeline, copied from a comment I made below:
wip/interpolated-strings