Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native string interpolation syntax #570

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

brandonchinn178
Copy link
Contributor

@brandonchinn178 brandonchinn178 commented Jan 11, 2023

Related to #569, but is an orthogonal feature that could be added or not added independently of it. Additionally, I expect this to be more controversial than multiline string support. Regardless, there hasn't been a proposal for this yet, so this will at least get discussion going on an official channel.

Related discussions:

Rendered

Updates

Rough timeline, copied from a comment I made below:

  1. Finish string parsing refactor
  2. Prototype string interpolation in GHC, experiment with the different options live
    • Branch: wip/interpolated-strings
    • 2024-09-20: Started work
    • 2024-12-24: Got single-line interpolated strings working end-to-end
  3. Compile the different options into a doc
  4. Do a community poll on the doc
  5. Update proposal with the winner of the poll + document other options under Alternatives

@guibou
Copy link
Contributor

guibou commented Jan 11, 2023

Do you have any idea how your new syntax can be made extensible so it allows formatting, ala python "f-string", or https://hackage.haskell.org/package/PyF?

@nomeata
Copy link
Contributor

nomeata commented Jan 11, 2023

That's a big bike shed to paint, I expect :-)

I'm wondering if, these days, a feature like this could not begin as a library (likely using a GHC plugin). It'd be as opt in as a language extension, a bit less convenient to obtain (but not too bad in a cabal project), but allow quicker iteration. When it has become reasonable stable and reasonable popular, turning it into a native feature (for even easier access and better error messages, I presume) can then be discussed.

@JakobBruenker's plugin for banged monadic sub expressions is an example for that path.

@JakobBruenker
Copy link
Contributor

JakobBruenker commented Jan 11, 2023

Worth noting that syntax plugins, with what GHC offers at the moment, are a bit annoying if the GHC parser doesn't already accept what you want your syntax to be. (You'd need a pre-processor with -pgmF, but that doesn't easily handle Haskell sub-expressions...)

The s"..." syntax in particular though I don't think should be affected by that, since it's already valid syntax. (and my plugin wasn't affected by that, either)

@noughtmare
Copy link
Contributor

noughtmare commented Jan 11, 2023

To me it sounds like the principles that lead people to avoid Template Haskell are fixable. I think the main complaint against template Haskell is that it can run arbitrary code at compile time. But that can be addressed by adding a language extension like PureQuasiQuotes that only allows quasiquotes that only require the Quote typeclass and not the full Quasi type class (which brings in MonadIO). If you really want to be sure no malicious code gets executed you can combine it with a sandbox/vm and Safe Haskell (for now).

Or are there more reasons to avoid quasi quotes?

@brandonchinn178
Copy link
Contributor Author

More reasons people avoid Template Haskell: slow compilation, bad recompilation, and bad cross-compile support. The first two are potentially fixable, but at least the last one would involve redesigning how TH works.

I think PureQuasiQuote is a step in the right direction, but idk if it would solve those problems. Plus, as I mention in the proposal, it wouldnt allow reusing features like multiline string support, so youd have to reimplement the multiline indentation algorithm (not a big deal, but still)

@ChickenProp
Copy link

My feeling is that

  1. In general I think doing interpolation through template haskell seems fine. It doesn't work for everyone, but are there enough people it doesn't work for, to justify reifying this as a language feature?
  2. This particular approach seems costly. The vast majority of packages are going to "need" to define at least one Interpolate instance, requiring a minor version bump (per PVP) and perhaps some CPP for backwards compatibility. "Need" in scare quotes because sure, they could just not do that... but users are going to want it. I'd be nervous about imposing this on people. (I also don't love that it embeds String still deeper into the ecosystem, but eh, probably that ship has sailed and if we ever manage to move away from it, this won't make things that much harder.)
  3. Relatedly, this approach looks untested? If we're doing this, I'd expect us to take an existing TH-based (or plugin-based) interpolation library and say "we now support this style of interpolation natively, and if you want a different interpolation style there's still TH". Or perhaps to offer some way of making many different interpolation styles available without TH. Why propose a whole new one instead of something that already exists?

@brandonchinn178
Copy link
Contributor Author

I'm wondering if, these days, a feature like this could not begin as a library

@nomeata yeah, that's fair (which also relates to @ChickenProp's point). I might prototype a GHC plugin for this at some point

@nomeata
Copy link
Contributor

nomeata commented Jan 12, 2023

I’d also like to advertise the idea of giving quasiquotes better multi-line syntax in #569 (comment)

@re-xyr
Copy link

re-xyr commented Jan 12, 2023

My concern is that if you interpolate in some Text then they're first converted to Strings and then the whole thing is converted back to Text, which seems pretty inefficient to me.

@brandonchinn178
Copy link
Contributor Author

@re-xyr yes, you're right. With that performance issue and @endgame's comments about Interpolate allowing string interpolation to get around safe interpolation (e.g. with SqlQuery), I decided to change the design to the Interpolate s/InterpolateValue s a version. It should be safer and better scalable, but it's certainly a degree more complex, using more recent features like MultiParamTypeClasses.

I'll be prototyping this approach here — hopefully I'll be able to make progress on this in the near future

@brandonchinn178
Copy link
Contributor Author

I'm wondering if, these days, a feature like this could not begin as a library (likely using a GHC plugin).

@nomeata in this case, I think I'll need to use a preprocessor, so that I can prototype #569 at the same time, which isn't valid syntax currently.

One thing I just realized: it doesn't seem like GHC supports multiple -pgmF options, so if someone wants to try out both string interpolation and some other proposal, they won't be able to. Puts a damper in recommending this as a standard process

@JakobBruenker
Copy link
Contributor

@brandonchinn178 I haven't tried it, but I think you can by writing a script that runs both pre-processors and use that as pre-processor. But admittedly that's annoying (and only works if the pre-processors are compatible with each other.)

@nomeata
Copy link
Contributor

nomeata commented Jan 13, 2023

Yes, preprocessors are probably just good enough for demo propotypes, but not for production quality (like GHC plugin based approaches might be)

@konsumlamm
Copy link
Contributor

To me it sounds like the principles that lead people to avoid Template Haskell are fixable. I think the main complaint against template Haskell is that it can run arbitrary code at compile time. But that can be addressed by adding a language extension like PureQuasiQuotes that only allows quasiquotes that only require the Quote typeclass and not the full Quasi type class (which brings in MonadIO). If you really want to be sure no malicious code gets executed you can combine it with a sandbox/vm and Safe Haskell (for now).

Or are there more reasons to avoid quasi quotes?

Apart from the already mentioned problems, most string interpolation libraries depend on haskell-src-meta, which is a huge dependency. I think this is a big reason to avoid them, as the costs outweigh the gains (it's just nicer syntax for ++/show after all). If on the other hand, GHC had native string interpolation syntax, there wouldn't be many reasons not to use it. Another solution of course is to improve the ways to build interpolation quasiquoters (see https://gitlab.haskell.org/ghc/ghc/-/issues/20862), but given the comments on that issue, I don't have much hope that something like that will ever happen. haskell-src-meta also has the issue that it's not always up to date and more prone to have errors, e.g. see haskell-party/haskell-src-meta#4.

Another (minor) problem with quasiquoters is that they don't have nice syntax highlighting and I don't see how that could be realistically implemented for 3rd party libraries.

@brandonchinn178
Copy link
Contributor Author

FYI I have a working prototype at https://github.com/brandonchinn178/string-syntax + I've updated the proposal

@yy0zz
Copy link

yy0zz commented Jun 3, 2023

"printf is partial and unsafe, which especially safety-conscious people might always stay away from anyway."

You could patch "printf" to use the default formatting when the types don't match instead of crashing. E.g. printf "%d" "a" is "a" or replacement char. (Ugly, I know).

Or #387

@ribosomerocker
Copy link

@yy0zz Other than you personally mentioning that it's ugly, I would also like to mention that even with that merged proposal it's still not as convenient as what @brandonchinn178's proposal achieves. Speaking of, how come there's no activity? It would be really nice if we get this into Haskell...

@aryah47
Copy link

aryah47 commented Jun 2, 2024

far, far too saccharine for me (ie too much special syntactic sugar) -- whatever the semantics of this special s""" ... """ syntax should be, a normal variadic printf-like function (some_fn """ ... """) that takes a (multiline) string argument seems to be able to implement just as well.

@googleson78
Copy link
Contributor

printf is not comparable to string interpolation, as it achieves a different thing, imo. For example, with a printf style, you still need to match up the index of a following argument with the corresponding % in the string you're printfing in if you want to know what you'll actually be putting in your final string.

(as an aside, from looking at/using various type safe printf libraries in Haskell, I'm not convinced most of them offer much more of a benefit compared to using Text.concat or a lot of invocations of (<>))

@sgraf812
Copy link
Contributor

sgraf812 commented Aug 14, 2024

Explicit 1 has quadratic performance for "normal" String/Text interpolation. I don't think that's really acceptable, it's very easy to trigger.

Then here is design Explicit 3, featuring an Interpolate class but no implicit conversions (Edit: Note that this has been edited since the response of @s-and-witch below):

Explicit 3 / Implicit 2

https://play.haskell.org/saved/iHkkM7g3

class Buildable s => Interpolate a s where
  interpolate :: a -> Builder s 
  
instance Interpolate String String where
  interpolate = toBuilder
  
instance Interpolate (Endo String) String where
  interpolate = id
  
instance Interpolate T.Text T.Text where
  interpolate = toBuilder
  
instance Interpolate T.Builder T.Text where
  interpolate = id
  
showS :: (Show a) => a -> Builder String
showS = Endo . shows

showT :: (Show a) => a -> Builder T.Text
showT = fromString . show

main :: IO ()
-- main = do
--   putStrLn s"a ${"b"} c ${showS d}"
--   T.putStrLn s"a ${"b"} c ${showT d}"
main = do
  putStrLn $   fromBuilder (toBuilder (fromString "a ") <> interpolate (fromString "b" :: String) <> toBuilder (fromString " c ") <> interpolate (showS (42 :: Int)))
  T.putStrLn $ fromBuilder (toBuilder (fromString "a ") <> interpolate (fromString "b" :: T.Text) <> toBuilder (fromString " c ") <> interpolate (showT (42 :: Int)))

Note that instance Interpolate (Endo String) String means we can interpolate Builder String, which I use by writing a custom formatter showS that uses the Show instance. (It would be conceivable to use formatting here as well.)
I do not define the "implicit coercion" instance {-# OVERLAPPABLE #-} Show a => Interpolate a String, the impl of which would be the very showS.

I can do the same for Text, writing a custom formatter showT.

This is to demonstrate that the Implicit design from the OP is almost this design; the only difference is whether or not we want the overlappable Interpolate instance.

@s-and-witch
Copy link
Contributor

Explicit 2 you simply do this explicitly: write some functions stringValue :: String -> SqlQuery, intValue :: String -> SqlQuery that embed them appropriately.

That's obviously not what I want to get from SQL interpolation. I want to convert int into Builder SqlQuery, not SqlQuery, because the second is already cooked value with a query string and a list of arguments.

Let's visualise control flow of interpolated value. Current proposal:

Int becomes Builder s, get concatenated with other Builder s and get transformed into s

Explicit 2 with explicit interpolate call:

Int becomes s, then it becomes Builder s, get concatenated with other Builder s and get transformed into s

As you can see, there is an additional step "Int becomes s", where the original type information is lost because s supposed to be a final type with optimised representation unlike Builder s, that aims to hold building context. To avoid that, you have to hold building context inside s as well.

@s-and-witch
Copy link
Contributor

s-and-witch commented Aug 14, 2024

@sgraf812 type inference works in explicit 3 because of injectivity annotation on Builder, Interpolate class is unnecessary (or I didn't get your idea).

@sgraf812
Copy link
Contributor

sgraf812 commented Aug 14, 2024

@s-and-witch I elaborated proposal Explicit 3 and made the use of Interpolate more useful. Note that Explicit 3 is almost Implicit from the OP; the only difference is the overlappable Interpolate instance for Show (which could well live in an orphan module).

I wonder if @michaelpj likes Explicit 3 better than Explicit 1. It certainly has better inference behavior than Explicit 2 while String perf is good (better than Explicit 1).
(Note however that I deactivated -XOverloadedStrings and thus automatic insertion of fromString; the proposed desugaring is the expected one with -XOverloadedStrings active, as in Implicit.)

@michaelpj
Copy link
Contributor

@sgraf812

I wonder if @michaelpj likes Explicit 3 better than Explicit 1.

Personally I would consider it also an "implicit" version, it's just more measured about its conversions. I certainly like Explicit 3 (or should we call it Implicit 2 ;) ) more than Implicit 1.

I think that is mostly due to the subtext about the use pattern. As I understand you, you intend that Explicit 3 would be used as follows:

  • For a given string type S, the implementer of S might (should?) implement instance Interpolate S S and instance Interpolate (Builder S) S.
  • Optionally, the implementer of S might decide to offer some specific additional instances which provide other implicit conversions; these can either be bundled with the main type or in a separate module. For example, the author of a SqlQuery type might provide instance Interpolate Int SqlQuery.
  • So users of string interpolation will mostly be using functions to convert to S or Builder S, but there might be a few special extra cases.

Whereas I understood the use pattern of the original proposal to be:

  • For a given string type S, the implementor of S should implement many Interpolate instances for many different types, in order to make it as convenient as possible to interpolate almost anything into S.
  • So users of string interpolation will mostly be relying on Interpolate instances and not using functions.

For me, this doesn't answer the question of "are implicit conversions desirable here?", but I do think it's a better design if we want the implicit conversions.

Incidentally, I suppose we could just rename your Interpolate class to IsBuilder, by analogy with IsString. We could then view Explicit 3 as a combination of Explicit 2 plus "overloaded builders". The desugared expression would then look very consistent: we convert every part to a builder, making use of a combination of overloaded strings and overloaded builders.

@s-and-witch

Let's "write" interpolation instances for the fmt library (a lot of fancy libraries are built on top of it, and it supports multiple targets, so I suppose that it's a good example).

I don't think this is a good example because it already has a generic mechanism for supporting multiple targets. That's just smuggling in the N*M work already. A better example is just something like Text. When we add a new interpolatable type (e.g. the BigDecimal example from the proposal), do we write N instances for all the target types? When we add a new type that can be interpolated into (like Text), do we add write M instances for all the interpolatable types? I don't see why you think this isn't N*M.

Other than that, all (oh, at least some of them) these libraries appeared because their authors found existing formatter inconvenient and want to define their own behavior, so the logic can't really be shared.

So the key advantage of Explicit 2 here is that yes, we already have lots of functions for converting between lots of string types in lots of ways. Probably way more than N*M. But! They already exist! You can just use them! You don't also define Interpolate instances.

That's obviously not what I want to get from SQL interpolation. I want to convert int into Builder SqlQuery, not SqlQuery, because the second is already cooked value with a query string and a list of arguments.

Okay, then it seems like you have something different in mind than the SqlQuery example in the proposal. That example is I think entirely compatible with what I've been suggesting. I think you are asking for something new and stronger, which is a Builder-like thing that doesn't just avoid performance problems, but also has some kind of extra internal state (although I'm not sure I totally get what you want).

@s-and-witch
Copy link
Contributor

s-and-witch commented Aug 15, 2024

That's just smuggling in the N*M work already.

Actually, N plus M done on the side of fmt.

A better example is just something like Text.

This is not the responsibility of Text to define custom interpolation values (unless we are going to merge text-display or something inside it). I'm totally fine with Implicit 2 here.

Oh, wait, just realised the source of our disagreement. I'm assuming that nobody should define Interpolate instances like this:

instance Interpolate Int String
instance Interpolate Bool String
instance Interpolate Double String
instance Interpolate Int Text
instance Interpolate Bool Text
instance Interpolate Double Text
... 

It's indeed the poor interface for a library that want to format values. But instead, Interpolate should be an interface for calling another, better formatting libraries, where the output format rules what formatting library we use to interpolate values.

I think that we may enforce this in the interface, see Implicit 3 at the bottom of my post.

Okay, then it seems like you have something different in mind than the SqlQuery example in the proposal.

I think the best way to show what I want is to define this particular example: https://play.haskell.org/saved/FmD1nNSW

Design highlight:

  • The code doesn't concatenate QueryString until the very end (fromBuilder, call of cookSqlAccum. I didn't define it, but, assuming that this SQL query goes into the Postgres, it should insert $index in the place of the argument).
  • Interpolation doesn't touch [QueryString], it just adds a new item to the list of the arguments:
instance Interpolate Int SqlQuery where
  interpolate i = Endo (addSqlArgument (SqlInt i))
  • It's actually unsafe to interpolate SqlQuery inside SqlQuery (and I don't define this interpolation) at least because it has its own argument indexation starting from 1.
  • It would be safe to interpolate SqlQueryAccum, but I just didn't define it.

Implicit 3

The inference works fine because s is solvable from Builder's injectivity annotation: https://play.haskell.org/saved/DSwqiepK

class Interpolate s where
  type InterpolationConstraint :: Type -> Constraint
  interpolate :: InterpolationConstraint s a => a -> Builder s
  
instance Interpolate String where
  type InterpolationConstraint String = Show
  interpolate = Endo . shows

-- Demo of the explicit interpolation
instance Interpolate Text where
  type InterpolationConstraint Text = (~) Text
  interpolate = toBuilder

Pros:

  • The Interpolate type class now have some intention, it's way less lawless
  • No more "N*M" in any case
  • It's very easy to make interpolation explicit for the particular target using (~) Text trick.
    Cons:
  • It is not possible to define overlapping Interpolate String String

@sgraf812
Copy link
Contributor

sgraf812 commented Aug 15, 2024

Optionally, the implementer of S might decide to offer some specific additional instances which provide other implicit conversions; these can either be bundled with the main type or in a separate module. For example, the author of a SqlQuery type might provide instance Interpolate Int SqlQuery.

Yes. At the heart of Implicit 2 is that people are free to choose not to import "implicit conversions".
In that case, they simply would not import Data.String.Interpolate.Show, where the orphan instance instance {-# OVERLAPPABLE #-} Show a => Interpolate a s where interpolate = Endo . shows is defined.

Personally, I find Implicit 2 and an orphan module a bit too inconvenient; I would rather have the orphan instance provided by default. But I do have an alternative proposal: How about we build on the withDict mechanism and provide usingShow :: forall r. ((forall a. Show a => Interpolate a String) => r) -> r, to be used like putStrLn $ usingShow s"${name} ${age}"?
In order for this to work, we'd need to remove the super class from Interpolate.

I tried that here: https://play.haskell.org/saved/gpdiCYwI The key is:

usingShow :: forall r. ((forall a. Show a => Interpolate a String) => r) -> r
usingShow f = withDict @(forall a. Show a => Interpolate a String) @(forall a. Show a => a -> Builder String) showS f

Alas, GHC is unable to figure out the quantified WithDict constraint:

Main.hs:49:15: error: [GHC-39999]
    • No instance for ‘WithDict
                         (forall a. Show a => Interpolate a String)
                         (forall a. Show a => a -> Endo String)’
        arising from a use of ‘withDict’

Perhaps the low-teck solution of adapter orphan instances (for Show, but also for formatting, etc.) isn't so bad.

@sgraf812
Copy link
Contributor

sgraf812 commented Aug 15, 2024

Implicit 3

I'm not a huge fan, I must say, exactly because of

It is not possible to define overlapping Interpolate String String

which is pretty strange. The additional constraint IMO serves no real use other than to complicate matters.
By contrast, Implicit 2 would encourage defining adapter orphan instances (for Show, formatting, etc.) in different modules. No need to define N*M instances if libraries have already done it. I think that's good design.

@s-and-witch
Copy link
Contributor

The additional constraint IMO serves no real use

It actually improves inference, especially with (~) case.

Saying that, I don't insist on any specific interpolation, I'm fine with all from the Implicit 1,2,3 list, but I don't like any of the proposed explicit ones.

BTW, here is the most powerful explicit:

Explicit 4

https://play.haskell.org/saved/tn6Rlg6x

-- s"Name: ${name}, age: ${age}"
f_mono name age = fromBuilder $
  toBuilder "Name: " <> name <> toBuilder ", age: " <> age

f_poly name age = fromBuilder $
  toBuilder (fromString "Name: ") <> name <> toBuilder (fromString ", age: ") <> age

Pros:

  • the most flexible interpolation
  • It is possible to build implicit interpolation on top of it without overhead if we insert interpolate calls, as @michaelpj proposed above
    Cons:
  • the most verbose usage from the proposed

P.S. I really hope that we all agree on Implicit 1/2 and close the topic.

@michaelpj
Copy link
Contributor

(I've edited Explicit 2 to add a simpler desugaring that I think is quite nice, and makes use of a generic function for concatenating with builders. There shouldn't be more overhead since the lists are short and foldMap/mconcat can be efficient. https://play.haskell.org/saved/a4vy5ysh)

@s-and-witch

I think the best way to show what I want is to define this particular example

I see. You really want to do things differently. I guess my feeling about this is: this is going beyond the usage pattern of string interpolation that I thought this proposal was targeting. This is not something you could write today as a monoidal concatenation of literals and values, e.g.: "literal1" <> something <> "literal2". So it's perhaps out of scope. Brandon's version can be written that way, but yours is beyond that. You could use the "literals and concatenation" pattern to build up a SqlQueryAccum and then call cookSqlAccum on it... but you can also do that with Explicit 2, and I understand that you want to omit the cookSqlAccum call.

Implicit 3

It seems to me that this says that the only way to interpolate into a String is by having Show called on the arguments, which definitely seems bad. If I already have a String and I want to insert it I don't want the quotes from the Show instance! I think we mostly agree that we want multiple ways to interpolate in a value of some type, either driven by typeclasses or functions.

(In the example you can see this, it prints: Name: "John", age: 42, rather than Name: John, age: 42, and I can't see how you could make it print the latter.)

BTW, here is the most powerful explicit:

I'm missing something, how is this different from Explicit 2? You've put the functional dependency back and not added the local function, but I think those accomplish more or less the same thing?

I really hope that we all agree on Implicit 1/2 and close the topic.

I think we have viable Implicit and Explicit designs, what we need is a way to decide what kind of design we want. I wonder if we should do a community poll or something?

@s-and-witch
Copy link
Contributor

This is not something you could write today as a monoidal concatenation of literals and values, e.g.: "literal1" <> something <> "literal2".

I don't get it. My types fit the Implicit 1/2 API, the only thing that we have to do to support it is not to change things.

but you can also do that with Explicit 2, and I understand that you want to omit the cookSqlAccum call.

Yes, but not only. With Explicit 2 I sometimes have to define interpolation for the intermediate builder, and sometimes for the final representation, and often define both because I can't have interpolation value either builder or the final one.

It seems to me that this says that the only way to interpolate into a String is by having Show called on the arguments, which definitely seems bad.

I agree, but it represents the idea that we delegate formatting to another class. And another option is to use type InterpolationConstraint String = (~) String.

I'm missing something, how is this different from Explicit 2?

Oh, well, the main difference is desugaring. Explicit 2 (old version):

f_mono name age = fromBuilder $
  toBuilder "Name: " <> toBuilder name <> toBuilder ", age: " <> toBuilder age

Explicit 4:

f_mono name age = fromBuilder $
  toBuilder "Name: " <> name <> toBuilder ", age: " <> age

@Profpatsch
Copy link

Profpatsch commented Aug 22, 2024

fwiw, I agree with @michaelpj that implicit string conversion is badbadnotgood.

I love how people discount the “yeah you might interpolate something in a form you didn’t want to show this way”, which is about 99% of the issues I have in production with interpolation.
I thought I had written a good human-readable text message and turns out half of it is some Show output nobody can read

The interpolation types should be forced to the type of the whole string, forcing the user to interpolate only things that are of that type. Everything else is madness imo (not even talking about the type ambiguity problems that are gonna result, with the bad error messages that go with it …).

I guess having an Interpolate fragment text class that can be used to allow for very restricted automatic conversions can be a way to go; but 100% it’s gonna take people about 10 seconds to add a Interpolate ByteString MyTextWrapper to every type in the world. The always do that for some reason and maybe that’s why god has left earth long ago.

@brandonchinn178
Copy link
Contributor Author

brandonchinn178 commented Aug 29, 2024

Apologies for the radio silence; I've been working on a refactor of String parsing, which will make it easier for me to prototype implementing string interpolation, which I haven't gotten to yet.

I've also yet to internalize the last stream of comments around the various Implicit/Explicit proposals. Some quick random musings:

  • Coming back to this with fresher eyes, I'm not sure I like the overloaded Show instance anymore. The docs for Show says it should be a "syntactically correct Haskell expression", which is semantically different from "render it in a human friendly way". So I think the out-of-the-box instances would be individually implemented for various types (most of which probably would use show anyway, like numbers, but that's incidental)

  • The elegance of the Explicit proposal is admittedly compelling, but practically, I already hate building Text error messages with a bunch of numbers, and constantly needing to call showT (at least text merged Text.show finally 🎉). String interpolation is going to be much less attractive to me if this is the final interface.

  • @michaelpj Your wishful thinking about s"" and si"" is interesting to me. Just throwing this out there: Would it be a decent compromise if we added both syntaxes with -XStringInterpolation? It shouldn't be too much of a difference in maintenance burden between supporting just si"" vs both s"" + si"". Or, as much as you might dislike the implicit version, would you rather have just the implicit version than have two different ways to interpolate?

For transparency, here's my rough TODO list: <EDIT: Moved into PR body>

@michaelpj
Copy link
Contributor

Or, as much as you might dislike the implicit version, would you rather have just the implicit version than have two different ways to interpolate?

Personally I'm unsure. I would probably use it, just explicitly. I would mostly just be sad. But it's not really my opinion that matters, I think we should ask the community. I mostly wanted to make the case that at least some people would prefer an Explicit design, and it was worth considering as a serious alternative.

@endgame
Copy link

endgame commented Sep 2, 2024

I mostly wanted to make the case that at least some people would prefer an Explicit design, and it was worth considering as a serious alternative.

Agreed. In my experience, the amount of implicit magic that is tolerable in a large codebase is much much smaller than people initially think. The low takeup of -XOverloadedLists compared to -XOverloadedStrings shows this, I think.

@konsumlamm
Copy link
Contributor

I still don't see how this is any more implicit magic than print? When you see print x, you know that it uses the Show instance for x to convert it to a string, just like when you'd see "${x}", you'd know that it uses the Interpolate instance for x.

IMO string interpolation would be pretty unergonomic without "implicit" conversions, to the point where I don't see much advantage over using <> and show.

I also don't know any other language where you can only interpolate strings and I've never had any problems with string interpolation in other languages.

@sgraf812
Copy link
Contributor

sgraf812 commented Sep 9, 2024

I appreciate @Profpatsch's case for a more explicit design. If I understand him correctly, the Explicit 3/Implicit 2 design in #570 (comment) would be a viable compromise, because it advocates for explicit calls to showS.

First, do note that the proposal can only prescribe instances for String. It has no bearing on library instances for ByteString, Html, Sql, although it sets precedent for a design.

In that, the overlapping instance Show a => Interpolate a String is controversial. So I suggest we simply move it into a module separate from Interpolate! That is the essence of Implicit 2 above (playground). People like @konsumlamm may import, e.g., Data.String.Interpolate.Show to get the overlapping instance for Show, but it is not imported by default. It should be good practice for other libraries to follow this pattern.

A community poll might be worthwhile, however IMO it should be a binary choice or otherwise use some kind of ranked voting mechanism. I imagine it will be challenging to pre-select designs to vote on and properly characterise their pros and cons!

@ChickenProp
Copy link

I suggest we simply move it into a module separate from Interpolate!

Is there precedent for this? What I imagine happening if we do this, is that some library authors import that module, exposing the instance from some of their modules; and then app authors find the instance is available in some but not others of their modules.

@googleson78
Copy link
Contributor

Is there precedent for this?

Yeah, generic-lens does this with its orphan IsLabel instance. I'm not sure what negative impact it has had, though, and if it's comparable to what is being discussed here.

@sgraf812
Copy link
Contributor

sgraf812 commented Sep 9, 2024

some library authors import that module, exposing the instance from some of their modules

I would consider a library that does this (i.e., importing the orphan instances) simply a bad library.
At any rate I do not expect many uses of string interpolation in libraries. In executables, sure!
There are a number of orphan instances in the wild and I have never heard of them causing trouble through a widely used library importing them.

The generic-lens example is great; similar to string interpolation, most uses of lenses are in executables, much less libraries on hackage.

@endgame
Copy link

endgame commented Sep 9, 2024

On orphan instances and their acceptability: I don't think IsLabel is a great example, because it is tied into a compiler extension and -XOverloadedLabels is specifically designed for people to extend and overload. The majority of orphan instances IME primarily come from libraries providing "orphan instance" packages (e.g. quickcheck-instances). These have been consistently controversial over the years, though the discussion has settled down more recently. This idiom is at least semi-discoverable and predictable.

I think that is the key: every less-controversial use of orphans is predictable in some way: foo-bar-instances provides instances from foo for bar, avoiding a big dependency footprint in foo (although even this is not universally liked; ISTR @ekmett not being a fan). generic-lens provides instances specifically about getting generic lenses. effectful has a defined idiom for where you put your orphan instance when defining an effect for an MTL-style typeclass.

(Personally, I am -1 on orphans unless there is no other option, but I wanted to lay out the wider context as I have seen it.)

@akhra
Copy link

akhra commented Sep 10, 2024

I don't think IsLabel is a great example, because it is tied into a compiler extension and -XOverloadedLabels is specifically designed for people to extend and overload.

all of that is true here as well though?

@ChickenProp
Copy link

Fwiw I do expect libraries to use string interpolation, e.g. for error messages.

@michaelpj
Copy link
Contributor

generic-lens is less problematic because IsLabel instances are per-target-type, so there is less chance of them clashing in general. generic-lens is on dangerous ground in principle since it is providing instances for van-Laarhoven lenses, which are just functions, but it turns out that nobody else wants to make IsLabel instances for functions so it hasn't been a problem in practice.

On the other hand, I expect a lot of people to use string interpolation, and there is no chance of then not hitting the overlapping instance if it's in scope.

(IMO orphan instances are just not a good way of doing this kind of configurability, since what you really want is something like a module-level switch.)

@Jashweii
Copy link

Taking this in another direction, C++'s std::format at some point updated to a compile time string by default.
All the text in an interpolated literal outside of the interpolations are compile time, so is it worth getting Symbol proxies instead?

data Uninterpolated (a :: Symbol) = Uninterpolated
-- "an ${example}" = "${Uninterpolated @"an "}${example}"

This would allow extra constraints on what can go in beyond interpolated values - you could even rule them out entirely.
This does differ from IsString.

@sgraf812
Copy link
Contributor

orphan, overlappable Interpolate instances in a separate module are problematic

Fair enough, then it is perhaps less controversial to not provide these orphan instances with Implicit 2. Users are free to define them nonetheless. Perhaps in the future, a scoped library solution like usingShow from #570 (comment) will work:

usingShow s"Hi ${name}! You're ${age} years old."

Needs someone to make it happen, but IMO it's a good trade-off.

compile time string

I don't see what use case an interpolatable Symbol enables. How would you interpolate runtime values into it?

Do note that the C++ motivation is compile-time checking of format specifiers (e.g., no %d on a string). It is a strength of interpolated strings that format specifiers and the thing to interpolate are not separated. Specialised formatting directives can be implemented as a library (open all comments in this thread and search for fmt). All code within ${_} is already type-checked. If I wanted to format strings at compile-time, I would simply use Template Haskell.

@Jashweii
Copy link

#570 (comment)
I'm not suggesting it for the result, just for the parts of the literal outside of the interpolation i.e. "Hello " in "Hello $world".
I wouldn't use Symbol but an equivalent new type, because people probably will want to interpolate Symbol in and it should be distinct from the parts outside of $. It enables compile time restrictions on the rest of the string for e.g. queries. I suppose GHC can already evaluate constants so that's not really an added benefit.

I still think ultimately it would be better if one class received the whole input at once (e.g. to do a single allocation or restrict combinations of arguments).

@ChickenProp
Copy link

ChickenProp commented Sep 18, 2024

@sgraf812 For usingShow, would something like this work?

newtype UsingShow s = UsingShow { usingShow :: s }
instance (Show a, Interpolate String s) => Interpolate a (UsingShow s) where ...

I could easily be missing something, but it feels like it solves the same problem as you're trying to, but with less magic.

But I think a problem with both of our versions is that everything now uses Show, even if they have their own instance. IIUC there's no way to have an Interpolate a s => Interpolate a (UsingShow s) that would take priority when available.

Hm, I previously suggested a fromFixedInterp that's related, but not quite a mirror. Here are three things a user might want available:

  1. "I only want to be able to interpolate values of the same type as the result." fromFixedInterp provides this if the default is (2) or (3).
  2. "I want to be able to interpolate arbitrary values, but only if there's an explicit instance for them." I don't know how to get this if the default is (1) or (3).
  3. "I want to be able to interpolate arbitrary values, falling back to show if there's no explicit instance." usingShow is trying to provide this if the default is (2), but per above doesn't quite work.

@L0neGamer
Copy link

I want to throw my hat into the ring in terms of opinions.

To me, string interpolation is a quick and easy way to embed string and non-string values into a stringy value

This means that explicit proposals are somewhat defeating the purpose here; I can see the strong arguments for them, but if I have to do s"Index {show ind}: {show value}" instead of s"Index {ind}: {value}", then I may as well write "Index " <> show ind <> ": " <> show value.

What I am more comfortable with is some kind of extensible string interpolater, like the si suggestion. We can then have strong inference with explicit converting to textual values as a base, and a separate way to infer the conversions.

Something specific I want to bring to the table is to introduce a Display typeclass for use specifically with showing values in a good way. Show has always been more of a debug serialisation option, and with a new typeclass the community can pick and choose how types are serialised without having to use Show. For example, integral values can likely just use Show underneath, while Rational should be converted to a Float or similar before being Displayed. A newtype wrapper for deriving Show-using instances can be easy to provide.

Potential Display definition:

class Display a where
  display :: (IsString s, Monoid s) => a -> s

newtype DisplayViaShow a = MkDVS a

instance Show a => Display (DisplayViaShow a) where
  display (MkDVS a) = fromString . show

Another option I hadn't considered before now is a mix and match approach for the interpolation itself. For example, in s"Hello {ss} world, numerically ${n}", we could require the entire string to be of type s, and that ss is that same type, but we try to display whatever n is due to the special combination of ${. Similar to the extensible string interpolation above, this can give more options, and can be a little stricter on types. We could even just drop Display and use Show, since you're now making the choice to use Show when you interpolate in this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs revision The proposal needs changes in response to shepherd or committee review feedback
Development

Successfully merging this pull request may close these issues.