-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbolic strings (Nix string contexts-like) #948
Comments
It looks sensible. Regarding supported operations, while A thing to note is that symbolic strings will probably not support contracts in a meaningful way in the cases (like Nix, if I understand correctly) where the symbolic chunks are meant to be evaluated outside of Nickel. Finally, I know that |
+1 to everything @aspiwack said. In addition:
|
True, if we have interpolation, we do have concatenation.
Oh for sure, I didn't want to think too hard about it and just write the issue down, but it's an awful name. |
Nickel already has multiline strings written Yo do want a different delimiter than just
You're totally right, I've updated the issue.
This is a good question. I don't know. The proposed approach is more general, but I don't have right now an obvious example that could make use of that. Your proposition also enforces that |
Also, if the delimiter is more than one character, do you repeat it at the end, or do you reverse it 😈 |
Oh of course. good call.
Lol, yeah. I think just
Yeah, that's what I was thinking. I don't know exactly what the right solution is. More general vs. enforcing that constraint. How easy would it be to change it out from under people once it's implemented? There would end up being a bunch of code depending on whatever format we choose. But also, it wouldn't be that hard to convert from one to the other. |
This is ok, you can do as in Rust raw strings, or cpp I think, which is to repeat |
I'm pro- including tags in the resulting Nickel values. It seems feasible to me that library authors might want to handle a literal value & the equivalent interpolated string differently, and this enables that at a very small overall cost. Removing the tags makes the feature less flexible for - as far as I can see - no real benefit. If the desired behaviour of a specific implementation is for |
For the opening identifier - I agree that As a related point in the design space: I like how Scala (3) exposes its Imagining a hypothetical future where something* like this was possible, and we could write
|
It's a small detail, but someone suggested that we allow special strings for code highlighting in editor, a bit like in markdown. In this case we want to disambiguate e.g. if A second point is that we may, in a distant future, want to mix both Terraform and Nix interpolation in the same string, like:
I imagine in any case you need a combined parsing function, but I wonder about the prefix for such string. Maybe it's fine to have Beside that, I like the idea. I just wonder how to select the parsing function, because there's currently absolutely no notion of program-wise declaration (somehow, all bindings are local) or name resolution in the language. Maybe the contract could be in charge of specifying that, and fail if the prefix doesn't match with the declared function. In that case the parser would interpret anything like Doing so isn't fully satisfying though:
|
I don't have a strong opinion, but I think the question is should we enforce that the law |
(a) I think the possibility of someone accidentally violating that law is non-trivial. |
I spent some more time thinking about this today. I'm broadly still of the opinion that differentiating between literal strings and interpolated values (regardless of their type) on a purely syntactic basis is my preferred solution. The mental model is obvious & it exactly matches the string as it's written. It's also how the I agree this opens the feature up to potentially unexpected implementations, but I think that's the responsibility of library authors using this feature to handle. (Much in the same way that in Haskell you can implement Whatever we go with, it will be possible to build safer APIs on top too - e.g. a stdlib contract like Another thing I think it's worth mentioning is that the question of how to represent strings is somewhat coupled to the question of whether or not to tag chunks. For example:
If we do tag values, but we also want to treat strings differently to enforce that interpolating a string is always the same as writing the literal, then we need to evaluate This problem goes away if we just return I'm not quite at the point of implementation where this decision needs to be made, but I'm also not far off. I'll probably just go with whatever's simplest for my initial implementation, but it shouldn't be too hard to change it if we decide to do so. |
I'm not sure this is a feature, as much as it's just that enforcing monad laws in undecidable (to do automatically) and impractical to prove today in Haskell. If Haskell had full-fledged dependent types, I think it wouldn't be out of the question for monads instances to be required to come with a proof of the monad laws. An unlawful monad is probably never what you want (you may want something Monad-like that doesn't respect the monad laws, but then you should probably not call it Monad). Once again, I think the question is really: should such a law hold, because it's natural, it's what people would expect, or for good theoretical reasons (such as breaking this law would break important properties of the language)? If we decide so, I think we can enforce it easily.
True. I think the suggestion of @Radvendii is to not tag at all, meaning that the library function has no way to differentiate between literal strings, interpolated literal strings and computed interpolated strings (we can still perform "effects" on pure string, though, it's just that we have to treat them all the same). Then everything would probably have the type |
I'm going to close this issue, as a version of symbolic strings has already been merged. There are definitely further discussions to be had about the shape of the feature, but it seems to make sense for those to happen in their own issues. |
Is your feature request related to a problem? Please describe.
Working on Nickel-nix and in general Nix integration (#693), we've been needing something like Nix string context.
String context is a way of implicitly and automatically attaching and combining metadata to string values (in the case of Nix, the dependencies that must be built before the paths present inside the string become valid). When interpolating strings with context inside another string, all the dependencies (the contexts) are combined. This feature is really useful to avoid specifying obvious dependencies explicitly (e.g. source files).
However, we don't want to implement Nix string contexts as it is, because it's pretty ad-hoc and Nix specific. We would rather like to have a more general mechanism, of which string context would just be an instance, that may be used for other domains (Terraform, Kubenertes, etc.), or different use-cases within Nix (IFD/recursive Nix-like).
Fundamentally, Nix string context are an overloading of string interpolation (and other string operations) to work on richer values than just string. Very schematically, Nix strings are rather
{ctxt : Array Deps, value: Str}
.We've discussed the possibilities many times. Having a general ad-hoc overloading mechanism would be possible but pretty heavy (think trait/typeclasses, or even a very restricted form just for strings), with the usual problems of coherence, complexity for new users, etc.
In some way, Nix string context might be implemented armed with effects (#85), e.g. if we allow to perform effects at string interpolation. However such an effect system is still to be properly designed for Nickel, and effects handler would be implemented in Rust, as interpreter plugins, which make them rather heavy to implement and distribute. For something like Nix string context, that could be ok, as we would have to do it once and for all per target tools. It's still a long way to get there.
This issue makes a simple and lighter proposal that could achieve the same effect, but relies only on one language feature (very small) and otherwise pure Nickel library code. It also seems to be forward-compatible with performing effects at string interpolation.
Describe the solution you'd like
We propose to introduce a new form of strings, let's call them symbolic strings, and write them using the delimiters
s%"
and"%s
. Normal strings with interpolation are parsed as a list of chunks, where one chunk is either a string literal or an interpolated expression. For example,"foo %{bar} baz"
is represented as (something like)[Chunk::Literal("foo "), Chunk::Expr(..), Chunk::Literal(" baz")]
. String chunks are then evaluated at runtime, when first encountered, and turned into an actual string.Symbolic strings would be almost the same, but they would return the chunks as a normal Nickel expression, and wouldn't evaluate them further. For example:
s%"foo %{bar} baz"%s
would just be equivalent to write(the shape of chunks is just an example, and up to discussion)
Then, the library consuming such a string, or even just the contract attached to the field, would be in charge of doing whatever they want with it. Typically, in Nickel-Nix, there already is a
nix_string_hack
function that can process this kind of list and produce an AST that is re-interpreted on the Nix side, reconstructing the contexts, thus giving the same automatic and implicit dependency management as in Nix. But it uses normal function calls and arrays, which is arguably not very nice to read. Here is an example of how it is used:Symbolic string would just be an alternative, better syntax for this expression, allowing to write:
Which is really not different from what you would write in Nix today.
The change on the language side is really minimal (interpolated strings are already parsed as chunks, we just need to transform them into a Nickel value). Because symbolic strings are just composite Nickel values, the only operation that is natively supported is interpolation (for example, you can't call
string.length
or). That being said, interpolation seems to be what you use 99% of the time, and string operations don't even make sense in some cases (such as knowing the length of a Terraform computed value like an IP). The library writers providing an "interpreter" for those strings may then export additional string manipulation functions if they make sense (in the case of Nix, we can know the path at evaluation time, so we may define and export more string primitives in the library).++
on themRelated approaches
In fact, this idea is very close to the
quasiquote
/unquote
/unquote-slice
mechanism of Lisp. Or, even more specifically, to the G-expressions of Guix, but with a more idiomatic Nickel string syntax (and probably a few unimportant differences, as in this proposal interpolating would probably be more likeunquote
thanunquote-slice
, that is we wouldn't automatically "flatten" the AST but let that to the library code).The text was updated successfully, but these errors were encountered: