-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symbolic strings #994
Symbolic strings #994
Conversation
b4d5879
to
eefac48
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! I think we should merge this PR and see if the semantics of symbolic strings need chaning after experimenting with them a bit. Similarly, the string delimiters can be bikeshed later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM. I just wonder what we should do about the "multiline" semantics of symbolic strings. The first question is: should we have symbolic strings be multiline by default, or offer a separate version for multiline strings?
Another solution would be to have s"foo"
strings as well. I think C++ has those. Then there is a clear delimitation between the multiline axis and the "special" axis.
The second question, which is independent, is what to do with interpolated blocks inside mutliline strings. Currently, we carry around inside string chunks an extra ident
arguments to handle those in the natural way (see #994 (comment)). One possibility would be to pass this argument to the parsing function, but it can quickly become cumbersome to pass this information around.
Another solution is just to abandon this semantics for symbolic strings (and do not interpolate other strings as "blocks"). It would be inconsistent though, that the behavior differs between multiline strings and those symbolic string.
@@ -490,26 +495,35 @@ StrChunks: RichTerm = { | |||
.chain(lasts.into_iter()) | |||
.collect(); | |||
|
|||
let mut chunks = if start == StringKind::Multiline { | |||
let chunks = if start.needs_strip_indent() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this in mind but forget to ask. Should we behave like multiline string with respect to indentation, or not? It seems like you answer "no", which is very reasonable. That being said, I wonder if we then would like to combine the semantics of symbolic strings and multiline strings, e.g. in Nix often we currently use multiline strings to write builder commands, where symbolic strings would be used. This may not be trivial with respect to indenting interpolated values, though. I guess we don't have to answer this question right now.
pub fn needs_strip_indent(&self) -> bool { | ||
match self { | ||
StringStartDelimiter::Standard => false, | ||
StringStartDelimiter::Multiline | StringStartDelimiter::Symbolic => true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so what I wrote before is wrong, multiline strings and symbolic strings actually behave the same. I'm not sure how this works with interpolated values though. There is some logic at runtime such that:
let bar = "alpha\nbeta" in
m%"
foo
%{bar}
baz
"%
evaluates to
foo
alpha
beta
baz
I don't think we can easily keep this logic for symbolic strings, unless we delegate this to custom parsing functions (but that start to be cumbersome).
Since both multiline and symbolic strings will end with `"%`, it will no longer be true that the start & end delimiters of a string must match. It will also no longer be the case that we can infer whether or not indentation must be stripped from the starting delimiter. This commit separates the opening and closing delimeter into two different types, and provides a more explicit API for working with them, which will be resilient to the addition of a new `StringStartDelimiter` in a later commit.
This commit implements the symbolic string syntax `s%"..."%`. These strings using the same parsing rules as multiline strings, but are then desugared to `Term::Array`s rather than `Term::StrChunks`.
I think starting off by only supporting multiline symbolic strings makes sense. Especially as it's possible to just write mutliline symbolic strings on a single line. I can't currently think of a use case for a syntax like
In the current iteration I just ignore the To look at your example from the comment above:
currently evaluates to
This feels like how I'd expect this to behave (though obviously that's completely unscientific 😅) - it seems to make sense to give the exact value that was interpolated here. Though I suppose one place it could matter is if the symbolic string is intended to be evaluated into a language with meaningful whitespace... 🤔 Ultimately, I don't have particularly strong feelings & am happy with whatever you think makes sense. I do think what we have now is probably "good enough" for a first pass - what do you think about opening a separate issue for how to handle multiline interpolated values? |
eefac48
to
e062346
Compare
I think @matthew-healy's example evaluation is the most reasonable. I wouldn't expect Nickel to process a string that's interpolated into a symbolic string at all. That's also in line with how normal string interpolation in indented multiline strings behaves. |
Precisely, I don't think this is the case:
We carry explicitly the whole indentation of an interpolation (here e.g. 4 spaces for the
Sure, this is the way to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
@yannham That's interesting! I tested with the following in
and
I think the indent behaviour only kicks in if the multiline string starts with whitespace? |
@vkleen I think it kicks in if the line the interpolation is on starts with whitespace, so e.g.
results in
But because your example has the text |
Yes 🙂 you can find more context in #197. Maybe this information should be somewhere else, in the documentation? |
This PR introduces a simple form of "symbolic strings" (#948), whereby the syntax
s%"..."%
is just de-sugared to an array.