Replies: 2 comments 2 replies
-
Hi Scott! Looks like a fun project. As with most things, "it depends" :-) ... You could try making a (hand-built) tokenizer that can determine when to break a string into pattern elements handled by the top-level parser, turning the subtraction call into a more complex operation with supporting data structures, or use a completely separate tokenizer and parser to figure out your text patterns within the subtraction operator itself. Which you choose probably depends on whether you want patterns to evolve as part of the bigger language (akin to string interpolation in C#), or to be a feature built it (more like Regex, in C#). YMMV, just what comes immediately to mind, but hope this helps! |
Beta Was this translation helpful? Give feedback.
-
public static class StringSubtractionTokenizer
{
private static Tokenizer<SyntaxToken> Tokenizer { get; } = new TokenizerBuilder<SyntaxToken>()
.Match(Character.EqualTo('{'), SyntaxToken.OpenCurlyBrace)
.Match(Character.EqualTo('}'), SyntaxToken.CloseCurlyBrace)
.Match(Character.EqualTo(':'), SyntaxToken.Colon)
.Match(Character.EqualTo('*'), SyntaxToken.Asterisk)
.Match(Numerics.Natural, SyntaxToken.Number)
.Match(Character.AnyChar, SyntaxToken.Character)
.Build();
public static Result<TokenList<SyntaxToken>> TryTokenize(string source) => Tokenizer.TryTokenize(source);
}
public class StringSubtractionValue : Expression
{
public StringSubtractionValue(char[] values, int quantity)
{
Values = values;
Quantity = quantity;
}
public char[] Values { get; }
public int Quantity { get; }
}
public class StringSubtractionParser
{
private static TokenListParser<SyntaxToken, StringSubtractionValue> Character { get; } =
Token.EqualTo(SyntaxToken.Character)
.Select(c => new StringSubtractionValue(c.Span.ToString().ToCharArray(), 0));
private static TokenListParser<SyntaxToken, StringSubtractionValue> QuantifiedCharacter { get; } =
Token.EqualTo(SyntaxToken.Character)
.Then(character => Token.EqualTo(SyntaxToken.Colon)
.Then(_ => Token.EqualTo(SyntaxToken.Number).Or(Token.EqualTo(SyntaxToken.Asterisk))
.Select(quantity => new StringSubtractionValue(character.Span.ToString().ToCharArray(), quantity.Span.ToString() == "*" ? 0 : int.Parse(quantity.Span.ToString())))));
private static TokenListParser<SyntaxToken, StringSubtractionValue> CharacterGroup { get; } =
Token.EqualTo(SyntaxToken.OpenCurlyBracket)
.Then(_ => Token.EqualTo(SyntaxToken.Character).AtLeastOnce()
.Then(characters => Token.EqualTo(SyntaxToken.CloseCurlyBracket)
.Select(_ => new StringSubtractionValue(characters.Select(c => c.Span.ToString().ToCharArray().FirstOrDefault()).ToArray(), 0))));
private static TokenListParser<SyntaxToken, StringSubtractionValue> QuantifiedCharacterGroup { get; } =
Token.EqualTo(SyntaxToken.OpenCurlyBracket)
.Then(_ => Token.EqualTo(SyntaxToken.Character).AtLeastOnce()
.Then(characters => Token.EqualTo(SyntaxToken.CloseCurlyBracket)
.Then(_ => Token.EqualTo(SyntaxToken.Colon)
.Then(_ => Token.EqualTo(SyntaxToken.Number).Or(Token.EqualTo(SyntaxToken.Asterisk))
.Select(quantity => new StringSubtractionValue(characters.Select(c => c.Span.ToString().ToCharArray().FirstOrDefault()).ToArray(), quantity.Span.ToString() == "*" ? 0 : int.Parse(quantity.Span.ToString())))))));
private static TokenListParser<SyntaxToken, StringSubtractionValue[]> GroupedValue { get; } = QuantifiedCharacterGroup.Many().Try().Or(CharacterGroup.Many().Try());
private static TokenListParser<SyntaxToken, StringSubtractionValue[]> SingularValue { get; } = QuantifiedCharacter.Many().Try().Or(Character.Many().Try());
private static TokenListParser<SyntaxToken, StringSubtractionValue[]> Source { get; } = SingularValue.AtEnd().Or(GroupedValue.AtEnd());
public static bool TryParse(TokenList<SyntaxToken> tokens, out StringSubtractionValue[] stringSubtractions, out string error, out Position errorPosition)
{
var result = Source(tokens);
if (!result.HasValue)
{
stringSubtractions = null;
error = result.ToString();
errorPosition = result.ErrorPosition;
return false;
}
stringSubtractions = result.Value;
error = null;
errorPosition = Position.Empty;
return true;
}
} The above parser is doing exactly what I want in that I get the desired output from |
Beta Was this translation helpful? Give feedback.
-
Hi,
I'm really enjoying trying to write a language using Superpower's awesome combinators. It is proving a little difficult for me to get my head around for some things I would know how to achieve using recursive decent parsing. I've managed to get a small tokenized language which supports numeric's and strings.
I'm working on a feature to allow string subtraction like so:
"Hello" - "l" = "Heo"
. I want to be able to quantify the removals like so:"Hello" - "l:1" = "Helo"
. So my thought is that each individual character without a quantifier implies all (e.g"l"
implies"l:*"
) .Characters can also be grouped
"Hello" - "{el}" = "Hlo"
where here{el}
becomes{el}:*
, so you can use a quantifier on a grouping too.This means I need to be able to accurately parse
"Hello" - "elo" (e:* l:* o:*)
or"Hello" - "{He}:1l:1o" ({He}:1 l:1 o:*)
.I thought maybe I should create a separate StringSubtractionParser to return all of the character/grouped characters and their quantities. Should I have another Tokenizer too to handle only the tokens contained within the string on the right?
I'm a little unsure of how I could have a character or grouping of characters optionally followed by a colon and then number.
The current plan was to execute this parser from the
- operator
on myStringValue
class.Beta Was this translation helpful? Give feedback.
All reactions