Rule templates? #30

osa1 · 2021-10-29T17:32:29Z

Here are rules I'm using to lex Rust decimal, binary, octal, and hexadecimal numbers:

rule DecInt {
    $dec_digit,
    '_',

    $int_suffix | $ => |lexer| {
        let match_ = lexer.match_();
        lexer.switch_and_return(LexerRule::Init, Token::Lit(Lit::Int(match_)))
    },

    $whitespace => |lexer| {
        let match_ = lexer.match_();
        // TODO: Rust whitespace characters 1, 2, or 3 bytes long
        lexer.switch_and_return(
            LexerRule::Init,
            Token::Lit(Lit::Int(&match_[..match_.len() - match_.chars().last().unwrap().len_utf8()]))
        )
    },
}

rule BinInt {
    $bin_digit,
    '_',

    $int_suffix | $ => |lexer| {
        let match_ = lexer.match_();
        lexer.switch_and_return(LexerRule::Init, Token::Lit(Lit::Int(match_)))
    },

    $whitespace => |lexer| {
        let match_ = lexer.match_();
        // TODO: Rust whitespace characters 1, 2, or 3 bytes long
        lexer.switch_and_return(
            LexerRule::Init,
            Token::Lit(Lit::Int(&match_[..match_.len() - match_.chars().last().unwrap().len_utf8()]))
        )
    },
}

rule OctInt {
    $oct_digit,
    '_',

    $int_suffix | $ => |lexer| {
        let match_ = lexer.match_();
        lexer.switch_and_return(LexerRule::Init, Token::Lit(Lit::Int(match_)))
    },

    $whitespace => |lexer| {
        let match_ = lexer.match_();
        // TODO: Rust whitespace characters 1, 2, or 3 bytes long
        lexer.switch_and_return(
            LexerRule::Init,
            Token::Lit(Lit::Int(&match_[..match_.len() - match_.chars().last().unwrap().len_utf8()]))
        )
    },
}

rule HexInt {
    $hex_digit,
    '_',

    $int_suffix | $ => |lexer| {
        let match_ = lexer.match_();
        lexer.switch_and_return(LexerRule::Init, Token::Lit(Lit::Int(match_)))
    },

    $whitespace => |lexer| {
        let match_ = lexer.match_();
        // TODO: Rust whitespace characters 1, 2, or 3 bytes long
        lexer.switch_and_return(
            LexerRule::Init,
            Token::Lit(Lit::Int(&match_[..match_.len() - match_.chars().last().unwrap().len_utf8()]))
        )
    },
}

These rules are all the same, except the "digit" part: for binary numbers I'm using $bin_digit regex for the digits, for hex I'm using $hex_digit, and similar for other rules.

If we could implement "rule templates" that take regex as arguments, we could do have one template with a "digit" parameter, and pass $hex_digit, $oct_digit, etc. to it and avoid duplication.

The text was updated successfully, but these errors were encountered:

osa1 · 2021-10-30T01:03:08Z

Note that the rules above are not correct. For example, this won't be lexed correctly: [1].

osa1 added feature New feature or request design labels Oct 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rule templates? #30

Rule templates? #30

osa1 commented Oct 29, 2021 •

edited

Loading

osa1 commented Oct 30, 2021

Rule templates? #30

Rule templates? #30

Comments

osa1 commented Oct 29, 2021 • edited Loading

osa1 commented Oct 30, 2021

osa1 commented Oct 29, 2021 •

edited

Loading