Raw canonical ast separation #10

zypeh · 2020-02-25T14:14:15Z

In this branch:

Lexer that tokenizes the text stream
Parser that returns the raw ast.

Renaming & test case added

zypeh · 2020-02-25T14:14:48Z

src/sexpr_parser.rs

@@ -154,21 +154,50 @@ impl Parser {
    }
 }

-pub fn build_pair(expressions: Vec<Sexp>) -> Sexp {
+pub fn build_pair(mut expressions: Vec<Sexp>) -> Sexp {
+    expressions.reverse();


Hacky yeah, because rust does not have foldr

wongjiahau · 2020-02-25T14:16:43Z

src/sexpr_parser.rs

+/// An s-expression is either an atom or a list of s-expressions. This is
+/// similar to the data format used by lisp.
+///
+/// TODO: I don't know whether I need to add those seven Lisp primitives to


I think at the moment no need as we want to keep the scope small.

hch12907 · 2020-02-25T15:40:35Z

src/lib.rs

@@ -1,3 +1,4 @@
 pub mod node;
 pub mod sexpr_parser;
 pub mod sexpr_tokenizer;
+pub mod sexpr_tokenizer_2;


Why not put these modules into a separate directory?

hch12907 · 2020-02-25T15:48:06Z

src/sexpr_tokenizer_2.rs

+            None => vec![token]
+        }
+    })
+  });


Minor nit: return { .. }; vs { .. }

hch12907 · 2020-02-25T15:52:08Z

src/sexpr_tokenizer_2.rs

+        match tokens.last() {
+            Some(lastToken) => match (lastToken, token) {
+                (Token::Whitespace, Token::Whitespace) =>
+                    [&tokens[0..lineNumber]].concat(),


I think the concat is not needed here.

hch12907 · 2020-02-25T15:55:20Z

src/sexpr_parser.rs

+/// All strings must be valid utf-8.
+#[derive(PartialEq, Clone, PartialOrd, Debug)]
+pub enum Atom {
+    /// N stands for node


I get that those are the short forms, but why short forms in the first place?

hch12907 · 2020-02-25T15:58:10Z

src/sexpr_parser.rs

+}
+
+pub struct Parser {
+    tokens: Vec<Token>,


Pure nit, can be ignored: Box<[Token]> because tokens will never grow under Parser
(tokens.into_boxed_slice() will do that)

hch12907 · 2020-02-25T16:06:14Z

src/sexpr_tokenizer.rs

+#[derive(Debug)]
+pub struct TokenizerError {
+    /// The error message.
+    pub message: &'static str,


I feel like an enum TokenizerErrorKind is appropriate here, but I digress..

src/sexpr_tokenizer.rs

hch12907 · 2020-02-25T16:11:38Z

src/sexpr_tokenizer.rs

+    /// This will consume text stream and produces tokens one by one
+    fn next_token(&self, chars: &mut Peekable<Chars<'_>>) -> Result<Option<Token>, TokenizerError> {
+        match chars.peek() {
+            Some(&ch) => match ch {


Missing \n case (only \r\n is handled)

src/sexpr_tokenizer.rs

hch12907 · 2020-02-25T17:06:00Z

src/sexpr_tokenizer.rs

+/// Read from `chars` until `predicate` returns `false` or EOF is hit.
+/// Return the characters read as String, and keep the first non-matching
+/// char available as `chars.next()`.
+fn peeking_take_while(


I was thinking about take_while() but Peekable has disappointed me.

Wong Jia Hau and others added 16 commits February 14, 2020 10:44

Changed List to Pair, and also re-implemented type_of

547fac6

Renaming & test case added

8da73c2

review: bugfixes tuple syntax and self arg

5e3393f

Merge pull request #4 from topoi-lang/change-list-to-pair-zy

657374b

Renaming & test case added

Rename T-Ty to T

0d4e29b

wip: tokenizer implementation

b24d0bc

wip: whitespace token parsing

4d7bb7c

Sexpression lib

e077e0b

chore: move tests to corresponding file

518a40b

wip: development

9a88c4a

feat: tokenizer finish

e6dd910

chore: remove unused dependency

49266d9

wip: parser for s-expression

0b0a82e

Pair instead of list

77625b1

Done parser

c45e99a

tests added

269ba9e

zypeh requested review from hch12907 and wongjiahau February 25, 2020 14:14

zypeh commented Feb 25, 2020

View reviewed changes

wongjiahau reviewed Feb 25, 2020

View reviewed changes

Added sexpr_tokenizer_2.rs to demonstrate immutable code

71eaae7

hch12907 reviewed Feb 25, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raw canonical ast separation #10

Raw canonical ast separation #10

zypeh commented Feb 25, 2020

zypeh Feb 25, 2020

hch12907 Feb 25, 2020

wongjiahau Feb 25, 2020

hch12907 Feb 25, 2020

hch12907 Feb 25, 2020

hch12907 Feb 25, 2020

hch12907 Feb 25, 2020

hch12907 Feb 25, 2020

hch12907 Feb 25, 2020

hch12907 Feb 25, 2020

zypeh Feb 26, 2020

hch12907 Feb 25, 2020

Raw canonical ast separation #10

Are you sure you want to change the base?

Raw canonical ast separation #10

Conversation

zypeh commented Feb 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment