Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C compiler bug when running this code #23312

Closed
RpxdYTX opened this issue Dec 30, 2024 · 10 comments · Fixed by #23321
Closed

C compiler bug when running this code #23312

RpxdYTX opened this issue Dec 30, 2024 · 10 comments · Fixed by #23321
Assignees
Labels
Bug This tag is applied to issues which reports bugs. Status: Confirmed This bug has been confirmed to be valid by a contributor. Unit: markused Bugs/feature requests, that are related to the -skip-unused.

Comments

@RpxdYTX
Copy link

RpxdYTX commented Dec 30, 2024

Describe the bug

I'm a total V beginner, but bear with me, i was trying to make a language analysis tool for a simple project, but it just spits a C error

Reproduction Steps

Here's the (simplified) code:

// main.v
module main

import os

fn main() {
    file := os.read_file(...) or { panic("Couldn't open") }
    
    for node in Parser.new(file) { dump(node) }
}

pub fn todo() { panic("todo") }

pub fn first[T, U](iter T, f fn(usize, U) bool) ?U {
    mut ret := ?U(none)
    for i, e in iter { if f(usize(i), e) {
        ret = e
        break
    } }
    return ret
}

pub fn take_while(s string, f fn(usize, rune) bool) string {
    mut runes := [] rune {}
    for i, r in s.runes() {
        if !f(usize(i), r) { break }
        runes << r
    }
    
    return runes.string()
}

// lexer.v
module main

import encoding.utf8

pub type Token = TokenType1 | TokenType2...

pub struct Lexer {
    str    string
    mut: i usize
}

pub fn Lexer.new(str string) Lexer { return Lexer { str, 0 } }

pub fn (self Lexer) peek() ?Token {
    tkn, _ := self.lex()?
    return tkn
}

pub fn (mut self Lexer) next() ?Token {
    tkn, i := self.lex()?
    self.i += i
    return tkn
}

fn (self Lexer) lex() ?(Token, usize) { ... }

// parser.v
module main

pub type Node = NodeTypes...

pub fn (self Node) is_expr() bool { ... }

pub struct Parser { mut: lexer Lexer }

pub fn Parser.new(str string) Parser { return Parser { Lexer.new(str) } }

pub fn (mut self Parser) next() ?Node { ... }

Expected Behavior

It should just work or at least not cause c errors

Current Behavior

================== C compilation error (from cc): ==============
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGAX0GMQ7AY9BBNGWHZ5Y2EK.tmp.c:8745:29: error: call to undeclared function 'main__Lexer_next'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
cc:  8745 |                 _option_main__Token _t2 = main__Lexer_next(&_t1);
cc:       |                                           ^
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGAX0GMQ7AY9BBNGWHZ5Y2EK.tmp.c:8745:29: note: did you mean 'main__Lexer_lex'?
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGAX0GMQ7AY9BBNGWHZ5Y2EK.tmp.c:8658:56: note: 'main__Lexer_lex' declared here
cc:  8658 | VV_LOCAL_SYMBOL _option_multi_return_main__Token_usize main__Lexer_lex(main__Lexer self) {
cc:       |                                                        ^
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGAX0GMQ7AY9BBNGWHZ5Y2EK.tmp.c:8745:23: error: initializing '_option_main__Token' (aka 'struct _option_main__Token') with an expression of incompatible type 'int'
cc:  8745 |                 _option_main__Token _t2 = main__Lexer_next(&_t1);
cc:       |                                     ^     ~~~~~~~~~~~~~~~~~~~~~~
cc: 2 errors generated.
================================================================
(You can pass `-cg`, or `-show-c-output` as well, to print all the C error messages).
builder error:
==================
C error found. It should never happen, when compiling pure V code.

Possible Solution

No response

Additional Information/Context

In the function first, the original code just returned e if f(...) were true, but that was yielding a V error and i did this instead. This is a helper function that gets the first element matching some criteria, if any. I use it on the Parser.next function to ignore skippable tokens

V version

V 0.4.9 4225a34

Environment details (OS name and version, etc.)

V full version: V 0.4.9 b487986.4225a34
OS: termux, 3.18.140-SIMPLE-KERNEL_V1.1, #1 SMP PREEMPT Sun Apr 18 03:22:59 -03 2021
Processor: 4 cpus, 64bit, little endian

getwd: /storage/emulated/0/Documents/hex
vexe: /data/data/com.termux/files/home/.v/v
vexe mtime: 2024-12-30 04:25:37

vroot: OK, value: /data/data/com.termux/files/home/.v
VMODULES: OK, value: /data/data/com.termux/files/home/.vmodules
VTMP: OK, value: /data/data/com.termux/files/usr/tmp/v_10246

Git version: git version 2.47.1
Git vroot status: 4225a34
.git/config present: true

CC version: clang version 19.1.6
emcc version: N/A
thirdparty/tcc status: thirdparty-unknown-unknown de82a13

Note

You can use the 👍 reaction to increase the issue's priority for developers.

Please note that only the 👍 reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.

Huly®: V_0.6-21744

@RpxdYTX RpxdYTX added the Bug This tag is applied to issues which reports bugs. label Dec 30, 2024
@RpxdYTX
Copy link
Author

RpxdYTX commented Dec 30, 2024

Making first a function to work solely on Lexer made it work, though it's still weird that this bug happened.
Should i close the issue?

@RpxdYTX
Copy link
Author

RpxdYTX commented Dec 30, 2024

Used -d callstack with the v.debug module in the todo function and another c error popped up

@spytheman
Copy link
Member

Try compiling with -no-skip-unused or provide complete source code, not just snippets, so that we can reproduce the problem locally 1:1.

@RpxdYTX
Copy link
Author

RpxdYTX commented Dec 30, 2024

-no-skip-unsed didn't solve the todo issue, but it did solve the old error
This is the old source code, the one that yielded the error:

// main.v
module main

import os
import v.debug

fn main() {
    file := os.read_file('main.hex') or { panic("Couldn't open main.hex") }
    
    for node in Parser.new(file) { dump(node) }
}

pub fn todo() {
    debug.dump_callstack()
    panic("TODO")
}

pub fn first[T, U](mut iter T, f fn(usize, U) bool) ?U {
    mut ret := ?U(none)
    for i, e in iter { if f(usize(i), e) {
        ret = ?U(e)
        break
    } }
    return ret
}

pub fn take_while(s string, f fn(usize, rune) bool) string {
    mut runes := [] rune {}
    for i, r in s.runes() {
        if !f(usize(i), r) { break }
        runes << r
    }
    
    return runes.string()
}

// lexer.v
module main

import encoding.utf8

pub type Token = WsTkn      |

    StrTkn     | IdentTkn   |
    
    FnTkn                   |
    
    ColonTkn                |
    LParenTkn  | RParenTkn  |
    LCBraceTkn | RCBraceTkn |
    
    UnknownTkn

pub fn (self Token) skip() bool { return self is WsTkn }
//pub fn (self Token) sym()  bool { return self is  }
//pub fn (self Token) kw()   bool { return self is FnTkn }

pub struct Lexer {
    str    string
    mut: i usize
}

pub fn Lexer.new(str string) Lexer { return Lexer { str, 0 } }

pub fn (self Lexer) peek() ?Token {
    tkn, _ := self.lex()?
    return tkn
}

pub fn (mut self Lexer) next() ?Token {
    tkn, i := self.lex()?
    self.i += i
    return tkn
}

pub fn (mut self Lexer) rwnd() { self.i = 0 }

pub struct WsTkn {}

pub struct StrTkn { pub: str string }
pub struct IdentTkn { pub: ident string }

pub struct FnTkn {}

pub struct ColonTkn {}
pub struct LParenTkn {}
pub struct RParenTkn {}
pub struct LCBraceTkn {}
pub struct RCBraceTkn {}

pub struct UnknownTkn { token string }

fn (self Lexer) lex() ?(Token, usize) {
    str := self.str[self.i..]
    if str.len == 0 { return none }
    
    ws := take_while(str, |_, c| utf8.is_space(c))
    if ws.len > 0 { return Token(WsTkn {}), usize(ws.len) }
    
    mut str_lit := take_while(str, |i, c|
        (i == 0 && c.bytes()[0] == `"`)
        || (i != 0 && c.bytes()[0] != `"`)
    )
    if str_lit.len > 0 {
        if str[str_lit.len] != `"` { todo() }
        
        str_lit = str_lit[1..]
        return Token(StrTkn { str_lit }), usize(str_lit.len + 2)
    }
    
    ident := take_while(str, |i, c|
        (i == 0 && (c.bytes()[0] == `_` || c.bytes()[0].is_letter()))
        || (i != 0 && (c.bytes()[0] == `_` || c.bytes()[0].is_alnum()))
    )
    if ident.len > 0 {
        tkn := match ident {
            "fn" { Token(FnTkn {}) }
            else { Token(IdentTkn { ident }) }
        }
        
        return tkn, usize(ident.len)
    }
    
    sym := str.runes()[0]
    tkn := match sym {
        `:` { Token(ColonTkn {}) }
        `(` { Token(LParenTkn {}) }
        `)` { Token(RParenTkn {}) }
        `{` { Token(LCBraceTkn {}) }
        `}` { Token(RCBraceTkn {}) }
        
        else { Token(UnknownTkn { sym.str() }) }
    }
    
    return tkn, usize(sym.length_in_bytes())
}

// parser.v
module main

pub type Node =
    StrNode    |
    BlockNode  |
    FnCallNode |
    
    FnDeclNode |
    
    UnknownNode

pub fn (self Node) is_expr() bool { return match self
    StrNode, BlockNode, FnCallNode { true }
    else { false }
} }

pub struct Parser { mut: lexer Lexer }

pub fn Parser.new(str string) Parser { return Parser { Lexer.new(str) } }

pub fn (mut self Parser) rwnd() { self.lexer.rwnd() }

pub struct StrNode { pub: str string }
pub struct BlockNode { pub: exprs [] Node }
pub struct FnCallNode {
    pub:
    name string
    args [] Node
}

pub struct FnDeclNode {
    pub:
    name   string
    params [] string
    expr   Node
}

pub struct UnknownNode { pub: tkn Token }

fn non_skip_tkn(_ usize, tkn Token) bool { return !tkn.skip() }
pub fn (mut self Parser) next() ?Node {
    tkn := first(mut self.lexer, non_skip_tkn)?
    
    return match tkn {
        StrTkn { Node(StrNode { tkn.str }) }
        LCBraceTkn {
            mut exprs := [] Node {}
            mut token := self.lexer.peek()?
            for token !is RCBraceTkn {
                exprs << self.next()?
                token = self.lexer.peek()?
            }
            
            Node(BlockNode { exprs })
        }
        IdentTkn {
            name := tkn.ident
            mut token := first(mut self.lexer, non_skip_tkn)?
            
            if token !is LParenTkn { todo() }
            mut args := [] Node {} // todo
            
            token = first(mut self.lexer, non_skip_tkn)?
            if token !is RParenTkn { todo() }
            
            Node(FnCallNode { name, args })
        }
        
        FnTkn {
            ident := first(mut self.lexer, non_skip_tkn)?
            if ident !is IdentTkn { todo() }
            
            name := (ident as IdentTkn).ident
            mut token := first(mut self.lexer, non_skip_tkn)?
            if token !is LParenTkn { todo() }
            
            params := [] string {} // todo
            token = first(mut self.lexer, non_skip_tkn)?
            if token !is RParenTkn { todo() }
            
            token = first(mut self.lexer, non_skip_tkn)?
            if token !is ColonTkn { todo() }
            
            node := self.next()?
            if !node.is_expr() { todo() }
            
            Node(FnDeclNode { name, params, node })
        }
        
        else { Node(UnknownNode { tkn }) }
    }
}

@RpxdYTX
Copy link
Author

RpxdYTX commented Dec 30, 2024

Using debug.dump_callstack() with -d callstack errors with

================== C compilation error (from cc): ==============
cc:        |                                       ~~                                                                            ^~~~~~~~~
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGC68SV7BHYMXC3HWJVJ2FB0.tmp.c:26377:15: error: expected expression
cc:  26377 |         bool ret = f(, );
cc:        |                      ^
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGC68SV7BHYMXC3HWJVJ2FB0.tmp.c:26377:17: error: expected expression
cc:  26377 |         bool ret = f(, );
cc:        |                        ^
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGC68SV7BHYMXC3HWJVJ2FB0.tmp.c:26408:15: error: expected expression
cc:  26408 |         bool ret = f(, );
cc:        |                      ^
cc: /data/data/com.termux/files/usr/tmp/v_10246/hex.01JGC68SV7BHYMXC3HWJVJ2FB0.tmp.c:26408:17: error: expected expression
cc:  26408 |         bool ret = f(, );
... (the original output was 26 lines long, and was truncated to 12 lines)
================================================================
(You can pass `-cg`, or `-show-c-output` as well, to print all the C error messages).

On both the old and the new source code. Runinng without -d callstack makes the code run, but then todo is useless

@JalonSolov
Copy link
Contributor

Can you supply a main.hex file to use with this code? A random file may/may not show the same problems.

@RpxdYTX
Copy link
Author

RpxdYTX commented Dec 30, 2024 via email

@felipensp felipensp self-assigned this Dec 30, 2024
@felipensp felipensp added the Unit: markused Bugs/feature requests, that are related to the -skip-unused. label Dec 30, 2024
@spytheman
Copy link
Member

It is better to open another issue for the second problem, than to discuss several issues in the same thread.

The first issue is going to get solved in #23321 .

@RpxdYTX
Copy link
Author

RpxdYTX commented Dec 30, 2024

Should i open another issue then?

@felipensp
Copy link
Member

Should i open another issue then?

Sure, for callstack issue. Please.

@felipensp felipensp added the Status: Confirmed This bug has been confirmed to be valid by a contributor. label Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug This tag is applied to issues which reports bugs. Status: Confirmed This bug has been confirmed to be valid by a contributor. Unit: markused Bugs/feature requests, that are related to the -skip-unused.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants