Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEP is actually unsafe #33

Open
Redrield opened this issue Feb 18, 2018 · 9 comments
Open

GEP is actually unsafe #33

Redrield opened this issue Feb 18, 2018 · 9 comments
Labels
Milestone

Comments

@Redrield
Copy link

In an attempt to get around #32, I've been trying to get a program working that can call printf with a string array. Looking at working IR, I see that to convert from an array to a pointer, an inbounds GEP instruction is used. This is the code I came up with to replicate that.

extern crate inkwell;
extern crate pest;

use inkwell::context::Context;
use inkwell::targets::{InitializationConfig, Target};
use inkwell::AddressSpace;
use inkwell::module::Linkage;
use inkwell::values::IntValue;

use std::path::Path;

fn main() {
    Target::initialize_native(&InitializationConfig::default()).unwrap();
    let ctx = Context::create();
    let module = ctx.create_module("program");
    let builder = ctx.create_builder();

    let str_type = ctx.i8_type().ptr_type(AddressSpace::Generic);
    let i32_type = ctx.i32_type();

    let my_message = "Hello, World!\n\0";
    let arr_type = ctx.i8_type().array_type(my_message.len() as u32);

    let mut chars = Vec::with_capacity(my_message.len());

    for ch in my_message.chars() {
        chars.push(ctx.i8_type().const_int(ch as u64, false));
    }

    let global = module.add_global(&arr_type, Some(AddressSpace::Generic), "message");
    let oof: Vec<&IntValue> = chars.iter().map(|x| x).collect();

    let str_array = arr_type.const_array(&oof[..]);

    global.set_initializer(&str_array);

    let printf_type = i32_type.fn_type(&[&str_type], true);
    let printf = module.add_function("printf", &printf_type, Some(&Linkage::ExternalLinkage));

    let main_fn_type = i32_type.fn_type(&[&i32_type, &str_type.ptr_type(AddressSpace::Generic)], false);

    let main_fn = module.add_function("main", &main_fn_type, None);
    let block = ctx.append_basic_block(&main_fn, "entry");
    builder.position_at_end(&block);

    let value = builder.build_in_bounds_gep(&global.as_pointer_value(), &oof[..], "");

    builder.build_call(&printf, &[&value], "", false);
    builder.build_return(Some(&i32_type.const_int(0, false)));

//    println!("{:?}", module.print_to_string());
    module.print_to_file(Path::new("test.ll"));
}

At the line where value is bound, the program exits with a SIGSEGV

Environment

Operating system: Arch Linux
Kernel version: 4.15.3-1-ARCH
LLVM version: 5.0.1-2
Inkwell version: 0.1.0#eafca272
rustc -vV output:

rustc 1.25.0-nightly (3ec5a99aa 2018-02-14)
binary: rustc
commit-hash: 3ec5a99aaa0084d97a9e845b34fdf03d1462c475
commit-date: 2018-02-14
host: x86_64-unknown-linux-gnu
release: 1.25.0-nightly
LLVM version: 6.0
@TheDan64 TheDan64 added the bug label Feb 19, 2018
@TheDan64 TheDan64 added this to the 0.1.0 milestone Feb 19, 2018
@Redrield
Copy link
Author

Playing around more I found that it was just me not knowing how GEP works. Equivalent code using stuff from librustc_trans segfaults as well.

@nicokoch
Copy link

nicokoch commented Mar 5, 2018

This is still a bug though... segfaults are not allowed to happen in safe rust.
So maybe inkwell is missing some sanity checks?

@71
Copy link
Contributor

71 commented Mar 5, 2018

Unfortunately, I've only scratched the surface of what LLVM has to offer but I've learned it will segfault for ANYTHING, which is why most methods do not return status codes I suppose. While I agree that Inkwell should indeed provide a completely safe wrapper around LLVM, it has to be known that this will be very hard, or maybe impossible in some cases.

@TheDan64
Copy link
Owner

TheDan64 commented Mar 5, 2018

Yep, I agree with @6a's assessment. There are some things due to LLVM's nature that will just be too difficult to wrap and should either remain unsafe or safe with a documentation note (ie if it's just a rare edge case). I will reopen the ticket so we can at least triage and research what are our reasonable options to deal with this issue

@TheDan64 TheDan64 reopened this Mar 5, 2018
@TheDan64
Copy link
Owner

TheDan64 commented Mar 5, 2018

Thank you @nicokoch for bringing it up!

@71
Copy link
Contributor

71 commented Mar 5, 2018

In the meantime, we could start writing down somewhere all those "gotchas". I'll start: building instructions in a builder when no block has been created in that function will most likely result in a segfault.

@TheDan64
Copy link
Owner

TheDan64 commented Mar 5, 2018

@6a if you can get a concrete example of this segfault, feel free to open a separate issue. I think that's the best way to track them. Maybe I will start using a "gotcha" or "LLVM gotcha" label to these type of bug issues

@TheDan64 TheDan64 added the requires triage Currently assigned milestone may change pending additional research label May 9, 2018
@TheDan64
Copy link
Owner

Revisited this today and I'm not sure what can be done here... Maybe if we can encode pointer lengths into the type system, it could be smart enough to validate indexes that are OOB using something like the typenum crate or const generics might help. Going to push this until 0.2.0 since there's not much we can do about it at the moment.

@TheDan64 TheDan64 removed the requires triage Currently assigned milestone may change pending additional research label Jun 16, 2018
@TheDan64 TheDan64 modified the milestones: 0.1.0, 0.2.0 Jun 16, 2018
@TheDan64 TheDan64 changed the title Segmentation fault when calling Builder::build_in_bounds_gep Make GEP safe to use Jun 16, 2018
@TheDan64 TheDan64 changed the title Make GEP safe to use GEP is actually unsafe Jun 16, 2018
@Cypher1
Copy link

Cypher1 commented Mar 3, 2023

Does anyone have an example of safely constructing a global string with inkwell? I'm still unsure what the 'safe' way to do it is (and some of these examples no longer compile due to changes in the AddressSpace struct/enum, so I'm unsure what I'm doing wrong and if I'm making things worse as I 'fix' things)

@TheDan64 TheDan64 modified the milestones: 0.6.0, 0.7.0 Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants