-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
View function disassembly or raw instructions? #184
Comments
After more research, it seems horizontal ops are a bit weird in general, and even Rust's |
LLVM has the I suggest writing the natural code with target-neutral LLVM including the experimental vector reduce intrinsics, and once you get the resulting assembly, see if you can beat it. Previously I'd recommend using Intel Architecture Code Analyzer to analyzer assembly performance, but their webpage is now redirecting to llvm-mca: https://llvm.org/docs/CommandGuide/llvm-mca.html . |
Well, the original point of the issue is still present. How would I go about viewing the generated machine code from Inkwell? In fact, I'm also not sure how to inject arbitrary LLVM IR other than to create an entirely new module out of it. |
I'm not certain you can do the former at the moment. For the latter, maybe |
I'll have to experiment with that. Perhaps handwrite a few modules for common ops and rely on
|
Oh. Of course, this is already available. Target::initialize_native(&InitializationConfig::default()).expect("Failed to initialize native target");
let triple = TargetMachine::get_default_triple();
let cpu = TargetMachine::get_host_cpu_name().to_string();
let features = TargetMachine::get_host_cpu_features().to_string();
let target = Target::from_triple(&triple).unwrap();
let machine = target
.create_target_machine(
&triple,
&cpu,
&features,
OptimizationLevel::Aggressive,
RelocMode::Default,
CodeModel::Default,
)
.unwrap();
// create a module and do JIT stuff
machine.write_to_file(&module, FileType::Assembly, "out.asm".as_ref()).unwrap(); So yeah, took me a while to find out how, but it does indeed save the whole assembly with labels, attributes and so forth. It also confirms that it's producing highly-optimized machine code just like I hoped. However, some better documentation around target machines would be very helpful. Is it stateful? Does it actually affect codegen? Other than exporting that module, it doesn't touch the JIT code, so its affect is unknown. You're welcome to close this if this solution is acceptable, though my questions still stand. |
I'm about to start using Inkwell for a highly-optimized JIT system, and it would be great if there were a way to view the resulting compiled code or even just getting a pointer and length to where the code is, allowing me to read it directly.
I'm aware of the
print_to_string
/print_to_stderr
methods onFunctionValue
, but those only seem to print the raw LLVM IR.Without access to horizontal vector ops, I'm hoping LLVM will be able to autovectorize vector sums and products well enough, but without a way to see the resulting instructions I can't know.
Please let me know if I'm missing something obvious! Also if you have any ideas for autovectorization or horizontal vector ops, I'd love to hear them.
Here is the kind of thing I plan to do:
which results in this LLVM IR:
which was printed after running the optimization passes shown in the Kaleidoscope demo, which didn't seem to change much. Adding the two "vectorize" passes didn't seem to do anything either.
The text was updated successfully, but these errors were encountered: