-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracing API #32
Tracing API #32
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has a serious lack of comments :)
include/evmc/evmc.h
Outdated
size_t code_offset, | ||
enum evmc_status_code status_code, | ||
int64_t gas_left, | ||
const struct evmc_uint256be* top_stack_item, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this have a stack count? Does depth
above mean stack height or call depth?
include/evmc/evmc.h
Outdated
|
||
typedef void (*evmc_trace_callback)(struct evmc_tracer_context* context, | ||
int depth, | ||
int step, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called pc
?
include/evmc/evmc.h
Outdated
typedef void (*evmc_trace_callback)(struct evmc_tracer_context* context, | ||
int depth, | ||
int step, | ||
size_t code_offset, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this pc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with "pc" name I have is not everyone knows exactly that it is exactly. I think code_offest
=== pc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what is code_offset
and what is step
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The step
is the number of executed instructions so far (only increases).
The code_offset
is the position in the code where we are currently. It's different than the step
because of jump and push instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execution_step
perhaps? Not sure, step
, without any documentation gave me the wrong indication about what it is.
Why don't you want to have a documentation above it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also remove this because this can be implemented easily on the client side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add documentation soon. I just need to finish other PRs first.
Intentionally. I expect this to change a lot in the first phase. And I want this to be clear enough without comments. |
include/evmc/evmc.h
Outdated
size_t changed_memory_size, | ||
const uint8_t* changed_memory); | ||
|
||
typedef void (*evmc_set_tracer)(struct evmc_instance* instance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this return a pointer to previously set callback?
Why do you need to declare a pointer to function (evmc_set_tracer
) if it's just to declare one function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can return previous callback, but it would be hard to also return the context, so you will not be able to restore previous state anyway.
(If I understand the question) The evmc_instance is a set of function pointers. It is easier to read the spec when function pointer types are declared separately. I forgot the _fn
suffix in this type name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, I didn't notice that it's a part of evmc_instance
, sorry
ce65d2d
to
99f8c03
Compare
The PoC is here: ethereum/aleth#5065 I believe the API is good enough to be merged as is. That would allow me to focus on geth support for EVMC. |
include/evmc/evmc.h
Outdated
* @param context The pointer to the Client-side tracing context. This allows to | ||
* implement the tracer in OOP manner. | ||
* @param step The instruction counter for the current message. | ||
* @param code_offset The instruction possition in the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: position
include/evmc/evmc.h
Outdated
* | ||
* @param context The pointer to the Client-side tracing context. This allows to | ||
* implement the tracer in OOP manner. | ||
* @param step The instruction counter for the current message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe to clarify it instead of "current message" call it " The instruction counter after the instruction execution."
And also below add "after the instruction execution" for code_offset, stack_num_items, memory_size
I changed the documentation. |
include/evmc/evmc.h
Outdated
* The callback to trace instructions execution in an EVM. | ||
* | ||
* This function informs the Client what instruction has been executed in the EVM implementation | ||
* and what are the results of the execution this particular instruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execution of this
or of executing this
* @param status_code The status code of the instruction execution. | ||
* @param gas_left The amount of the gas left after the instruction execution. | ||
* @param stack_num_items The current EVM stack height after the instruction execution. | ||
* @param pushed_stack_item The top EVM stack item pushed as the result of the instruction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So based on these two the client needs to apply the stack changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tracer want to know what's on stack it has to track the changes. This is even more complicated when you consider SWAP and DUP instructions - the tracer would have to reapply SWAP and DUP semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure there aren't ever going to be instructions pushing more than one item?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we have them now in form of DUP and SWAP.
In practice, not likely needed. We can also change the API in future. I will check WASM instructions.
include/evmc/evmc.h
Outdated
* the instruction execution. | ||
* @param changed_memory_size The size of the memory area modified as the result of | ||
* the instruction execution. | ||
* @param changed_memory The pointer to the memory area modified as the result of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed if one can calculate that via changed_memory_offset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot calculate it via changed_memory_offset
without having the base memory pointer. We can return the base memory pointer instead of the pointer to the changed area. However that would put higher requirements on EVM memory layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably best (esp for other implementations) to directly pass back the changed memory. Not all language give you access to memory via pointers, so having an explicit argument present here will make it easier to use as a guide for implementing traces in other languages.
My question is how will you handle it if there are multiple disjoint changed regions of memory? What if they are very spaced out (as is often the case when using the "hash of variable name" scheme for deciding where to place things)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly pass back the changed memory
I'm not sure what you exactly mean by this.
My question is how will you handle it if there are multiple disjoint changed regions of memory?
You don't. There should be a way to report without information about memory. Something like changed_memory_size == -1
. I'm leaving it unanswered because I don't have good solution at the moment. I prefer do defer the decision until we have the use case for it.
include/evmc/evmc.h
Outdated
/** | ||
* Sets the EVM instruction tracer. | ||
* | ||
* When the tracer is set in the EVM instance, the EVM SHOULD callback the tracer with information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
callback
-> call
or call back
* | ||
* If the EVM does not support this feature the pointer can be NULL. | ||
*/ | ||
evmc_set_tracer_fn set_tracer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be unset after its being set?
I don't fully remember (and seemingly our documentation isn't clear enough): can an evmc_instance
reused for multiple executions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Answered in the latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That commit looks good.
I pushed additional commit with improved documentation. I will squeeze them later (easier to review changes now). |
include/evmc/evmc.h
Outdated
* This counter starts from 0 for every message passed to | ||
* ::evmc_execute_fn. | ||
* @param code_offset The current instruction position in the code. | ||
* @param status_code The status code of the instruction execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is useful (99% of the time the value is 0). And the Client can get it from evm_result in the end.
include/evmc/evmc.h
Outdated
* implement the tracer in OOP manner. | ||
* @param step The instruction counter: number of instructions executed. | ||
* This counter starts from 0 for every message passed to | ||
* ::evmc_execute_fn. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why call this step
instead of the more conventional programCounter
? Is this to facilitate ewasm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #32 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the program counter. Program counter points the instruction being executed, so it's code_offset
in this case. The step
is the trace entry index.
I'm ok with renaming any of those if you have better ideas for names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about tracing_step
or trace_index
? Why is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precisely I wanted the "instruction counter" per current message (i.e. EVM call). This is not not needed (because the Host can count tracing callback calls on its own) UNLESS the VM will not report every instruction.
I propose to rename it to instruction_counter
or remove it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's why trace_counter
/trace_step
/trace_index
is the appropriate name.
I'm not sure this is a useful field though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean this case where VM only traces storage instructions:
PUSH 0
SLOAD
PUSH 9
SSTORE
When counting the number of traces you should get:
(0, SLOAD, ...)
(1, SSTORE, ...)
When counting number of instructions you should get:
(1, SLOAD, ...)
(3, SSTORE, ...)
In the second case the Host can figure out that some executed instructions are missing in the trace. In the first case the counter is redundant as the Host can count it itself.
I'm going to remove the param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean this case where VM only traces storage instructions:
You mean a specific VM won't support tracing every instruction?
I have no problem keeping it, but step
seems to be rather confusing. Consider some of the suggestions above.
Rebased. |
@@ -9,6 +9,8 @@ struct examplevm | |||
{ | |||
struct evmc_instance instance; | |||
int verbose; | |||
evmc_trace_callback trace_callback; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be useful to have another callback to allow the EVM to log text back to the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of a separate callback, actually it could just be added as an additional parameter to evmc_trace_callback
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, EVM can just do this via stderr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C++ std::cerr
works. We could add a logger method to capture VM logs nicely on the Host side. But that is different feature and should be done independently.
95d1f23
to
925b714
Compare
The tracing API has been added.
* The pointer MUST NOT be stored by the Client. | ||
* The Client MUST NOT assume that | ||
* `changed_memory - changed_memory_offset` is a valid base pointer | ||
* of the VM memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad we discussed these on Gitter, much more cleaner now.
Fixes #22.