From d1a62c2188d08d04e6b9126b97f395fd5fde1220 Mon Sep 17 00:00:00 2001 From: CF Bolz-Tereick Date: Thu, 1 Aug 2024 17:15:10 +0200 Subject: [PATCH 1/8] start working on the tracing post --- posts/2024/08/state-of-tracing.md | 182 ++++++++++++++++++++++++++++++ 1 file changed, 182 insertions(+) create mode 100644 posts/2024/08/state-of-tracing.md diff --git a/posts/2024/08/state-of-tracing.md b/posts/2024/08/state-of-tracing.md new file mode 100644 index 000000000..9042016da --- /dev/null +++ b/posts/2024/08/state-of-tracing.md @@ -0,0 +1,182 @@ + + + + +"I'm curious what the current state of tracing JITs is. They used to be all the +rage for a while, then I though I heard they weren't so effective, then I +haven't heard of them at all. Is the latter because they are ubiquitous, or +because they proved to not work so well?" + +https://twitter.com/ShriramKMurthi/status/1818009884484583459 + +my opinion on tracing. this is such a complicated question, kind of too large for twitter. here's a thread that should be a blog post, with sections: + + +## Meta-tracing + +personal context: been working on pypy since ~20 years. pypy has a meta-JIT, +which allows it to re-use jit infrastructure for the various Python versions, +and also for some experimental different languages like Prolog, Racket, also an +ARM and RISC-V emulator + +PyPy gives itself the goal to try to be extremely compatible with all the +quirks of the Python language. so changing the language to make things easier +to compile is a no no. we try hard to have no opinions on language design, they +come up with the semantics, we somehow deal. + +PyPy started using a tracing JIT approach *not* because we thought method jits +are bad. but because we had failed to do a method-based meta-JIT that was using +partial evaluation (we wrote three or four method-based prototypes that all +weren't as good as we hoped). + +In the meta-JIT context tracing is nice, because tracing has relatively +understandable behavior and its easy(ish) to tweak how things work with extra +annotations in the interpreter source. + +meta-tracing often works well for us/pypy. It can often slice through the +complicated layers of Python quite effectively and remove a lot of overhead +(Python is more complicated than JS, imo. it's big and complex and growing) + +### Truffle + +later truffle came along and made a method-based meta-JIT using partial +evaluation work, but with a lot more people/resources and at first requiring a +quite specific style of interpreters + +it's still my impression that getting similar results with truffle is a lot +more work than with rpython and the warmup of truffle can often pretty bad. but +both are questions more for @smarr again + +## Tracing, the good + +the aggressive partial inlining of tracing, following just the hot path through +lots of layers of abstraction, is obviously often really useful for generating +fast code + +it should be possible to achieve the same effect in a method-based context with +path splitting. but it's not trivial, because the path execution counts of +inlined functions can often be very call-site dependent, and tracing gives you +call-site dependent path splitting + +(the aggressive partial inlining and path splitting is even more important in +the meta-tracing context of pypy, where some of layers are part of the runtime, +and where rare corner cases are basically absolutely everywhere) + +tracing makes a whole bunch of optimizations really easy to implement, because +there are (to first approximation) no control flow merges. This allows us to do +optimizations in exactly one forwards and one backwards pass. Eg our allocation +removal/partial escape analysis is simple + +in a tracing jit it can therefore be quite easy to get some pretty decent +optimizations. Our optimization of temporary allocations, the way we can reason +about the heap, about dictionary accesses, about properties of functions of the +runtime is all quite decent + + +## Tracing, the bad + +downsides of tracing: in my experience it tends to have big performance cliffs. +The 'good' cases are really good, but if something goes wrong you are annoyed +and performance can become a lot slower. with a simple method jit perf is more +"even" + +there are a bunch of strange corner cases that tracing has (when do you stop +inlining, what about tracing recursion, what happens if your traces are too +long, stuff like that) + +I agree with this too (and @samth and I have discussed it a few times when +working on Pycket): if you trace the bytecode dispatch loop of a bytecode +interpreter (or other interpreter-like control flow), you will get not great +results + +https://twitter.com/pnguyen0112/status/1818100321652199456 + +this is because the core assumption of the tracing jit "loops take similar +control flow paths" is just really wrong in the case of interpreters + + +this is because the core assumption of the tracing jit "loops take similar +control flow paths" is just really wrong in the case of interpreters + + + +## Discussion + +"This is a really great summary. Meta-tracing is probably the one biggest +success story. I think it has to do with how big and branchy the bytecode +implementations are for typical dynamic languages; the trace captures latent +type feedback naturally. + +There is an upper limit, tho." + +https://twitter.com/TitzerBL/status/1818385622203298265 + +Exactly, the complexity of py bytecodes is a big factor for why meta tracing +works well for us. But also in python there are many builtin types (collection +types, types that form the mop, stdlib modules implemented in C/rpython) and +tracing operations on them is important too + +Stefan Marr: +"I think Mozilla had a blog post talking more about the difficulty with +TraceMonkey, could only find this one: +https://blog.mozilla.org/nnethercote/category/jagermonkey/" +https://twitter.com/smarr/status/1818600052752797990 + +imo doing tracing for JS is really hard mode, because the browser is so +incredibly warmup-sensitive. IIRC tracemonkey used a really low loop trip count +(single-digit?) to decide when to start tracing (pypy uses >1000). the JS +interpreters of the time were also quite slow. + +Max Bernstein: +"What about basic block versioning?" +https://twitter.com/tekknolagi/status/1818368411157905482 + +It's another point in the phase space ;-). I like it a lot, and maybe it could +be pushed really far to give the best of both cfg-based and tracing approaches. +I'd be curious to see a BBV-based meta-JIT (but unfortunately writing meta-JITs +is super expensive in terms of time). + +Maxime Chevalier: +"There are a number of corner cases you have to deal with in a tracing JIT. It's +unfortunately not as simple and easy as the initial papers would have you +believe. One example is how would you deal with a loop inside a loop? Is your +tracing now recursive? + +There's been some research work on trace stitching to deal with trace explosion +but it does add complexity. With the increase in complexity, I think most +industrial VM developers would rather pick tried-and-true method-based JITs +that are well understood." + +https://twitter.com/Love2Code/status/1818292516753383644 + +## Conclusion + +In a non-meta-jit it's very unclear to me that you should use tracing. Rather +spend effort on a solid cfg-based baseline and then try to get some of the good +properties of tracing on top (path splitting, partial inlining, etc) + +in the meta-jit of pypy context I still think it's a relatively pragmatic +choice, and in the cases where it works well the performance of pypy is quite +hard to beat (particularly with the constraint of not being "allowed" to change +the language) + +this is all purely based on the data point of a single project, of course, +albeit one that has implemented a whole bunch of different languages. please +everyone tell me if you disagree with me. + +a side point: nobody in the current thread did this, but people who haven't +worked on python tend to underestimate its complexity. A pet peeve of mine is +C++ compiler devs/static analysis people/other well-meaning communities coming +with statements like "why don't you just..." 🤷‍♀️ + + From 09aadff88ed37eee2299cdc4d515f847af1a9c65 Mon Sep 17 00:00:00 2001 From: CF Bolz-Tereick Date: Fri, 23 Aug 2024 16:04:22 +0200 Subject: [PATCH 2/8] prosify --- posts/2024/08/state-of-tracing.md | 226 ++++++++++++++++-------------- 1 file changed, 124 insertions(+), 102 deletions(-) diff --git a/posts/2024/08/state-of-tracing.md b/posts/2024/08/state-of-tracing.md index 9042016da..617fa0207 100644 --- a/posts/2024/08/state-of-tracing.md +++ b/posts/2024/08/state-of-tracing.md @@ -10,143 +10,167 @@ .. author: CF Bolz-Tereick --> - +A few weeks ago, [Shriram Krishnamurthi](https://cs.brown.edu/~sk/) [asked on +Twitter](https://twitter.com/ShriramKMurthi/status/1818009884484583459): "I'm curious what the current state of tracing JITs is. They used to be all the rage for a while, then I though I heard they weren't so effective, then I haven't heard of them at all. Is the latter because they are ubiquitous, or because they proved to not work so well?" -https://twitter.com/ShriramKMurthi/status/1818009884484583459 - -my opinion on tracing. this is such a complicated question, kind of too large for twitter. here's a thread that should be a blog post, with sections: +I replied with my personal (partly pretty subjective) opinions about the +question in a lengthy Twitter thread (which also spawned an even lengthier +discussion). I wanted to turn what I wrote there into a blog post to make it +more widely available. The blog post i still somewhat terse, I've tried to at +least add links to further information. Please ask in the comments if something +is particularly unclear. ## Meta-tracing -personal context: been working on pypy since ~20 years. pypy has a meta-JIT, -which allows it to re-use jit infrastructure for the various Python versions, -and also for some experimental different languages like Prolog, Racket, also an -ARM and RISC-V emulator +First some personal context: my perspective is informed by nearly two decades +of work on PyPy. PyPy's implementation language, RPython, has support a +meta-JIT, which allows it to re-use its JIT infrastructure for the various +Python versions that we support (currently we do releases of PyPy2.7 and +PyPy3.10 together). We have also used the meta-JIT infrastructure for some +experimental different languages like Prolog, Racket, a database (those +implementations had various degrees of maturity and most of them are research +software and aren't maintained any more), but also some more surprising things +like an ARM and RISC-V emulator. PyPy gives itself the goal to try to be extremely compatible with all the -quirks of the Python language. so changing the language to make things easier -to compile is a no no. we try hard to have no opinions on language design, they -come up with the semantics, we somehow deal. - -PyPy started using a tracing JIT approach *not* because we thought method jits -are bad. but because we had failed to do a method-based meta-JIT that was using -partial evaluation (we wrote three or four method-based prototypes that all -weren't as good as we hoped). - -In the meta-JIT context tracing is nice, because tracing has relatively -understandable behavior and its easy(ish) to tweak how things work with extra -annotations in the interpreter source. +quirks of the Python language. We don't change the Python language to make +things easier to compile. We try very hard to have no opinions on language +design. The CPython core developers come up with the semantics, we somehow deal +with them. + +PyPy started using a tracing JIT approach *not* because we thought method-based +just-in-time compilers are bad. Historically we had tried to implemend a +method-based meta-JIT that was partial evaluation (we wrote three or four +method-based prototypes that all weren't as good as we hoped). After all those +experiments failed we switched to the tracing approach, and only at this point +did our meta-JIT start producing interesting performance. + +In the meta-JIT context tracing has good propreties, because tracing has +relatively understandable behavior and its easy(ish) to tweak how things work +with extra annotations in the interpreter source. + +Another reason why meta-tracing often works well for PyPy is that it can often +slice through the complicated layers of Python quite effectively and remove a +lot of overhead. Python is often described as simple, but I think that's +actually a misconception. On the implementation level it's a very big and +complicated language, and it is also continuously getting new features every +year (the language is quite a bit more complicated than Javascript, for +example). -meta-tracing often works well for us/pypy. It can often slice through the -complicated layers of Python quite effectively and remove a lot of overhead -(Python is more complicated than JS, imo. it's big and complex and growing) ### Truffle -later truffle came along and made a method-based meta-JIT using partial -evaluation work, but with a lot more people/resources and at first requiring a -quite specific style of interpreters - -it's still my impression that getting similar results with truffle is a lot -more work than with rpython and the warmup of truffle can often pretty bad. but -both are questions more for @smarr again - -## Tracing, the good - -the aggressive partial inlining of tracing, following just the hot path through -lots of layers of abstraction, is obviously often really useful for generating -fast code +Later Truffle came along and made a method-based meta-JIT using partial +evaluation work. However Truffle (and Graal) has had significantly more people +working on it and much more money invested. In addition, it at first required a +quite specific style of AST-based interpreters (in the last few years they have +also started supporting bytecode-based interpreters). -it should be possible to achieve the same effect in a method-based context with -path splitting. but it's not trivial, because the path execution counts of -inlined functions can often be very call-site dependent, and tracing gives you -call-site dependent path splitting +It's still my impression that getting similar results with Truffle is a lot +more work for language implementers than with RPython, and the warmup of +Truffle can often pretty bad. But Truffle is definitely an existence proof that +meta-JITs don't *have* to be based on tracing. -(the aggressive partial inlining and path splitting is even more important in -the meta-tracing context of pypy, where some of layers are part of the runtime, -and where rare corner cases are basically absolutely everywhere) -tracing makes a whole bunch of optimizations really easy to implement, because -there are (to first approximation) no control flow merges. This allows us to do -optimizations in exactly one forwards and one backwards pass. Eg our allocation -removal/partial escape analysis is simple +## Tracing, the good -in a tracing jit it can therefore be quite easy to get some pretty decent -optimizations. Our optimization of temporary allocations, the way we can reason -about the heap, about dictionary accesses, about properties of functions of the -runtime is all quite decent +Let's now discuss some of the advantages of tracing that go beyond the ease of +using tracing for a meta-JIT. +Tracing allows for doing very aggressive partial inlining, following just the +hot path through lots of layers of abstraction, is obviously often really +useful for generating fast code -## Tracing, the bad +It's definitely possible to achieve the same effect in a method-based context +with path splitting. But it requires a lot more implementation work and is not +trivial, because the path execution counts of inlined functions can often be +very call-site dependent, and tracing gives you call-site dependent path +splitting "for free". -downsides of tracing: in my experience it tends to have big performance cliffs. -The 'good' cases are really good, but if something goes wrong you are annoyed -and performance can become a lot slower. with a simple method jit perf is more -"even" +(The aggressive partial inlining and path splitting is even more important in +the meta-tracing context of PyPy, where some of inlined layers are a part of +the language runtime, and where rare corner cases that are never executed in +practice are basically absolutely everywhere.) -there are a bunch of strange corner cases that tracing has (when do you stop -inlining, what about tracing recursion, what happens if your traces are too -long, stuff like that) +Another advantage of tracing is that it makes a whole bunch of optimizations +really easy to implement, because there are (to first approximation) no control +flow merges. This makes all the optimizations that we do (super-)local +optimizations, that operate on a single (very long) basic block. This the JIT +to do the optimizations in exactly one forwards and one backwards pass. Eg our +allocation removal/partial escape analysis is simple. -I agree with this too (and @samth and I have discussed it a few times when -working on Pycket): if you trace the bytecode dispatch loop of a bytecode -interpreter (or other interpreter-like control flow), you will get not great -results +This ease of implementation of optimizations allowed us to implement some +pretty decent optimizations. Our optimization of temporary allocations, the way +we can reason about the heap, about dictionary accesses, about properties of +functions of the runtime, about the range and known bits of integer variables +is all quite solid. -https://twitter.com/pnguyen0112/status/1818100321652199456 -this is because the core assumption of the tracing jit "loops take similar -control flow paths" is just really wrong in the case of interpreters +## Tracing, the bad +Tracing also comes with a significant number of downsides. Probably the biggest +one is that it tends to have big performance cliffs (PyPy certainly has them, +and other tracing JITs such as TraceMonkey had them too). The 'good' cases are +really good, but if something goes wrong you are annoyed and performance can +become a lot slower. With a simple method jit the performance is often much +more "even". -this is because the core assumption of the tracing jit "loops take similar -control flow paths" is just really wrong in the case of interpreters +Another set of downsides is that tracing has a number of corner cases and +"weird" behaviour in certain situations. Questions such as: +- When do you stop inlining? +- What happens when you trace recursion? +- What happens if your traces are consistently too long, even without inling? +- and so on... +There are also some classes of programs that tend to perform quite poorly when +they are executed by a tracing JIT, bytecode interpreters in particularly, and +other extremely unpredictably branchy code. This is because the core assumption +of the tracing jit "loops take similar control flow paths" is just really wrong +in the case of interpreters. ## Discussion +The Twitter thread spawned quite a bit of discussion, please look at the +original thread. Here are three that I wanted to highlight: + "This is a really great summary. Meta-tracing is probably the one biggest success story. I think it has to do with how big and branchy the bytecode implementations are for typical dynamic languages; the trace captures latent type feedback naturally. -There is an upper limit, tho." +There is an upper limit, tho." [Ben Titzer](https://twitter.com/TitzerBL/status/1818385622203298265) -https://twitter.com/TitzerBL/status/1818385622203298265 +I agree with this completely, the complexity of Python bytecodes is a big +factor for why meta tracing works well for us. But also in Python there are +many builtin types (collection types, types that form the meta-object protocol +of Python, standard library modules implemented in C/RPython) and tracing +operations on them is very important too, for good performance. -Exactly, the complexity of py bytecodes is a big factor for why meta tracing -works well for us. But also in python there are many builtin types (collection -types, types that form the mop, stdlib modules implemented in C/rpython) and -tracing operations on them is important too -Stefan Marr: +---- + "I think Mozilla had a blog post talking more about the difficulty with TraceMonkey, could only find this one: https://blog.mozilla.org/nnethercote/category/jagermonkey/" -https://twitter.com/smarr/status/1818600052752797990 -imo doing tracing for JS is really hard mode, because the browser is so +[Stefan Marr](https://twitter.com/smarr/status/1818600052752797990) + +"imo doing tracing for JS is really hard mode, because the browser is so incredibly warmup-sensitive. IIRC tracemonkey used a really low loop trip count (single-digit?) to decide when to start tracing (pypy uses >1000). the JS -interpreters of the time were also quite slow. +interpreters of the time were also quite slow." -Max Bernstein: -"What about basic block versioning?" -https://twitter.com/tekknolagi/status/1818368411157905482 +[me](https://twitter.com/cfbolz/status/1818609594219811245) -It's another point in the phase space ;-). I like it a lot, and maybe it could -be pushed really far to give the best of both cfg-based and tracing approaches. -I'd be curious to see a BBV-based meta-JIT (but unfortunately writing meta-JITs -is super expensive in terms of time). +---- -Maxime Chevalier: "There are a number of corner cases you have to deal with in a tracing JIT. It's unfortunately not as simple and easy as the initial papers would have you believe. One example is how would you deal with a loop inside a loop? Is your @@ -157,26 +181,24 @@ but it does add complexity. With the increase in complexity, I think most industrial VM developers would rather pick tried-and-true method-based JITs that are well understood." -https://twitter.com/Love2Code/status/1818292516753383644 +[Maxime Chevalier](https://twitter.com/Love2Code/status/1818292516753383644) ## Conclusion -In a non-meta-jit it's very unclear to me that you should use tracing. Rather -spend effort on a solid cfg-based baseline and then try to get some of the good -properties of tracing on top (path splitting, partial inlining, etc) +In a non-meta-jit it's very unclear to me that you should use tracing. It makes +more sense to rather spend effort on a solid control-flow-graph-based baseline +and then try to get some of the good properties of tracing on top (path +splitting, partial inlining, etc). -in the meta-jit of pypy context I still think it's a relatively pragmatic -choice, and in the cases where it works well the performance of pypy is quite +For PyPy with it's meta-JIT I still think tracing is a relatively pragmatic +choice, and in the cases where it works well the performance of PyPy is quite hard to beat (particularly with the constraint of not being "allowed" to change -the language) - -this is all purely based on the data point of a single project, of course, -albeit one that has implemented a whole bunch of different languages. please -everyone tell me if you disagree with me. - -a side point: nobody in the current thread did this, but people who haven't -worked on python tend to underestimate its complexity. A pet peeve of mine is -C++ compiler devs/static analysis people/other well-meaning communities coming -with statements like "why don't you just..." 🤷‍♀️ +the language). +All of the above is all purely based on the data point of a single project, of +course, but one that has implemented a number of different languages. +(A side point: people who haven't worked on Python tend to underestimate +its complexity. A pet peeve of mine is C++ compiler devs/static +analysis/Javascript people/other well-meaning communities coming with +statements like "why don't you just..." 🤷‍♀️) From 16418a935b38c417511aa5963b2d093fe76c5fe5 Mon Sep 17 00:00:00 2001 From: CF Bolz-Tereick Date: Fri, 23 Aug 2024 16:10:53 +0200 Subject: [PATCH 3/8] move --- posts/2024/{08 => 09}/state-of-tracing.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename posts/2024/{08 => 09}/state-of-tracing.md (100%) diff --git a/posts/2024/08/state-of-tracing.md b/posts/2024/09/state-of-tracing.md similarity index 100% rename from posts/2024/08/state-of-tracing.md rename to posts/2024/09/state-of-tracing.md From c38f34643ecb645e995d0f8b893c28d36b582743 Mon Sep 17 00:00:00 2001 From: CF Bolz-Tereick Date: Fri, 23 Aug 2024 17:48:03 +0200 Subject: [PATCH 4/8] more edits --- posts/2024/09/state-of-tracing.md | 245 ++++++++++++++++++------------ 1 file changed, 149 insertions(+), 96 deletions(-) diff --git a/posts/2024/09/state-of-tracing.md b/posts/2024/09/state-of-tracing.md index 617fa0207..b0f3f70c9 100644 --- a/posts/2024/09/state-of-tracing.md +++ b/posts/2024/09/state-of-tracing.md @@ -1,7 +1,7 @@ -A few weeks ago, [Shriram Krishnamurthi](https://cs.brown.edu/~sk/) [asked on +Last summer, [Shriram Krishnamurthi](https://cs.brown.edu/~sk/) [asked on Twitter](https://twitter.com/ShriramKMurthi/status/1818009884484583459): > "I'm curious what the current state of tracing JITs is. They used to be all the @@ -18,19 +18,31 @@ Twitter](https://twitter.com/ShriramKMurthi/status/1818009884484583459): > haven't heard of them at all. Is the latter because they are ubiquitous, or > because they proved to not work so well?" -I replied with my personal (partly pretty subjective) opinions about the +I replied with my personal (pretty subjective) opinions about the question in a lengthy Twitter thread (which also spawned an even lengthier discussion). I wanted to turn what I wrote there into a blog post to make it -more widely available. The blog post i still somewhat terse, I've written a -small background section and tried to at least add links to further -information. Please ask in the comments if something is particularly unclear. +more widely available (Twitter is no longer easily consumable without an +account), and also because I'm mostly not using Twitter anymore. The blog post +i still somewhat terse, I've written a small background section and tried to at +least add links to further information. Please ask in the comments if something +is particularly unclear. ## Background -I'll explain a few of the central terms of the rest of the post. JIT compilers +I'll explain a few of the central terms of the rest of the post. *JIT compilers* are compilers that do their work at runtime, interleaved (or concurrent with) -the execution of the program. +the execution of the program. There are (at least) two common general styles of +JIT compiler architectures. The most common one is that of a method-based JIT, +which will compile one method or function at a time. Then there are tracing JIT +compilers, which generate code by tracing the execution of the user's program. +They often focus on loops as stheir main unit of compilation. + +Then there is the distinction between a "regular" JIT compiler and that of a +*meta-JIT*. A regular JIT is built to compile one specific source language to +machine code. A meta-JIT is a framework for building JIT compilers for a +variety of different languages, re-using as much machinery as possible between +the different implementation. ## Personal and Project Context @@ -70,7 +82,7 @@ JIT](https://en.wikipedia.org/wiki/Tracing_just-in-time_compilation) approach *not* because we thought method-based just-in-time compilers are bad. Historically we [had tried](https://foss.heptapod.net/pypy/extradoc/-/blob/branch/extradoc/eu-report/D08.2_JIT_Compiler_Architecture-2007-05-01.pdf) -to implemend a method-based meta-JIT that was partial evaluation (we wrote +to implemend a method-based meta-JIT that was using partial evaluation (we wrote three or four method-based prototypes that all weren't as good as we hoped). After all those [experiments failed](https://pypy.org/posts/2008/10/sprint-discussions-jit-generator-3301578822967655604.html) @@ -78,7 +90,7 @@ we switched to the [tracing approach](https://dl.acm.org/doi/10.1145/1565824.1565827), and only at this point did our meta-JIT start producing interesting performance. -In the meta-JIT context tracing has good propreties, because tracing has +In the meta-JIT context tracing has good properties, because tracing has relatively understandable behavior and its easy(ish) to tweak how things work [with extra annotations in the interpreter source](https://dl.acm.org/doi/10.1145/2069172.2069181). @@ -101,10 +113,10 @@ example[^help]). Later [Truffle](https://dl.acm.org/doi/abs/10.1145/2509578.2509581) came along and made a method-based meta-JIT using partial evaluation work. However Truffle -(and [Graal]()) has had significantly more people working on it and much more +(and [Graal](https://www.oracle.com/java/graalvm/)) has had significantly more people working on it and much more money invested. In addition, it at first required a quite specific style of [AST-based interpreters](https://dl.acm.org/doi/10.1145/2384577.2384587) (in -the last few years they have also started supporting bytecode-based +the last few years they have also added support for bytecode-based interpreters). It's still my impression that getting similar results with Truffle is [more @@ -117,27 +129,28 @@ meta-JITs don't *have* to be based on tracing. ## Tracing, the good -Let's now discuss some of the advantages of tracing that go beyond the ease of -using tracing for a meta-JIT. +Let's now actually get to he heart of Shriram's question and discuss some of +the advantages of tracing that go beyond the ease of using tracing for a +meta-JIT. Tracing allows for doing very aggressive [partial inlining](https://www.cs.fsu.edu/~xyuan/INTERACT-15/papers/paper11.pdf), -following just the hot path through lots of layers of abstraction, is obviously -often really useful for generating fast code +Following just the hot path through lots of layers of abstraction is obviously +often really useful for generating fast code. It's definitely possible to achieve the same effect in a method-based context with [path splitting](https://dl.acm.org/doi/pdf/10.1145/117954.117955). But it requires a lot more implementation work and is not trivial, because the path [execution counts](https://dl.acm.org/doi/10.1145/504282.504295) of inlined -functions can often be very call-site dependent, and tracing gives you -call-site dependent path splitting "for free". +functions can often be very call-site dependent. Tracing, on the other hand, +gives you call-site dependent path splitting "for free". (The aggressive partial inlining and path splitting is even more important in the meta-tracing context of PyPy, where some of inlined layers are a part of the language runtime, and where rare corner cases that are never executed in -practice are basically absolutely everywhere.) +practice are everywhere.) -Another advantage of tracing is that it makes a whole bunch of optimizations +Another advantage of tracing is that it makes a number of optimizations really easy to implement, because there are (to first approximation) no control flow merges. This makes all the optimizations that we do (super-)[local optimizations](https://en.wikipedia.org/wiki/Optimizing_compiler#Local_vs._global_scope), @@ -164,7 +177,7 @@ Tracing also comes with a significant number of downsides. Probably the biggest one is that it tends to have big performance cliffs (PyPy certainly has them, and other tracing JITs such as TraceMonkey had them too). In my experience the 'good' cases of tracing are really good, but if something goes wrong you are -annoyed and performance can become a lot slower. With a simple method jit the +annoyed and performance can become a lot slower. With a simple method JIT the performance is often much more "even". Another set of downsides is that tracing has a number of corner cases and @@ -199,7 +212,7 @@ highlight: [Ben Titzer](https://twitter.com/TitzerBL/status/1818385622203298265) -I agree with this completely, the complexity of Python bytecodes is a big +I agree with this completely. The complexity of Python bytecodes is a big factor for why meta tracing works well for us. But also in Python there are many builtin types (collection types, types that form the [meta-object protocol](https://en.wikipedia.org/wiki/Metaobject#Metaobject_protocol) of @@ -222,6 +235,11 @@ operations on them is very important too, for good performance. [me](https://twitter.com/cfbolz/status/1818609594219811245) +In the meantime there were some more reminiscences about tracing in Javascript +by [Shu-Yu Guo in a panel +discussion](https://www.youtube.com/live/_VF3pISRYRc?t=24797s) and by [Jason +Orendorff on Mastodon](https://kfogel.org/notice/AngH0uqyJl231yLLOa). + ---- > "There are a number of corner cases you have to deal with in a tracing JIT. It's @@ -262,7 +280,7 @@ meta-JIT approach. Instagram is running on [Cinder](https://github.com/facebookincubator/cinder/) and also CPython has [grown a JIT recently](https://tonybaloney.github.io/posts/python-gets-a-jit.html) which -will be part of the upcoming [3.13 release, but only as an off-by-default build +was part of the recent [3.13 release, but only as an off-by-default build option](https://docs.python.org/3.13/whatsnew/3.13.html#an-experimental-just-in-time-jit-compiler), so I'm very excited about how Python's performance will develop in the next years! From a7377b7d56e88f3d0c7bcfd5c1a0f43b2cd65e52 Mon Sep 17 00:00:00 2001 From: CF Bolz-Tereick Date: Sat, 4 Jan 2025 11:10:23 +0100 Subject: [PATCH 8/8] fix typos --- posts/2025/01/state-of-tracing.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/posts/2025/01/state-of-tracing.md b/posts/2025/01/state-of-tracing.md index 615567efb..dafccf412 100644 --- a/posts/2025/01/state-of-tracing.md +++ b/posts/2025/01/state-of-tracing.md @@ -49,7 +49,7 @@ the different implementation. Some personal context: my perspective is informed by nearly [two decades](https://mail.python.org/archives/list/pypy-dev@python.org/thread/TZM37YJ733G445R6JGTV26333RQEPLRX/) of work on PyPy. PyPy's implementation language, [RPython](https://rpython.readthedocs.io/), has support for a -meta-JIT, which allows it to re-use its JIT infrastructure for the various +meta-JIT, which allows it to reuse its JIT infrastructure for the various Python versions that we support (currently we do releases of PyPy2.7 and PyPy3.10 together). Our meta-JIT infrastructure has been used for some experimental different languages like: @@ -82,7 +82,7 @@ JIT](https://en.wikipedia.org/wiki/Tracing_just-in-time_compilation) approach *not* because we thought method-based just-in-time compilers are bad. Historically we [had tried](https://foss.heptapod.net/pypy/extradoc/-/blob/branch/extradoc/eu-report/D08.2_JIT_Compiler_Architecture-2007-05-01.pdf) -to implemend a method-based meta-JIT that was using partial evaluation (we wrote +to implement a method-based meta-JIT that was using partial evaluation (we wrote three or four method-based prototypes that all weren't as good as we hoped). After all those [experiments failed](https://pypy.org/posts/2008/10/sprint-discussions-jit-generator-3301578822967655604.html)