-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OPTIMIZATION] Lazy Python => JS transcription ? #2318
Comments
My $0.02 on idea 'A'- My use case maybe a bit unorthodox, basically I'm making stand-alone programs (games) that are distributed as .html files and also meant to be run locally (locally as in file:/// not http://localhost:8000/ ). Since Chrome (and therefore most of the internet) treat file:/// as one domain for sharing localstorage, the space there is at a premium, there's 2-5MB(?) to share for all games and programs. For example Twine (a popular html game engine) saves living in local storage are known to take up 0.5-1.5MB just for one game. One script dumping some data into localstorage is probably not a problem, but if that data is not cleared when the page is closed - local storage will fill up pretty quick. Not sure if one can even clear the data when the page is unloaded -I've always been told that one can't reliably catch the 'unload' event (especially on mobile) and 'beforeunload' is it's own can of worms. For me, this would be a game breaking change (unless it's opt-in, like the indexedDB thing). |
I'm not talking about putting the script into If we assume each module name would take in average ~40 characters, then in 1kB, we can store ~25 modules names. |
Oh, my bad. |
Using |
Hi, Sorry for the late answer, some thoughts here: A - the indexedDB API is not used to detect if a module is stored in the database, because it is asynchronous. The detection is made from B - it's hard to tell if the balance between a plus in translation time against a minus in execution time is worth the extra complexity. If someone want to experiment it's ok for me. C - that would be great ! I am starting to play with ESLint, it seems to be able to detect at least a part of such "dead code", for instance names that are declared but never used. |
I don't get it. You do need to store data in a persistent, but dynamic, way in order to detect if a module is stored in the dataset.
I guess the first step would be to make some benchmark on translation time.
For Python code, I guess using settrace could help identify the lines that are never called, but that'd be a little more harder for JS code. Well this isn't the priority, but the idea is here. |
For the parser cost :
Parsing execution time is a little too slow. A big chunk of its cost seems to come from I think we'll have some work to do on the parsing. Maybe we could also provide some good practices for performances :
But I guess the priority is first the features, then the execution speed time, then the parsing speed time, and then the documentation/good practices. Raw data : <!DOCTYPE html>
<html>
<head>
<!-- Required meta tags-->
<meta charset="utf-8">
<title>X</title>
<!-- Brython -->
<script src="https://raw.githack.com/brython-dev/brython/master/www/src/brython.js"></script>
<!--<script src="https://raw.githack.com/brython-dev/brython/master/www/src/brython_stdlib.js"></script>-->
</head>
<body>
</body>
<script>
let data = "";
const pycode = `
def foo(i):
return i+i
print( foo(3) )
`;
let d = Date.now();
for(let i = 0; i < 1000000; ++i) {
data += __BRYTHON__.py2js(pycode, "exec").to_js()
}
let e = Date.now();
document.body.textContent = data.length + ' in ' + (e-d);
</script>
</html> |
This is explained in section "indexedDB cache for standard library modules" of the page How Brython works. Probably as hard to read as it was hard to write ;-) |
I found out that JS parser is very very very slow, e.g. using Meaning that, very often, it might be better to call a function during runtime instead of generating more JS code during translation. For example : fct.$infos = {
...
... // LOTS of lines
...
...
};
Then doing something like : setFctInfos(fct, "self, i, *arg"); // set the $infos. Should be waaaay quicker to parse by JS, and some operations are also transferred from Brython parsing time to Brython execution time. Meaning that code will start to execute sooner, and should be overall faster. I see that Brython is calling |
Hi,
Though of some ideas, some maybe ridiculous, but I think they are interesting to talk about.
@PierreQuentel what do you think ?
A. Use localStorage to store the name of the known modules ?
localStorage
is synchronous and faster thanindexDB
.Using it like :
... would maybe enable quicker checks if a module is present in
indexDB
?Maybe storing also e.g. a hash of the script (e.g.
myscript:hash,myscript2:hash2
) to know if we have to fetch it inindexDB
or to transpile it / or somehow some kind of cache policy ?Note: maybe we should look at the cost of indexDB vs transcription cost for small Python script - maybe using indexDB is not always worthing it ?
B. JIT transcription
Currently, Brython is doing AOT transcription during the page loading, storing the generated JS into
indexDB
.But there is no needs to parse a function/class until its first use/call, even more when it isn't used/called by the script.
Maybe something like :
The goal would be to reduce transcription/loading time knowing that lot of functions are often not used/called.
This would allow to start first execution sooner, with a small cost on runtime speed, which should be smaller than the total gain in transcription time.
The advantage is that this is adaptive.
If we can detect that one function will be used, we can directly generate the JS code inside
run()
.C Tree shaking / detection of dead code.
When generating Brython bundle/module, being able to have a static analysis, or run the script once, to detect the functions / parts of codes that aren't called, so that we can choose to remove then (by hand or automatically).
Cordially,
The text was updated successfully, but these errors were encountered: