Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Suggestions to improve imports #2276

Closed
Tracked by #2250
denis-migdal opened this issue Oct 12, 2023 · 23 comments
Closed
Tracked by #2250

[FEATURE] Suggestions to improve imports #2276

denis-migdal opened this issue Oct 12, 2023 · 23 comments

Comments

@denis-migdal
Copy link
Contributor

denis-migdal commented Oct 12, 2023

Hi,

Here some suggestions to improve imports.

A. When searching for a module, search for all directories at once

To find a module, Python search them in different directories one by one.
This isn't an issue on Desktop, but on a website, it means sending a query, waiting for an answer, if not found sending another query, etc.

A most efficient way could be to send several async queries to search for the module in different dir at the same time.

  1. Search if module already known, if not :
  2. Send multiple queries [Q1,Q2,Q3,Q4...] at once.
  3. We know the correct module query responses Qn when there are no correct answer for all Qi (i<n) : i.e. we don't need to wait for Qn+1 if Qn has a correct answer.

B. Allows to define module path aliases

One even more efficient ways would be to define aliases to precise where to find a given module path, e.g. something like :

{
       "brython.{module_name}": "{$BRYTHON_DIR}/Lib/{module_path}.py",
       "myapp.{module_name}": "/src/brython/{module_name}/index.py",
       "mylib.{module_name}": "https://www.mylib.fr:8080/{module_path}",
}

The behavior would then be :

  1. Search if module already known, if not :
  2. Test the first alias, if it matches the key, try to import the module :
    - e.g. "import brython.browser.console" => "/Brython-3.19/Lib/browser/console.py"
    - e.g. "import myapp.browser.console" => "/src/brython/browser.console/index.py"
  3. If not found, try the next alias in the list until the list is over.
  4. If not found fallback into searching in the directories.

This would have the side-effect of allowing the generate kind of "maps", listing all modules and their path, we could give to Brython for an efficient look up. Such aliases could also be "scoped" in Brython module by the "init.py" files ?
i.e. they could define their own aliases to apply inside their files (and not outside).

Such paths could be given either by an HTML tag or by a file.

C. Improve support of JS import

Could be nice to be able to import ES6 module like we import Brython modules, i.e. with from js.myapp import foo
This could be made possible thanks to the previous suggestion, simply by adding an aliases resolving a module to a .js file.

There are 5 types of JS scripts :

  1. simple files : we can load thanks to the load() function, and that we shouldn't import.
  2. files imported by a "script" tag. Then it could be nice to be able, e.g. if the <script> tag has an id= attribute, to add them as a module. But I'm not sure how it could be made.
  3. files having a Brython module format (I belied (function($B){})(__BRYTHON) ) or is this something like 1. we'd use with load() ?
  4. files that exposes a "$module" variable.
  5. ES6 files.

I'll advise the following behavior:

  1. replace the 4th type of JS scripts by an ES6 compatible format, "$module =" would becomes "module.exports =".
  2. if the filename ends with ".js" in the aliases, use JS let module = await import(path) to import the module.

D. Exporting Brython modules as an ES6 module

Of course, trying to import a Brython module inside a JS ES6 module wouldn't work as JS modules have no idea on how to handle Brython files.

But, we could have a brython-es6.js file, to which we could give in its GET parameters brython module e.g. brython-es6.js?module=XXXX that would do :

let module = .... ; // dunno how to get that

async function loadBrythonModule(module) {
   let brythonObj = ...; // load + transpile Brython
   
   return ....
}

module.exports = await loadBrythonModule(module);
@PierreQuentel
Copy link
Contributor

I'm afraid option A is not possible, because imports are synchronous in Python. They are implemented in Brython with blocking Ajax calls - those have been marked as deprecated for ages, but fortunately are still working. If they are removed some day, I will be in trouble...

For option B (or A bis) :

  • there is already a mapping for the modules in the standard library (only used of course if brython_stdlib.js is not loaded)
  • for user-defined or 3rd party modules and packages, there are already 2 ways to optimize imports: create a bundle with brython-cli make_modules or a Brython package with brython-cli make_package

I am reluctant to adding another feature to optimize imports, which would be less efficient than these bundles anyway (both eliminate the need for Ajax calls).

More on the other items later.

Merci !

@denis-migdal
Copy link
Contributor Author

I'm afraid option A is not possible, because imports are synchronous in Python. They are implemented in Brython with blocking Ajax calls - those have been marked as deprecated for ages, but fortunately are still working. If they are removed some day, I will be in trouble...

For "standard" import the solution is quite simple.
When building the AST, get the list of the import, and fetch them, build their AST too, fetch their dependencies, etc. before really starting to execute the Brython file.

But indeed for some import functions of importlib, that'd be hard/impossible...
E.g. for SourceFileLoader(module_name, str(file) ).load_module().

With WebWorker, it seems we can pause them... the issue is that we can't put Brython in a WebWorker as it doesn't have DOM access...

That'd need an active wait it seems...
Fortunately, downloading is quick so it'll be good (and I hopes SourceFileLoader(module_name, str(file) ).load_module(). wouldn't be called too often. And throw an exception "FileNotFound" if active wait takes too much time.

For the A. that should indeed be B. I'll answer later too.

@denis-migdal
Copy link
Contributor Author

Okay, found a game changing idea for brython-cli make_package for a simple but perfect integration with all modern WebDev tools :)

I have to finish some work, then I'll explain it here ;)

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 13, 2023

Here my proposal :

Introduce 2 new brython-cli commands:

brython-cli build [OPTIONS...] # execute all brython-cli commands required for your application
brython-cli watch [OPTIONS...] # like brython-cli build, but in watch mode.

Watch mode is a very useful thing for large project and is often used in web development tools. It runs in the background, listening to the file changes (e.g. with ionotify-wait) and recreate the output for the modified files. It caches the results for all files in order to speed up the process and only recompute what needs to be recomputed.

The first option would be --config $PATH. By default, its value would be $ROOT_DIR/brython.json. It is very common as it enables to write the options of the build and watch commands into a file, which is very helpful when the command line becomes long, and avoid having to remember and write all the necessary options.

This configuration could be, in development mode, be included in an HTML Tag, either within as a JSON, or with a src= attribute.

The file would be a JSON with the following structure:

{
    "root-dir": [PATH], // CLI option "--root-dir", by default is the directory containing this file.
    // if no brython.json, would be the current working directory.
    
   "brython-dir": [PATH],// CLI option "--brython-dir", by default "."
   // the path where we can find brython.js files (relative to "root-dir").
   "output-dir": [PATH], // CLI option "--output-dir" by default "./build/brython", relative to "root-dir"
    // this enables to easily use WebDev tools like "webpack watch"/"Live Server"/"browser-sync"
    // users would make them watch "output-dir", and exclude the sources dir / .py files from the watch.
          
    "modules": {
         "build-target": ["brython-legacy"|"es6"], // CLI option "--modules.build-target", by default the "brython-legacy" ?
         //   - brython-legacy : current behavior
         //   - es6 : output es6 JS modules, one JS file for one brython file.
         // the advantage of the "es6" option is be perfectly integrate with current webdev tools like WebPack.
         // without the need to write special plugins/loaders for theses tools.
         
          "paths": {
               " [module_prefix]": [path], // CLI option "--modules.path [prefix]=[glob path]", can have many
               // e.g. => "my.app.": "./src/brython/myapp/"
               // module names would be relative to "./src/brython/myapp/", and be prefixed by "my.app".
               // this allows to select the modules we want to include (or not) to the build.
          }

          // I don't know how we should handle Python dependencies.
          // Maybe automatically adding the relevant files in the output-dir ?
    }
}

Ofc, this would need a tutorial to use them, both in development and production mode.
But that'd ensure easy and full compatibility with modern web development tools, which would likely helps issues :

Now remains the issue about import, I'll answer that in my next message.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

Okay, so my idea to handle imports...

There are 4 types of imports :

  • async imports : no issues with them
  • sync imports (Brython) : the idea is to detect them during the AST building, and to start pre-fetching the ressources (and their ressources) before executing the produced JS code.
  • sync imports (ES6 JS) : at the beginning of the code : no issues. Inside block of codes, need to pre-import it (so might have some stuff executed sooner than expected). I'd suggest to raise an exception "unsupported import of JS module inside block of code".
  • sync dynamics imports : I'll suggest to require the user to prefetch it explicitly, else raising an exception "unfound module: if it exists, use async imports or prefetch it".

If we could produce ES6 module from Brython files :

  • sync imports (ES6 JS) : written as import {} from XXX. No further issues.
  • sync imports (Brython) : not possible, should be converted to ES6 module to be importable.

So the real issue is for sync dynamics imports, but that should not be quite frequent and would be bad practices. Though we'd offers way to prefetch it to somehow support it.

Else, for brython.js, it should initialize __BRYTHON__ variables and builtins modules synchronously so that we could access them from JS scripts when testing stuff or doing other crazy things :

  • Store the essential builtins in localstorage (this is sync, but only 10M available).
  • Store the builtins as JS inside brython.js ? instead of storing them as strings of Python code ?
  • Add a function : __BRYTHON__.whenImported(module_name) returning a Promise, that we could use in JS code (and that you could also use internally to solve dependencies) ?

Brython then would start executing Brython code found in the page, in an async ways (after DOMContentLoaded I guess ?).

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

@PierreQuentel JS has a new live-saving feature that'd save your life :)

A life saving future JS feature that'll solve all our issues : Top Level Await - which seems already implemented.

AND IT WORKS !

<script type="module">
async function foo() {
    return 4
}

await foo();

console.log('ok')
</script>

This won't help solving sync dynamic import in sync bloc code, but, hey, that's quite good enough ^^.

Another interesting feature worth looking at it : import maps.

@PierreQuentel
Copy link
Contributor

For "standard" import the solution is quite simple.
When building the AST, get the list of the import, and fetch them, build their AST too, fetch their dependencies, etc. before really starting to execute the Brython file.

It is quite simple in theory, but problematic to implement in practice.

If the module has import datetime, because of all the dependencies between standard library modules, you would have dozens of scripts to process before you can start running the Brython program, including modules that don't have much to do with dates and times such as pdb, zipimport, and even modules that are not supported by Brython such as socketserver or xml (you can use the stdlib modulefinder to check).

The strategy you suggest is used by brython_cli make_modules and results in brython_modules.js scripts that are sometimes surprisingly huge. If your directory only has a html file with the one-line Brython script import datetime, with Brython 3.12.0 brython_modules.js has no less that 3,2MB...

@PierreQuentel
Copy link
Contributor

A life saving future JS feature that'll solve all our issues : Top Level Await - which seems already implemented.

I have read about this feature, but I have not found how to take advantage of it to implement import.

If we have this Brython script

import some_module
print(some_module.x)

where some_module.py is a Python module in the same directory, how do you see it translated into Javascript using top-level await ?

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

If we have this Brython script

import some_module
print(some_module.x)

where some_module.py is a Python module in the same directory, how do you see it translated into Javascript using top-level await ?

If we handle it as a standalone file i.e. not evaluated by brython.js, i.e. pre-converted and saved to a file.
(I'll do the eval version later).

// To potentially wait Brython to be ready :
await ___BRYTHON__.whenReady();
// or
await new Promise( (resolve, error) => {
     document.addEventListener('BRYTHON_READY', resolve );
});
// or
import * as $B from 'brython.js'; // I think it requires <script type="module"/> or to be imported by another script ?
// or
const $B = await import('brython.js');


let some_module = await __BRYTHON__.fetchModule("some_module") // if "some_module" is Python module (ofc)
// if "from some_module import ...."
let {....} = await __BRYTHON__.fetchModule("some_module");

// do your print and other stuff here.

function foo() {}

// if you want it to handle ES6 export (which could be a good thing) ?
// but not authorized if not imported by a script, or not in <script type=module> ?
export foo;

With, in brython.js :

// returns null if not found.
async fetchModuleFromIndexDB(name) {
    // do the stuff here with lot of awaits.
}

async downloadModule(name) {

     let response = await fetch(....);
     while(response.status === 404) { // ?
           // try another one
            response = await fetch(....);
     }
     
     // OR for parallel candidate search :
     let candidates = [.........].
     let answers = await Promise.all(candidates.map( c => fetch(c) ) );
     response = .... // the first good answer.
     
     return await response.getText
}

async fetchModule(name) {

       if( name in __BRYTHON__.imported )
             return __BRYTHON__.imported[name];

       let module = await fetchModuleFromIndexDB(name);
       if( module === null ) {
             let str = await downloadModule('name');
             
             // do the JS conversion
             // eval the JS
             
             // put it in indexDB
             // put it in __BRYTHON__.imported ?
             
             return {} // an object containing all top-level symbol of the file (functions, variables, etc.)
       }
}

window.__BRYTHON__.fetchModule = fetchModule;
// or using ES6 export
export fetchModule

I did it from memory, this is more pseudo-code, but the idea is here.

@denis-migdal
Copy link
Contributor Author

If the module has import datetime, because of all the dependencies between standard library modules, you would have dozens of scripts to process before you can start running the Brython program

You could have, when building the package, all dependencies saved at the top of the package ?
So not needing to process them again ?

But with top-level import, that is less an issue I guess (I'll have to develop on that).

including modules that don't have much to do with dates and times such as pdb, zipimport, and even modules that are not supported by Brython such as socketserver or xml (you can use the stdlib modulefinder to check).

Humm... so you'll need like a blacklist ?

Another solution, is converting it to ES6 JS files, and let dedicated tool do the tree cutting (e.g. Webpack) ?
But it seems that Python modules are quite dirty ?

Why the fuck is datetime having a socketserver dependency ???

The strategy you suggest is used by brython_cli make_modules and results in brython_modules.js scripts that are sometimes surprisingly huge. If your directory only has a html file with the one-line Brython script import datetime, with Brython 3.12.0 brython_modules.js has no less that 3,2MB...

I'll think more about it, with the async in eval.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

Okay, in eval, you can wrap your code into an (async function() {})(), then await eval(yourcode).
This allows top level await in your eval, that you can then wait.

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

For dynamic synchronous import that are not at top level... just require them to be preloaded by the user (the ones he needs), or ask them to use async ? Maybe enabling it to define a blacklist for some ? Or a white list for some sync dependencies that won't be prefetched ?

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

Instead of the eval, you can insert a <script> tag, that'd be likely cleaner ?

Which seems very interresting...

Capture d’écran_2023-10-14_16-55-59

@denis-migdal
Copy link
Contributor Author

var script_tag = document.createElement('script');
script_tag.type = 'module';
script_tag.text = 'console.log("ok"); async function foo(){ console.log("ddd") }; await foo();'
//script_tag.text = 'alert("hello world")';
document.body.appendChild(script_tag);
console.log('loaded');
ok
ddd
loaded

Have to check if order is still good if we do an operation like fetch...

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

Okay, I'm sad.

<script type="text/javascript">
  
  var script_tag = document.createElement('script');
  script_tag.type = 'module';
  script_tag.text = 'console.log("ok"); async function foo(){ let r = await fetch("toto"); console.log(r.status) }; await foo();'
  //script_tag.text = 'alert("hello world")';
  document.body.appendChild(script_tag);
  console.log('loaded');
</script>
ok
loaded
404

But still, using a script tag instead of eval() could be cleaner, and would produce JS code we could save in file and use it as a normal JS file.

And would likely enables to use ES6 module import/export in the produced JS code.

EDIT: with an import or a src attribute, "loaded" is printed first...

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

One solution :

  • start using fetch when it is possible, and await fetch all top-level static sync imports.
  • for dynamic or non-top level sync imports use synchronous AJAX and prints a warning ("[depreciated] this type of import may no longer be possible due to the evolution of Web standards.") urging users to :
    • explicitly asking Brython to prefetch it.
    • explicitly asking Brython to put it in a blacklist.

Not sure if there are other ways around non-top level sync import...

@denis-migdal
Copy link
Contributor Author

denis-migdal commented Oct 14, 2023

Another solution, is converting it to ES6 JS files, and let dedicated tool do the tree cutting (e.g. Webpack) ?

But still, using a script tag instead of eval() could be cleaner, and would produce JS code we could save in file and use it as a normal JS file.

And would likely enables to use ES6 module import/export in the produced JS code.

I'll develop on that.

1. It enables to copy/paste from the Editor.

Imagine, at execution time, Brython putting all generated JS code into an ES6-compatible module <script type=module>.
This would mean that you could also put this code inside files, and to import such modules normally, like you would with ES6 modules, without having to do any special operations.

So e.g. copy-paste the JavaScript generated from the Editor to a file and use it directly, e.g. to test the code in JSPerf, in its own project, with some profiling tools, etc.

2. It enables better JS <=> Brython interactions

You could import JS ES6 modules in Brython files, and Brython modules in ES6 modules.
Dunno how the import would work for an export inside a non-src <script>.

If it isn't possible, maybe would ask Brython is the module exists, then try to load them in the ES6 way, and lastly do the Brython path search ?
Or the strategy would depend on the build mode ? I dunno, I'll have to check on that.

EDIT: it is not possible.

Then doing, if generated during runtime and ... not known by Brython :

let x;

try {
     x = await import( '....' ) // try fetching ES6 module
} catch() {
    x = await __BRYTHON__.fetchModule(...) // else try other paths.
}

let {......} = x;

If generated during runtime or building time, and ... known by Brython as a JS ES6 module file :

// + wait for brython to be ready if generated at building time...
import {....} from '.....';

If generated during runtime or building time, and ... known by Brython as a .py module file that will not be converted to an ES6 module :

// + wait for brython to be ready if generated at building time...
let {....} = await __BRYTHON__.fetchModule(...);

If generated during runtime and ... known by Brython as an already loaded Brython .py file :

x = await __BRYTHON__.getModule(...)

3. Without doing anything, it solves the issue of big dependencies (in number of files or size).

This means that for big dependencies like brython_sdtlib.js, you could instead have many .js ES6 modules files that will be synchronously imported by the Brython code that needs then, when needed. Thus considerably reducing what is downloaded (4MB? for brython_sdtlib.js), without doing anything, having to guess/search the .py path, or to add 50 <script src='...'> tags.

When debugging a watch mode to brython-cli could also make your life easier, regenerating the .js files when the source is modified.

4. It enables to use dedicated WebDev tools to generate bundles, do tree shaking, etc.

Currently, you can create packages with brython-cli commands, as well as re-building a lighter brython_stdlib.js.

But it'll becomes harder for a project that'd have many files, with potentially many dependencies. Thus requiring to add features and to maintain this tool. Which is a lot of work.

If, when building the production version of your website, your generated script use ES6 imports to import its JS dependencies (likely other generated scripts), you would then use any bundler you like e.g. Parcel, WebPack, etc. Tools that have lot of features, lot of possible configurations, plugins, etc. that you can't expect brython-cli to be able to do all that.

Therefore, you'll be able to rely on existing WebDev tool, and integrate with them. Moreover if brython-cli has a watch mode, you'd have 0 adaptations to use them and be able to perfectly integrate it with WebDev pipelines and tools.

@PierreQuentel
Copy link
Contributor

I have to admit I am a little lost with all these suggestions ;-)

I did a quick try with dynamic insertion of a script of type "module" with top-level awaits, and it actually worked. Making it a replacement for imports is not an easy task, and as you noticed it wouldn't work in Web Workers anyway. I keep it in mind in case blocking Ajax calls are removed some day...

The chain that leads from datetime to socketserver is hard to follow (you need tools for that...) but one of the issues is that some imports are inside functions or class methods; with your proposal to scan all imports recursively before running the script, you have to process these imports, even if your code never calls the functions or methods that import them.

For the other suggestions, could you try to implement one or several ones in branches and submit Pull Requests ? I realize it can be sometimes hard to understand how the current code works and I can help for that.

@denis-migdal
Copy link
Contributor Author

I have to admit I am a little lost with all these suggestions ;-)

I'll try to make a summary of this issue someday.
I keep thinking about that and try to find possible solutions. Sometimes I get ideas minutes after posting xD.

So yeah, I understand that this isn't clear.

I did a quick try with dynamic insertion of a script of type "module" with top-level awaits, and it actually worked. Making it a replacement for imports is not an easy task, and as you noticed it wouldn't work in Web Workers anyway. I keep it in mind in case blocking Ajax calls are removed some day...

It seems that you can declare them like that :
new Worker("worker.js", { type: "module" });

Cf https://stackoverflow.com/questions/44118600/web-workers-how-to-import-modules

I think top level await import would then be possible inside Workers if they are declared as a module.

For Web Workers, you also have importScript() that is synchronous, if you don't declare it to be a module.

The chain that leads from datetime to socketserver is hard to follow (you need tools for that...) but one of the issues is that some imports are inside functions or class methods; with your proposal to scan all imports recursively before running the script, you have to process these imports, even if your code never calls the functions or methods that import them.

Mmm.... Python is really doing dirty things...

For the other suggestions, could you try to implement one or several ones in branches and submit Pull Requests ? I realize it can be sometimes hard to understand how the current code works and I can help for that.

I am not familiar enough with the AST.
I understand now the generated JS code, and some of Brython stuff, but not enough about the AST.

I can write some mocks on my free time, but touching at a core part of Brython would be a little too hard.
I'll also have quite some works in the following days.

Currently, I have 19 opened issues which is a little too much at the same time.
Maybe I can open a "[Project]" issue, where I'll focus on ONE thing, the thing I find the most important, proposing a roadmap, steps by steps, trying to suggest little changes at each steps, while not bringing too much changes in current Brython ?

First project could be about ES6 module generation for Brython, aiming for better integration with WebDev tools / JS ES6 modules ?

I'd then suggest some steps for an alternative code generation (e.g. through an option in brython-cli), as simple as possible, potentially not enabling 100% of Brython features, and steps by steps bringing more features until a satisfactory state, even if not optimal ?
Then we could discuss e.g. through a visio about the steps, i'd revise the steps following our discussion, and i'd try to provide you with mocks ?

@PierreQuentel
Copy link
Contributor

Web workers is just one of the issues if imports are implemented as await $B.import(module_name).

If there is an import inside a function, the function must be made asynchronous; the programs that call this function must await it, but static code analysis won't tell you which programs might call functions that must be awaited because they import modules.

@denis-migdal
Copy link
Contributor Author

If there is an import inside a function, the function must be made asynchronous; the programs that call this function must await it, but static code analysis won't tell you which programs might call functions that must be awaited because they import modules.

I have a solution for that.

I'll be taking my time into writing a document to explain everything steps by steps, while verifying 2-3 things. I'll post a new issue with a [Project] tag once done. Maybe I'll wait until this week-end to get more step back.

@denis-migdal
Copy link
Contributor Author

See #2292

@denis-migdal
Copy link
Contributor Author

Closing this issue in favor of #2292.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants