Skip to content
Yao Xiao edited this page Apr 26, 2024 · 9 revisions

Group 1 COR by Yao Xiao and Yuchen Zhou

We use the Chain of Responsibility (COR) design pattern in the bundler.

By definition, "Chain of Responsibility is a behavioral design pattern that lets you pass requests along a chain of handlers. Upon receiving a request, each handler decides either to process the request or to pass it to the next handler in the chain". In our project, we inherit the Bundler struct from swc_bundler, whose execution follows the chain: path resolver --> path loader --> bundler --> AST transformers --> emitter. In particular, we follow this pattern and implement the following parts:

1. Path Resolver

PathResolver is in charge of resolving the module specifiers in the import statements. If path refers to a file, it is directly returned. Otherwise, path with each implicitly supported extension is tried in order. The implementation is as follows:

struct PathResolver {
    root: PathBuf,
}

impl PathResolver {
    /// Helper function for resolving a path by treating it as a file.
    ///
    /// If `path` refers to a file then it is directly returned. Otherwise, `path` with
    /// each extension in [`EXTENSIONS`] is tried in order.
    fn resolve_as_file(&self, path: &Path) -> Result<PathBuf, Error> {
        if path.is_file() {
            // Early return if `path` is directly a file
            return Ok(path.to_path_buf());
        }

        if let Some(name) = path.file_name() {
            let mut ext_path = path.to_path_buf();
            let name = name.to_string_lossy();

            // Try all extensions we support for importing
            for ext in EXTENSIONS {
                ext_path.set_file_name(format!("{name}.{ext}"));
                if ext_path.is_file() {
                    return Ok(ext_path);
                }
            }
        }
        bail!("File resolution failed")
    }

    /// Helper function for the [`Resolve`] trait.
    ///
    /// Note that errors emitted here do not need to provide information about `base`
    /// and `module_specifier` because the call to this function should have already
    /// been wrapped in an SWC context that provides this information.
    fn resolve_filename(
        &self,
        base: &FileName,
        module_specifier: &str,
    ) -> Result<FileName, Error> {
        let base = match base {
            FileName::Real(v) => v,
            _ => bail!("Invalid base for resolution: '{base}'"),
        };

        // Determine the base directory (or `base` itself if already a directory)
        let base_dir = if base.is_file() {
            // If failed to get the parent directory then use the cwd
            base.parent().unwrap_or_else(|| Path::new("."))
        } else {
            base
        };

        let spec_path = Path::new(module_specifier);
        // Absolute paths are not supported
        if spec_path.is_absolute() {
            bail!("Absolute imports are not supported; use relative imports instead");
        }

        // If not absolute, then it should be either relative, a node module, or a URL;
        // we support only relative import among these types
        let mut components = spec_path.components();
        if let Some(Component::CurDir | Component::ParentDir) = components.next() {
            let path = base_dir.join(module_specifier).clean();

            // Try to resolve by treating `path` as a file first, otherwise try by
            // looking for an `index` file under `path` as a directory
            let resolved_path = self
                .resolve_as_file(&path)
                .or_else(|_| self.resolve_as_file(&path.join("index")))?;

            // Reject if the resolved path goes beyond the root
            if !resolved_path.starts_with(&self.root) {
                bail!(
                    "Relative imports should not go beyond the root '{}'",
                    self.root.display(),
                );
            }
            return Ok(FileName::Real(resolved_path));
        }

        bail!(
            "node_modules imports should be explicitly included in package.json to \
            avoid being bundled at runtime; URL imports are not supported, one should \
            vendor its source to local and use a relative import instead"
        )
    }
}

impl Resolve for PathResolver {
    fn resolve(
        &self,
        base: &FileName,
        module_specifier: &str,
    ) -> Result<Resolution, Error> {
        self.resolve_filename(base, module_specifier)
            .map(|filename| Resolution { filename, slug: None })
    }
}

2. Path Loader

We use PathLoader to parse the source file into a module AST. The implementation is as follows:

struct PathLoader(Lrc<SourceMap>);

impl Load for PathLoader {
    fn load(&self, file: &FileName) -> Result<ModuleData, Error> {
        let fm = match file {
            FileName::Real(path) => self.0.load_file(path)?,
            _ => unreachable!(),
        };

        let syntax = Syntax::Es(EsConfig { jsx: true, ..Default::default() });

        // Parse the file as a module; note that transformations are not applied here,
        // because per-file transformations may lead to unexpected results when bundled
        // together; instead, transformations are postponed until the bundling phase
        match parse_file_as_module(&fm, syntax, Default::default(), None, &mut vec![]) {
            Ok(module) => Ok(ModuleData { fm, module, helpers: Default::default() }),
            Err(err) => {
                // The error handler requires a destination for the emitter writer that
                // implements `Write`; a buffer implements `Write` but its borrowed
                // reference does not, causing the handler to take ownership of the
                // buffer, making us unable to read from it later (and the buffer is
                // made private in the handler); the workaround is to use a temporary
                // file and access its content later by its path (we require the file to
                // live only for a short time so this is relatively safe)
                let mut err_msg = String::new();
                {
                    let context = format!(
                        "Parsing error occurred but failed to emit the formatted error \
                        analysis; falling back to raw version: {err:?}"
                    );
                    let buffer = NamedTempFile::new().context(context.clone())?;
                    let buffer_path = buffer.path().to_path_buf();
                    let handler = Handler::with_emitter_writer(
                        Box::new(buffer),
                        Some(self.0.clone()),
                    );
                    err.into_diagnostic(&handler).emit();
                    File::open(buffer_path)
                        .context(context.clone())?
                        .read_to_string(&mut err_msg)
                        .context(context.clone())?;
                }
                bail!(err_msg);
            },
        }
    }
}

3. Bundler

Bundler from swc_bundler puts PathResolver and PathLoader together and performs the chain of responsibility pattern internally. The implementation is as follows:

let mut bundler = Bundler::new(
    &globals,
    cm.clone(),
    PathLoader(cm.clone()),
    PathResolver { root: root.canonicalize()?.to_path_buf() },
    // We must disabled the default tree-shaking by the SWC bundler, otherwise it
    // will remove unused `React` variables, which is required by the JSX transform
    swc_bundler::Config {
        external_modules,
        disable_dce: true,
        ..Default::default()
    },
    Box::new(NoopHook),
);

4. AST transforms

We apply a sequence of AST transforms to the bundled AST. Each of them is responsible for a single type of transformation (use-case). Currently, we apply DCE (dead-code elimination), JSX transform (interpret react syntax), and import renamer transform (our custom transformer). The implementation is as follows:

    let code = GLOBALS.set(&globals, || {
        // Tree-shaking optimization in the bundler is disabled, so we need to manually
        // apply the transform; we need to retain the top level mark `React` because it
        // is needed by the JSX transform even if not explicitly used in the code
        let mut tree_shaking = Repeat::new(dce::dce(
            dce::Config { top_retain: vec![Atom::from("React")], ..Default::default() },
            Mark::new(),
        ));

        // There are two types of JSX transforms ("classic" and "automatic"), see
        // https://legacy.reactjs.org/blog/2020/09/22/introducing-the-new-jsx-transform.html
        //
        // The "automatic" transform automatically imports from "react/jsx-runtime", but
        // this module is not available when runnning the bundled code in the browser,
        // so we have to use the "classic" transform instead. The "classic" transform
        // requires `React` to be in scope, which we can require users to bring into
        // scope by assigning `const React = window.__DESKULPT__.React`.
        let mut jsx_transform = jsx::<SingleThreadedComments>(
            cm.clone(),
            None,
            Default::default(), // options, where runtime defaults to "classic"
            Mark::new(),        // top level mark
            Mark::new(),        // unresolved mark
        );

        // We need to rename the imports of `@deskulpt-test/apis` to the blob URL which
        // wraps the widget APIs to avoid exposing the raw APIs that allow specifying
        // widget IDs; note that this transform should be done last to avoid messing up
        // with import resolution
        let mut wrap_apis = as_folder(ImportRenamer(
            [("@deskulpt-test/apis".to_string(), apis_blob_url)].into(),
        ));

        // Apply the module transformations
        let module = module
            .fold_with(&mut tree_shaking)
            .fold_with(&mut jsx_transform)
            .fold_with(&mut wrap_apis);

       // -- snip emitter logic --

5. Emitter

Finally we emit the transformed AST into ECMAScript code as a string. The implementation is as follows:

    let code = GLOBALS.set(&globals, || {
        // -- snip AST transforms logic

        // Emit the bundled module as string into a buffer
        let mut buf = vec![];
        {
            let wr = JsWriter::new(cm.clone(), "\n", &mut buf, None);
            let mut emitter = Emitter {
                cfg: swc_ecma_codegen::Config::default().with_minify(true),
                cm: cm.clone(),
                comments: None,
                wr: Box::new(wr) as Box<dyn WriteJs>,
            };
            emitter.emit_module(&module).unwrap();
        }
        String::from_utf8_lossy(&buf).to_string()
    });

Group 2 Proxy by Xinyu Li and Frank Zhu

To implement the frontend wrapper of widget APIs, we use the proxy design pattern.

Motivation

The goal is to enable widget developers to write their source code like this

// A widget called `writer`
import apis from "@deskulpt/apis"

const content = await apis.fs.readFile("documents/poem.txt");    
console.log(`File content: ${content}`);

The challenge is that the backend rust APIs requires widget id as their argument. In backend, we have a corresponding API like this:

fn read_file(widget_id: String, path: String) -> Result<String> {
    // check if widget have access to path
    // if so, read file and return its content as string
}

Per this backend API, the writer widget need to call the APIs by explicitly passing in its widget ID

const content = await apis.fs.readFile("writer", "documents/poem.txt");    

This is undesirable since it requires the widget developers to actively maintain their widget ID. If they pass in the incorrect widget ID, the API call would have undefined behavior (read other's file, or fail to read file).

Failed Attempt: Class Component

A natural solution is to treat each widget as a class, and implement a class method readFile

class Widget:
    private widgetId;

    Constructor(widgetId: string) {
        self.widgetId = widgetId;
    }

    readFile(path: string) -> Promise<string> {
        return rawApis.readFile(self.widgetId, path);
    }

However, this is incompatible with React because this requires us to extend the React class component, whose usage is generally not recommended. Modern React recommends developers to use functional component instead of class component.

Proxy to the rescue

Another solution is to use a proxy of api calls which pass in widgetId to the apis on behalf of the widgets.

Conceptually, we need to create another entity apiProxy

                           ___________________
                          |     apiProxy      |                        |
widgetA -readFile(path)-> |                   | -readFile("A", path)-> |
                          |  apply widget id  |                        |  Rust
                          |  for widget       |                        | backend
widgetB -readFile(path)-> |                   | -readFiel("B", path)-> |
                          |___________________|                        |

This is hard to implement in our app because each widget is essentially a html element in a canvas window. It's impossible to learn the information of an api caller at widget runtime if each widget imports from the same JS module.

Since this can't be done at widget runtime, we can instead do it at widget compile time. We redirect the widget's import to different JS modules when building the widget.

In specific, here is what we do

  1. The source code of widget writer import from @deskulpt/apis as usual.

    import apis from "@deskulpt/apis"
  2. When Deskulpt compiles this widget, it creates a js module that imports from raw apis and partially applies the widget id to the raw apis. The psesudocode would be something like this

    import rawApis from "@deskultp/rawApis"
    
    // supported by Deskulpt, see below
    const widgetId = __DESKULTP_WIDGET_ID__;
    
    let apis = {};
    
    for rawApiName in rawApis:
        const rawApi = rawApis.get(rawApiName)
        // partially apply widgetId
        const api = (...args) => rawApi(widgetId, ...args)
        apis[rawApiName] = api;
    
    // reexport the rawApis with the same structure
    export default apis;

    The mysterious __Deskulpt_widget_ID__ variable is provided by directly manipulating the Abstract Syntax Tree when building the widget to replace it with the actual widget id.

  3. Deskulpt make this widget-specific a blob object, create a url for it, and replace all imports from @deskulpt/apis with imports from this url. After this stage, the widget source code would be like this

    // import apis from "@deskulpt/apis"
    import apis from <widgetUrl>

Now, when the widget calls the imported apis, it will automatically have widget id partially applied.