-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(compartment-mapper): Support dynamic require/importNow #2310
Conversation
2135d35
to
8df8629
Compare
44ea8db
to
54ee067
Compare
|
compartmentDescriptor.modules = moduleDescriptors; | ||
|
||
let { policy } = compartmentDescriptor; | ||
policy = policy || Object.create(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is for typescript
if ('packages' in policy && typeof policy.packages === 'object') { | ||
for (const [pkgName, policyItem] of entries(policy.packages)) { | ||
if ( | ||
!(pkgName in compartmentDescriptor.modules) && | ||
pkgName in compartmentDescriptor.scopes && | ||
policyItem | ||
) { | ||
compartmentDescriptor.modules[pkgName] = | ||
compartmentDescriptor.scopes[pkgName]; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure if this is correct, but it creates links between compartment descriptors based only on policy--where Endo would not have detected them otherwise.
Checking if dynamic: true
is in policy at this point causes many tests to fail, because this ends up being a code path which many tests take. Why? Because the behavior is opt-out.
With regards to that, in addition to the policy, perhaps we should add a dynamic
option to ImportLocationOptions
or whathaveyou, so you can explicitly opt-in to using importNowHook
. Right now, it's an opt-out based on the lack of a moduleTransforms
(async module transforms) option and the shape of readPowers
. It's not until later that we take the policy into account. Another way to put it: we're creating an importNowHook
because a) we have the tools to do so, and b) something might dynamically require something else.
If we did that, I'd be able to delete some LoC, and we'd reduce the risk of introducing backwards-incompatible changes. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
less code and better compatibility. Sounds good.
Also, explicit opt-in would allow an early failure if ReadPowers or transforms don't match the intent to enable dynamic require.
// Collate candidate locations for the moduleSpecifier, | ||
// to support Node.js conventions and similar. | ||
const candidates = [moduleSpecifier]; | ||
for (const candidateSuffix of searchSuffixes) { | ||
candidates.push(`${moduleSpecifier}${candidateSuffix}`); | ||
} | ||
|
||
for (const candidateSpecifier of candidates) { | ||
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier]; | ||
if (candidateModuleDescriptor !== undefined) { | ||
const { compartment: candidateCompartmentName = packageLocation } = | ||
candidateModuleDescriptor; | ||
const candidateCompartment = compartments[candidateCompartmentName]; | ||
if (candidateCompartment === undefined) { | ||
throw Error( | ||
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`, | ||
); | ||
} | ||
// modify compartmentMap to include this redirect | ||
const candidateCompartmentDescriptor = | ||
compartmentDescriptors[candidateCompartmentName]; | ||
if (candidateCompartmentDescriptor === undefined) { | ||
throw Error( | ||
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`, | ||
); | ||
} | ||
candidateCompartmentDescriptor.modules[moduleSpecifier] = | ||
candidateModuleDescriptor; | ||
// return a redirect | ||
/** @type {RedirectStaticModuleInterface} */ | ||
const record = { | ||
specifier: candidateSpecifier, | ||
compartment: candidateCompartment, | ||
}; | ||
return record; | ||
} | ||
|
||
// Using a specifier as a location. | ||
// This is not always valid. | ||
// But, for Node.js, when the specifier is relative and not a directory | ||
// name, they are usable as URL's. | ||
const moduleLocation = resolveLocation( | ||
candidateSpecifier, | ||
packageLocation, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copypasta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of duplication. I think this is worth trying to bounce on a trampoline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. This is a good idea, but is also unfortunate because it will take me longer to land this.
let moduleBytes; | ||
try { | ||
moduleBytes = readSync(moduleLocation); | ||
} catch (err) { | ||
if (err && err.code === 'ENOENT') { | ||
// might be an exit module. use the fallback `exitModuleImportNowHook` to import it | ||
// eslint-disable-next-line no-continue | ||
continue; | ||
} | ||
throw err; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we know something is an exit module? maybe we should check the compartment descriptor, and only use the fallback if the thing doesn't exist there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not arrived at a hard decision for this question. Falling through the bottom of the import hook should be enough, but might not be. An option we did not have when I first wrote this: we can identify “host modules” by the presence of a prefix:
like node:
or endo:
. These would be guaranteed to escape the Node.js style mappings.
In the interest of keeping coupling to a specific specifier resolution strategy low, I am leaning heavily toward “pass through if nothing in the compartment map matches.”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prefix option would be really unfortunate for ecosystem compatibility. I don't think we'll ever get all packages to switch form fs
to node:fs
.
For the sake of completion: if importHook for exit modules is not provided or returns nothing we still need to fail with an import error.
/** | ||
* @typedef FsPromisesApi | ||
* @property {(filepath: string) => Promise<string>} realpath | ||
* @property {WriteFn} writeFile | ||
* @property {ReadFn} readFile | ||
*/ | ||
|
||
/** | ||
* @typedef FsAPI | ||
* @property {FsPromisesApi} promises | ||
* @property {ReadSyncFn} readFileSync | ||
*/ | ||
|
||
/** | ||
* @typedef UrlAPI | ||
* @property {(location: string | URL) => string} fileURLToPath | ||
* @property {(path: string) => URL} pathToFileURL | ||
*/ | ||
|
||
/** | ||
* @typedef CryptoAPI | ||
* @property {typeof import('crypto').createHash} createHash | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not strictly necessary, but I found it helpful. YMMV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preliminary feedback.
// Collate candidate locations for the moduleSpecifier, | ||
// to support Node.js conventions and similar. | ||
const candidates = [moduleSpecifier]; | ||
for (const candidateSuffix of searchSuffixes) { | ||
candidates.push(`${moduleSpecifier}${candidateSuffix}`); | ||
} | ||
|
||
for (const candidateSpecifier of candidates) { | ||
const candidateModuleDescriptor = moduleDescriptors[candidateSpecifier]; | ||
if (candidateModuleDescriptor !== undefined) { | ||
const { compartment: candidateCompartmentName = packageLocation } = | ||
candidateModuleDescriptor; | ||
const candidateCompartment = compartments[candidateCompartmentName]; | ||
if (candidateCompartment === undefined) { | ||
throw Error( | ||
`compartment missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`, | ||
); | ||
} | ||
// modify compartmentMap to include this redirect | ||
const candidateCompartmentDescriptor = | ||
compartmentDescriptors[candidateCompartmentName]; | ||
if (candidateCompartmentDescriptor === undefined) { | ||
throw Error( | ||
`compartmentDescriptor missing for candidate ${candidateSpecifier} in ${candidateCompartmentName}`, | ||
); | ||
} | ||
candidateCompartmentDescriptor.modules[moduleSpecifier] = | ||
candidateModuleDescriptor; | ||
// return a redirect | ||
/** @type {RedirectStaticModuleInterface} */ | ||
const record = { | ||
specifier: candidateSpecifier, | ||
compartment: candidateCompartment, | ||
}; | ||
return record; | ||
} | ||
|
||
// Using a specifier as a location. | ||
// This is not always valid. | ||
// But, for Node.js, when the specifier is relative and not a directory | ||
// name, they are usable as URL's. | ||
const moduleLocation = resolveLocation( | ||
candidateSpecifier, | ||
packageLocation, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of duplication. I think this is worth trying to bounce on a trampoline.
506d990
to
45c147e
Compare
Going to extract the changes to |
Ref: #2332 cc @kriskowal |
Once #2332 is merged, I can rebase this onto |
3861722
to
0e5fb1b
Compare
0e5fb1b
to
0963553
Compare
04f003b
to
b09a0cd
Compare
note on I don't see where the source-map-js is added. The documentation says it's a fork of source-map but with some optimizations added. One of the optimizations listed in the article they cite as source of the optimizations is using a |
for (const [someDescriptorName, someDescriptor] of entries( | ||
compartmentDescriptors, | ||
)) { | ||
if (someDescriptorName !== ATTENUATORS_COMPARTMENT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exclusion makes a lot of sense.
9e16892
to
7386c8f
Compare
84b4513
to
633c304
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been a journey and I commend you, and apologize profusely for taking too long to review. That you have a green CI speaks volumes.
This is in the right shape. I have mostly annoying nits about naming and idioms below, most of which can justifiably be ignored at your discretion.
I only request you consider seriously the whether we can bear O(compartments) lookup and think about what it would take to do a more efficient prefix match. I do not want you to build a prefix trie, but if iteratively popping components off a module specifier until you find a matching compartment location works, please consider that, or explain why that won’t work or isn’t necessary.
There is a breaking change here now that parsers are part of the public API, but I am not worried about that since all usage of them is local to Endojs/Endo and is loosely coupled to the specific interface. It might be worth noting. It might not.
edit: and please look into the weak correspondence between package and dependency name, at least explain why we can’t use the compartment keys to establish that correlation. More book-keeping is fine if it’s more precise.
/** | ||
* A resolution of `undefined` indicates `ENOENT` or the equivalent. | ||
* | ||
* @callback MaybeReadSyncFn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a mild preference to grow outward from the precedent of Now
to mean synchronous in names, instead of the Node.js precedent of Sync
suffix. There’s a reasonable chance we will follow the lead of Moddable XS and be proposing corresponding import.now
dynamic synchronous import syntax. I’ve seeded the idea it in TC39 channels in any case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a blocker? There's a lot of stuff to rename, if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but it will be harder to change later.
@@ -68,10 +91,68 @@ const nodejsConventionSearchSuffixes = [ | |||
'/index.node', | |||
]; | |||
|
|||
/** | |||
* Given a module specifier which is an absolute path, attempt to match it with | |||
* an existing compartment; return a {@link RedirectStaticModuleInterface} if found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some work toward converging with XS for the shape of module descriptors. RedirectStaticModuleInterface
is deprecated in favor of a SourceModuleDescriptor
or NamespaceModuleDescriptor
, but both work still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither SourceModuleDescriptor
nor NamespaceModuleDescriptor
is valid, in this case. The former because we do not have a source
property, and the latter because we do not have a namespace
property; all we have is {string} specifier
and {Compartment} compartment
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is only that the the Moddable folks preferred to rename source
overspecifier
so that it is clear that the referent (still a string) means that this compartment adopts the other compartment’s ModuleSource, vs namespace
when it shares the other compartment’s module instance. The name specifier
left that ambiguous.
Here’s an Agoric SDK integration testing PR Agoric/agoric-sdk#10159 The current CI run failed because trampoline is not in npm yet, and at least one version needs to go up for that to sync properly. I’ll take that on. |
@kriskowal I've applied all of the naming-related changes you've requested. |
} | ||
} | ||
|
||
// for (const [someDescriptorName, someDescriptor] of entries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover from debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous implementation. Good catch; let’s delete this dead code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the right shape. I will not need another review. Please review prior comments and delete dead code. Naming nits are optional.
Please add an entry to NEWS.md for the new feature.
You should be able to merge when you are ready. Please rebase on master
before merging, ensure that CI is green, and either squash with a merge commit or manually merge fixups and edits into the prior feature commits so that history appears linear. Also, review your merge commit description, which is generated from the PR description.
I would like to get this proven in Agoric SDK CI, so I’m going to produce a release for @endo/trampoline
today and kick that CI job again to see what happens. I’ll let you know.
} | ||
} | ||
|
||
// for (const [someDescriptorName, someDescriptor] of entries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous implementation. Good catch; let’s delete this dead code.
sourceMapHook, | ||
strictlyRequiredForCompartment, | ||
}, | ||
{ maybeRead, parse, shouldDeferError = () => false }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be addressed in a refactor, so does not need to be made consistent in this change.
if (isAbsolute(moduleSpecifier)) { | ||
let shouldCallExitModuleImportNowHook = true; | ||
for (const [someDescriptorName, someDescriptor] of entries( | ||
compartmentDescriptors, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naugtur Please take a look at Chris’s new algorithm with the precomputed compartments
set. I think the order of complexity is at least right now, as well as some correctness concerns when one package is nested in the node_modules of another.
/** | ||
* A resolution of `undefined` indicates `ENOENT` or the equivalent. | ||
* | ||
* @callback MaybeReadSyncFn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but it will be harder to change later.
@@ -724,6 +734,7 @@ const translateGraph = ( | |||
parsers, | |||
types, | |||
policy: /** @type {SomePackagePolicy} */ (packagePolicy), | |||
compartments: compartmentNames, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found the money right here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for later: this could be narrowed down to what's allowed by policy. TODO: create an issue
40efea3
to
26c4bf4
Compare
a740027
to
25724bf
Compare
This change adds support for _dynamic requires_, where a Node.js script does something like this: ```js const someModuleSpecifier = getSpecifier(); const idk = require(someModuleSpecifier); ``` This behavior is not enabled by default; specific options and Powers are required. See the signature of `loadFromMap()` in `import-lite.js` for more information. Dynamic loading of exit modules (e.g., if `someModuleSpecifier` was `node:fs`) is handled by a user-provided "exit module import hook". Policy applies. Dynamic requires work as you'd probably expected _except_: - A compartment may dynamically require from its *dependee*, since this is a common pattern in the ecosystem (see [node-gyp-build](https://npm.im/node-gyp-build)). - The special "attenuators" compartment may not be dynamically required. Horsefeathers! Some relevant bits, if you're mining history: - All internal parsers are now _synchronous_, which was made possible by the introduction of `evadeCensorSync` in `@endo/evasive-transform`. - Async parsers are still supported, but they would only be user-defined. Async parsers are incompatible with dynamic requires. - Added property `{Set<string>} compartments` to `CompartmentDescriptor`. This is essentially a view into the compartment names within `CompartmentDescriptor.scopes`. - The `mapParsers()` function has moved into `map-parser.js`. - `@endo/trampoline` is leveraged in a couple places (`import-hook.js`, `map-parser.js`) to avoid code duplication. - Introduced `FsInterface`, `UrlInterface`, `CryptoInterface` and `PathInterface` (and their ilk) for use with `makeReadPowers()` and the new `makeReadNowPowers()`.
25724bf
to
572589c
Compare
#2310 (comment) |
Description
This PR adds support for dynamic requires via
loadFromMap()
andlink()
(inimport-lite.js
andlink.js
, respectively).importLocation()
's signature has also been modified for support.To use this feature in either function, the following must be true:
moduleTransforms
option (in the appropriateoptions
parameter) must not be present. These are asynchronous module transforms, which cannot be used by dynamic require.ReadPowers
param must be a properReadPowers
object (not just aReadFn
) and must contain both the newmaybeReadSync
, newisAbsolute
, andfileURLToPath
functions. The new type representing this isSyncReadPowers
.If all of the above are true, then a compartment will be allowed to dynamically require something. If that thing still cannot be found in the compartment map, the sync "fallback" exit hook (see next item) will be executed.
The new
importHookNow
property can be provided via options, which is a synchronous exit module import hook.About
SyncReadPowers
:SyncReadPowers.maybeReadSync()
is necessary to read files synchronously which is necessary to load them synchronously. This fails gracefully if the file is not found; the implementation is platform-dependent.SyncReadPowers.isAbsolute()
is necessary to determine if the module specifier of a dynamic require is absolute. If it is, it could be just about anything, and must be loaded via the user-providedimportNowHook
(the sync exit module import hook).SyncReadPowers.fileURLToPath()
is needed for__filename
and__dirname
, which is often used to create absolute paths for requiring dynamically.As an alternative to
moduleTransforms
, synchronous module transforms may be provided via the newsyncModuleTransforms
object. In a non-dynamic-require use-case, if present,syncModuleTransforms
are combined with themoduleTransforms
option; all sync module transforms are module transforms, but not all module transforms are sync module transforms.All builtin parsers are now synchronous. User-defined parsers can be async, but this will disable dynamic require support.
@endo/evasive-transform
now exportsevadeCensorSync()
in addition toevadeCensor()
. This is possible because I've swapped the async-only source-map with source-map-js, which is a fork of the former before it went async-only.source-map-js
claims comparable performance.Security Considerations
Dynamically requiring an exit module (e.g., a Node.js builtin) requires a user-defined hook, which has the same security considerations as a user-defined exit module hook.
Swapping out a dependency (
source-map-js
forsource-map
) incurs risk.Scaling Considerations
n/a
Documentation Considerations
Should be announced as a user-facing feature
Testing Considerations
I've added some fixtures and tested around the conditionals I've added, but am open to any suggestions for additional coverage.
Compatibility Considerations
This increases ecosystem compatibility considerably; use of dynamic require in the ecosystem is not rare.
For example, most packages which ship a native module will be using dynamic require, because the filepath of the build artifact is dependent upon the platform and architecture.
Upgrade Considerations
Everything else should be backwards-compatible, as long as
source-map-js
does as it says on the tin.Users of
@endo/evasive-transform
may note that native modules are neither downloaded/compiled (due to the switch fromsource-map
tosource-map-js
).UPDATE: Aug 18 2024
This PR now targets the
boneskull/supertramp
branch, which is PR #2263