Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[move-vm][closures] Type and Value representation and serialization #15670

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

wrwg
Copy link
Contributor

@wrwg wrwg commented Jan 4, 2025

Description

[PR 4/n vm closures]

This implements types and values for functions and closures, as well as serialization. For a detailed discussion, see the design doc. It does not yet implement the interpreter and runtime verification.

Overview:

  • MoveValue and MoveTypeLayout are extended to represent closures and functions. A closure carries the function name, the closure mask, the layout of the captured arguments, and their values. The layout enables serialization of the captured arguments without any context.
  • Runtime Type is extended by functions.
  • ValueImpl is extended by a closure variant. This representation uses trait AbstractFunction to represent the function value and implement core functionality like comparison and printing. The existing trait FunctionValueExtension is used to create AbstractFunction from the serialized representation.
  • Similar as with MoveValue, the serialized representation of ValueImpl::Closure contains the captured parameter layouts, enabling to deserialize the captured arguments context independently. This design has been chosen for robustness, avoiding a dependency of serialization from loaded functions, and to enable fully 'lazy' deserialization of closures. The later allows a contract to store hundreds of function values and on loading them from storage, avoiding loading the associated code until the moment the closure is actually executed.
  • AbstractFunction is implemented in the loader such that it can be either in 'unresolved' or in `resolved' state.

How Has This Been Tested?

Testing postponed until e2e wiring is up

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Copy link

trunk-io bot commented Jan 4, 2025

⏱️ 1h 6m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-cargo-deny 9m 🟩🟩🟩🟩 (+1 more)
rust-move-tests 8m 🟥
rust-move-tests 8m 🟥
check-dynamic-deps 6m 🟩🟩🟩🟩🟩 (+1 more)
general-lints 3m 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 2m 🟥
semgrep/ci 2m 🟩🟩🟩🟩🟩 (+1 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+1 more)
rust-move-tests 29s
permission-check 17s 🟩🟩🟩🟩🟩 (+1 more)
permission-check 12s 🟩🟩🟩🟩🟩 (+1 more)
check-branch-prefix 1s 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 4972a5c to 2fed681 Compare January 5, 2025 05:50
@wrwg wrwg force-pushed the wrwg/clos_values branch from c1ff3d8 to 4cf0964 Compare January 5, 2025 05:50
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 2fed681 to 9c0814c Compare January 6, 2025 02:47
@wrwg wrwg force-pushed the wrwg/clos_values branch from 4cf0964 to 60546b9 Compare January 6, 2025 02:47
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 9c0814c to 8b11451 Compare January 6, 2025 02:47
@wrwg wrwg force-pushed the wrwg/clos_values branch from 60546b9 to f305df7 Compare January 6, 2025 02:48
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 8b11451 to dcb6bc5 Compare January 6, 2025 03:25
@wrwg wrwg force-pushed the wrwg/clos_values branch 2 times, most recently from 95728c9 to 6eea07c Compare January 7, 2025 04:04
@wrwg wrwg mentioned this pull request Jan 7, 2025
16 tasks
@wrwg wrwg marked this pull request as ready for review January 8, 2025 06:20
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from dcb6bc5 to bfc6f4c Compare January 13, 2025 07:08
@wrwg wrwg force-pushed the wrwg/clos_values branch from 6eea07c to 06363e0 Compare January 13, 2025 07:08
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from bfc6f4c to 355525d Compare January 14, 2025 02:32
@wrwg wrwg force-pushed the wrwg/clos_values branch from 31df3f3 to 7e049a7 Compare January 22, 2025 03:19
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from d4d5803 to 892bc51 Compare January 22, 2025 06:23
@wrwg wrwg force-pushed the wrwg/clos_values branch from 7e049a7 to 9359cfa Compare January 22, 2025 06:23
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 892bc51 to bf07d31 Compare January 22, 2025 06:52
@wrwg wrwg force-pushed the wrwg/clos_values branch 2 times, most recently from 129ffe7 to 474593c Compare January 22, 2025 07:34
Copy link
Contributor

@georgemitenkov georgemitenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First almost full pass, need to look more, but there are already plenty of comments

api/types/src/convert.rs Show resolved Hide resolved
let state = self.0.borrow();
match &*state {
LazyLoadedFunctionState::Resolved { mask, .. } => *mask,
LazyLoadedFunctionState::Unresolved {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have a resolution_error: Some(..) stored here? Do we just ignore it?

.iter()
.map(|t| module_storage.fetch_ty(t))
.collect::<PartialVMResult<Vec<_>>>()?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should call

Type::verify_ty_arg_abilities(function.ty_param_abilities(), &ty_args)
            .map_err(|e| e.finish(Location::Module(module_id.clone())))?;

here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, Loader has load_function(...) -> VMResult<LoadedFunction>, can we reuse it? Just move this as default impl to ModuleStorage (I did not d it because I hoped to remove V1 faster, but it can go to module storage easily).

} else {
Err(
PartialVMError::new(StatusCode::FUNCTION_RESOLUTION_FAILURE).with_message(
"inconsistency between types of serialized captured \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add more info to debug?

if ok {
for (ty, layout) in captured_arg_types.into_iter().zip(captured_layouts) {
// Convert layout into TypeTag
let serialized_tag: TypeTag = layout.try_into().map_err(|_| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this even works:

impl TryInto<StructTag> for &MoveStructLayout {
    type Error = anyhow::Error;

    fn try_into(self) -> Result<StructTag, Self::Error> {
        use MoveStructLayout::*;
        match self {
            Runtime(..) | RuntimeVariants(..) | WithFields(..) | WithVariants(..) => bail!(
                "Invalid MoveTypeLayout -> StructTag conversion--needed MoveLayoutType::WithTypes"
            ),
            WithTypes { type_, .. } => Ok(type_.clone()),
        }
    }
}

because here we have runtime layouts?

"unexpected type tag conversion failure",
)
})?;
let arg_tag = converter.ty_to_ty_tag(ty)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a generic type parameter in the function, this fails:

// from ty_to_ty_tag implementation

// References and type parameters cannot be converted to tags.
            Type::Reference(_) | Type::MutableReference(_) | Type::TyParam(_) => {
                return Err(
                    PartialVMError::new(StatusCode::UNKNOWN_INVARIANT_VIOLATION_ERROR)
                        .with_message(format!("No type tag for {:?}", ty)),
                );
            },

@@ -127,6 +127,7 @@ impl<'r, 'l> SessionExt<'r, 'l> {
module_storage: &impl ModuleStorage,
) -> VMResult<(VMChangeSet, ModuleWriteSet)> {
let move_vm = self.inner.get_move_vm();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded change

@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from bf07d31 to 1513955 Compare January 23, 2025 03:59
@wrwg wrwg force-pushed the wrwg/clos_values branch from 474593c to bb8390f Compare January 23, 2025 03:59
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 1513955 to 4022451 Compare January 23, 2025 04:06
@wrwg wrwg force-pushed the wrwg/clos_values branch from bb8390f to 24a5460 Compare January 23, 2025 04:06
/// `LoadedFunction`. This is wrapped into a Rc so one can clone the
/// function while sharing the loading state.
#[derive(Clone, Tid)]
pub(crate) struct LazyLoadedFunction(pub(crate) Rc<RefCell<LazyLoadedFunctionState>>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should more like lazy loaded closure?

@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 4022451 to 6f6afe3 Compare January 24, 2025 06:03
@wrwg wrwg force-pushed the wrwg/clos_values branch from 24a5460 to e6560c5 Compare January 24, 2025 06:03
mask,
captured,
} = self;
let mut s = serializer.serialize_seq(Some(4 + captured.len()))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to double the length here as you would be serializing tuples

result
},
Err(e) => {
*resolution_error = Some(e.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was wondering why the interior mutability is needed here for error. If the resolution failed this function would fail and you don't need to store the error msg here?s

wrwg added 9 commits January 24, 2025 17:06
This PR implements the extensions needed in the file format for representing closures:

- The type (SignatureToken) has a new variant `Function(args, result, abilities)` to represent a function type
- The opcodes are extendeed by operations `ClosPack`, `ClosPackGeneric`, and `ClosEval`

This supports bit masks for specifyinng which arguments of a function are captured when creating a closure.

Bytecode verification is extended to support the new types and opcodes. The implementation of consistency, type, and reference safety should be complete. What is missing are:

- File format serialization
- Interpreter and value representation
- Value serialization
- Connection of the new types with the various other type representations
…re types

[PR 2/n vm closures]

The type `AbilitySet` is required in `MoveTypeLayout`, and the type `ClosureMask` in `MoveValue`. This PRs moves those both types from file_format into the core types crate.
…nto a trait

Type conversions from runtime types to `MoveTypeLayout` and `TypeTag` currently are associated with the `Loader` type. However, they are needed for the `FunctionValueExtension` trait which needs to be constructed in contexts where no loader but only `ModuleStorage` exists.

This PR moves the conversion functions into a new trait `TypeConverter`. The trait is then implemented two times based on `ModuleStorage` only and based on the existing `Loader`, for downwards compatibility.
[PR 3/n vm closures]

This implements types and values for functions and closures, as well as serialization. For a detailed discussion, see the design doc. It does not yet implement the interpreter and runtime verification.

Overview:

- `MoveValue` and `MoveTypeLayout` are extended to represent closures and functions. A closure carries the function name, the closure mask, the function layout, and the captured arguments. The function layout enables serialization of the captured arguments without any context.
- Runtime `Type` is extended by functions.
- `ValueImpl` is extended by a closure variant. This representation uses `trait AbstractFunction` to represent the function value and implement core functionality like comparison and printing. The existing trait `FunctionValueExtension` is used to create `AbstracFunction` from the serialized representation.
- Similar as with `MoveValue`, the serialized representation of `ValueImpl::Closure` contains the full function layout, enabling to deserialize the captured arguments context independently. This design has been chosen for robustness, avoiding a dependency of serialization from loaded functions, and to enable fully 'lazy' deserialization of closures. The later allows a contract to store hundreds of function values and on loading them from storage, avoiding loading the associated code until the moment the closure is actually executed.
- `AbstractFunction` is implemented in the loader such that it can be either in 'unresolved' or in `resolved' state.
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from 6f6afe3 to f178823 Compare January 25, 2025 01:07
@wrwg wrwg force-pushed the wrwg/clos_values branch from e6560c5 to b3492b5 Compare January 25, 2025 01:07
@wrwg wrwg force-pushed the wrwg/clos_type_conv branch from f178823 to 2707a22 Compare January 25, 2025 06:39
Base automatically changed from wrwg/clos_type_conv to main January 25, 2025 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants