Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial layout selection framework #1264

Merged
merged 1 commit into from
Feb 8, 2025

Conversation

j2kun
Copy link
Collaborator

@j2kun j2kun commented Jan 14, 2025

This PR adds an initial ciphertext layout selection pass.

This involves the following:

  • Assigning to each secret tensor-typed SSA value a layout attribute as an AffineMap (see the design doc https://docs.google.com/document/d/1TTj0NDZkn0T9NNQj6idbTk2g65tAb7VxsYpt1hKjK_Y/edit?tab=t.0#heading=h.hcusv1ynbv9n)
  • Assumes each function input has a row-major layout, as done by Fhelipe.
  • Forward-propagating the layout attributes through the IR, with transformations when passing through ops that affect the layout.
  • Inserting layout_conversion ops when the operands of an op have incompatible layouts.
  • Persisting layout information as attributes on the IR:
    • func.func persists using the func op's "argAttrs" dictionary, with a new tensor_ext.layout dialect attribute as the key
    • secret.generic persists using our own copy of the func's argAttrs to mark the layouts of operands and block arguments.
    • All other ops persist the layout attributes as an ArrayAttr for the op results (usually just one).

This pass does not optimize for the selected layouts. Instead, I expect to define additional passes that optimize, while allowing this pass to be doubly used as an analysis pass (persisting layout information to ops as attributes) as well as an initial layout selection pass. As such, the logic in this pass is to convert all secret+tensor op operands to have the layout of the op's first operand.

This PR also does not yet lower vecmat/matvec using Halevi-Shoup, nor does it lower convert_layout (which will require its own effort to implement a shift network optimization algorithm). I wanted to get feedback from the community on the direction as it relates to the layout attribute and propagation before doing too much additional work.

Example commands

An IR that requires layout conversion ops to be inserted

bazel run --noallow_analysis_cache_discard //tools:heir-opt -- --layout-propagation $PWD/tests/Transforms/layout_propagation/insert_conversion.mlir 

Result: https://gist.github.com/j2kun/eef84273a83893a992694c219b4dde3f

Layout assignments for Halevi-Shoup matvec mul:

bazel run //tools:heir-opt -- --secretize --wrap-generic --drop-unit-dims --canonicalize --cse --linalg-canonicalizations --layout-propagation $PWD/tests/Examples/openfhe/halevi_shoup_matmul.mlir

Result: https://gist.github.com/j2kun/75b676fe90d5ba1f2c67ad1a2409406b

Open questions

  • Should we have the layout persisted on the secret type? I think we may be able to get away with running the pass to persist attributes per op, and use the op-aware type conversion established in Hongren's mgmt pass work. I don't actually use the added secret.layout annotation in this PR at all.
  • There will be cases where the layout propagation cannot be known without assuming how the op will be lowered. For example, the tensor_ext.sum op assumes rotate-and-reduce is performed in the obvious way, but it could be done a different way. More complicated examples would be linalg.matmul where both inputs are encrypted, which has a dozen or so academic papers with varying methods. While I can get reproductions of Fhelipe or Orion by assuming fixed implementations of linalg ops, the big open research question is: how can we jointly optimize for the layout and the lowering kernel of an op? I suspect it simply won't be able to use layout-propagation as defined in this PR, and will have to do a global optimization from scratch.

Notes

  • Currently I added a tensor_ext.sum op, which I think I will remove in favor of the linalg linalg.reduce op. I wanted to have at least one op in this PR that actually modifies the layout attribute, and sum is probably the easiest one to understand while doing something meaningful.
  • Currently a matvec/vecmat op doesn't change the layout of the input vector, since halevi-shoup/squat packing methods maintain layout invariance of the vector. However, if the input vector has a non-row-major encoding (e.g., strided), then we would need to lower the operation in a special way, either by customizing the diagonal packing of the plaintext matrix to match, or else by inserting a compactification op (layout conversion) on the vector.
  • I wanted to rework the halevi-shoup matmul example, but noticed it's using the linalg.matmul op instead of matvec/vecmat. I looked into it, and found it was very easy to add a drop-unit-dims transform that would convert matmul to matvec/vecmat when the right dimensions were 1. I think this would be better long-term and I will port the halevi-shoup lowerings to use matvec/vecmat instead of matmul.

@j2kun j2kun marked this pull request as draft January 14, 2025 21:40
@j2kun j2kun force-pushed the layout-attr branch 3 times, most recently from c942df4 to f796e2d Compare January 17, 2025 05:51
@j2kun j2kun force-pushed the layout-attr branch 3 times, most recently from 818383c to dd5197a Compare January 21, 2025 16:44
@j2kun j2kun force-pushed the layout-attr branch 2 times, most recently from af8ea16 to 4f040e1 Compare January 21, 2025 21:37
Comment on lines 28 to 29
// result of sum has row-major layout, i.e., with don't cares at the end
// (1, 2, ..., 32, x, x, x, x, ..., x)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to #1258

rotate-and-reduce code (for example, sum tensor<32x32xi16> to tensor<32xi16> then sum to i16) would assume cyclic repetition for the middle layout (d0 -> d0), namely the tail is not "dont care" but meaningful data. Could we describe that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. affine_map can't natively represent the semantics of what outside the image of the layout map. It seems reasonable to me that we could require cyclic repetition by default for all layouts that don't fill up their ciphertexts.

I can think of a few obstacles to doing this:

  • It requires the data to have dimensions divisible by a power of 2 (so that we can get at least one full repetition, a partial repetition would be invalid), though I think most compilers do padding on non-power-of-two-sized inputs anyway.
  • Some op may have a kernel that doesn't produce cyclic repetition in its output layout, in which case we would have to insert an op (during the lowering, likely) to restore this invariant. The overhead could add up to unacceptable performance.

But I'm willing to forge ahead with this assumption anyway, since if/when it becomes a problem, we can wrap the layout attribute in a larger attribute that includes an enum specifying the semantics.

@j2kun j2kun force-pushed the layout-attr branch 5 times, most recently from 159def7 to 9d6144b Compare February 1, 2025 00:17
@j2kun j2kun requested a review from asraa February 1, 2025 00:17
@j2kun
Copy link
Collaborator Author

j2kun commented Feb 1, 2025

Did a bit of cleanup and left some TODOs so we can get the first big piece of this PR in.

New stuff since last comment:

  • Added tensor_ext.assign_layout op.
  • Removed tensor_ext.sum and replaced it with linalg.reduce
  • Completed layout propagation pass for linalg.reduce

Followup work:

  • Finish the parts of the layout propagation pass required for matvec/vecmat
  • Enable drop-unit-dims and lower the newly-emitted tensor ops like collapse_shape and expand_shape
  • Add a new lowering that converts data tensor types to "ciphertext tensor" types that align with the image of the layout map
  • Lower convert_layout (Lower tensor_ext.permute via Vos-Vos-Erkin shift networks #1286)
  • Add back-propagation pass to lift layout conversions earlier in the pipeline

@j2kun j2kun marked this pull request as ready for review February 1, 2025 00:20
Copy link
Collaborator

@ZenithalHourlyRate ZenithalHourlyRate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see this! Just some comments on the MLIR side, others LGTM!

lib/Dialect/TensorExt/IR/TensorExtAttributes.td Outdated Show resolved Hide resolved
lib/Transforms/LayoutPropagation/LayoutPropagation.td Outdated Show resolved Hide resolved
@j2kun j2kun force-pushed the layout-attr branch 4 times, most recently from c2e9461 to 609cbf3 Compare February 6, 2025 17:57
Copy link
Collaborator

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just a few very qq and I'll mark as approved for you to submit

@@ -0,0 +1,114 @@
#ifndef LIB_TRANSFORMS_LAYOUTPROPAGATION_LAYOUTPROPAGATION_TD_
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you place this transform under lib/Dialect/TensorExt/Transforms? If a transform is pretty dialect specific I think we'd like to put it there, to avoid making lib/Transforms/ such a mixed bag (@AlexanderViand-Intel and I still had leftover un-categorized transforms there, but this one seems pretty clearly under TensorExt)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had it there and decided, because of the dependence on ops from tensor, tensor_ext and linalg, that it should be considered a general transform not specific to one dialect. There will definitely be more linalg ops coming (convolution, matmul, matvec) and I could also see it supporting layout assignment for higher-level ML ops if they are present.

lib/Transforms/LayoutPropagation/LayoutPropagation.td Outdated Show resolved Hide resolved
lib/Transforms/LayoutPropagation/LayoutPropagation.cpp Outdated Show resolved Hide resolved
@j2kun j2kun force-pushed the layout-attr branch 2 times, most recently from 9cb2bc6 to b488f00 Compare February 6, 2025 19:07
@j2kun j2kun requested a review from asraa February 6, 2025 20:14
Copy link
Collaborator

@asraa asraa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@asraa asraa added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Feb 7, 2025
@copybara-service copybara-service bot merged commit bcd2fec into google:main Feb 8, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants