Add initial layout selection framework #1264

j2kun · 2025-01-14T21:40:27Z

This PR adds an initial ciphertext layout selection pass.

This involves the following:

Assigning to each secret tensor-typed SSA value a layout attribute as an AffineMap (see the design doc https://docs.google.com/document/d/1TTj0NDZkn0T9NNQj6idbTk2g65tAb7VxsYpt1hKjK_Y/edit?tab=t.0#heading=h.hcusv1ynbv9n)
Assumes each function input has a row-major layout, as done by Fhelipe.
Forward-propagating the layout attributes through the IR, with transformations when passing through ops that affect the layout.
Inserting layout_conversion ops when the operands of an op have incompatible layouts.
Persisting layout information as attributes on the IR:
- func.func persists using the func op's "argAttrs" dictionary, with a new tensor_ext.layout dialect attribute as the key
- secret.generic persists using our own copy of the func's argAttrs to mark the layouts of operands and block arguments.
- All other ops persist the layout attributes as an ArrayAttr for the op results (usually just one).

This pass does not optimize for the selected layouts. Instead, I expect to define additional passes that optimize, while allowing this pass to be doubly used as an analysis pass (persisting layout information to ops as attributes) as well as an initial layout selection pass. As such, the logic in this pass is to convert all secret+tensor op operands to have the layout of the op's first operand.

This PR also does not yet lower vecmat/matvec using Halevi-Shoup, nor does it lower convert_layout (which will require its own effort to implement a shift network optimization algorithm). I wanted to get feedback from the community on the direction as it relates to the layout attribute and propagation before doing too much additional work.

Example commands

An IR that requires layout conversion ops to be inserted

bazel run --noallow_analysis_cache_discard //tools:heir-opt -- --layout-propagation $PWD/tests/Transforms/layout_propagation/insert_conversion.mlir

Result: https://gist.github.com/j2kun/eef84273a83893a992694c219b4dde3f

Layout assignments for Halevi-Shoup matvec mul:

bazel run //tools:heir-opt -- --secretize --wrap-generic --drop-unit-dims --canonicalize --cse --linalg-canonicalizations --layout-propagation $PWD/tests/Examples/openfhe/halevi_shoup_matmul.mlir

Result: https://gist.github.com/j2kun/75b676fe90d5ba1f2c67ad1a2409406b

Open questions

Should we have the layout persisted on the secret type? I think we may be able to get away with running the pass to persist attributes per op, and use the op-aware type conversion established in Hongren's mgmt pass work. I don't actually use the added secret.layout annotation in this PR at all.
There will be cases where the layout propagation cannot be known without assuming how the op will be lowered. For example, the tensor_ext.sum op assumes rotate-and-reduce is performed in the obvious way, but it could be done a different way. More complicated examples would be linalg.matmul where both inputs are encrypted, which has a dozen or so academic papers with varying methods. While I can get reproductions of Fhelipe or Orion by assuming fixed implementations of linalg ops, the big open research question is: how can we jointly optimize for the layout and the lowering kernel of an op? I suspect it simply won't be able to use layout-propagation as defined in this PR, and will have to do a global optimization from scratch.

Notes

Currently I added a tensor_ext.sum op, which I think I will remove in favor of the linalg linalg.reduce op. I wanted to have at least one op in this PR that actually modifies the layout attribute, and sum is probably the easiest one to understand while doing something meaningful.
Currently a matvec/vecmat op doesn't change the layout of the input vector, since halevi-shoup/squat packing methods maintain layout invariance of the vector. However, if the input vector has a non-row-major encoding (e.g., strided), then we would need to lower the operation in a special way, either by customizing the diagonal packing of the plaintext matrix to match, or else by inserting a compactification op (layout conversion) on the vector.
I wanted to rework the halevi-shoup matmul example, but noticed it's using the linalg.matmul op instead of matvec/vecmat. I looked into it, and found it was very easy to add a drop-unit-dims transform that would convert matmul to matvec/vecmat when the right dimensions were 1. I think this would be better long-term and I will port the halevi-shoup lowerings to use matvec/vecmat instead of matmul.

ZenithalHourlyRate · 2025-01-23T15:47:18Z

tests/Transforms/layout_propagation/insert_conversion.mlir

+    // result of sum has row-major layout, i.e., with don't cares at the end
+    // (1, 2, ..., 32, x, x, x, x, ..., x)


Related to #1258

rotate-and-reduce code (for example, sum tensor<32x32xi16> to tensor<32xi16> then sum to i16) would assume cyclic repetition for the middle layout (d0 -> d0), namely the tail is not "dont care" but meaningful data. Could we describe that

This is a good question. affine_map can't natively represent the semantics of what outside the image of the layout map. It seems reasonable to me that we could require cyclic repetition by default for all layouts that don't fill up their ciphertexts.

I can think of a few obstacles to doing this:

It requires the data to have dimensions divisible by a power of 2 (so that we can get at least one full repetition, a partial repetition would be invalid), though I think most compilers do padding on non-power-of-two-sized inputs anyway.

Some op may have a kernel that doesn't produce cyclic repetition in its output layout, in which case we would have to insert an op (during the lowering, likely) to restore this invariant. The overhead could add up to unacceptable performance.

But I'm willing to forge ahead with this assumption anyway, since if/when it becomes a problem, we can wrap the layout attribute in a larger attribute that includes an enum specifying the semantics.

j2kun · 2025-02-01T00:20:05Z

Did a bit of cleanup and left some TODOs so we can get the first big piece of this PR in.

New stuff since last comment:

Added tensor_ext.assign_layout op.
Removed tensor_ext.sum and replaced it with linalg.reduce
Completed layout propagation pass for linalg.reduce

Followup work:

Finish the parts of the layout propagation pass required for matvec/vecmat
Enable drop-unit-dims and lower the newly-emitted tensor ops like collapse_shape and expand_shape
Add a new lowering that converts data tensor types to "ciphertext tensor" types that align with the image of the layout map
Lower convert_layout (Lower tensor_ext.permute via Vos-Vos-Erkin shift networks #1286)
Add back-propagation pass to lift layout conversions earlier in the pipeline

ZenithalHourlyRate

Happy to see this! Just some comments on the MLIR side, others LGTM!

lib/Dialect/TensorExt/IR/TensorExtAttributes.td

lib/Transforms/LayoutPropagation/LayoutPropagation.td

asraa

Thanks! Just a few very qq and I'll mark as approved for you to submit

asraa · 2025-02-06T18:05:02Z

lib/Transforms/LayoutPropagation/LayoutPropagation.td

@@ -0,0 +1,114 @@
+#ifndef LIB_TRANSFORMS_LAYOUTPROPAGATION_LAYOUTPROPAGATION_TD_


Could you place this transform under lib/Dialect/TensorExt/Transforms? If a transform is pretty dialect specific I think we'd like to put it there, to avoid making lib/Transforms/ such a mixed bag (@AlexanderViand-Intel and I still had leftover un-categorized transforms there, but this one seems pretty clearly under TensorExt)

I actually had it there and decided, because of the dependence on ops from tensor, tensor_ext and linalg, that it should be considered a general transform not specific to one dialect. There will definitely be more linalg ops coming (convolution, matmul, matvec) and I could also see it supporting layout assignment for higher-level ML ops if they are present.

lib/Transforms/LayoutPropagation/LayoutPropagation.td

lib/Transforms/LayoutPropagation/LayoutPropagation.cpp

asraa

thanks!

j2kun marked this pull request as draft January 14, 2025 21:40

j2kun force-pushed the layout-attr branch 3 times, most recently from c942df4 to f796e2d Compare January 17, 2025 05:51

AlexanderViand-Intel mentioned this pull request Jan 17, 2025

Added Squat Packing's rotate and sum operation and resolved type mismatch issues. #1269

Closed

j2kun force-pushed the layout-attr branch 3 times, most recently from 818383c to dd5197a Compare January 21, 2025 16:44

j2kun mentioned this pull request Jan 21, 2025

add drop-unit-dims transform #1282

Merged

j2kun force-pushed the layout-attr branch 2 times, most recently from af8ea16 to 4f040e1 Compare January 21, 2025 21:37

ZenithalHourlyRate reviewed Jan 23, 2025

View reviewed changes

j2kun force-pushed the layout-attr branch 5 times, most recently from 159def7 to 9d6144b Compare February 1, 2025 00:17

j2kun requested a review from asraa February 1, 2025 00:17

j2kun requested review from AlexanderViand-Intel and ZenithalHourlyRate February 1, 2025 00:20

j2kun marked this pull request as ready for review February 1, 2025 00:20

j2kun force-pushed the layout-attr branch from 9d6144b to 9121a23 Compare February 1, 2025 00:29

ZenithalHourlyRate approved these changes Feb 3, 2025

View reviewed changes

lib/Dialect/TensorExt/IR/TensorExtAttributes.td Outdated Show resolved Hide resolved

lib/Transforms/LayoutPropagation/LayoutPropagation.td Show resolved Hide resolved

lib/Transforms/LayoutPropagation/LayoutPropagation.td Outdated Show resolved Hide resolved

j2kun mentioned this pull request Feb 3, 2025

Add attribute aliases for layout affine_maps #1356

Open

j2kun force-pushed the layout-attr branch 4 times, most recently from c2e9461 to 609cbf3 Compare February 6, 2025 17:57

asraa reviewed Feb 6, 2025

View reviewed changes

j2kun force-pushed the layout-attr branch 2 times, most recently from 9cb2bc6 to b488f00 Compare February 6, 2025 19:07

Add LayoutPropagation pass

b488f00

j2kun requested a review from asraa February 6, 2025 20:14

asraa approved these changes Feb 7, 2025

View reviewed changes

asraa added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Feb 7, 2025

copybara-service bot merged commit bcd2fec into google:main Feb 8, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial layout selection framework #1264

Add initial layout selection framework #1264

j2kun commented Jan 14, 2025 •

edited

Loading

ZenithalHourlyRate Jan 23, 2025

j2kun Jan 23, 2025

j2kun commented Feb 1, 2025 •

edited

Loading

ZenithalHourlyRate left a comment

asraa left a comment

asraa Feb 6, 2025

j2kun Feb 6, 2025

asraa left a comment

		// result of sum has row-major layout, i.e., with don't cares at the end
		// (1, 2, ..., 32, x, x, x, x, ..., x)

		@@ -0,0 +1,114 @@
		#ifndef LIB_TRANSFORMS_LAYOUTPROPAGATION_LAYOUTPROPAGATION_TD_

Add initial layout selection framework #1264

Add initial layout selection framework #1264

Conversation

j2kun commented Jan 14, 2025 • edited Loading

Example commands

Open questions

Notes

ZenithalHourlyRate Jan 23, 2025

Choose a reason for hiding this comment

j2kun Jan 23, 2025

Choose a reason for hiding this comment

j2kun commented Feb 1, 2025 • edited Loading

ZenithalHourlyRate left a comment

Choose a reason for hiding this comment

asraa left a comment

Choose a reason for hiding this comment

asraa Feb 6, 2025

Choose a reason for hiding this comment

j2kun Feb 6, 2025

Choose a reason for hiding this comment

asraa left a comment

Choose a reason for hiding this comment

j2kun commented Jan 14, 2025 •

edited

Loading

j2kun commented Feb 1, 2025 •

edited

Loading