Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise SparsePauliOp.from_operator #11557

Merged
merged 4 commits into from
Oct 31, 2024

Conversation

jakelishman
Copy link
Member

@jakelishman jakelishman commented Jan 13, 2024

Summary

This rewrites the from_operator handling (again!) from the initial Rust implementation of the recursive matrix-addition form into an iterative approach that re-uses the same scratch memory all the way down. This is significantly faster, and allocates far less often, although in practice the peak heap memory usage will be not dissimilar.

The algorithm is rewritten to be a manual stack-based iteration, rather than a functional recursion. The size of a single stack entry in the iteration is one usize, which is drastically smaller than whatever per-function-call stack will have been used before.

Details and comments

This is the rewrite alluded to in #11133. Using the same timing script, the new graph looks like:

Figure_1

where the absolute timing of the the 10q operator is ~40ms compared to ~227ms, so it's a 5-6x speedup at that scale (for a fully dense operator). Decompositions of up to 4q operators are now entirely lost in the noise of the construction time of SparsePauliOp (which tbh probably says more about the overheads of the quantum_info module than anything else...).

Still no parallelism here, but honestly, I'm not even sure it's worth putting in the time to do that other than casual interest.

@jakelishman jakelishman added performance Changelog: New Feature Include in the "Added" section of the changelog mod: quantum info Related to the Quantum Info module (States & Operators) Rust This PR or issue is related to Rust code in the repository labels Jan 13, 2024
@jakelishman jakelishman added this to the 1.0.0 milestone Jan 13, 2024
@jakelishman jakelishman requested a review from a team as a code owner January 13, 2024 02:42
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

@coveralls
Copy link

coveralls commented Jan 13, 2024

Pull Request Test Coverage Report for Build 11597689008

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 316 of 338 (93.49%) changed or added relevant lines in 1 file are covered.
  • 69 unchanged lines in 8 files lost coverage.
  • Overall coverage increased (+0.05%) to 88.721%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/sparse_pauli_op.rs 316 338 93.49%
Files with Coverage Reduction New Missed Lines %
qiskit/circuit/library/iqp.py 1 96.15%
crates/accelerate/src/two_qubit_decompose.rs 1 92.09%
crates/accelerate/src/unitary_synthesis.rs 1 92.2%
crates/qasm2/src/lex.rs 4 92.73%
qiskit/compiler/assembler.py 5 96.32%
qiskit/compiler/transpiler.py 7 92.93%
qiskit/transpiler/preset_passmanagers/generate_preset_pass_manager.py 14 92.0%
qiskit/providers/basic_provider/basic_simulator.py 36 87.76%
Totals Coverage Status
Change from base Build 11591703186: 0.05%
Covered Lines: 75712
Relevant Lines: 85337

💛 - Coveralls

@jakelishman jakelishman removed the request for review from nonhermitian January 13, 2024 15:30
@mtreinish mtreinish modified the milestones: 1.0.0, 1.1.0 Jan 23, 2024
@mtreinish mtreinish self-assigned this Mar 22, 2024
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also a merge conflict now that #11388 merged

@jakelishman
Copy link
Member Author

Now rebased over main. Apparently I touched a couple of unrelated comments at some point that I mistakenly thought were part of this PR not a previous one, but oh well.

@mtreinish mtreinish assigned jlapeyre and unassigned mtreinish May 1, 2024
@mtreinish mtreinish modified the milestones: 1.1.0, 1.2.0 May 2, 2024
@mtreinish mtreinish modified the milestones: 1.2.0, 1.3.0 Jul 31, 2024
@ShellyGarion ShellyGarion self-assigned this Sep 23, 2024
@ShellyGarion
Copy link
Member

ShellyGarion commented Sep 25, 2024

Before looking into this very smart, efficient and complicated code, I looked into the tests, and found only one very simple test for the SparsePauliOp.from_operator method (which tests only operators which are 2-qubit Paulis with coefficient=1):

Since this PR (as well the one before it #11133) replaced a simple code of about ~10 lines in Python by a quite complicated and much longer code in Rust, I would suggest to enhance the test code by some tests of the following form:

labels = ["XI", "YZ", "YY", "ZZ"]  # it's better to provide a longer list of labels, preferably 3-4 qubits  
coeffs = [-3, 4.4j, 0.2 - 0.1j, 66.12] # provide a random list of coefficients, include also complex numbers
target = np.zeros((4, 4), dtype=complex)
for coeff, label in zip(coeffs, labels):
    target += coeff * pauli_mat(label)
op = SparsePauliOp.from_operator(Operator(target))
# assert that op and target are the same (up to a permutation)

There are also several methods that tests the to_matrix and to_operator methods - it's also possible to add a from_operator method to these tests (although all of them use the same parameters given above):


Copy link
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very optimized version of the original code (~10 lines in Python) as well as the code in the paper of HBG (<100 lines in Python).
The code is of high quality, and so my main concern is that these optimizations should be better explained and documented, and some further tests should be added (see my previous comment).

crates/accelerate/src/sparse_pauli_op.rs Outdated Show resolved Hide resolved
out_list.push(PauliLocation::begin(num_qubits));
}
_ => {
unsafe { scratch.set_len(side * side) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to test that there is no memory leakage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can write a Rust-space test, since that'll cause all this code to be run under Miri, which checks this kind of thing. In the mean time I made this scratch.set_len(scratch.capacity()) so it's clear that it's not setting it wider than is allocated.

crates/accelerate/src/sparse_pauli_op.rs Show resolved Hide resolved
// This set of `extend` calls is effectively an 8-fold unrolling of the "natural" loop through
// each bit, where the initial `if` statements are handling the remainder (the up-to 7
// least-significant bits). In practice, it's probably unlikely that people are decomposing
// 16q+ operators, since that's a pretty huge matrix already.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that an operator will have many qubits but with a few Pauli terms (sparse)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, very much so. Everything here should handle those cases fine, though.

crates/accelerate/src/sparse_pauli_op.rs Outdated Show resolved Hide resolved
This rewrites the `from_operator` handling (again!) from the initial
Rust implementation of the recursive matrix-addition form into an
iterative approach that re-uses the same scratch memory all the way
down.  This is significantly faster, and allocates far less often,
although in practice the peak heap memory usage will be not dissimilar.

The algorithm is rewritten to be a manual stack-based iteration, rather
than a functional recursion.  The size of a single stack entry in the
iteration is one `usize`, which is drastically smaller than whatever
per-function-call stack will have been used before.
@jakelishman
Copy link
Member Author

I've updated all the documentation - I'll add some more tests tomorrow.

@jakelishman
Copy link
Member Author

Ok, I've pushed up a handful more tests in 80a3e87, which should substantially increase the mathematical coverage, and refactored a small amount to make it possible to write a couple of simple Rust-space tests so all the unsafe code gets run under Miri as well.

Copy link
Member

@ShellyGarion ShellyGarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@jakelishman jakelishman added this pull request to the merge queue Oct 31, 2024
Merged via the queue into Qiskit:main with commit 4e6fd36 Oct 31, 2024
17 checks passed
@jakelishman jakelishman deleted the faster-pauli-decomposition branch October 31, 2024 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog mod: quantum info Related to the Quantum Info module (States & Operators) performance Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants