title | filename | chapternum |
---|---|---|
Quantum computing |
lec_26_quantum_computing |
22 |
- See main aspects in which quantum mechanics differs from local deterministic theories. \
- Model of quantum circuits, or equivalently QNAND-CIRC programs \
- The complexity class
$\mathbf{BQP}$ and what we know about its relation to other classes \ - Ideas behind Shor's Algorithm and the Quantum Fourier Transform
"We always have had (secret, secret, close the doors!) ... a great deal of difficulty in understanding the world view that quantum mechanics represents ... It has not yet become obvious to me that there's no real problem. ... Can I learn anything from asking this question about computers--about this may or may not be mystery as to what the world view of quantum mechanics is?" , Richard Feynman, 1981
"The only difference between a probabilistic classical world and the equations of the quantum world is that somehow or other it appears as if the probabilities would have to go negative", Richard Feynman, 1981
There were two schools of natural philosophy in ancient Greece. Aristotle believed that objects have an essence that explains their behavior, and a theory of the natural world has to refer to the reasons (or "final cause" to use Aristotle's language) as to why they exhibit certain phenomena. Democritus believed in a purely mechanistic explanation of the world. In his view, the universe was ultimately composed of elementary particles (or Atoms) and our observed phenomena arise from the interactions between these particles according to some local rules. Modern science (arguably starting with Newton) has embraced Democritus' point of view, of a mechanistic or "clockwork" universe of particles and forces acting upon them.
While the classification of particles and forces evolved with time, to a large extent the "big picture" has not changed from Newton till Einstein.
In particular it was held as an axiom that if we knew fully the current state of the universe (i.e., the particles and their properties such as location and velocity) then we could predict its future state at any point in time.
In computational language, in all these theories the state of a system with
Alas, in the beginning of the 20th century, several experimental results were calling into question this "clockwork" or "billiard ball" theory of the world. One such experiment is the famous double slit experiment. Here is one way to describe it. Suppose that we buy one of those baseball pitching machines, and aim it at a soft plastic wall, but put a metal barrier with a single slit between the machine and the plastic wall (see doublebaseballfig{.ref}). If we shoot baseballs at the plastic wall, then some of the baseballs would bounce off the metal barrier, while some would make it through the slit and dent the wall. If we now carve out an additional slit in the metal barrier then more balls would get through, and so the plastic wall would be even more dented.
So far this is pure common sense, and it is indeed (to my knowledge) an accurate description of what happens when we shoot baseballs at a plastic wall. However, this is not the same when we shoot photons. Amazingly, if we shoot with a "photon gun" (i.e., a laser) at a wall equipped with photon detectors through some barrier, then (as shown in doubleslitfig{.ref}) in some positions of the wall we will see fewer hits when the two slits are open than when only one of them is!. In particular there are positions in the wall that are hit when the first slit is open, hit when the second gun is open, but are not hit at all when both slits are open!.
It seems as if each photon coming out of the gun is aware of the global setup of the experiment, and behaves differently if two slits are open than if only one is. If we try to "catch the photon in the act" and place a detector right next to each slit so we can see exactly the path each photon takes then something even more bizarre happens. The mere fact that we measure the path changes the photon's behavior, and now this "destructive interference" pattern is gone and the number of times a position is hit when two slits are open is the sum of the number of times it is hit when each slit is open.
You should read the paragraphs above more than once and make sure you appreciate how truly mind boggling these results are.
The double slit and other experiments ultimately forced scientists to accept a very counterintuitive picture of the world. It is not merely about nature being randomized, but rather it is about the probabilities in some sense "going negative" and cancelling each other!
To see what we mean by this, let us go back to the baseball experiment.
Suppose that the probability a ball passes through the left slit is
To understand the way we model this in quantum mechanics, it is helpful to think of a "lazy evaluation" approach to probability. We can think of a probabilistic experiment such as shooting a baseball through two slits in two different ways:
-
When a ball is shot, "nature" tosses a coin and decides if it will go through the left slit (which happens with probability
$p_L$ ), right slit (which happens with probability$p_R$ ), or bounce back. If it passes through one of the slits then it will hit the wall. Later we can look at the wall and find out whether or not this event happened, but the fact that the event happened or not is determined independently of whether or not we look at the wall. -
The other viewpoint is that when a ball is shot, "nature" computes the probabilities
$p_L$ and$p_R$ as before, but does not yet "toss the coin" and determines what happened. Only when we actually look at the wall, nature tosses a coin and with probability$p_L+p_R$ ensures we see a dent. That is, nature uses "lazy evaluation", and only determines the result of a probabilistic experiment when we decide to measure it.
While the first scenario seems much more natural, the end result in both is the same (the wall is hit with probability
However, when we want to describe the double slit experiment with photons rather than baseballs, it is the second scenario that lends itself better to a quantum generalization.
Quantum mechanics associates a number
Specifically, consider an event that can either occur or not (e.g. "detector number 17 was hit by a photon").
In classical probability, we model this by a probability distribution over the two outcomes: a pair of non-negative numbers
::: { .pause } If you don't find the above description confusing and unintuitive, you probably didn't get it. Please make sure to re-read the above paragraphs until you are thoroughly confused. :::
Quantum mechanics is a mathematical theory that allows us to calculate and predict the results of the double-slit and many other experiments.
If you think of quantum mechanics as an explanation as to what "really" goes on in the world, it can be rather confusing.
However, if you simply "shut up and calculate" then it works amazingly well at predicting experimental results.
In particular, in the double slit experiment, for any position in the wall, we can compute numbers
::: {.remark title="Complex vs real, other simplifications" #complexrem} If (like the author) you are a bit intimidated by complex numbers, don't worry: you can think of all amplitudes as real (though potentially negative) numbers without loss of understanding. All the "magic" of quantum computing already arises in this case, and so we will often restrict attention to real amplitudes in this chapter.
We will also only discuss so-called pure quantum states, and not the more general notion of mixed states. Pure states turn out to be sufficient for understanding the algorithmic aspects of quantum computing.
More generally, this chapter is not meant to be a complete description of quantum mechanics, quantum information theory, or quantum computing, but rather illustrate the main points where these differ from classical computing. :::
Linear algebra underlies much of quantum mechanics, and so you would do well to review some of the basic notions such as vectors, matrices, and linear subspaces. The operations in quantum mechanics can be represented as linear functions over the complex numbers, but we stick to the real numbers in this chapter. This does not cause much loss in understanding but does allow us to simplify our notation and eliminate the use of the complex conjugate.
The main notions we use are:
-
A function
$F:\R^N \rightarrow \R^N$ is linear if$F(\alpha u + \beta v) = \alpha F(u) + \beta F(v)$ for every$\alpha,\beta \in \R$ and$u,v \in \R^N$ . -
The inner product of two vectors
$u,v \in \R^N$ can be defined as$\langle u,v \rangle = \sum_{i\in [N]} u_iv_i$ . (There can be different inner products but we stick to this one.) The norm of a vector$u \in \R^N$ is defined as$|u| = \sqrt{\langle u,u \rangle} = \sqrt{\sum_{i\in [N]}u_i^2}$ . We say that$u$ is a unit vector if$|u|=1$ . -
Two vectors
$u,v \in \R^N$ are orthogonal if$\langle u,v\rangle = 0$ . An orthonormal basis for$\R^N$ is a set of$N$ vectors$v_0,v_1,\ldots, v_{N-1}$ such that$| v_i |=1$ for every$i\in [N]$ and$\langle v_i,v_j \rangle=0$ for every$i\neq j$ . A canoncial example is the standard basis$e_0,\ldots,e_{N-1}$ , where$e_i$ is the vector that has zeroes in all cooordinates except the$i$ -th coordinate in which its value is$1$ . A quirk of the quantum mechanics literature is that$e_i$ is often denoted by$|i \rangle$ . We also often look at the case$N=2^n$ , in which case we identify$[N]$ with${0,1}^n$ and for every$x\in {0,1}^n$ , we denote the standard basis element corresponding to the$x$ -th coordinate by$|x \rangle$ . -
If
$u$ is a vector in$\R^n$ and$v_0,\ldots,v_{N-1}$ is an orthonormal basis for$\R^N$ , then there are coefficients$\alpha_0,\ldots,\alpha_{N-1}$ such that$u = \alpha_0v_0 + \cdots + \alpha_{N-1}v_{N-1}$ . Consequently, the value$F(u)$ is determined by the values$F(v_0)$ ,$\ldots$ ,$F(v_{N-1})$ . Moreover,$|u| = \sqrt{\sum_{i\in [N]} \alpha_i^2}$ . -
We can represent a linear function
$F:\R^N \rightarrow \R^N$ as an$N\times N$ matrix$M(F)$ where the coordinate in the$i$ -th row and$j$ -th column of$M(F)$ (that is$M(F)_{i,j}$ ) is equal to$\langle e_i , F(e_j) \rangle$ or equivalently the$i$ -th coordinate of$F(e_j)$ . -
A linear function
$F:\R^N \rightarrow \R^N$ such that$| F(u) | = |u |$ for every$u$ is called unitary. It can be shown that a function$F$ is unitary if and only if$M(F) M(F)^\top = I$ where$\top$ is the transpose operator (in the complex case the conjugate transpose) and$I$ is the$N\times N$ identity matrix that has$1$ 's on the diagonal and zeroes everywhere else. (For every two matrices$A,B$ , we use$A B$ to denote the matrix product of$A$ and$B$ .) Another equivalent characterization of this condition is that$M(F)^\top = M(F)^{-1}$ and yet another is that both the rows and columns of$M(F)$ form an orthonormal basis.
There is something weird about quantum mechanics. In 1935 Einstein, Podolsky and Rosen (EPR) tried to pinpoint this issue by highlighting a previously unrealized corollary of this theory. They showed that the idea that nature does not determine the results of an experiment until it is measured results in so called "spooky action at a distance". Namely, making a measurement of one object may instantaneously effect the state (i.e., the vector of amplitudes) of another object in the other end of the universe.
Since the vector of amplitudes is just a mathematical abstraction, the EPR paper was considered to be merely a thought experiment for philosophers to be concerned about, without bearing on experiments. This changed when in 1965 John Bell showed an actual experiment to test the predictions of EPR and hence pit intuitive common sense against quantum mechanics. Quantum mechanics won: it turns out that it is in fact possible to use measurements to create correlations between the states of objects far removed from one another that cannot be explained by any prior theory. Nonetheless, since the results of these experiments are so obviously wrong to anyone that has ever sat in an armchair, that there are still a number of Bell denialists arguing that this can't be true and quantum mechanics is wrong.
So, what is this Bell's Inequality?
Suppose that Alice and Bob try to convince you they have telepathic ability, and they aim to prove it via the following experiment.
Alice and Bob will be in separate closed rooms.^[If you are extremely paranoid about Alice and Bob communicating with one another, you can coordinate with your assistant to perform the experiment exactly at the same time, and make sure that the rooms are sufficiently far apart (e.g., are on two different continents, or maybe even one is on the moon and another is on earth) so that Alice and Bob couldn't communicate to each other in time the results of their respective coins even if they do so at the speed of light.]
You will interrogate Alice and your associate will interrogate Bob.
You choose a random bit
Now if Alice and Bob are not telepathic, then they need to agree in advance on some strategy.
It's not hard for Alice and Bob to succeed with probability
For every two functions
::: {.proof data-ref="bellthm"}
Since the probability is taken over all four choices of
for all the four choices of
If we XOR together the first and second equalities we get
::: {.remark title="Randomized strategies" #randomizedstrategies}
bellthm{.ref} above assumes that Alice and Bob use deterministic strategies
An amazing experimentally verified fact is that quantum mechanics allows for "telepathy".1
Specifically, it has been shown that using the weirdness of quantum mechanics, there is in fact a strategy for Alice and Bob to succeed in this game with probability larger than
Some of the counterintuitive properties that arise from quantum mechanics include:
-
Interference - As we've seen, quantum amplitudes can "cancel each other out".
-
Measurement - The idea that amplitudes are negative as long as "no one is looking" and "collapse" (by squaring them) to positive probabilities when they are measured is deeply disturbing. Indeed, as shown by EPR and Bell, this leads to various strange outcomes such as "spooky actions at a distance", where we can create correlations between the results of measurements in places far removed. Unfortunately (or fortunately?) these strange outcomes have been confirmed experimentally.
-
Entanglement - The notion that two parts of the system could be connected in this weird way where measuring one will affect the other is known as quantum entanglement.
As counter-intuitive as these concepts are, they have been experimentally confirmed, so we just have to live with them.
The discussion in this chapter of quantum mechanics in general and quantum computing in particular is quite brief and superficial, the "bibliographical notes" section (quantumbibnotessec{.ref}) contains references and links to many other resources that cover this material in more depth.
One of the strange aspects of the quantum-mechanical picture of the world is that unlike in the billiard ball example, there is no obvious algorithm to simulate the evolution of
In the 1981, physicist Richard Feynman proposed to "turn this lemon to lemonade" by making the following almost tautological observation:
If a physical system cannot be simulated by a computer in
$T$ steps, the system can be considered as performing a computation that would take more than$T$ steps.
So, he asked whether one could design a quantum system such that its outcome
For a while these hypothetical quantum computers seemed useful for one of two things. First, to provide a general-purpose mechanism to simulate a variety of the real quantum systems that people care about, such as various interactions inside molecules in quantum chemistry. Second, as a challenge to the Physical Extended Church Turing Thesis which says that every physically realizable computation device can be modeled (up to polynomial overhead) by Turing machines (or equivalently, NAND-TM / NAND-RAM programs).
Quantum chemistry is important (and in particular understanding it can be a bottleneck for designing new materials, drugs, and more), but it is still a rather niche area within the broader context of computing (and even scientific computing) applications. Hence for a while most researchers (to the extent they were aware of it), thought of quantum computers as a theoretical curiosity that has little bearing to practice, given that this theoretical "extra power" of quantum computer seemed to offer little advantage in the majority of the problems people want to solve in areas such as combinatorial optimization, machine learning, data structures, etc..
To some extent this is still true today. As far as we know, quantum computers, if built, will not provide exponential speed ups for 95% of the applications of computing.3
In particular, as far as we know, quantum computers will not help us solve
However, there is one cryptography-sized exception: In 1994 Peter Shor showed that quantum computers can solve the integer factoring and discrete logarithm in polynomial time. This result has captured the imagination of a great many people, and completely energized research into quantum computing. This is both because the hardness of these particular problems provides the foundations for securing such a huge part of our communications (and these days, our economy), as well as it was a powerful demonstration that quantum computers could turn out to be useful for problems that a-priori seemed to have nothing to do with quantum physics.
As we'll discuss later, at the moment there are several intensive efforts to construct large scale quantum computers.
It seems safe to say that, as far as we know, in the next five years or so there will not be a quantum computer large enough to factor, say, a
::: {.remark title="Quantum computing and
::: { .bigidea #quantumcomp}
Quantum computers are not a panacea and are unlikely to solve
Before we talk about quantum computing, let us recall how we physically realize "vanilla" or classical computing.
We model a logical bit that can equal
In the probabilistic setting, we would model the state of the system by a distribution.
For an individual bit, we could model it by a pair of non-negative numbers
If we think of the
Applying the operation above of setting the
::: { .pause }
Please make sure you understand why performing the operation will take a system in state
If your linear algebra is a bit rusty, now would be a good time to review it, and in particular make sure you are comfortable with the notions of matrices, vectors, (orthogonal and orthonormal) bases, and norms. :::
In the quantum setting, the state of an individual bit (or "qubit", to use quantum parlance) is modeled by a pair of numbers
Following quantum tradition, instead of using
In classical computation, we typically think that there are only two operations that we can do on a single bit: keep it the same or negate it.
In the quantum setting, a single bit operation corresponds to any linear map
In fact it turns out that Hadamard is all that we need to add to a classical universal basis to achieve the full power of quantum computing.
If you ignore the physics and philosophy, for the purposes of understanding the model of quantum computers, all you need to know about quantum systems is the following.
The state of a quantum system of
A quantum operation on such a system is modeled by a
When we measure an
Now that we have the notation in place, we can show a strategy for Alice and Bob to display "quantum telepathy" in Bell's Game.
Recall that in the classical case, Alice and Bob can succeed in the "Bell Game" with probability at most
There is a 2-qubit quantum state
::: {.proof data-ref="bellstrategy"} Alice and Bob will start by preparing a 2-qubit quantum system in the state
(this state is known as an EPR pair).
Alice takes the first qubit of the system to her room, and Bob takes the second qubit to his room.
Now, when Alice receives
Recall that to win the game Bob and Alice want their outputs to be more likely to differ if
Case 1:
Case 2:
The analysis for Case 3, where
Case 4:
Intuitively, since we rotate one state by 45 degrees and the other state by -45 degrees, they will become orthogonal to each other, and the measurements will behave like independent coin tosses that agree with probability 1/2. However, for the sake of completeness, we now show the full calculation.
Opening up the coefficients and using
$$ \begin{aligned} \cos^2(\pi/8)|00 \rangle &+ \cos(\pi/8)\sin(\pi/8)|01 \rangle \
- \sin(\pi/8)\cos(\pi/8)|10\rangle &+ \sin^2(\pi/8)|11 \rangle \
- \sin^2(\pi/8)|00 \rangle &+ \sin(\pi/8)\cos(\pi/8)|01 \rangle \
- \cos(\pi/8)\sin(\pi/8)|10\rangle &+ \cos^2(\pi/8)|11 \rangle ;. \end{aligned} $$
Using the trigonometric identities
Taking all the four cases together, the overall probability of winning the game is
It is instructive to understand what is it about quantum mechanics that enabled this gain in Bell's Inequality. For this, consider the following analogous probabilistic strategy for Alice and Bob. They agree that each one of them output
Recall that in the classical setting, we modeled computation as obtained by a sequence of basic operations. We had two types of computational models:
-
Non uniform models of computation such as Boolean circuits and NAND-CIRC programs, where a finite function
$f:{0,1}^n \rightarrow {0,1}$ is computable in size$T$ if it can be expressed as a combination of$T$ basic operations (gates in a circuit or lines in a NAND-CIRC program) -
Uniform models of computation such as Turing machines and NAND-TM programs, where an infinite function
$F:{0,1}^* \rightarrow {0,1}$ is computable in time$T(n)$ if there is a single algorithm that on input$x\in {0,1}^n$ evaluates$F(x)$ using at most$T(n)$ basic steps.
When considering efficient computation, we defined the class
We will do the same for quantum computation, focusing mostly on the non uniform setting of quantum circuits, since that is simpler, and already illustrates the important differences with classical computing.
A quantum circuit is analogous to a Boolean circuit, and can be described as a directed acyclic graph.
One crucial difference that the out degree of every vertex in a quantum circuit is at most one.
This is because we cannot "reuse" quantum states without measuring them (which collapses their "probabilities").
Therefore, we cannot use the same qubit as input for two different gates.
(This is known as the No Cloning Theorem.)
Another more technical difference is that to express our operations as unitary matrices, we will need to make sure all our gates are reversible.
This is not hard to ensure.
For example, in the quantum context, instead of thinking of
If we have an
As mentioned above, we will also use the Hadamard or
A quantum circuit is obtained by composing these basic operations on some
-
On input
$x$ , we initialize the system to hold$x_0,\ldots,x_{n-1}$ in the first$n$ qubits, and initialize all remaining$m-n$ qubits to zero. -
We execute each elementary operation one by one: at every step we apply to the current state either an operation of the form
$U_{NAND}^{i,j,k}$ or an operation of the form$HAD^i$ for$i,j,k\in [m]$ . -
At the end of the computation, we measure the system, and output the result of the last qubit (i.e. the qubit in location
$m-1$ ). (For simplicity we restrict attention to functions with a single bit of output, though the definition of quantum circuits naturally extends to circuits with multiple outputs.) -
We say that the circuit computes the function
$f$ if the probability that this output equals$f(x)$ is at least$2/3$ . Note that this probability is obtained by summing up the squares of the amplitudes of all coordinates in the final state of the system corresponding to vectors$|y \rangle$ where$y_{m-1}=f(x)$ .
Formally we define quantum circuits as follows:
::: {.definition title="Quantum circuit" #quantumcircuitdef}
Let
A quantum circuit computes a function
Let
::: { .pause } Please stop here and see that this definition makes sense to you. :::
::: { .bigidea #quantumdefine} Just as we did with classical computation, we can define mathematical models for quantum computation, and represent quantum algorithms as binary strings. :::
Once we have the notion of quantum circuits, we can define the quantum analog of
Let
::: {.remark title="The obviously exponential fallacy" #exponential}
A priori it might seem "obvious" that quantum computing is exponentially powerful, since to perform a quantum computation on
Depending on how you interpret it, this description is either false or would apply equally well to probabilistic computation, even though we've already seen that every randomized algorithm can be simulated by a similar-sized circuit, and in fact we conjecture that
Moreover, this "obvious" approach for simulating a quantum computation will take not just exponential time but exponential space as well, while can be shown that using a simple recursive formula one can calculate the final quantum state using polynomial space (in physics this is known as "Feynman path integrals"). So, the exponentially long vector description by itself does not imply that quantum computers are exponentially powerful. Indeed, we cannot prove that they are (i.e., we have not been able to rule out the possibility that every QNAND-CIRC program could be simulated by a NAND-CIRC program/ Boolean circuit with polynomial overhead), but we do have some problems (integer factoring most prominently) for which they do provide exponential speedup over the currently best known classical (deterministic or probabilistic) algorithms. :::
Just like in the classical case, there is an equivalence between circuits and straight-line programs, and so we can define the programming language QNAND-CIRC that is the quantum analog of our NAND-CIRC programming language.
To do so, we only add a single operation: HAD(foo)
which applies the single-bit operation foo
.
We also use the following interpretation to make NAND
reversible: foo = NAND(bar,blah)
means that we modify foo
to be the XOR of its original value and the NAND of bar
and blah
.
(In other words, apply the foo
, bar
and blah
.)
If foo
is initialized to zero then this makes no difference.
If
-
We initialize the input variables
X[
$0$]</code> $\ldots$ <code>X[$ n-1$]
to$x_0,\ldots,x_{n-1}$ and all other variables to$0$ . -
We execute the program line by line, applying the corresponding physical operation
$H$ or$U_{NAND}$ to the qubits that are referred to by the line. -
We measure the output variables
Y[
$0$]</code>, $\ldots$, <code>Y[$ m-1$]
and output the result (if there is more than one output then we measure more variables).
Just as in the classical case, we can define uniform computational models for quantum computing as well.
We will let
Let
BQPdef{.ref} is the quantum analog of the alternative characterization of
The relation between
It can be shown that
::: {.remark title="Restricting attention to circuits" #quantumnonuniformrem} Because the non uniform model is a little cleaner to work with, in the rest of this chapter we mostly restrict attention to this model, though all the algorithms we discuss can be implemented using uniform algorithms as well. :::
To realize quantum computation one needs to create a system with
There have been several proposals to build quantum computers:
-
Superconducting quantum computers use superconducting electric circuits to do quantum computation. This is the direction where there has been most recent progress towards "beating" classical computers.
-
Trapped ion quantum computers Use the states of an ion to simulate a qubit. People have made some recent advances on these computers too. While it's not at all clear that's the right measuring yard, the current best implementation of Shor's algorithm (for factoring 15) is done using an ion-trap computer.
-
Topological quantum computers use a different technology. Topological qubits are more stable by design and hence error correction is less of an issue, but constructing them is extremely challenging.
These approaches are not mutually exclusive and it could be that ultimately quantum computers are built by combining all of them together. In the near future, it seems that we will not be able to achieve full fledged large scale universal quantum computers, but rather more restricted machines, sometimes called "Noisy Intermediate-Scale Quantum Computers" or "NISQ". See this article by John Preskil for some of the progress and applications of such more restricted devices.
Bell's Inequality is a powerful demonstration that there is something very strange going on with quantum mechanics. But could this "strangeness" be of any use to solve computational problems not directly related to quantum systems? A priori, one could guess the answer is no. In 1994 Peter Shor showed that one would be wrong:
::: {.theorem title="Shor's Algorithm" #shorthm}
There is a polynomial-time quantum algorithm that on input an integer
Another way to state shorthm{.ref} is that if we define
At the heart of Shor's Theorem is an efficient quantum algorithm for finding periods of a given function.
For example, a function
Musical notes yield one type of periodic function.
When you pull on a string on a musical instrument, it vibrates in a repeating pattern.
Hence, if we plot the speed of the string (and so also the speed of the air around it) as a function of time, it will correspond to some periodic function.
The length of the period is known as the wave length of the note.
The frequency is the number of times the function repeats itself within a unit of time.
For example, the "Middle C" note has a frequency of
If we play a chord by playing several notes at once, we get a more complex periodic function obtained by combining the functions of the individual notes (see timefreqfig{.ref}). The human ear contains many small hairs, each of which is sensitive to a narrow band of frequencies. Hence when we hear the sound corresponding to a chord, the hairs in our ears actually separate it out to the components corresponding to each frequency.
It turns out that (essentially) every periodic function
On input an integer
Step 1: Reduce to period finding. The first step in the algorithm is to pick a random
Some not-too-hard (though somewhat technical) calculations show that: (1) The function
Step 2: Period finding via the Quantum Fourier Transform.
Using a simple trick known as "repeated squaring", it is possible to compute the map
In particular, if we were to measure the state
Another way to describe the state
The magic of Shor's algorithm comes from a procedure known as the Quantum Fourier Transform. It allows to change the state
As mentioned above, we can recover the factorization of
The resulting algorithm can be described in a high (and somewhat inaccurate) level as follows:
::: {.quote} Shor's Algorithm: (sketch)
Input: Number
Output: Prime factorization of
Operations:
-
Repeat the following
$k=poly(\log M)$ number of times:a. Choose
$A \in {0,\ldots,M-1}$ at random, and let$f_A:\Z_M \rightarrow \Z_M$ be the map$x \mapsto A^x \mod M$ .b. For
$t=poly(\log M)$ , repeat$t$ times the following step: Quantum Fourier Transform to create a quantum state$| \psi \rangle$ over$poly(\log(m))$ qubits, such that if we measure$| \psi \rangle$ we obtain a pair of strings$(j,y)$ with probability proportional to the square of the coefficient corresponding to the wave function$x \mapsto \cos(x \pi j/M)$ or$x \mapsto \sin(x \pi j/M)$ in the Fourier transform of the function$f_{A,y}:\Z_m \rightarrow {0,1}$ defined as$f_{A,y}(x)=1$ iff$f_A(x)=y$ .c. If
$j_1,\ldots,j_t$ are the coefficients we obtained in the previous step, then the least common multiple of$M/j_1,\ldots,M/j_t$ is likely to be the period of the function$f_A$ . -
If we let
$A_0,\ldots,A_{k-1}$ and$p_0,\ldots,p_{k-1}$ be the numbers we chose in the previous step and the corresponding periods of the functions$f_{A_0},\ldots,f_{A_{k-1}}$ then we can use classical results in number theory to obtain from these a non-trivial prime factor$Q$ of$M$ (if such exists). We can now run the algorithm again with the (smaller) input$M/Q$ to obtain all other factors. :::
Reducing factoring to order finding is cumbersome, but can be done in polynomial time using a classical computer. The key quantum ingredient in Shor's algorithm is the quantum fourier transform.
::: {.remark title="Quantum Fourier Transform" #QFT}
Despite its name, the Quantum Fourier Transform does not actually give a way to compute the Fourier Transform of a function
The above description of Shor's algorithm skipped over the implementation of the main quantum ingredient: the Quantum Fourier Transform algorithm. In this section we discuss the ideas behind this algorithm. We will be rather brief and imprecise. quantumsources{.ref} and quantumbibnotessec{.ref} contain references to sources of more information about this topic.
To understand the Quantum Fourier Transform, we need to better understand the Fourier Transform itself. In particular, we will need to understand how it applies not just to functions whose input is a real number but to functions whose domain can be any arbitrary commutative group. Therefore we now take a short detour to (very basic) group theory, and define the notion of periodic functions over groups.
::: {.remark title="Group theory" #grouptheorem} While we define the concepts we use, some background in group or number theory might be quite helpful for fully understanding this section.
We will not use anything more than the basic properties of finite Abelian groups. Specifically we use the following notions:
-
A finite group
$\mathbb{G}$ can be thought of as simply a set of elements and some binary operation$\star$ on these elements (i.e., if$g,h \in \mathbb{G}$ then$g \star h$ is an element of$\mathbb{G}$ as well). -
The operation
$\star$ satisfies the sort of properties that a product operation does, namely, it is associative (i.e., $(g \star h)\star f = g \star (h \star f)$) and there is some element$1$ such that$g \star 1 = g$ for all$g$ , and for every$g\in \mathbb{G}$ there exists an element$g^{-1}$ such that$g \star g^{-1} = 1$ . -
A group is called commutative (also known as Abelian) if
$g \star h = h \star g$ for all$g,h \in \mathbb{G}$ . :::
The Fourier transform is a deep and vast topic, on which we will barely touch upon here.
Over the real numbers, the Fourier transform of a function
where the
We now describe the simplest setting of the Quantum Fourier Transform: the group
where
The Quantum Fourier Transform over
::: {.theorem title="QFT Over the Boolean Cube" #QFTcube}
Let
where
The idea behind the proof is that the Hadamard operation corresponds to the Fourier transform over the group
::: {.proof data-ref="QFTcube"}
We can express the Hadamard operation
We are given the state
Now suppose that we apply the
We can now use the distributive law and open up a term of the form
to the following sum over
(If you find the above confusing, try to work out explicitly this calculation for
By changing the order of summations, we see that the final state is
which exactly corresponds to
Using QFTcube{.ref} it is not hard to get an algorithm that can recover a string
So, by measuring the state, we can obtain a sample of a random
This result is known as Simon's Algorithm, and it preceded and inspired Shor's algorithm.
QFTcube{.ref} seemed to really use the special bit-wise structure of the group
The key step in Shor's algorithm is to implement the Fourier transform for the group
The key to implementing the Quantum Fourier Transform for such groups is to use the same recursive equations that enable the classical Fast Fourier Transform (FFT) algorithm.
Specifically, consider the case that
which reduces computing the Fourier transform of
Specifically, the Fourier characters of the group
This observation is usually used to obtain a fast (e.g. $O(L \log L)$) time to compute the Fourier transform in a classical setting, but it can be used to obtain a quantum circuit of
The case that
- The state of an
$n$ -qubit quantum system can be modeled as a$2^n$ dimensional vector - An operation on the state corresponds to applying a unitary matrix to this vector.
- Quantum circuits are obtained by composing basic operations such as
$HAD$ and$U_{NAND}$ . - We can use quantum circuits to define the classes
$\mathbf{BQP_{/poly}}$ and$\mathbf{BQP}$ which are the quantum analogs of$\mathbf{P_{/poly}}$ and$\mathbf{BPP}$ respectively. - There are some problems for which the best known quantum algorithm is exponentially faster than the best known, but quantum computing is not a panacea. In particular, as far as we know, quantum computers could still require exponential time to solve
$\mathbf{NP}$ -complete problems such as$SAT$ .
::: {.exercise title="Quantum and classical complexity class relations" #BQPcontainements} Prove the following relations between quantum complexity classes and classical ones:
-
$\mathbf{P_{/poly}} \subseteq \mathbf{BQP_{/poly}}$ . See footnote for hint.^[You can use$U_{NAND}$ to simulate NAND gates.] -
$\mathbf{P} \subseteq \mathbf{BQP}$ . See footnote for hint.^[Use the alternative characterization of$\mathbf{P}$ as in Palternativeex{.ref}.] -
$\mathbf{BPP} \subseteq \mathbf{BQP}$ . See footnote for hint.^[You can use the$HAD$ gate to simulate a coin toss.] -
$\mathbf{BQP} \subseteq \mathbf{EXP}$ . See footnote for hint.^[In exponential time simulating quantum computation boils down to matrix multiplication.] -
If
$SAT \in \mathbf{BQP}$ then$\mathbf{NP} \subseteq \mathbf{BQP}$ . See footnote for hint.^[If a reduction can be implemented in$\mathbf{P}$ it can be implemented in$\mathbf{BQP}$ as well.] :::
::: {.exercise title="Discrete logarithm from order finding" #dlogfromorder}
Show a probabilistic polynomial time classical algorithm that given an Abelian finite group
An excellent gentle introduction to quantum computation is given in Mermin's book [@mermin2007quantum]. In particular the first 100 pages (Chapter 1 to 4) of [@mermin2007quantum] cover all the material of this chapter in a much more comprehensive way. This material is also covered in the first 5 chapters of De-Wolf's online lecture notes. For a more condensed exposition, the chapter on quantum computation in my book with Arora (see draft here) is one relatively short source that contains full descriptions of Grover's, Simon's and Shor's algorithms. This blog post of Aaronson contains a high level explanation of Shor's algorithm which ends with links to several more detailed expositions. Chapters 9 and 10 in Aaronson's book [@Aaronson13democritus] give an informal but highly informative introduction to the topics of this chapter and much more. Chapter 10 in Avi Wigderson's book also provides a high level overview of quantum computing. Other recommended resources include Andrew Childs' lecture notes on quantum algorithms, as well as the lecture notes of Umesh Vazirani, John Preskill, and John Watrous.
There are many excellent videos available online covering some of these materials. The videos of Umesh Vazirani'z EdX course are an accessible and recommended introduction to quantum computing. Regarding quantum mechanics in general, this video illustrates the double slit experiment, this Scientific American video is a nice exposition of Bell's Theorem. This talk and panel moderated by Brian Greene discusses some of the philosophical and technical issues around quantum mechanics and its so called "measurement problem". The Feynmann lecture on the Fourier Transform and quantum mechanics in general are very much worth reading. The Fourier transform is covered in these videos of Dr. Chris Geoscience, Clare Zhang and Vi Hart. See also Kelsey Houston-Edwards's video on Shor's Algorithm.
The form of Bell's game we discuss in bellineqsec{.ref} was given by Clauser, Horne, Shimony, and Holt.
The Fast Fourier Transform, used as a component in Shor's algorithm, is one of the most widely used algorithms invented. The stories of its discovery by Gauss in trying to calculate asteroid orbits and rediscovery by Tukey during the cold war are fascinating as well.
The image in doubleslitfig{.ref} is taken from Wikipedia.
Thanks to Scott Aaronson for many helpful comments about this chapter.
Footnotes
-
More accurately, one either has to give up on a "billiard ball type" theory of the universe or believe in telepathy (believe it or not, some scientists went for the latter option). ↩
-
As its title suggests, Feynman's lecture was actually focused on the other side of simulating physics with a computer. However, he mentioned that as a "side remark" one could wonder if it's possible to simulate physics with a new kind of computer - a "quantum computer" which would "not [be] a Turing machine, but a machine of a different kind". As far as I know, Feynman did not suggest that such a computer could be useful for computations completely outside the domain of quantum simulation. Indeed, he was more interested in the question of whether quantum mechanics could be simulated by a classical computer. ↩
-
This "95 percent" is a figure of speech, but not completely so. At the time of this writing, cryptocurrency mining electricity consumption is estimated to use up at least 70Twh or 0.3 percent of the world's production, which is about 2 to 5 percent of the total energy usage for the computing industry. All the current cryptocurrencies will be broken by quantum computers. Also, for many web servers the TLS protocol (which is based on the current non-lattice based systems would be completely broken by quantum computing) is responsible for about 1 percent of the CPU usage. ↩
-
Of course, given that we're still hearing of attacks exploiting "export grade" cryptography that was supposed to disappear in 1990's, I imagine that we'll still have products running 1024 bit RSA when everyone has a quantum laptop. ↩