title | filename | chapternum |
---|---|---|
Code as Data, Data as Code |
lec_04_code_and_data |
5 |
- See one of the most important concepts in computing: duality between code and data. \
- Build up comfort in moving between different representations of programs. \
- Follow the construction of a "universal circuit evaluator" that can evaluate other circuits given their representation. \
- See major result that complements the result of the last chapter: some functions require an exponential number of gates to compute.
- Discussion of Physical extended Church-Turing thesis stating that Boolean circuits capture all feasible computation in the physical world, and its physical and philosophical implications.
"The term code script is, of course, too narrow. The chromosomal structures are at the same time instrumental in bringing about the development they foreshadow. They are law-code and executive power - or, to use another simile, they are architect’s plan and builder’s craft - in one." , Erwin Schrödinger, 1944.
"A mathematician would hardly call a correspondence between the set of 64 triples of four units and a set of twenty other units, "universal", while such correspondence is, probably, the most fundamental general feature of life on Earth", Misha Gromov, 2013
A program is simply a sequence of symbols, each of which can be encoded as a string of
::: { .bigidea #programisinput } A program is a piece of text, and so it can be fed as input to other programs. :::
This correspondence between code and data is one of the most fundamental aspects of computing. It underlies the notion of general purpose computers, that are not pre-wired to compute only one task, and also forms the basis of our hope for obtaining general artificial intelligence. This concept finds immense use in all areas of computing, from scripting languages to machine learning, but it is fair to say that we haven't yet fully mastered it. Many security exploits involve cases such as "buffer overflows" when attackers manage to inject code where the system expected only "passive" data (see XKCDmomexploitsfig{.ref}). The relation between code and data reaches beyond the realm of electronic computers. For example, DNA can be thought of as both a program and data (in the words of Schrödinger, who wrote before DNA's discovery a book that inspired Watson and Crick, DNA is both "architect's plan and builder's craft").
{#XKCDmomexploitsfig .margin }
In this chapter, we will begin to explore some of the applications of this connection.
We start by using the representation of programs/circuits as strings to count the number of programs/circuits up to a certain size, and use that to obtain a counterpart to the result we proved in finiteuniversalchap{.ref}.
There we proved that for every function
We can represent programs or circuits as strings in a myriad of ways.
For example, since Boolean circuits are labeled directed acyclic graphs, we can use the adjacency matrix or adjacency list representations for them.
However, since the code of a program is ultimately just a sequence of letters and symbols, arguably the conceptually simplest representation of a program is as such a sequence.
For example, the following NAND-CIRC program
temp_0 = NAND(X[0],X[1])
temp_1 = NAND(X[0],temp_0)
temp_2 = NAND(X[1],temp_0)
Y[0] = NAND(temp_1,temp_2)
is simply a string of 107 symbols which include lower and upper case letters, digits, the underscore character _
and equality sign =
, punctuation marks such as "(
",")
",",
", spaces, and "new line" markers (often denoted
as "\ n
" or "↵").
Each such symbol can be encoded as a string of
Nothing in the above discussion was specific to the program temp_0
, temp_1
, temp_2
, etc..
Moreover, if the program has foo = NAND(bar,blah)
), can be represented using
::: {.theorem title="Representing programs as strings" #asciirepprogramthm}
There is a constant
::: { .pause } We omit the formal proof of asciirepprogramthm{.ref} but please make sure that you understand why it follows from the reasoning above. :::
One consequence of the representation of programs as strings is that the number of programs of certain length is bounded by the number of strings that represent them.
This has consequences for the sets
For every
::: {.proof data-ref="program-count"}
We will show a one-to-one map
The map
A function mapping
There is a constant
::: {.proof data-ref="counting-lb"}
The proof is simple. If we let
We have seen before that every function mapping
::: { .bigidea #countinglb }
Some functions
In fact, as we explore in the exercises, this is the case for most functions.
Hence functions that can be computed in a small number of lines (such as addition, multiplication, finding short paths in graphs, or even the
The ASCII representation is not the shortest representation for NAND-CIRC programs.
NAND-CIRC programs are equivalent to circuits with NAND gates, which means that a NAND-CIRC program of
By NAND-univ-thm-improved{.ref} the class
For every sufficiently large
To prove the theorem we need to find a function
::: {.proof data-ref="sizehiearchythm"}
Let $f^: {0,1}^n \rightarrow {0,1}$ be the function (whose existence we are guaranteed by counting-lb{.ref}) such that $f^ \not\in SIZE_n(0.1 \cdot 2^n /n)$.
We define the functions
The function
By our choice of
::: {.remark title="Explicit functions" #explicitfunc}
While the size hierarchy theorem guarantees that there exists some function that can be computed using, for example,
ASCII is a fine presentation of programs, but for some applications it is useful to have a more concrete representation of NAND-CIRC programs. In this section we describe a particular choice, that will be convenient for us later on. A NAND-CIRC program is simply a sequence of lines of the form
blah = NAND(baz,boo)
There is of course nothing special about the particular names we use for variables.
Although they would be harder to read, we could write all our programs using only working variables such as temp_0
, temp_1
etc.
Therefore, our representation for NAND-CIRC programs ignores the actual names of the variables, and just associate a number with each variable.
We encode a line of the program as a triple of numbers.
If the line has the form foo = NAND(bar,blah)
then we encode it with the triple foo
and bar
and blah
respectively.
More concretely, we will associate every variable with a number in the set
::: {.definition title="List of tuples representation" #nandtuplesdef}
Let
We assign a number for variable of
-
For every
$i\in [n]$ , the variableX[
$i$ ]
is assigned the number$i$ . -
For every
$j\in [m]$ , the variableY[
$j$ ]
is assigned the number$t-m+j$ . -
Every other variable is assigned a number in
${n,n+1,\ldots,t-m-1}$ in the order in which the variable appears in the program$P$ . :::
The list of tuples representation is our default choice for representing NAND-CIRC programs.
Since "list of tuples representation" is a bit of a mouthful, we will often call it simply "the representation" for a program
::: {.example title="Representing the XOR program" #representXOR} Our favorite NAND-CIRC program, the program
u = NAND(X[0],X[1])
v = NAND(X[0],u)
w = NAND(X[1],u)
Y[0] = NAND(v,w)
computing the XOR function is represented as the tuple X[0]
and X[1]
are given the indices u
,v
,w
are given the indices Y[0]
is given the index
Transforming a NAND-CIRC program from its representation as code to the representation as a list of tuples is a fairly straightforward programming exercise, and in particular can be done in a few lines of Python.^[If you're curious what these few lines are, see our GitHub repository.] The list-of-tuples representation loses information such as the particular names we used for the variables, but this is OK since these names do not make a difference to the functionality of the program.
If
We define
We can represent
Since we can represent programs as strings, we can also think of a program as an input to a function.
In particular, for every natural numbers
That is,
Take-away points. The fine details of
-
$EVAL_{s,n,m}$ is a finite function taking a string of fixed length as input and outputting a string of fixed length as output. -
$EVAL_{s,n,m}$ is a single function, such that computing$EVAL_{s,n,m}$ allows to evaluate arbitrary NAND-CIRC programs of a certain length on arbitrary inputs of the appropriate length. -
$EVAL_{s,n,m}$ is a function, not a program (recall the discussion in specvsimplrem{.ref}). That is,$EVAL_{s,n,m}$ is a specification of what output is associated with what input. The existence of a program that computes$EVAL_{s,n,m}$ (i.e., an implementation for$EVAL_{s,n,m}$ ) is a separate fact, which needs to be established (and which we will do in bounded-univ{.ref}, with a more efficient program shown in in eff-bounded-univ{.ref}).
One of the first examples of self circularity we will see in this book is the following theorem, which we can think of as showing a "NAND-CIRC interpreter in NAND-CIRC":
::: {.theorem title="Bounded Universality of NAND-CIRC programs" #bounded-univ}
For every
That is, the NAND-CIRC program
::: {.proof data-ref="bounded-univ"}
bounded-univ{.ref} is an important result, but it is actually not hard to prove.
Specifically, since
bounded-univ{.ref} is simple but important. Make sure you understand what this theorem means, and why it is a corollary of NAND-univ-thm{.ref}.
bounded-univ{.ref} establishes the existence of a NAND-CIRC program for computing
For every
::: { .pause }
If you haven't done so already, now might be a good time to review
Unlike bounded-univ{.ref}, eff-bounded-univ{.ref} is not a trivial corollary of the fact that every finite function can be computed by some circuit.
Proving bounded-univ{.ref} requires us to present a concrete NAND-CIRC program for computing the function
-
First, we will describe the algorithm to evaluate
$EVAL_{s,n,m}$ in "pseudo code". -
Then, we will show how we can write a program to compute
$EVAL_{s,n,m}$ in Python. We will not use much about Python, and a reader that has familiarity with programming in any language should be able to follow along. -
Finally, we will show how we can transform this Python program into a NAND-CIRC program.
This approach yields much more than just proving eff-bounded-univ{.ref}: we will see that it is in fact always possible to transform (loop free) code in high level languages such as Python to NAND-CIRC programs (and hence to Boolean circuits as well).
To prove eff-bounded-univ{.ref} it suffices to give a NAND-CIRC program of
It would be highly worthwhile for you to stop here and try to solve this problem yourself.
For example, you can try thinking how you would write a program NANDEVAL(n,m,s,L,x)
that computes this function in the programming language of your choice.
We will now describe such an algorithm.
We assume that we have access to a bit array data structure that can store for every Table
is a variable holding this data structure, then we assume we can perform the operations:
-
GET(Table,i)
which retrieves the bit corresponding toi
inTable
. The value ofi
is assumed to be an integer in$[t]$ . -
Table = UPDATE(Table,i,b)
which updatesTable
so the the bit corresponding toi
is now set tob
. The value ofi
is assumed to be an integer in$[t]$ andb
is a bit in${0,1}$ .
Input: Numbers $n,m,s$ and $t\leq 3s$, as well as a list $L$ of $s$ triples of numbers in $[t]$, and a string $x\in \{0,1\}^n$.
Output: Evaluation of the program represented by $(n,m,L)$ on the input $x\in \{0,1\}^n$.
Let `Vartable` be table of size $t$
For{$i$ in $[n]$}
`Vartable = UPDATE(Vartable,`$i$`,`$x_i$`)`
Endfor
For{$(i,j,k)$ in $L$}
$a \leftarrow$ `GET(Vartable,`$j$`)`
$b \leftarrow$ `GET(Vartable,`$k$`)`
`Vartable = UPDATE(Vartable,`$i$,`NAND(`$a$`,`$b$`))`
Endfor
For{$j$ in $[m]$}
$y_j \leftarrow$ `GET(Vartable,`$t-m+j$`)`
Endfor
Return $y_0,\ldots,y_{m-1}$
evalnandcircalg{.ref} evaluates the program given to it as input one line at a time, updating the Vartable
table to contain the value of each variable.
At the end of the execution it outputs the variables at positions
To make things more concrete, let us see how we implement evalnandcircalg{.ref} in the Python programming language.
(There is nothing special about Python. We could have easily presented a corresponding function in JavaScript, C, OCaml, or any other programming language.)
We will construct a function NANDEVAL
that on input
def NANDEVAL(n,m,L,X):
# Evaluate a NAND-CIRC program from list of tuple representation.
s = len(L) # num of lines
t = max(max(a,b,c) for (a,b,c) in L)+1 # max index in L + 1
Vartable = [0] * t # initialize array
# helper functions
def GET(V,i): return V[i]
def UPDATE(V,i,b):
V[i]=b
return V
# load input values to Vartable:
for i in range(n):
Vartable = UPDATE(Vartable,i,X[i])
# Run the program
for (i,j,k) in L:
a = GET(Vartable,j)
b = GET(Vartable,k)
c = NAND(a,b)
Vartable = UPDATE(Vartable,i,c)
# Return outputs Vartable[t-m], Vartable[t-m+1],....,Vartable[t-1]
return [GET(Vartable,t-m+j) for j in range(m)]
# Test on XOR (2 inputs, 1 output)
L = ((2, 0, 1), (3, 0, 2), (4, 1, 2), (5, 3, 4))
print(NANDEVAL(2,1,L,(0,1))) # XOR(0,1)
# [1]
print(NANDEVAL(2,1,L,(1,1))) # XOR(1,1)
# [0]
Accessing an element of the array Vartable
at a given index takes a constant number of basic operations.
Hence (since
We now turn to describing the proof of eff-bounded-univ{.ref}.
To prove the theorem it is not enough to give a Python program.
Rather, we need to show how we compute the function
Before reading further, try to think how you could give a "constructive proof" of eff-bounded-univ{.ref}.
That is, think of how you would write, in the programming language of your choice, a function universal(s,n,m)
that on input NANDEVAL
program described above.
Rather than actually evaluating a given program universal
should output the code of a NAND-CIRC program that computes the map
Our construction will follow very closely the Python implementation of EVAL
above.
We will use variables Vartable[
$0]
, where Vartable[i]
for some variable i
.
However, we can implement the function GET(Vartable,i)
that outputs the i
-th bit of the array Vartable
.
Indeed, this is nothing but the function
Please make sure that you understand why GET
and
We saw that we can compute
For every UPDATE
function for arrays of length
-
For every
$j\in [2^\ell]$ , there is an$O(\ell)$ line NAND-CIRC program to compute the function$EQUALS_j: {0,1}^\ell \rightarrow {0,1}$ that on input$i$ outputs$1$ if and only if$i$ is equal to (the binary representation of)$j$ . (We leave verifying this as equals{.ref} and equalstwo{.ref}.) -
We have seen that we can compute the function
$IF:{0,1}^3 \rightarrow {0,1}$ such that$IF(a,b,c)$ equals$b$ if$a=1$ and$c$ if$a=0$ .
Together, this means that we can compute UPDATE
(using some "syntactic sugar" for bounded length loops) as follows:
def UPDATE_ell(V,i,b):
# Get V[0]...V[2^ell-1], i in {0,1}^ell, b in {0,1}
# Return NewV[0],...,NewV[2^ell-1]
# updated array with NewV[i]=b and all
# else same as V
for j in range(2**ell): # j = 0,1,2,....,2^ell -1
a = EQUALS_j(i)
NewV[j] = IF(a,b,V[j])
return NewV
Since the loop over j
in UPDATE
is run EQUALS_j
takes UPDATE
is GET
and UPDATE
, the rest of the implementation amounts to "book keeping" that needs to be done carefully, but is not too insightful, and hence we omit the full details.
Since we run GET
and UPDATE
::: {.remark title="Improving to quasilinear overhead (advanced optional note)" #quasilinearevalrem}
The NAND-CIRC program above is less efficient than its Python counterpart, since NAND does not offer arrays with efficient random access. Hence for example the LOOKUP
operation on an array of
It turns out that it is possible to improve the bound of eff-bounded-univ{.ref}, and evaluate
To prove eff-bounded-univ{.ref} we essentially translated every line of the Python program for EVAL
into an equivalent NAND-CIRC snippet.
However none of our reasoning was specific to the particular function
For starters, one can can use CPython (the reference implementation for Python), to evaluate every Python program using a C
program.
We can combine this with a C compiler to transform a Python program to various flavors of "machine language".
So, to transform a Python program into an equivalent NAND-CIRC program, it is enough to show how to transform a machine language program into an equivalent NAND-CIRC program.
One minimalistic (and hence convenient) family of machine languages is known as the ARM architecture which powers many mobile devices including essentially all Android devices.^[ARM stands for "Advanced RISC Machine" where RISC in turn stands for "Reduced instruction set computer".]
There are even simpler machine languages, such as the LEG architecture for which a backend for the LLVM compiler was implemented (and hence can be the target of compiling any of large and growing list of languages that this compiler supports).
Other examples include the TinyRAM architecture (motivated by interactive proof systems that we will discuss in chapproofs{.ref}) and the teaching-oriented Ridiculously Simple Computer architecture.
Going one by one over the instruction sets of such computers and translating them to NAND snippets is no fun, but it is a feasible thing to do.
In fact, ultimately this is very similar to the transformation that takes place in converting our high level code to actual silicon gates that are not so different from the operations of a NAND-CIRC program.
Indeed, tools such as MyHDL that transform "Python to Silicon" can be used to convert a Python program to a NAND-CIRC program.
The NAND-CIRC programming language is just a teaching tool, and by no means do I suggest that writing NAND-CIRC programs, or compilers to NAND-CIRC, is a practical, useful, or enjoyable activity. What I do want is to make sure you understand why it can be done, and to have the confidence that if your life (or at least your grade) depended on it, then you would be able to do this. Understanding how programs in high level languages such as Python are eventually transformed into concrete low-level representation such as NAND is fundamental to computer science.
The astute reader might notice that the above paragraphs only outlined why it should be possible to find for every particular Python-computable function
What we are seeing time and again is the notion of universality or self reference of computation, which is the sense that all reasonably rich models of computation are expressive enough that they can "simulate themselves". The importance of this phenomenon to both the theory and practice of computing, as well as far beyond it, including the foundations of mathematics and basic questions in science, cannot be overstated.
We've seen that NAND gates (and other Boolean operations) can be implemented using very different systems in the physical world. What about the reverse direction? Can NAND-CIRC programs simulate any physical computer?
We can take a leap of faith and stipulate that Boolean circuits (or equivalently NAND-CIRC programs) do actually encapsulate every computation that we can think of. Such a statement (in the realm of infinite functions, which we'll encounter in chaploops{.ref}) is typically attributed to Alonzo Church and Alan Turing, and in that context is known as the Church Turing Thesis. As we will discuss in future lectures, the Church-Turing Thesis is not a mathematical theorem or conjecture. Rather, like theories in physics, the Church-Turing Thesis is about mathematically modeling the real world. In the context of finite functions, we can make the following informal hypothesis or prediction:
"Physical Extended Church-Turing Thesis" (PECTT): If a function $F:{0,1}^n \rightarrow {0,1}^m$ can be computed in the physical world using $s$ amount of "physical resources" then it can be computed by a Boolean circuit program of roughly $s$ gates.
A priori it might seem rather extreme to hypothesize that our meager model of NAND-CIRC programs or Boolean circuits captures all possible physical computation. But yet, in more than a century of computing technologies, no one has yet built any scalable computing device that challenges this hypothesis.
We now discuss the "fine print" of the PECTT in more detail, as well as the (so far unsuccessful) challenges that have been raised against it.
There is no single universally-agreed-upon formalization of "roughly
In other words, we can phrase the PECTT as stipulating that any function that can be computed by a device that takes a certain volume
The exact form of the function
::: {.remark title="Advanced note: making PECTT concrete (advanced, optional)" #concretepectt}
We can attempt a more exact phrasing of the PECTT as follows.
Suppose that
One can then phrase the PECTT as stipulating that if there exists such a system
To fully make the PECTT concrete, we need to decide on the units for measuring time and volume, and the normalization constant
There are of course several hurdles to refuting the PECTT in this way, one of which is that we can't actually test the system on all possible inputs. However, it turns out that we can get around this issue using notions such as interactive proofs and program checking that we might encounter later in this book. Another, perhaps more salient problem, is that while we know many hard functions exist, at the moment there is no single explicit function
One of the admirable traits of mankind is the refusal to accept limitations. In the best case this is manifested by people achieving longstanding "impossible" challenges such as heavier-than-air flight, putting a person on the moon, circumnavigating the globe, or even resolving Fermat's Last Theorem. In the worst case it is manifested by people continually following the footsteps of previous failures to try to do proven-impossible tasks such as build a perpetual motion machine, trisect an angle with a compass and straightedge, or refute Bell's inequality. The Physical Extended Church Turing thesis (in its various forms) has attracted both types of people. Here are some physical devices that have been speculated to achieve computational tasks that cannot be done by not-too-large NAND-CIRC programs:
-
Spaghetti sort: One of the first lower bounds that Computer Science students encounter is that sorting
$n$ numbers requires making$\Omega(n \log n)$ comparisons. The "spaghetti sort" is a description of a proposed "mechanical computer" that would do this faster. The idea is that to sort$n$ numbers$x_1,\ldots,x_n$ , we could cut$n$ spaghetti noodles into lengths$x_1,\ldots,x_n$ , and then if we simply hold them together in our hand and bring them down to a flat surface, they will emerge in sorted order. There are a great many reasons why this is not truly a challenge to the PECTT hypothesis, and I will not ruin the reader's fun in finding them out by her or himself. -
Soap bubbles: One function
$F:{0,1}^n \rightarrow {0,1}$ that is conjectured to require a large number of NAND lines to solve is the Euclidean Steiner Tree problem. This is the problem where one is given$m$ points in the plane$(x_1,y_1),\ldots,(x_m,y_m)$ (say with integer coordinates ranging from$1$ till$m$ , and hence the list can be represented as a string of$n=O(m \log m)$ size) and some number$K$ . The goal is to figure out whether it is possible to connect all the points by line segments of total length at most$K$ . This function is conjectured to be hard because it is NP complete - a concept that we'll encounter later in this course - and it is in fact reasonable to conjecture that as$m$ grows, the number of NAND lines required to compute this function grows exponentially in$m$ , meaning that the PECTT would predict that if$m$ is sufficiently large (such as few hundreds or so) then no physical device could compute$F$ . Yet, some people claimed that there is in fact a very simple physical device that could solve this problem, that can be constructed using some wooden pegs and soap. The idea is that if we take two glass plates, and put$m$ wooden pegs between them in the locations$(x_1,y_1),\ldots,(x_m,y_m)$ then bubbles will form whose edges touch those pegs in the way that will minimize the total energy which turns out to be a function of the total length of the line segments. The problem with this device of course is that nature, just like people, often gets stuck in "local optima". That is, the resulting configuration will not be one that achieves the absolute minimum of the total energy but rather one that can't be improved with local changes. Aaronson has carried out actual experiments (see aaronsonsoapfig{.ref}), and saw that while this device often is successful for three or four pegs, it starts yielding suboptimal results once the number of pegs grows beyond that.
-
DNA computing. People have suggested using the properties of DNA to do hard computational problems. The main advantage of DNA is the ability to potentially encode a lot of information in relatively small physical space, as well as compute on this information in a highly parallel manner. At the time of this writing, it was demonstrated that one can use DNA to store about
$10^{16}$ bits of information in a region of radius about a millimeter, as opposed to about$10^{10}$ bits with the best known hard disk technology. This does not posit a real challenge to the PECTT but does suggest that one should be conservative about the choice of constant and not assume that current hard disk + silicon technologies are the absolute best possible.^[We were extremely conservative in the suggested parameters for the PECTT, having assumed that as many as$\ell_P^{-2}10^{-6} \sim 10^{61}$ bits could potentially be stored in a millimeter radius region.] -
Continuous/real computers. The physical world is often described using continuous quantities such as time and space, and people have suggested that analog devices might have direct access to computing with real-valued quantities and would be inherently more powerful than discrete models such as NAND machines. Whether the "true" physical world is continuous or discrete is an open question. In fact, we do not even know how to precisely phrase this question, let alone answer it. Yet, regardless of the answer, it seems clear that the effort to measure a continuous quantity grows with the level of accuracy desired, and so there is no "free lunch" or way to bypass the PECTT using such machines (see also this paper). Related to that are proposals known as "hypercomputing" or "Zeno's computers" which attempt to use the continuity of time by doing the first operation in one second, the second one in half a second, the third operation in a quarter second and so on.. These fail for a similar reason to the one guaranteeing that Achilles will eventually catch the tortoise despite the original Zeno's paradox.
-
Relativity computer and time travel. The formulation above assumed the notion of time, but under the theory of relativity time is in the eye of the observer. One approach to solve hard problems is to leave the computer to run for a lot of time from his perspective, but to ensure that this is actually a short while from our perspective. One approach to do so is for the user to start the computer and then go for a quick jog at close to the speed of light before checking on its status. Depending on how fast one goes, few seconds from the point of view of the user might correspond to centuries in computer time (it might even finish updating its Windows operating system!). Of course the catch here is that the energy required from the user is proportional to how close one needs to get to the speed of light. A more interesting proposal is to use time travel via closed timelike curves (CTCs). In this case we could run an arbitrarily long computation by doing some calculations, remembering the current state, and the travelling back in time to continue where we left off. Indeed, if CTCs exist then we'd probably have to revise the PECTT (though in this case I will simply travel back in time and edit these notes, so I can claim I never conjectured it in the first place...)
-
Humans. Another computing system that has been proposed as a counterexample to the PECTT is a 3 pound computer of about 0.1m radius, namely the human brain. Humans can walk around, talk, feel, and do others things that are not commonly done by NAND-CIRC programs, but can they compute partial functions that NAND-CIRC programs cannot? There are certainly computational tasks that at the moment humans do better than computers (e.g., play some video games, at the moment), but based on our current understanding of the brain, humans (or other animals) have no inherent computational advantage over computers. The brain has about
$10^{11}$ neurons, each operating in a speed of about$1000$ operations per seconds. Hence a rough first approximation is that a Boolean circuit of about$10^{14}$ gates could simulate one second of a brain's activity.^[This is a very rough approximation that could be wrong to a few orders of magnitude in either direction. For one, there are other structures in the brain apart from neurons that one might need to simulate, hence requiring higher overhead. On the other hand, it is by no mean clear that we need to fully clone the brain in order to achieve the same computational tasks that it does.] Note that the fact that such a circuit (likely) exists does not mean it is easy to find it. After all, constructing this circuit took evolution billions of years. Much of the recent efforts in artificial intelligence research is focused on finding programs that replicate some of the brain's capabilities and they take massive computational effort to discover, these programs often turn out to be much smaller than the pessimistic estimates above. For example, at the time of this writing, Google's neural network for machine translation has about$10^4$ nodes (and can be simulated by a NAND-CIRC program of comparable size). Philosophers, priests and many others have since time immemorial argued that there is something about humans that cannot be captured by mechanical devices such as computers; whether or not that is the case, the evidence is thin that humans can perform computational tasks that are inherently impossible to achieve by computers of similar complexity.^[There are some well known scientists that have advocated that humans have inherent computational advantages over computers. See also this.] -
Quantum computation. The most compelling attack on the Physical Extended Church Turing Thesis comes from the notion of quantum computing. The idea was initiated by the observation that systems with strong quantum effects are very hard to simulate on a computer. Turning this observation on its head, people have proposed using such systems to perform computations that we do not know how to do otherwise. At the time of this writing, Scalable quantum computers have not yet been built, but it is a fascinating possibility, and one that does not seem to contradict any known law of nature. We will discuss quantum computing in much more detail in quantumchap{.ref}. Modeling quantum computation involves extending the model of Boolean circuits into Quantum circuits that have one more (very special) gate. However, the main take away is that while quantum computing does suggest we need to amend the PECTT, it does not require a complete revision of our worldview. Indeed, almost all of the content of this book remains the same regardless of whether the underlying computational model is Boolean circuits or quantum circuits.
::: {.remark title="Physical Extended Church-Turing Thesis and Cryptography" #pcettcrypto}
While even the precise phrasing of the PECTT, let alone understanding its correctness, is still a subject of active research, some variants of it are already implicitly assumed in practice.
Governments, companies, and individuals currently rely on cryptography to protect some of their most precious assets, including state secrets, control of weapon systems and critical infrastructure, securing commerce, and protecting the confidentiality of personal information.
In applied cryptography, one often encounters statements such as "cryptosystem
- We can think of programs both as describing a process, as well as simply a list of symbols that can be considered as data that can be fed as input to other programs.
- We can write a NAND-CIRC program that evaluates arbitrary NAND-CIRC programs (or equivalently a circuit that evaluates other circuits). Moreover, the efficiency loss in doing so is not too large.
- We can even write a NAND-CIRC program that evaluates programs in other programming languages such as Python, C, Lisp, Java, Go, etc.
- By a leap of faith, we could hypothesize that the number of gates in the smallest circuit that computes a function
$f$ captures roughly the amount of physical resources required to compute$f$ . This statement is known as the Physical Extended Church-Turing Thesis (PECTT). - Boolean circuits (or equivalently AON-CIRC or NAND-CIRC programs) capture a surprisingly wide array of computational models. The strongest currently known challenge to the PECTT comes from the potential for using quantum mechanical effects to speed-up computation, a model known as quantum computers.
This chapter concludes the first part of this book that deals with finite computation (computing functions that map a fixed number of Boolean inputs to a fixed number of Boolean outputs). The main take-aways from compchap{.ref}, finiteuniversalchap{.ref}, and codeanddatachap{.ref} are as follows (see also finiterecapfig{.ref}):
-
We can formally define the notion of a function
$f:{0,1}^n \rightarrow {0,1}^m$ being computable using$s$ basic operations. Whether these operations are AND/OR/NOT, NAND, or some other universal basis does not make much difference. We can describe such a computation either using a circuit or using a straight-line program. -
We define
$SIZE_{n,m}(s)$ to be the set of functions that are computable by NAND circuits of at most$s$ gates. This set is equal to the set of functions computable by a NAND-CIRC program of at most$s$ lines and up to a constant factor in$s$ (which we will not care about) this is also the same as the set of functions that are computable by a Boolean circuit of at most$s$ AND/OR/NOT gates. The class$SIZE_{n,m}(s)$ is a set of functions, not of programs/circuits. -
Every function
$f:{0,1}^n \rightarrow {0,1}^m$ can be computed using a circuit of at most$O(m \cdot 2^n / n)$ gates. Some functions require at least$\Omega(m \cdot 2^n /n)$ gates. We define$SIZE_{n,m}(s)$ to be the set of functions from${0,1}^n$ to${0,1}^m$ that can be computed using at most$s$ gates. -
We can describe a circuit/program
$P$ as a string. For every$s$ , there is a universal circuit/program$U_s$ that can evaluate programs of length$s$ given their description as strings. We can use this representation also to count the number of circuits of at most$s$ gates and hence prove that some functions cannot be computed by circuit of smaller-than-exponential size. -
If there is a circuit of
$s$ gates that computes a function$f$ , then we can build a physical device to compute$f$ using$s$ basic components (such as transistors). The "Physical Extended Church-Turing Thesis" postulates that the reverse direction is true as well: if$f$ is a function for which every circuit requires at least$s$ gates then that every physical device to compute$f$ will require about$s$ "physical resources". The main challenge to the PECTT is quantum computing, which we will discuss in quantumchap{.ref}.
Sneak preview: In the next part we will discuss how to model computational tasks on unbounded inputs, which are specified using functions $F:{0,1}^* \rightarrow {0,1}^$ (or $F:{0,1}^ \rightarrow {0,1}$) that can take an unbounded number of Boolean inputs.
::: {.exercise #reading-comp} Which one of the following statements is false:
a. There is an
b. There is an
c. There is an
For every
For every
::: {.exercise title="Counting lower bound for multibit functions" #countingmultibitex}
Prove that there exists a number
::: {.exercise title="Size hierarchy theorem for multibit functions" #sizehiearchyex}
Prove that there exists a number
::: {.exercise title="Efficient representation of circuits and a tighter counting upper bound" #efficientrepresentationex}
Use the ideas of efficientrepresentation{.ref} to show that for every
::: {.exercise title="Tighter counting lower bound" #efficientlbex}
Prove that for every
Suppose
::: {.exercise }
The following is a tuple representing a NAND program:
-
Write a table with the eight values
$P(000)$ ,$P(001)$ ,$P(010)$ ,$P(011)$ ,$P(100)$ ,$P(101)$ ,$P(110)$ ,$P(111)$ in this order. -
Describe what the programs does in words. :::
::: {.exercise title="EVAL with XOR" #XOREVAL}
For every sufficiently large
Prove that for every sufficiently large
::: {.exercise title="Learning circuits (challenge, optional, assumes more background)" #learningcircuitsex}
(This exercise assumes background in probability theory and/or machine learning that you might not have at this point. Feel free to come back to it at a later point and in particular after going over probabilitychap{.ref}.)
In this exercise we will use our bound on the number of circuits of size
Let
In other words, if
The
While we've seen that "most" functions mapping
Scott Aaronson's blog post on how information is physical is a good discussion on issues related to the physical extended Church-Turing Physics.
Aaronson's survey on NP complete problems and physical reality [@aaronson2005physicalreality] discusses these issues as well, though it might be easier to read after we reach cooklevinchap{.ref} on