What Might Lie Beyond Quantum Computing: Can We Overcome The Church-Turing Thesis?

Although in one sense this question and answer is altogether beyond physics, (as discussed below) there would seem to be a very natural answer to this question, to wit:

Beyond Quantum Computers are computers than can overcome the Church-Turing thesis (See the Wikipedia page with this name) and, more generally, can compute the truth value of propositions that are undecidable by a finite computation, or, equivalently, underivable in a finite number of steps from a fixed axiom system. The latter are often, and somewhat cheekily, called “Oracles” (see the Wikipedia page for “Oracle Machine”) by computer scientists, logicians and other researchers into the foundations of mathematics.

The Position of Mainstream Physics on Oracle Machines

As far as I know, there is as yet no serious question as to whether the class of problems solvable by quantum computers might be bigger than the class solvable by deterministic, Turing machines. Indeed it seems that mainstream physics believes that computers transcending the CT thesis are likely to be outside physics (see this Physics SE question “Church-Turing hypothesis as a fundamental law of physics”). I think most workers in this field would believe, on the evidence so far that quantum computers will not overcome the Church-Turing thesis, i.e. that no algorithm “implemented in any physical computer” can compute anything that a classical Turing machine cannot in principle (the bit in quotes is my addition to the wonted statement of the CT thesis). Note that I say in principle: many, indeed most of the serial sequences of Turing state machine states that make up many computations are so fantastically long that no classical computer could do them in the life span of a physical universe complying with known physical laws. Quantum computers may change all that and bring many computations formerly thought forbiddingly complex for any practical solution into the doable realm. So the factors that may set quantum computers apart from classical ones are:

  1. Massive storage arising from the tensor product of quantum states: an $N$ qubit state lives in a vector space of $2^N$ quantum basis vectors, so even if we use only two digital levels of superposition weight for each separate basis vector in a general quantum superposition, the finite state space stores $2^N$ bits or has $2^{2^N}$ distinguishable states;
  2. Extremely low power consumption: quantum state evolutions are unitary therefore reversible, and therefore do not consume energy by Landauer’s principle (see also my post here), although there will be energy required to initialize qubits when the computer first powers up, in accordance with Landauer’s principle;
  3. Massive speed arising from massive parallelism inherent in the evolution of quantum state-represented computation space;

But the overcoming of the Church-Turing thesis, from the research so far, does not seem to be an attribute of quantum computing.

Speculative Futures Grounded on Yet-To-Be-Discovered Physics

Now, mainstream physics has no concept of how an oracle machine – effectively something that can do an infinite number of computing steps in a finite time – might be implemented, so in this sense the question is altogether beyond physics. However, if the standpoint taken by strong mathematical platonists such as Gödel, Quine and Penrose is the right one (a more extreme view still is Mark Tegmark’s Mathematical Universe Hypothesis) then it is conceivable (but still not needful, mind you) that such oracles are physically realizable in some as yet unknown physics. To give some idea of how fantastical that is: if we indeed developed such technology, the observation of true infinities in the laboratory would become a commonplace, everyday idea!

What Current Physics has to say about the limits of Oracle Machines

Even we are thinking about yet to be discovered (if at all valid) physics, there are a few things we can say are likely to be true about an oracle machine or other computer transcending the Church-Turing thesis, mainly grounded on thermodynamics and Landauer’s principle:

  1. It would have to be contained in a finite memory space: the initialization of $N$ bits of memory (whether classical or qubit) calls for, in accordance with Landauer’s principle, the expenditure of useful work $N\,k_B\,T\,\log 2$. So if our speculative physics is still constrained by present thermodynamics, the oracle machine must have finite energy needs and therefore it must be contained in a bounded memory;
  2. Interestingly, once we have initialized this memory, the running of an infinite number of reversible steps on such a computer may not need an infinite amount of energy. Contrapositively, any non-reversible oracle algorithm cannot be implemented because its energy needs would be infinite owing to the need to erase memory and Landauer principle attendant need for energy each time an irreversible erasing happened.

Of course, in General Relativity there is no global energy conservation: for most metrics fulfilling the Einstein Field Equations no definition of a global time can be made, thus we cannot write down time-invariant Lagrangian so that Noether’s theorem cannot be applied. Time-shift-invariance of a system’s Lagrangian is taken to be the theoretical justification of the principle of conservation of energy, so there may even be speculative speculative physics not subject to energy restrictions! But this is likely to be true only of a computer spread over cosmological timescales and distances. This is because General Relativity still behests a local conservation of energy; the divergence of the stress energy tensor vanishes:

$$\nabla_\beta T^{\alpha\beta}  \,  = T^{\alpha\beta}{}_{;\beta}  \, = 0$$

 Some global metrics, technically those which have a “global timelike Killing vector”, i.e. one which fulfils $\nabla_{a}\xi_{b} + \nabla_{b} \xi_{a} = 0$, do allow a definition of global time. All this is simply a poncy way of saying that there is a vector field on the spacetime manifold which “generates” (can be integrated to) a time shift symmetry – if you need to understand a precise idea of what this means, see Theorem 11.14 in my Introduction to Lie Groups exposition. Our universe seems to be very near to flat, i.e. globally Minkowskian, so our energy-unlimited computer will need to be very, very big, very very longlived or parked near a very big black hole.

Examples of Undecidable Propositions, i.e. Those Needing Oracles for Decision

If the idea of undecidable and unncomputable propositions is unwonted to you, you should begin by looking up:

  1. The Halting Problem on Wikipedia: the proof is simple and given below: it can be proven impossible to decide with a Turing machine whether any general computer program will end normally or get stuck in an infinite loop;
  2. The computation of Kolmogorov Complexity: again see this one on Wikipedia and also proven below: roughly: there is no algorithm that can decide whether a full specification of an object might be made smaller and still be a full specification. This is a mathematically rigorous variation on the Berry Paradox;
  3. Gödel undecidable propositions: roughly: that there are always relationships between the natural numbers whose truth or falsehood cannot be decided by a Turing machine;
  4. One that I like (more in keeping with what I would think of as my background) is the Novikov–Boone theorem that one cannot decide by Turing machine whether two words in a finite group presentation are talking about the same group element or not;
  5. Chaitin’s constant and uncomputable numbers and undefinable real numbers (that’s “most” of them, insofar that the computable ones are countable!) in general.

Most of these problems involve proving something by the Cantor slash procedure (the procedure used to show that $\mathbb{R}$ is uncountable) so I’ll give a flavor of them as follows by showing the existence of uncomputable functions and by proving the uncomputability of a total halting function. The proof of uncomputability of Kolmogorov complexity stands out as a little different from this general technique – so I’ll talk about it separately.

Theorem:

There exist uncomputable functions in any finitely defined programming language (i.e. with a specification by a finite forgathering of e.g. syntax diagrams or BNF statements).

Proof: The set of finite, syntactically legal computer programs, being a proper subset of the set of finite ASCII strings, can be certainly put in one-to-one correspondence with the natural numbers (strictly positive integers) $\mathbb{N}$, e.g. through lexicographic ordering.

However, the set of all infinite sequences of natural numbers, being equivalent to the set of natural number valued functions of one natural variable $\varphi:\mathbb{N}\to\mathbb{N}$ cannot be placed into such a one-to-one correspondence, as can be proven by the standard Cantor slash argument: we assume that there is such a correspondence defined by the set $\mathbb{F}=\{f_i:\mathbb{N}\to\mathbb{N}\}_{i=1}^\infty$ of natural number valued functions of one natural number and then consider the function:

$$f:\mathbb{N}\to\mathbb{N}:\;f(n) = f_n(n) + 1$$

and then it is clear that $f(n) = f_n(n)+1 \neq f_n(n),\,\forall n\in\mathbb{N}$ and this new, well defined function does not belong to the set $\mathbb{F}$ gainsaying our assumption.$\quad\square$

It is instructive to flesh this proof out a bit in more computer science language as follows:

Proof: Suppose we have a computer language. We list all syntactically valid programs in order of their Gödel number (we can take this to be the code’s “value” as a string of ASCII digits). This list is clearly in one-to-one onto correspondence with the positive integers. We then pick the programs that compute one integer function of one integer variable and put these in a condensed list: these too are clearly a list that can be put into bijective correspondence with the positive integers. Indeed, there is a function $\mathcal{L}:\mathbb{N}\to\{0,\,1\}$ such $\mathcal{L}(N)$ tells whether the program with Gödel number $N$ is syntactically correct or “legal”. In other words, this is a parser in the target language, clearly implementable as a finite state machine (see the references at the end for a good summary of the background to this statement).

Next we throw out all the syntactically valid programs which do not halt when their own Gödel numbers are input to them. By the unsolvability of the halting problem, we cannot actually write such a finite program. Nonetheless, the subset of $\mathbb{H}\subset\mathbb{N}$ of Gödel numbers belonging to programs that do halt exists and is well-ordered (as long, of course, as we accept the law of excluded middle: either a program halts or it does not). So the oracle $\mathcal{H}:\mathbb{N}\times\mathbb{N}\to\{0,\,1\}$ that returns true ($\mathcal{H}(N,\,M)$ = 1) if and only if the program with Gödel number $N$ halts when its input is $M$ is well defined.

Note that we can construe any program as an integer valued function of one integer variable $f:\mathbb{N}\to\mathbb{N}$: we can simply name a particular location in memory (on the Turing tape) as the beginning of the “input” and “output” and also define a reserved codeword to mark the end of the input / output. We can if we like even assume a Harvard architecture and keep output and input space apart on a separate memory / Turing tape. We then begin the program running with the input integer laden into the input space and read the “output” off and define it as the program’s output when the program ends. Mostly of course, the program is not written to calculate some function $f:\mathbb{N}\to\mathbb{N}$ so the output will be either nought or some very complicated pattern of 0s and 1s in memory. But we can always arrange programs that do calculate known functions to write their output in this way, therefore we can use the above definition and call any program a partial function $f:\mathbb{N}\to\mathbb{N}$. We saypartial in general because the program may not halt for some input values.

We lastly use the Universal Turing Machine $\mathcal{U}:\mathbb{N}\times\mathbb{N}\to\mathbb{N}$ where $\mathcal{U}:(N,\,n)$ is the value output by the program with Gödel number $N$ when it is run with the integer $n$ as input. Note that this program may or may not halt. It is, however, very wonted to us. It is what we now call a compiler.

So we are left with a countable list $\{f_i:\mathbb{N}\to\mathbb{N}\}_{i=1}^\infty$ of computable, integer valued integer functions. These include every such fully defined function of the integers that can be computed in that particular computer language. So now we call upon the Cantor Slash argument and consider the function:

$$f:\mathbb{N}\to\mathbb{N}:\;f(x) = f_x(x) + 1$$

and readily show that it is not on the list ($f_x(x) \neq f_x(x)+1 = f(x)$) of legal, always fully defined functions. Since $f$ itself is a fully defined function, it cannot be a partial function either i.e. one which halts and returns a value for some input integers, but not all of $\mathbb{N}$. This shows that there are well defined, uncomputable functions.

Indeed, to make matters more explicit, with the help of our oracle that decides whether a code halts, we can define this uncomputable function by a program that, thanks to our oracle, always halts:

$$\begin{array}{l}
\text{function uncomputable( }n\text{ : integer ) : integer;}\\
\text{begin}\\
\quad \text{var }i,\,j\text{: integer;}\\
\quad i:=0;\\
\quad j:=1\\
\quad \text{while }i < n\\
\quad \text{begin}\\
\quad\quad \text{while (not }\mathcal{L}(j))\text{ or (not }\mathcal{H}(j,\,j))\text{ do}\\
\quad\quad\text{begin}\\
\quad\quad\quad j:=j+1;\\
\quad\quad\text{end}\\
\quad\quad i:=i+1;\\
\quad\text{end}\\
\quad\text{uncomputable} :=\mathcal{U}(i,\,i) + 1\\
\text{end};\end{array}$$

So, given $j$, the inner while loop finds the next Gödel number greater than $j$ which belongs to a syntactically correct program that also halts when its own Gödel number is input to it. So by the time the outer while loop ends, $i$ is equal to the $n^{th}$ Gödel number of a program that always halts. So let $f_n : \mathbb{N}\to\mathbb{N}$ be the the well-defined function defined by this $n^{th}$ legal program that halts. So, to complete the Cantor slash, we evaluate the $n^{th}$ legal, halting program at input $n$ and return its value plus one. $\quad\square$

Notice the central role played by the halt tester $\mathcal{H}$. If it were realisable as program, then our above code would work. A great deal more on the practical description of the Universal Turing Machine, i.e. parsers, compilers, is to be found in the following references.

Appel, Andrew W. (1998),“Modern Compiler Implementation in C”, Cambridge University Press, New York, Cambridge. ISBN 0-521-58390-X. See esp. Chapter 2.

Appel, Andrew W. (1998),“Modern Compiler Implementation in Java”, Cambridge University Press, New York, Cambridge. ISBN 0-521-58388-8

Johnson, Steven. (1998),“Yacc: Yet-Another Compiler Compiler”, see this reference on-line.

Of course the proof that the halting problem is uncomputable is also a standard Cantor slash argument.


Theorem:

In any finitely defined programming language (i.e. with a specification by a finite forgathering of e.g. syntax diagrams or BNF statements), a halting function, i.e. a total function $\mathcal{H}:\mathbb{N}\times\mathbb{N}\to\{0,\,1\}$ such that $\mathcal{H}(n,\,m)=1$ if and only if the program with Gödel number $n$ halts when its input is $m$, is uncomputable.

Proof: It is readily shown that every totally computable binary function $\varphi:\mathbb{N}\times\mathbb{N}\to\{0,\,1\}$ must be different from $\mathcal{H}$. To this end, for a given $\varphi$ consider the program:

$$\begin{array}{l}
\text{function nohalt( }n\text{ : integer ) : integer;}\\
\text{begin}\\
\quad \text{var }i\text{: integer;}\\
\quad \text{if }\varphi(n,\,n) = 1\text{ then }\\
\quad\quad \text{while false do } i:=0;\\
\quad \text{else}\\
\quad\quad \text{nohalt}:= 0;\\
\text{end}\end{array}$$

So now let the Gödel number of the above code be $n_{nh}$; $\varphi$ is computable, so that $n_{nh}$ is well defined from a program for $\varphi$.

Suppose now we run the above code with $n_{nh}$ as input. $\varphi(n_{nh},\,n_{nh}) = 0$ if and only if the program above does not get stuck in the infinite loop $\text{while false do } i:=0;$ for this input. But the halting function $\mathcal{H}(n_{nh},\,n_{nh}) = 0$ if and only if the program above does get stuck in the infinite loop $\text{while false do } i:=0;$ for this input. Therefore $\mathcal{H}(n_{nh},\,n_{nh}) \neq \varphi(n_{nh},\,n_{nh})$, and so the halting function cannot be any totally computable function.$\quad\square$

Witness that the above is a Cantor Slash argument: from each totally computable function $\varphi$ we build a code fragment such that $\mathcal{H}(n_{nh},\,n_{nh}) = 0$ is exactly the opposite to $\varphi(n_{nh},\,n_{nh}) = 0$. Sometimes the above argument is validly made with $\mathcal{H}$ assumed totally computable and inserted into the code instead of $\varphi(n_{nh},\,n_{nh}) = 0$; in this case we get a contradiction because we find that the code halts if and only if it doesn’t halt. The use of the separate, arbitrary $\varphi$ shows how the argument is a special kind of Cantor slash.

I end with the proof that the Kolmogorov complexity is uncomputable, that is, there is no finite algorithm that can find the length of the shortest possible description, in any given language, of a general string. I say more about the Kolmogorov Complexity in my post here.

The idea of the proof is this: we argue that if the Kolmogorov complexity of a general string were computable, then we could, with this Kolmogorov-Complexity-Finding algorithm as part of the definition, give a unique definition of certain strings that is much shorter than the string’s Kolmogorov complexity, which gainsays the definition of the Kolmogorov complexity as the length in bits of the shortest description of a string that there can be. In other words, the algorithm could be used to compress certain strings to a smaller size than their Kolmogorov complexity itself, which contradicts the definition of Kolmogorov complexity. The proof is interesting insofar that it is the only “uncomputability” proof that I know that doesn’t directly use the Cantor Slash argument. However, the uncomputability of Kolmogorov Complexity and the Uncomputability of a general halting function are Turing Equivalent: that is, given an “oracle” for one, once can built an “oracle” for the other using only the former oracle together with computable code. So the uncomputability of Kolmogorov Complexity can be derived from the uncomputability of the general halting function and contrariwise.


Theorem:

Given any language, there is no finite algorithm definable in that language that can compute the Kolmogorov complexity of a general string.

Proof: Firstly we witness that there are strings of greater Kolmogorov complexity than any positive integer $n$. For the total number of distinct algorithms of K-complexity less than or equal to $n$ is certainly less than $1+2+2^2+\cdots+2^n$, which is finite and thus such algorithms can only describe a finite number of strings.

So if the K-complexity of a string $s$ were computable by finite algorithm $\mathcal{K}(s)$ (which is of some finite. fixed length $L_0$ bits), then we could define algorithmically the “first string $s_n$ with K-complexity of $n$ or greater”. This algorithm would iterate over all finite strings $s_j$ listed in lexicographic order $\{s_j\}_{j=1}^\infty$, and compute the K-complexity of each one and halt when it first finds a string with complexity $n$ or greater and return the “first string $s_n$ with K-complexity of $n$ or greater”. It must halt after a finite time, because we have shown above that there are finite strings with K-complexity greater than any natural number $n$ no matter how big.

But now, what is the length of this new description? It is the length of the fixed algorithm $\mathcal{K}$, plus the length of a unique description of $n$, which at most $\log_2 n$ bits long, plus some fixed length $L_1$ of conventional “glue code”. The length of this description is thus less than or equal to $L = L_0+L_1+\log_2 n$. $L$, by definition, is longer than the Kolmogorov Complexity string $s_n$, which, by definition is greater than or equal to $n$. So therefore:

$$n\leq \mathcal{K}(s_n) \leq L=L_0+L_1+\log_2 n$$

or:

$$n \leq L_0+L_1+\log_2 n$$

Given $L_0+L_1$ is fixed, and given that there is an $s_n$ for every positive $n$, we can choose $n$ high enough that the above inequality is a contradiction. $\qquad\square$.

Indeed our algorithm for finding the “first string $s_n$ with K-complexity of $n$ or greater” is simply as follows, given $\mathcal{K}$ as well as a function “dictionary(n)”, which gives the $n^{th}$ finite string:

$$\begin{array}{l}
\text{function first( }n\text{ : integer ) : string;}\\
\text{begin}\\
\quad \text{var }i\text{: integer;}\\
\quad i:=1;\\
\quad \text{while }\mathcal{K}(\text{dictionary}(i)) < n\text{ do } i:=i+1;\\
\quad \text{first}:=\text{dictionary}(i);\\
\text{end}\end{array}$$

Charles Bennett in the paper:

Charles Bennett, “The Thermodynamics of Computation: A Review”, Int. J. Theo. Phys., 21, No. 12, 1982

makes the interesting point that, given the uncomputability of Kolmogorov complexity, we cannot know what the most compressed, unambiguous description of a general molecule or of its crystalline structure could be. Descriptions of such things are, of course, strings or are equivalent thereto (e.g. descriptions in a standardised graphical language). So it is not surprising that attempts to systematically calculate entropies of substances to foretell Gibbs and other free energies wholly theoretically have met with failure. Ultimately, it may be a form of the Kolmogorov complexity that is important to the second law of thermodynamics. If we think about a reaction from the Landauer Principle account of the second law of thermodynamics i.e. that at a microscopic level the laws of physics are reversible, then the Kolmogorov Complexity of the reactants’ microstate, relative to whatever “language Nature uses to encode her states” will determine whether that microstate can be encoded in the degrees of freedom of the reaction products. If so, then the reaction may go forward without any work being done: the products can still encode the input microstate and therefore there is no “excess entropy” that must be “thrown out of the system”, i.e. encoded in the state of the environment of the products: the mapping between system microstates at different times can still be one-to-one. If, however, there is excess entropy, i.e. the Kolmogorov complexity of the reactants means that the reactants’ microstate cannot fit in the degrees of freedom of the products, then work must be input to push that entropy out into the environment, thus keeping the World’s state evolution a one to one mapping. Given that the Kolmogorov complexity is uncomputable, it is not surprising that we do not have a systematic way to compute whether the reactants’ microstate can be encoded in the products. Therefore, systematic theoretical calculation of entropies and free energies would thus seem implausible.