Finite of Sense and Infinite of Thought: A History of Computation, Logic and Algebra, Part III

19 Jan 2018

An almost metaphysical system of abstract signs, by which the motion of the hand performs the office of the mind.

— Dionysius Lardner, Babbage’s Calculating Engine, 1834

Even when the connotation of a term has been accurately fixed, and still more if it has been left in the state of a vague unanalyzed feeling of resemblance; there is a constant tendency in the word, through familiar use, to part with a portion of its connotation. It is a well-known law of the mind, that a word originally associated with a very complex cluster of ideas, is far from calling up all those ideas in the mind, every time the word is used… If this were not the case, processes of thought could not take place with any thing like the rapidity which we know they possess.

… The inferences… which are successively drawn, are inferences concerning things, not symbols; although as any Things whatever will serve the turn, there is no necessity for keeping the idea of the Thing at all distinct, and consequently the process of thought may, in this case, be allowed without danger to do what all processes of thought, when they have been performed often, will do if permitted, namely, to become entirely mechanical.

… Whenever the nature of the subject permits our reasoning processes to be, without danger, carried on mechanically, the language should be constructed on as mechanical principles as possible; while, in the contrary case, it should be so constructed that there shall be the greatest possible obstacles to a merely mechanical use of it.

— John Stuart Mill, A System Of Logic, Ratiocinative And Inductive, 1882

III. Ghostbusters

Lingua Characteristica

In his seminal book, From Frege to Gödel, A Source Book in Mathematical Logic, 1879-1931, the pioneering historian of logic (and the personal secratary of Leon Trotsky in exile) Jean van Heijenoort writes:

⊕From Frege to Gödel, p. vi ⊕While some logicians agree with Heijenoort’s statements, like Quine, who wrote ‘‘[L]logic became a substantial branch of mathematics only with the emergence of general quantification theory at the hands of Frege and Peirce. I date modern logic from there,’ others paint a more nuanced picture. See Peckhaus’s Calculus Ratiocinator vs. Characteristica Universalis? The Two Traditions in Logic, Revisited. Boole, De Morgan, and Jevons are regarded as the initiators of modern logic, and rightly so. This first phase, however, suffered from a number of limitations. It tried to copy mathematics too closely, and often artificially. The multiplicity of interpretations of what became known as Boolean algebra created confusion and for a time was a hindrance rather than an advantage. Considered by itself, the period would, no doubt, leave its mark upon the history of logic, but it would not count as a great epoch. A great epoch in the history of logic did open in 1879, when Gottlob Frege’s Begriffsschrift was published. This book freed logic from an artificial connection with mathematics but at the same time prepared a deeper interrelation between these two sciences.

But Frege’s ideas were ignored and even outright rejected at the time of their introduction, and it took another to spread them. In 1889, Giuseppe Peano (1858–1932), an Italian mathematician, wrote a pamphlet called, Arithmetices principia, nova methodo exposita (The principles of arithmetic, presented by a new method).

Peano begins:

⊕Princeps, in From Frege to Gödel, p. 85 Questions that pertain to the foundations of mathematics, although treated by many in recent times, still lack a satisfactory solution. The difficulty has its main source in the ambiguity of language.

That is why it is of the utmost importance to examine attentively the very words we use. My goal has been to undertake this examination, and in this paper I am presenting the results of my study, as well as some applications to arithmetic.

I have denoted by signs all ideas that occur in the principles of arithmetic, so that every proposition is stated only by means of these signs.

With these notations, every proposition assumes the form and the precision that equations have in algebra; from the propositions thus written other propositions are deduced, and in fact by procedures that are similar to those used in solving equations. This is the main point of the whole paper.

Heijenoort writes that

⊕From Frege to Gödel, p. 84 The ease with which we read Peano’s booklet today shows how much of his notation has found its way, either directly or in a somewhat modified form, into contemporary logic.

⊕
Jean van Heijenoort, left, with Leon Trotsky (middle) and his wife, Natalia Sedova, in Mexico, 1937 (Source: Wikipedia)

… There is, however, a grave defect. The formulas are simply listed, not derived; and they could not be derived, because no rules of inference are given. Peano introduces a notation for substitution but does not state any rule. What is far more important, he does not have any rule that would play the role of the rule of detachment. The result is that, for all his meticulousness in the writing of formulas, he has no logic that he can use… What is presented as a proof is actually a list of formulas that are such that, from the point of view of the working mathematician, each one is very close to the next. But, however close two successive formulas may be, the logician cannot pass from one to the next because of the absence of rules of inference. The proof does not get off the ground.

… [T]he proof … cannot be carried out by a formal procedure; it requires some intuitive logical argument, which the reader has to supply. [This] brings out the whole difference between an axiomatization, even written in symbols and however careful it may be, and a formalization.

… Peano’s writings, of minor significance for logic proper, showed how mathematical theories can be expressed in one symbolic language. These writings rapidly gained a wide influence and greatly contributed to the diffusion of the new ideas.

In 1892, Peano proclaimed he would embark on an ambitious project, which he called Furmulario Mathematico, to publish all known mathematical theorems in his symbolic language. He published five volumes and even bought a small printing office and studied the craft in order to serve his goal. Murawski writes that,Mechanization of Reasoning, Kindle Locations 2867-2872 “One of the results of the project was a further simplification of the mathematical symbolism. Peano treated Formulario as a fulfilment of Leibniz’s ideas. He wrote: ‘After two centuries, this dream of the inventor of the infinitesimal calculus has become a reality… We now have the solution to the problem proposed by Leibniz’. He hoped that Formulario would be widely used by professors and students. He himself immediately began to use it in his classes. But the response was very weak. One reason was the fact that almost everything was written there in symbols, the other was the usage of latino sine flexione (that is Latin without inflection — an artificial international language invented by Peano…) for explanations and commentaries.”

“As for Frege,” Heijenoort writes ⊕From Frege to Gödel, p. 83 Peano learned of his work immediately after the publication of Arithmetices principia.”

⊕Mechanization of Reasoning, Kindle Locations 2881-2894 The influence of Peano and his writings on the development of mathematical logic was great. In the last decade of the 19th century he and his school played first fiddle. Later the center was moved to England where B[ertrand] Russell was active — but Peano has played also here an important role. Russell met Peano at the International Philosophical Congress in Paris in August 1900 and later he wrote about this meeting in the following way:

⊕
A portrait of Bertrand Russell by Renee Bolinger (source: the artist’s website; used with permission)

The Congress was a turning point in my intellectual life, because I met there Peano. I already knew him by name and had seen some of his work, but had not taken the trouble to master his notation. In discussions at the Congress I observed that he was always more precise than anyone else, and that he invariably got the better of any argument upon which he embarked. As the days went by, I decided that this must be owing to his mathematical logic. I therefore got him to give me all his works, and as soon as the Congress was over I retired to Fernhurst to study quietly every word written by him and his disciples. It became clear to me that his notation afforded an instrument of logical analysis such as I had been seeking for years, and that by studying him I was acquiring a new and powerful technique for the work I had long wanted to do.

In a letter to Jourdain from 1912 B. Russell wrote: “Until I got hold of Peano, it had never struck me that Symbolic Logic would be any use for the Principles of mathematics, because I knew the Boolean stuff and found it useless. It was Peano’s, together with the discovery that relations could be fitted into his system, that led me to adopt symbolic logic.” It should be admitted also that Russell learned of Frege and his works just through Peano.

Just prior to meeting Peano in Paris, Russell published A Critical Exposition of the Philosophy of Leibniz, ⊕A Critical Exposition of the Philosophy of Leibniz, 1900 the first modern study of Leibniz’s work in logic. It was through Bertrand Russell (1872 – 1970) and his famous work with Alfred North Whitehead (1861-1947), the Principia Methematica, published in in 1910-1913 and continued Frege’s ideas, that the world learned and accepted Frege’s work.

As his first foray into logic, in 1879, Gottlob Frege (1848-1925) published Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens (Ideography: A Formal Language for Pure Thought Modeled on that of Arithmetic), on which Heijenoort writes, From Frege to Gödel, p. 1 “This is the first work that Frege wrote in the field of logic, and … it is perhaps the most important single work ever written in logic. Its fundamental contributions, among lesser points, are the truth-functional propositional calculus, the analysis of the proposition into function and argument(s) instead of subject and predicate, [and] the theory of quantification.”

The Begriffsschrift — literally, “concept-writing” but usually translated to English as “ideography” — was Frege’s attempt to address the ambiguities of natural language.

⊕Begriffsschrift, a formula language modeled upon that of arithmetic, for pure thought, 1879, in From Frege to Gödel, pp. 5-7 In apprehending a scientific truth we pass, as a rule, through various degrees of certitude. Perhaps first conjectured on the basis of an insufficient number of particular cases, a general proposition comes to be more and more securely established by being connected with other truths through chains of inferences, whether consequences are derived from it that are confirmed in some other way or hether, conversely, it is seen to be a consequence of propositions already established. Hence we can inquire, on the one hand, how we have gradually arrived at a given proposition and, on the other, how we can finally provide it with the most secure foundation…

⊕
Inference in Frege’s Begriffsschrift, from Begriffsschrift, a formula language modeled upon that of arithmetic, for pure thought. While Frege’s Begriffsschrift’s abstract syntax was that of the familiar predicate calculus, its concrete syntax was two-dimensional — it is, in fact, a syntax tree (and thus not requiring parentheses and operator precedence) — resembling a circuit diagram. A proposition was introduced with a horizontal line, as in $- a$, while the judgment that $a$ is true was marked by the addition of a vertical bar, as in $\vdash a$. Vertical branches represented disjunction, small vertical notches on the lines represented negation, and universal quantification was represented by an indentation with the name of the quantified variable, as in $-\!\!\!\stackrel{\mathfrak{a}}{\smile}\!\!\!-F(\mathfrak{a})$ . Conjunction and existential quantification were not primitive, but represented by the proper application of negation. Unlike the logic itself, the notation was never accepted, but some have argued that it demonstrates ingenious design.

The most reliable way of carrying out a proof, obviously, is to follow pure logic, a way that, disregarding the particular characteristics of objects, depends solely on those laws upon which all knowledge rests. Accordingly, we divide all truths that require justification into two kinds, those for which the proof can be carried out purely by means of logic and those for which it must be supported by facts of experience. But that a proposition is of the first kind is surely compatible with the fact that it could nevertheless not have come to consciousness in a human mind without any activity of the senses. Hence it is not the psychological genesis but the best method of proof that is at the basis of the classification. Now, when I came to consider the question to which of these two kinds the judgments of arithmetic belong, I first had to ascertain how far one could proceed in arithmetic by means of inferences alone, with the sole support of those laws of thought that transcend all particulars… I found the inadequacy of language to be an obstacle; no matter how unwieldy the expressions I was ready to accept, I was less and less able, as the relations became more and more complex, to attain the precision that my purpose required. This deficiency led me to the idea of the present ideography. Its first purpose, therefore, is to provide us with the most reliable test of the validity of a chain of inferences and to point out every presupposition that tries to sneak in unnoticed, so that its origin can be investigated. That is why I decided to forgo expressing anything that is without significance for the inferential sequence. … I called what alone mattered to me the conceptual content. Hence this definition must always be kept in mind if one wishes to gain a proper understanding of what my formula language is. That, too, is what led me to the name “Begriffsschrift” [ideography or concept language]. Since I confined myself for the time being to expressing relations that are independent of the particular characteristics of objects, I was also able to use the expression “formula language for pure thought”. That it is modeled upon the formula language of arithmetic, as I indicated in the title, has to do with fundamental ideas rather than with details of execution. Any effort to create an artificial similarity by regarding a concept as the sum of its marks [Merkmale] was entirely alien to my thought. The most immediate point of contact between my formula language and that of arithmetic is the way in which letters are employed.

… I believe that I can best make the relation of my ideography to ordinary language clear if I compare it to that which the microscope has to the eye. Because of the range of its possible uses and the versatility with which it can adapt to the most diverse circumstances, the eye is far superior to the microscope. Considered as an optical instrument, to be sufre, it exhibits many imperfections, which ordinarily remain unnoticed only on account of its intimate connection with our mental life. But, as soon as scientific goals demand great sharpness of resolution, the eye proves to be insufficient. The microscope, on the other hand, is perfectly suited to precisely such goals, but that is just why it is useless for all others.

This ideography, likewise, is a device invented for certain scientific purposes, and one must not condemn it because it is not suited to others. If it answers to these purposes in some degree, one should not mind the fact that there are no new truths in my work. I would console myself on this point with the realization that a development of method, too, furthers science. Bacon, after all, thought it better to invent a means by which everything could easily be discovered than to discover particular truths, and all great steps of scientific progress in recent times have had their origin in an improvement of method.

Leibniz, too, recognized—and perhaps overrated—the advantages of an adequate system of notation. His idea of a universal characteristic, of a calculus philosophicus or ratiocinator, was so gigantic that the attempt to realize it could not go beyond the bare preliminaries. The enthusiasm that seized its originator when he contemplated the immense increase in the intellectual power of mankind that a system of notation directly appropriate to objects themselves would bring about led him to underestimate the difficulties that stand in the way of such an enterprise. But, even if this worthy goal cannot be reached in one leap, we need not despair of a slow, step-by-step approximation. When a problem appears to be unsolvable in its full generality, one should temporarily restrict it; perhaps it can then be conquered by a gradual advance. It is possible to view the signs of arithmetic, geometry, and chemistry as realizations, for specific fields, of Leibniz’s idea. The ideography proposed here adds a new one to these fields, indeed the central one, which borders on all the others. If we take our departure from there, we can with the greatest expectation of success proceed to fill the gaps in the existing formula languages, connect their hitherto separated fields into a single domain, and extend this domain to include fields that up to now have lacked such a language.

… If it is one of the tasks of philosophy to break the domination of the word over the human spirit by laying bare the misconceptions that through the use of language often almost unavoidably arise concerning the relations between concepts and by freeing thought from that with which only the means of expression of ordinary language, constituted as they are, saddle it, then my ideography, further developed for these purposes, can become a useful tool for the philosopher.

… The mere invention of this ideography has, it seems to me, advanced logic. I hope that logicians, if they do not allow themselves to be frightened off by an initial impression of strangeness, will not withhold their assent from the innovations that, by a necessity inherent in the subject matter itself, I was driven to make. These deviations from what is traditional find their justification in the fact that logic has hitherto always followed ordinary language and grammar too closely. In particular, I believe that the replacement of the concepts subject and predicate by argument and function, respectively, will stand the test of time. It is easy to see how regarding a content as a function of an argument leads to the formation of concepts. Furthermore, the demonstration of the connection between the meanings of the words if, and, not, or, there is, some, all, and so forth, deserves attention.

The Begriffschrift contains all the familiar characteristics of formal logic as we know it today: connectives, variables, relations, quantifiers, and — most importantly — syntactic inference rules by which deduction is performed.

Frege states that his goal in creating a language for “pure thought” (i.e., thought that does not interact with the senses) is finding the best proof method, and not providing a psychological description of the mind’s operation. He expanded on what is meant by “laws of thought” in a later work:

⊕The Basic Laws of Arithmetic, xv, p. 12-13, 1893 And this brings me to what stands in the way of the influence of my book among logicians: namely, the corrupting incursion of psychology into logic. Our conception of the laws of logic is necessarily decisive for our treatment of the science of logic, and that conception in turn is connected with our understanding of the word “true”. It will be granted by all at the outset that the laws of logic ought to be guiding principles for thought in the attainment of truth, yet this is only too easily forgotten, and here what is fatal is the double meaning of the word “law”. In one sense a law asserts what is; in the other it prescribes what ought to be. Only in the latter sense can the laws of logic be called ‘laws of thought’: so far as they stipulate the way in which one ought to think. Any law asserting what is, can be conceived as prescribing that one ought to think in conformity with it, and is thus in that sense a law of thought. This holds for laws of geometry and physics no less than for laws of logic. The latter have a special title to the name “laws of thought” only if we mean to assert that they are the most general laws, which prescribe universally the way in which one ought to think if one is to think at all. But the expression “law of thought” seduces us into supposing that these laws govern thinking in the same way as laws of nature govern events in the external world. In that case they can be nothing but laws of psychology: for thinking is a mental process. And if logic were concerned with these psychological laws it would be a part of psychology;… is reduced to individuals’ taking something to be true. All I have to say to this is: being true is different from being taken to be true, whether by one or many or everybody, and in no case is to be reduced to it.

Frege’s formal notation marked a sharp departure from the algebraic school begun with George Boole. Instead of equations and reasoning by substitution and elimination, formal reasoning proceeds by syntactic inference rules more elaborate than those offered by algebra. And while it would prove to be the innovation that would ultimately make formal logic a serious object of study with the publication of the Principia Mathematica, it did not go over well with members of that tradition like Ernst Schröder (1841-1902):

⊕Review of Frege’s Conceptual Notation This very unusual book—obviously the original work of an ambitious thinker with a purely scientific turn of mind—pursues a course to which the reviewer is naturally highly sympathetic, since he himself has made similar investigations. The present work promises to advance toward Leibniz’s ideal of a universal language, which is still very far from its realization despite the great importance laid upon it by that brilliant philosopher!

The fact that a completed universal language, characteristic, or general conceptual notation {allgemeine Begriffschrift} does not exist even today justifies my trying to say from the beginning what is to be understood by it. I almost want to say, “it is a risk to state [what a completed universal language would be like]”; for, as history teaches, in the further pursuit of such ideals, we often find ourselves led to modify the original ones very significantly; especially once we have succeeded in advancing substantially toward [our goal].

… Even if, in spite of all earlier attempts and also the latest one now under discussion, the idea of a universal language has not yet been realized in a nearly satisfactory sense; it is still the case that the impossibility of the undertaking has not come to light. On the contrary, there is always hope, though remote, that by making existing scientific technical language {wissenschaftliche Kunstsprache} precise, or by developing a special such language, we may gain a firm foundation by means of which it would someday become possible to emerge from the confusion of philosophical controversies, terminologies, and systems whose conflict or disagreement is to be mainly attributed (as indeed can be generally seen) to the lack of definiteness of the basic concepts. The blame must be placed almost entirely upon the imperfections of the language in which we are forced to argue from the outset.

Given the sense [of ‘conceptual notation’] which I sought to indicate in the above remarks, it must be said that Frege’s title, Conceptual Notation, promises too much—more precisely, that the title does not correspond at all to the content [of the book]. Instead of leaning toward a universal characteristic, the present work (perhaps unknown to the author himself) definitely leans toward Leibniz’s “calculus ratiocinator”. In the latter direction, the present little book makes an advance which I should consider very creditable, if a large part of what it attempts had not already been accomplished by someone else, and indeed (as I shall prove) in a doubtlessly more adequate fashion.

The book is clearly and refreshingly written and also rich in perceptive comments. The examples are pertinent; and I read with genuine pleasure nearly all the secondary discussions which accompany Frege’s theory; for example, the excellently written Preface. On the other hand, I can pass no such unqualified judgement upon the major content—the formula notation itself.

… First of all, I consider it a shortcoming that the book is presented in too isolated a manner and not only seeks no serious connection with achievements that have been made in essentially similar directions (namely those of Boole), but even disregards them entirely. The only comment that the author makes which is remotely concerned with [Boole’s achievements] is the statement… which reads, “Any effort to create an artificial similarity by regarding a concept as the sum of its marks [Merkmale] was entirely alien to my thought.”I have copied the translation used in the above excerpt from Frege. This comment even by itself lends a certain probability to the supposition-which gains confirmation in other ways- that the author has an erroneous low opinion of “those efforts” simply because he lacks knowledge of them.

… However, the comment (which I shall prove below) that might contribute most effectively to the correction of opinions is that the Fregean “conceptual notation” does not differ so essentially from Boole’s formula language … With the exception of what is said on pages 15-22 about “function” and “generality” … the book is devoted to the establishment of a formula language, which essentially coincides with Boole’s mode of presenting judgements and Boole’s calculus of judgements, and which certainly in no way achieves more.

With regard to its major content, the “conceptual notation” could be considered actually a transcription of the Boolean formula language. With regard to its form, though, the former is different beyond recognition—and not to its advantage. As I have said already it was without doubt developed completely independently—all too independently!

If the author’s notation does have an advantage over the Boolean one, which eluded me, it certainly also has a disadvantage. I think that to anyone who is familiar with both, [the author’s notation] must above all give the impression of hiding—to be sure not intentionally, but certainly “artificially”—the many beautiful, real, and genuine analogies which the logical formula language naturally bears with regard to the mathematical one.

In the subtitle, “A Formula Language Modelled Upon that of Arithmetic”, I find the very point in which the book corresponds least to its advertised program, but in which a much more complete correspondence could be attained—precisely by means of the neglected emulation of previous works. If, to the impartial eye, the “modelling” appears to consist of nothing more than using letters in both cases, then it seems to me this does not sufficiently justify the epithet used.

In other words, that same correspondence between abstract algebra (or its modern generalization in category theory), and formal logic, was taken by Schröder as evidence for the complete superfluousness of the latter, to which Frege responds, of course they correspond; Boole and I invented both to discuss the same thing! Nevertheless, there can be value in different forms of notation:

⊕On the Purpose of the Begriffsschrift, 1882 Among other things I stand reproached with having left the works of Boole out of account. E. Schröder is one of those who offer this reproach: comparing my Begriffsschrift with the Boolean formula-language [Formelsprache] in his review (Zeitschr./. Math. u. Phys., Vol. 25), he comes to the conclusion that the latter is to be preferred in every respect. …

Regarding the aforementioned reproach I should like to remark that in the twenty years and more since it was devised the Boolean formula-language has by no means achieved such a signal success that to abandon henceforth the foundations it laid must be regarded as foolish, or that anything except a further development is out of the question. The problems Boole deals with, after all, seem for the most part to have been first thought up in order to be solved by means of his formulas.

Above all, though, that reproach overlooks the fact that my purpose was quite other than Boole’s. I was not trying to present an abstract logic in formulas; I was trying to express contents in an exacter and more perspicuous manner than is possible in words, by using written symbols. I was trying, in fact, to create a ‘lingua characteristica’ in the Leibnizian sense, not a mere ‘calculus ratiocinator‘—not that I do not recognize such a deductive calculus as a necessary constituent of a Begriffsschrift. If this has been misunderstood, perhaps it is because I concentrated overmuch on the abstract logic aspect in my exposition.

… Boole distinguishes primary propositions from secondary propositions.See The Monist in Part 2. The former compare concepts in respect of their extensions; the latter expression relations between possible contents of judgment. This classification is unsatisfactory since it provides no place for existential judgments. Let us take the primary propositions first. Here the letters stand for extensions of concepts. Individual things as such cannot be designated, and this is a considerable shortcoming of the Boolean formula-language; for even if only a single thing is comprehended by a concept, there is still always a vast difference between the concept and that thing.

… So consequential are the differences between logical and mathematical calculation that the solving of logical equations, which Boole is especially concerned with, has scarcely anything in common with the solving of algebraic ones. The subordination of one concept to another can now be expressed thus:

A = A . B

If e.g. A stands for the extension of the concept ‘mammal’ and B for that of the concept ‘airbreathing’, then the equation means the extensions of the concepts ‘mammal’ and ‘airbreathing mammal’ are equal; i.e.: all mammals are airbreathing. The falling of an individual under a concept, which is entirely different from the subordination of one concept to another, has no special expression in Boole: indeed, strictly speaking, it has no expression at all. Everything thus far is already to be found, with only superficial divergences, in Leibniz—of whose works relevant to this subject Boole knew nothing. In Boole, 0 designates the extension of a concept under which nothing falls; 1 stands for the extension of a concept under which falls everything which is directly under discussion (universe of discourse). We see that the meanings of these signs, too, and especially that of 1, diverge from their arithmetical ones. Leibniz has “non ens” and “ens” instead of them.

… Surveying the Boolean formula-language as a whole, we recognise that it is abstract logic clothed in the garments of algebraic symbols; it is not adapted to the reproduction of a content—nor is that its purpose. And just this is my intention. I want to fuse together the few signs I introduce, with the symbols of mathematics already at hand, into a single formula- language. In it the existing symbols correspond roughly to the roots of words in ordinary language, while the signs I have added are comparable to the word endings and particles which impose logical relations upon the contents contained in those roots.

I could not use the Boolean notation for this, for it is out of the question to have the + sign, for instance, occurring in one and the same formula with now a logical and now an arithmetical sense. The analogy, prized by Boole, between the logical and the arithmetical methods of calculation, can issue only in confusion if the two are brought into relation with each other. Boole’s sign-language is conceivable only in total separation from arithmetic.

Hence I had to invent other symbols for the logical relations. Schröder says that my Begriffsschrift has almost nothing in common with the Boolean calculus of concepts, but that it does have something in common with the Boolean calculus of judgmentsHe means the interpretation of the Boolean algebra where variables denote classes versus the one where they denote propositions. . In fact, one of the most important ways in which my interpretation differs from the Boolean (and, I might add, from the Aristotelian) is that I take judgments and not concepts as my starting-point though this is by no means to say that I am unable to express the relation of subordination between concepts.

… Schröder proceeds everywhere in his criticism from a direct comparableness of the Begriffsschrift with the Leibniz-Boole formula-language — a comparableness which is not to be had. His most effective contribution towards putting things to fights, he considers, is his observation that the two systems of notation are not essentially different, since it is possible to translate from one into the other. But this proves nothing. If the same subject-matter can be presented in two systems of symbols it follows automatically that translation or transcription from one to the other is possible. From this possibility, conversely, nothing more follows than the presence of a common subject-matter: the systems of symbols can still be basically different.

… I regard this notation [of the universal quantifier] as one of the most important features of my Begriffschrift, and one in respect of which it has a considerable advantage over Boole’s mode of expression, even as a mere representation of the logical forms. By its means the Boolean artifice is replaced by an organic relationship between primary and secondary propositions.

It is interesting that Frege points out a difference between the algebraic and the formal logic approach that lasts to this day: Whereas algebras (in general, and of logic in particular) talk about concepts (collections of objects, but this can be also be taken to mean any mathematical object which could be considered made of individuals, like functions that can be applied at a point) in a restricted universe of discourse and treat them as indivisible, formal logic can also talk about the individual through the use of quantifiers. Whether or not this is preferable depends, of course, on context and usage.The mathematician Terence Tao has described algebra essentially as a clever hiding of specific uses of quantifiers.

In an in-depth, nuanced discussion of the Frege-Schröder debate and their respective understanding of the Leibnizian program, Volker Peckhaus writes:

⊕Calculus Ratiocinator vs. Characteristica Universalis? The Two Traditions in Logic, Revisited, p. 10 [B]oth Frege and Schröder, criticized the concurring system as being a mere calculus ratiocinator. Both claimed, on the other hand, that their own system was the better realization of the Leibnizian idea of a characteristica universalis. Both accept that a language would require both elements, and both aim at such a language. Schröder’s algebraic attitude requiring an external semantics seems to be closer to the original Leibnizian idea of a lingua rationalis. Furthermore, the universality of a universal characteristics is not bound to modern quantification theory.

But the fundamental difference between Frege and Boole is one of goals. Whereas Boole wanted to use algebra in order to subsume logic into mathematics, Frege wanted to create a logical foundation upon which mathematics can be established:

⊕The Foundation of Arithmetic, 1884, pp. xv-xx It is sad and discouraging to observe how discoveries once made are always threatening to be lost again … and how much work promises to have been done in vain, because we fancy ourselves so well off that we need not bother to assimilate its results. My work too, as I am well aware, is exposed to this risk. A typical crudity confronts me, when I find calculation described as “aggregative mechanical thought”. I doubt whether there exists any thought whatsoever answering to this description. … Thought is in essentials the same everywhere: it is not true that there are different kinds of laws of thought to suit the different kinds of objects thought about. Such differences as there are consist only in this, that the thought is more pure or less pure, less dependent or more upon psychological influences and on external aids such as words or numerals, and further to some extent too in the finer or coarser structure of the concepts involved; but it is precisely in this respect that mathematics aspires to surpass all other sciences, even philosophy.

The present work will make it clear that even an inference like that from n to n + 1, which on the face of it is peculiar to mathematics, is based on the general laws of logic, and that there is no need of special laws for aggregative thought. It is possible, of course, to operate with figures mechanically, just as it is possible to speak like a parrot: but that hardly deserves the name of thought. It only becomes possible at all after the mathematical notation has, as a result of genuine thought, been so developed that it does the thinking for us, so to speak.

… We suppose, it would seem, that concepts sprout in the individual mind like leaves on a tree, and we think to discover their nature by studying their birth: we seek to define them psychologically, in terms of the nature of the human mind. But this account makes everything subjective, and if we follow it through to the end, does away with truth. What is known as the history of concepts is really a history either of our knowledge of concepts or of the meanings of words. Often it is only after immense intellectual effort, which may have continued over centuries, that humanity at last succeeds in achieving knowledge of a concept in its pure form, in stripping off the irrelevant accretions which veil it from the eyes of the mind. What, then, are we to say of those who, instead of advancing this work where it is not yet completed, despise it, and betake themselves to the nursery, or bury themselves in the remotest conceivable periods of human evolution, there to discover, like John Stuart Mill, some gingerbread or pebble arithmetic! It remains only to ascribe to the flavour of the bread some special meaning for the concept of number. A procedure like this is surely the very reverse of rational, and as unmathematical, at any rate, as it could well be. No wonder the mathematicians turn their backs on it. Do the concepts, as we approach their supposed sources, reveal themselves in peculiar purity?Frege may be criticizing the very goal of my text, but sometimes, after the concepts have been purified so much that their origins are forgotten, traveling back to their source — like tracing the evolution of the eye — may reveal exactly which stages of development were most crucial to which aspect of the concept, and so may reveal some possible paths forward. Not at all; we see everything as through a fog, blurred and undifferentiated. It is as though everyone who wished to know about America were to try to put himself back in the position of Columbus, at the time when he caught the first dubious glimpse of his supposed India. Of course, a comparison like this proves nothing; but it should, I hope, make my point clear. It may well be that in many cases the history of earlier discoveries is a useful study, as a preparation for further researches; but it should not set up to usurp their place.

⊕The Foundation of Arithmetic, pp. 21-22 … Statements in Leibniz can only be taken to mean that the laws of number are analytic, as was to be expected, since for him the a priori coincides with the analytic. Thus he declares that the benefits of algebra are due to its borrowings from a far superior science, that of the true logic. In another passage he compares necessary and contingent truths to commensurable and incommensurable magnitudes, and maintains that in the case of necessary truths a proof or reduction to identities is possible. However, these declarations lose some of their force in view of Leibniz’s inclination to regard all truths as provable: “Every truth”, he says, “has its proof a priori derived from the concept of the terms, notwithstanding it does not always lie in our power to achieve this analysis.”

… To quote Mill: “The doctrine that we can discover facts, detect the hidden processes of nature, by an artful manipulation of language, is so contrary to common sense, that a person must have made some advances in philosophy to believe it.”

Very true—if it be supposed that during the artful manipulation we do not think at all. Mill is here criticizing a kind of formalism that scarcely anyone would wish to defend. Everyone who uses words or mathematical symbols makes the claim that they mean something, and no one will expect any sense to emerge from empty symbols. But it is possible for a mathematician to perform quite lengthy calculations without understanding by his symbols anything intuitable, or with which we could be sensibly acquainted. And that does not mean that the symbols have no sense; we still distinguish between the symbols themselves and their content, even though it may be that the content can only be grasped by their aid. We realize perfectly that other symbols might have been arranged to stand for the same things. All we need to know is how to handle logically the content as made sensible in the symbols and, if we wish to apply our calculus to physics, how to effect the transition to the phenomena.

⊕The Foundation of Arithmetic, p. 99 … I hope I may claim in the present work to have made it probable that the laws of arithmetic are analytic judgements and consequently a priori. Arithmetic thus becomes simply a development of logic, and every proposition of arithmetic a law of logic, albeit a derivative one. To apply arithmetic in the physical sciences is to bring logic to bear on observed facts; calculation becomes deduction. The laws of number will not… need to stand up to practical tests if they are to be applicable to the external world; for in the external world, in the whole of space and all that therein is, there are no concepts, no properties of concepts, no numbers. The laws of number, therefore, are not really applicable to external things; they are not laws of nature. They are, however, applicable to judgements holding good of things in the external world: they are laws of the laws of nature. They assert not connexions between phenomena, but connexions between judgements; and among judgements are included the laws of nature.

Frege created a new field of study in logic — the foundations of mathematics — which seeks to construct formal systems — logics — which can describe all of known mathematics, a field of study that was first fully realized in Russell and Whitehead’s Principia. In his precise foundation, he used sets, as studied by Georg Cantor (1845-1918) in his set theory, to precisely model Leibniz’s “concepts”, Euler’s “notions” and Boole’s “classes”.

Unfortunately, Frege’s invention of formal logic did not bring about Leibniz’s dream of an end to human quarrel, even in matters of mathematics. The foundations of mathematics were grounds to famous disputes in the early twentieth century, as questions regarding the philosophy of mathematics turned out to be essential in the choice of axioms of the new formalisms. The philosophy of mathematics is a big and serious topic, one that is, unfortunately, almost always misrepresented when discussed online, and that I am certain to misrepresent here. My goal is simply to present the kind of disputes that arose, as they’re a result of a fundamental feature of Frege’s entire program, and one which was addressed in subsequent work on computation.

Several schools of thoughts emerged. The first, Frege and Russell’s, was that mathematics can be reduced to logic, meaning, a few self-evident logical axioms entail all of mathematics, and its truths stem from the obvious truths of logic (this view is evident in Frege’s words, “even an inference like that from n to n + 1, which on the face of it is peculiar to mathematics, is based on the general laws of logic, and that there is no need of special laws for aggregative thought”The Foundation of Arithmetic, p. xvi ); ⊕Heijenoort’s book, From Frege to Gödel contains several primary sources pertaining to this controversy: Brouwer’s On the Significance of the Principle of Excluded Middle in Mathematics, 1923, and Intuitionistic Reflections on Formalism, 1927; Hilbert’s On the Infinite, 1925, and The Foundations of Mathematics, 1927; and Hermann Weyl’s comments on Hilbert’s second lecture on the foundations of mathematics, 1927. It is always interesting to read what the debating parties actually said. this school was called Logicism. In addition, Russell espoused a kind of Platonist realism, believing that all mathematical objects — even infinitary ones — have a real existence that we uncover through the study of mathematics. L. E. J Brouwer (1881-1966) held that some logical axioms — in particular, the law of excluded middle that says that every proposition is either true or false — do not hold when they concern infinitary mathematical objects. The truth of mathematical statements, like that of statements concerning the natural world, is to be established empirically by testing (even if only in principle), and some mathematical propositions regarding infinitary objects cannot be tested by finite mechanisms like the human mind. For example, determining whether a real number is equal to zero requires examining all infinity of digits in its decimal expansion, and thus the truth or falsity of that proposition cannot be determined. Brouwer uncovered just how much mathematics, infinitesimal calculus in particular, relies on such reasoning which he considered simply incorrect, and so much of mathematics had to be chucked. The school he founded was called Intuitionism. ⊕Due to the intense personal quarrel between the two, it is sometimes believed that Brouwer saw in Formalism the enemy of Intuitionism. This is false. ‘While logicism and intuitionism were too far apart to allow a dialogue between them,’ Heijenoort writes (p. 490), ‘the emergence of Hilbert’s meta-mathematics created between Hilbert and Brouwer a ground on which a discussion could proceed.’ So much so that in Intuitionistic Reflections on Formalism, 1927, he writes that he expects the differences between the two ‘will vanish’. For example, he thought that the two philosophies agreed that in ‘intuitive’ or ‘contentual’ mathematics, the law of excluded middle can be applied ‘thoughtlessly’ only to finite objects. Finally, David Hilbert (1862-1943) found a way to admit classical mathematics while still rejecting the reality of infinitary objects with his philosophy of Formalism. He contended that mathematics comprises two kinds statements: “real” propositions — regarding finitary objects and relating to observations about the real world — and “ideal” propositions — regarding infinitary objects. As long as the ideal propositions are consistent with the real ones, and as long as the inferences used are finitary (i.e., operate on finite logical statements and are of finite length), then ideal propositions are allowed, even though they do not contain any real meaning, in the sense of referencing real objects. What matters is not the “real” correctness of the mathematical objects, but the reality and correctness of the laws of deriving mathematical statements.For a discussion of the deeper philosophical difference between Hilbert’s and Frege’s points of view, see The Frege-Hilbert Controversy, in the Stanford Encyclopedia of Philosophy.

It should be noted that all three schools, Logicism, Intuitionism and Formalims, suffer from severe flaws. For example, it turns out that formalizing mathematics requires some axioms (about the non-finiteness of some collections of objects and various induction rules) that are decidedly mathematical rather than purely logical in nature, not to mention the variant requiring that one should accept the Platonic reality of mathematical objects; intuitionism is arbitrary, and except for simply rejecting the law of excluded middle cannot prescribe a non-contrived philosophical principle (as opposed to a creed) by which it can be unambiguously determined which axioms and inferences are admissible or notFor example, in modern terminology all natural numbers are intuitionistically ‘realized’. This cannot be philosophically justified by anything other than what is an equivalent restatement of this axiom, or by a creed like that attributed to Leopold Kronecker of ‘God made the integers, all else is the work of man,’ here taken to mean ‘God made all the integers’. The fact that it is more obvious than the law of excluded middle, as it is logically weaker, does not make it actually obvious, and the philosophical principle by which the former is obvious enough to be an axiom while the latter is not is unclear. That the principles of intuitionism coincide with the notion of computability is not a philosophical justification because computability — as pointed out by Turing himself (Lecture to the London Mathematieal Society on 20 February 1947, p. 1) — is just a convenient approximation of a stronger notion (of feasibility). Indeed, in recent decades feasibility has replaced computability as the central theoretical notion in computer science. Put bluntly, intuitionism cannot explain why classical mathematics must be abandoned on the grounds of philosophical/physical fidelity in favor of intuitionism, while intuitionism should not be abandoned, by the same justification, in favor of ultrafinitism, or conversely, if intuitionism is an acceptable mathematical approximation of the reality of ultrafinitism, why is it that classical mathematics is not. ; Formalism relies on an assumption of consistency in mathematics that, as it turns out, can never be proven. None of the three (and the many other schools in the philosophy of mathematics established since) can be said to be clearly superior to the others or more obviously correct, and so the choice comes down to an aesthetic (perhaps Leibniz would call it “emotional”) preference. But much more importantly — at least to my focus as a programmer — the choice of a mathematical foundation cannot possibly have any effect whatsoever when its subject matter is computation. Computation — from Hobbes to Turing — is tied to the physical, and were a mathematical system to yield a result falsified in the physical world, it would be discarded immediately; all foundations must necessarily yield the same (physical) result. Thus, as far as computation is concerned, inasmuch as different mathematical philosophies prescribe the use of different formalisms, the choice among them should be purely pragmatic and aesthetic, and possibly not universal — one should pick whatever language is convenient and appealing.

The source of those disputes was a fundamental feature of Frege’s formal logic — and also of Boole’s and Leibniz’s. While Frege’s formal deduction was no doubt mechanical or “algorithmic”, arguably even more so than Boole’s system, in Frege’s logic — as in Boole’s — the mechanical rules of inference correspond to the “laws of thought,” but each and every symbol, sentence or part of a sentence, including the axioms and the inference rules themselves must also have a contentual meaning; they must directly refer to an intuitive mental notion (like a set, a function, intersection, addition etc.). “Calculation becomes deduction,” but “it only becomes possible at all after the mathematical notation has, as a result of genuine thought, been so developed that it does the thinking for us, so to speak,” and, “[e]veryone who uses words or mathematical symbols makes the claim that they mean something, and no one will expect any sense to emerge from empty symbols.” It is this obligation that every symbol or a combination of symbols must be meaningful in an intuitive sense — and it is that intuitive content that justifies the logic — that invites controversy over axioms and inference rules, as the axioms and inference rules must make intuitive sense. And what makes intuitive sense to one, may not make sense to another. This contentual or conceptual basis for the entire theory was the root of the deep disputes over formal systems.

Indeed, the meaning of symbols and its relation to the signs was a topic of great interest to Frege. When pondering the meaning of the equality sign, Frege observed that symbols and statements in both his Begriffschrift and natural language have two kinds of meanings, which he calls sense and reference (or denotation in some translations):

⊕On Sense and Reference, pp. 36-38, 1892 ⊕
A portrait of Gottlob Frege by Renee Bolinger, paired with Van Gogh’s Starry Night, as a pun on Frege’s famous puzzle about the significance of learning that ‘Hesperus is Phosphorus.’ (source: the artist’s website; used with permission)Equality gives rise to challenging questions which are not altogether easy to answer. Is it a relation? A relation between objects, or between names or signs of objects? In my Begriffsschrift I assumed the latter. The reasons which seem to favour this are the following: $a = a$ and $a = b$ and are obviously statements of differing cognitive value; $a = a$ holds a priori and, according to Kant, is to be labelled analytic, while statements of the form $a = b$ often contain very valuable extensions of our knowledge and cannot always be established a priori. The discovery that the rising sun is not new every morning, but always the same, was one of the most fertile astronomical discoveries… Now if we were to regard equality as a relation between that which the names ‘a’ and ‘b’ designate, it would seem that $a = b$ could not differ from $a = a$ (i.e. provided $a = b$ is true). A relation would thereby be expressed of a thing to itself, and indeed one in which each thing stands to itself but to no other thing. What is intended to be said by $a = b$ seems to be that the signs or names ‘a’ and ‘b’ designate the same thing, so that those signs themselves would be under discussion; a relation between them would be asserted. But this relation would hold between the names or signs only in so far as they named or designated something. It would be mediated by the connexion of each of the two signs with the same designated thing… If the sign ‘a’ is distinguished from the sign ‘b’ only as object (here, by means of its shape), not as sign (i.e. not by the manner in which it designates something), the cognitive value of $a = a$ becomes essentially equal to that of $a = b$, provided $a = b$ is true. A difference can arise only if the difference between the signs corresponds to a difference in the mode of presentation of that which is designated…

It is natural, now, to think of there being connected with a sign (name, combination of words, letter), besides that to which the sign refers, which may be called the reference of the sign, also what I should like to call the sense of the sign, wherein the mode of presentation is contained… The reference of ‘evening star’ would be the same as that of ‘morning star,’ but not the sense.Frege used the terms ‘morning star’ (Morgenstern) and ‘evening star’ (Abendstern); later philosophers changed the example to use the proper names Hesperus and Phosphorus, both referring to the planet Venus.

… The regular connexion between a sign, its sense, and its reference is of such a kind that to the sign there corresponds a definite sense and to that in turn a definite reference, while to a given reference (an object) there does not belong only a single sign… It may perhaps be granted that every grammatically well-formed expression representing a proper name always has a sense. But this is not to say that to the sense there also corresponds a reference. The words ‘the celestial body most distant from the Earth’ have a sense, but it is very doubtful if they also have a reference.This corresponds to a non-terminating term in a programming language; it has a computational sense, but not denotation (unless some denotation, such as $\bot$, is artificially assigned to it). The expression ‘the least rapidly convergent series’ has a sense but demonstrably has no reference, since for every given convergent series, another convergent, but less rapidly convergent, series can be found. In grasping a sense, one is not certainly assured of a reference.

⊕On Sense and Reference, pp. 42-44 … We are therefore driven into accepting the truth value of a sentence as constituting its reference. By the truth value of a sentence I understand the circumstance that it is true or false. There are no further truth values. For brevity I call the one the True, the other the False. Every declarative sentence concerned with the reference of its words is therefore to be regarded as a proper name, and its reference, if it has one, is either the True or the False.

… If our supposition that the reference of a sentence is its truth value is correct, the latter must remain unchanged when a part of the sentence is replaced by an expression having the same reference. And this is in fact the case. Leibniz gives the definition: ‘Eadem sunt, quae sibi mutuo substitui possunt, salva veritate.’Same are those that, when substituted for one another, preserve the truth. What else but the truth value could be found, that belongs quite generally to every sentence if the reference of its components is relevant, and remains unchanged by substitutions of the kind in question?This paragraph is possibly the earliest treatment of what’s known as ‘referential transparency,’ a term that has recently been subject to much misinterpretation. Virtually all programming languages are referentially transparent as long as they do not employ syntactic macros, while some useful mathematical logics — for example, the modal logics — are not, for reasons covered by Frege in this paper, namely, that they have the ability to talk about sentences. The term as now used by some in the context of functional programming is actually misapplied to refer to the particular denotation used in functional languages (in which a function application denotes the value that function evaluates to), and is of no general theoretical interest.

If now the truth value of a sentence is its reference, then on the one hand all true sentences have the same reference and so, on the other hand, do all false sentences. From this we see that in the reference of the sentence all that is specific is obliterated. We can never be concerned only with the reference of a sentence; but again the mere thought alone yields no knowledge, but only the thought together with its reference, i.e. its truth value. Judgments can be regarded as advances from a thought to a truth value…. One might also say that judgments are distinctions of parts within truth values. Such distinction occurs by a return to the thought. To every sense belonging to a truth value there would correspond its own manner of analysis.

⊕On Sense and Reference, p. 56 … Let us return to our starting point.

When we found ‘$a = a$’ and ‘$a = b$’ to have different cognitive values, the explanation is that for the purpose of knowledge, the sense of the sentence, viz., the thought expressed by it, is no less relevant than its reference, i.e. its truth value. If now $a = b$, then indeed the reference of ‘b’ is the same as that of ‘a,’ and hence the truth value of ‘$a = b$’ is the same as that of ‘$a = a$.’ In spite of this, the sense of ‘b’ may differ from that of ‘a’, and thereby the thought expressed in ‘$a = b$’ differs from that of ‘$a = a$.’ In that case the two sentences do not have the same cognitive value. If we understand by ‘judgment’ the advance from the thought to its truth value, as in the above paper, we can also say that the judgments are different.

⊕In the context of programming languages, J.Y. Girard (Proofs and Types, pp. 1-3), presents the reference of a program term as its denotational semantics and its sense as operational semantics. Frege points out that different senses corresponding to the same referent require different manners of analysis and in formal logic, sentences of different senses referring to the same truth value True, require different “judgments”, i.e., different proofs. A formal proof — one that is just a sequence of application of various syntactic inference rules — corresponds to computation; as Frege says, “calculation becomes deduction”.Frege does not relate sense to computation, but it is done by Girard, mentioned in the above note, and by Y. N. Moschovakis, in his 1994 Sense and denotation as algorithm and value.

Another problem with Fregeian tradition of formal logic that stood (and stands to this day) in the way of fulfilling Leibniz’s dream was that, with time, as logical systems grew ever more elaborate and sophisticated, they’ve become arcane and ever more estranged from the work of practitioners — namely mathematicians who, by and large, have rejected any and all formalisms. Emil Post wrote,Recursively enumerable sets… and their decision problems, p. 284, 1944 “That mathematicians generally are oblivious to the importance of this work… as it affects the subject of their own interest is in part due to the forbidding, diverse and alien formalisms in which this work is embodied”, and John von Neumann wrote,The General and Logical Theory of Automata, 1948 or 1951, in Collected Works, Volume 5, p. 303 “Everybody who has worked in formal logic will confirm that it is one of the technically most refractory parts of mathematics.”

Calculation Becomes Deduction

Work on formal logic in the style of Frege continued with Whitehead and Russell (who found an inconsistency, “Russell’s paradox in Frege’s mathematical foundation, which Frege took in strideSee Russell’s letter to Frege, 1902 (From Frege to Gödel, pp. 124-125 ), and Frege’s response (From Frege to Gödel, pp. 127-128), where he wrote: Your discovery of the contradiction caused me the greatest surprise and, I would almost say, consternation, since it has shaken the basis on which I intended to build arithmetic… I must reflect further on the matter. It is all the more serious since, with the loss of my Rule V, not only the foundations of my arithmetic, but also the sole possible foundations of arithmetic, seem to vanish. Yet, I should think, it must be possible to set up conditions… such that the essentials of my proofs remain intact. In any case your discovery is very remarkable and will perhaps result in a great advance in logic, unwelcome as it may seem at first glance. ), Hilbert, Paul Bernays (1888-1977), Leopold Löwenheim (1878-1957), Thoralf Skolem (1887-1963), and later Kurt Gödel and Gerhard Gentzen (1909-1945). In the meantime, abstract algebra developed independently with progress in group theory and Alfred North Whitehead’s 1898 publication of A Treatise On Universal Algebra, but with the development of Frege’s formal logic, logic was no longer identified with algebra as strongly as it had been during the nineteenth century (although many of those working on formal logic also did significant work in algebra), and so we will leave algebra behind. Computation and logic, however, were still considered a single subject.

Robin Gandy (1919-1995), Alan Turing’s friend and only student wrote a fascinating paper in which he summarized precisely who knew what, when and how in the years leading to the great discoveries in computation in 1936. It was clear to Hilbert and other logicians of his time (as it had been to Leibniz) that formal proof is a form of computation:

⊕Quoted and translated by Robin Gandy in The Confluence of Ideas in 1936, pp. 91-92 [S]tarting with [The Foundations of Logic and Arithmetic,] 1905, [Hilbert] developed his ‘Beweistheorie’. The logical apparatus as well as the mathematical subject matter had to be formalized; formal proofs are then finite objects. Metamathematics is the study of these objects, and here Hilbert accepted Kronecker’s restrictions to ‘finitistic’ methods. Operations on formal proofs (such as reduction to a normal form) are to be (in an intuitive sense of the word) recursive; relations between (parts of) proofs are to be decidable; the prime method of metamathematical proof is induction on the construction of formal proofs. Although (prior to 1931) Hilbert and his group did not code proofs by numbers, the connection with numerical calculation was fairly clear. By careful arrangement of the data, algebraic and symbolic operations can be reduced (as Babbage had realized) to numerical ones. A general theory of proofs would require the investigation and classification of effectively calculable processes.

Hilbert, who had already posed a famous algorithmic question in 1900, was impressed with the progress made in formal logic and the increased rigor in mathematics it has brought, and raised an algorithmic question on the nature of formal logic itself, and, indeed, the core of mathematics, concerning with the capabilities and limits of formal systems. The famous Entscheidungsproblem, the decision problem, saw several manifestations in the writings and talks of Hilbert and his students as well as others previously (Stephen Kleene writesGödel’s impression on students of logic in the 1930’s, p. 56 , “The Entscheidungsproblem for various formal systems had been posed by Schröder [in] 1895, Löwenheim [in] 1915, and Hilbert [in] 1918.”), but was given a definitive expression in a 1928 textbook Hilbert wrote with Wilhelm Ackermann (1896-1962):

⊕Quoted and translated by Robin Gandy in The Confluence of Ideas in 1936, p. 58 The Entscheidungsproblem is solved if one knows a procedure which will permit one to decide, using a finite number of operations, on the validity, respectively the satisfiability of a given [first-order] logical expression.

Hilbert was hopeful regarding a positive solution to the Entscheidungsproblem (though he did not assume it), famously saying in a recorded 1930 address that there are no unsolved problems in mathematics, but others did not share that optimism. More generally, on the question of whether there are any unsolvable problems in mathematics, Brouwer wrote:

⊕Quoted in The Confluence of Ideas in 1936, p. 61 It follows that the question of the validity of the principium tertii exclusi is equivalent to the question whether unsolvable mathematical problems can exist. There is not a shred of proof for the conviction which has sometimes been put forward that there exist no unsolvable mathematical problems.

Others, like John von Neumann (1903-1957), were particularly skeptical of a positive outcome, writing in 1927:

⊕Quoted and translated by Robin Gandy in The Confluence of Ideas in 1936, p. 62 So it appears that there is no way of finding a general criterion for deciding whether or not a well-formed formula is a theorem. (We cannot at the moment prove this. We have no clue as to how such a proof of undecidability would go.) But this ignorance does not prevent us from asserting: As of today we cannot in general decide whether an arbitrary well-formed formula can or cannot be proved from the axiom schemata given below. And the contemporary practice of mathematics, using as it does heuristic methods, only makes sense because of this undecidability. When the undecidability fails then mathematics, as we now understand it, will cease to exist; in its place there will be a mechanical prescription for deciding whether a given sentence is provable or not.

G. H. Hardy (1877-1947) similarly said in a Cambridge lecture in 1928:

⊕Quoted in The Confluence of Ideas in 1936, p. 62 Suppose, for example, that we could find a finite system of rules which enabled us to say whether any given formula was demonstrable or not. This system would embody a theorem of metamathematics. There is of course no such theorem and this is very fortunate, since if there were we should have a mechanical set of rules for the solution of all mathematical problems, and our activities as mathematicians would come to an end.

Hilbert’s belief in the solvability of all mathematical questions was shattered with the publication of the incompleteness theorems by Kurt Gödel (1906-1978), who first announced his result a day before Hilbert’s hopeful address. While, at the time, no one understood the result and its significance except for John von NeumannSolomon Feferman, The nature and significance of Gödel’s incompleteness theorems, p. 10 , it was a double-blow to Hilbert’s program of mechanizing mathematics. Gödel first showed that unsolvable formal mathematical questions do exist, and then showed that the consistency of a formal system — which Hilbert viewed as the keystone of his philosophy of Formalism — cannot be proven within the system itself (i.e., using its inference rules).

Gödel’s paperKurt Gödel, On Formally Undecidable Propositions of Principia Mathematica and Related Systems I, 1931 in From Frege to Gödel, pp. 596-616, or, in a different translation and modernized notation and mathematical terminology, by Martin Hirzel here. is very important to our discussion of the relationship between logic and computation, but its interesting parts are technical and should not be quoted partially, while its quotable parts are less interesting, so I will briefly summarize the pertinent ideas.

⊕
A portrait of Kurt Gödel by Renee Bolinger, rendered in the Art-Nouveau style, for its emphasis on recursion, since they echo Gödel’s prominent work in recursive proofs in logic. (source: the artist’s website; used with permission)

Gödel uses the predicate calculus (“Hilbert’s functional calculus”) with the Peano axioms of arithmetic as a multi-sorted logic, so variables are of three sorts: a sort-1 variable stands for a natural number, a sort-2 variable for a set of natural numbers, sort-3 variables for a set-of-sets of natural numbers and so on (he makes do with only three sorts). He then programs a meta-circular evaluator for this logic, i.e., he programs an interpreter for this logic in the logic itself, encoding formulas and proofs as natural numbers. He does so by first defining the notion of a recursive function (primitive recursive in modern terminology) — without making any claims about the supposed universality of recursive functions, i.e., that every “calculable” function is a primitive recursive one; indeed it isn’t — and shows that those can be formally defined and manipulated in the logic. He then encodes formulas and proofs as numbers, representing a proof as a sequence of formulas, each immediately inferrable from the previous ones; thus, when encoded, proofs become numbers — mathematical objects of the logic that can be discussed and manipulated within the logic itself. Finally, and most importantly, he represents the inference rules themselves as primitive-recursive functions. Using those he defines a relation $x By$ , which means that $x$ is (an encoding of) a proof of (the encoding of) formula $y$. The operator $Bew$ is a “proof checker” programmed in the logic. He then defines an operator $Bew(y)$ as $\exists x . xBy$ which means that $y$ encodes a provable formula, which is used to show his main result, that in a consistent logic (one that cannot prove both a sentence and its negation) there are true sentences that aren’t provable, by encoding the sentence, “this sentence is unprovable.” The first inconsistency theorem states that a logic rich enough to express arithmetic can have sentences that are true (for some particular definition of truth) yet are not provable, and the second says that such a logic cannot prove its own consistency.

Gödel completed Leibniz’s goal of showing that logic can be turned into arithmetic, and deduction into calculation. Note, however, that Gödel’s proof relies specifically on the predicate calculus.

With the incompleteness theorems the chances of a positive answer to the Entscheidungsproblem all but vanished. The theorems showed that there are formal propositions that can neither be proven nor refuted, but did not completely rule out the possibility of an algorithm deciding the status of a given proposition (provable, disprovable or neither). Still, if it were guaranteed that some certificate (proof or disproof) exists for every logical sentence, it would make sense to ask whether it could be found mechanically; if one may not even exist, the chances of mechanically determining the formula’s status seem slim. Gandy writes that,

⊕The Confluence of Ideas in 1936, pp. 63-64 ⊕
David Hilbert’s tombstone, with the inscription: We must know, we shall know (source: Wikipedia, By Kassandro CC BY-SA 3.0)Gödel’s result meant that it was almost inconceivable that the Entscheidungsproblem should be decidable: a solution could, so to speak, only work by magic.

A natural question to ask is why neither Gödel nor von Neumann proved the undecidability of the Entscheidungsproblem. Von Neumann was certainly excited by Gödel’s lecture at Königsberg and was the first person fully to understand Gödel’s results… But by then he was mainly interested in other branches of mathematics — after 1931 he did not, for many years, publish any papers about logic. More importantly, I think, his mind moved too fast for him to be able to tackle what is, in effect, a philosophical problem: one needed to reflect long and hard on the idea of calculability.

… Gödel was primarily concerned — in opposition to the climate of the time — with the analysis of nonfinitist concepts and methods… But a concern with nonfinitary reasoning is not what is needed for an analysis of calculations. Gödel admired and accepted Turing’s analysis, but it is not surprising that he did not anticipate it. Indeed, to the end of his life he believed that we might be able to use nonfinitary reasoning in (nonmechanical) calculations.

As Gandy explains, providing a negative answer — now the expected result — to the Entscheidungsproblem required defining precisely what was meant by the vague and intuitive notion of a calculation.

The road to tackling that problem began with attempts to simplify logic. Inspired by the success of reducing the propositional calculus to just a single connective (NAND), Moses Schönfinkel (1889–1942) attempted to simplify the Frege/Russell/Hilbert-style predicate calculus.

⊕On the Building Blocks of Mathematical Logic, 1924, in From Frege to Gödel pp. 356-360 It is in the spirit of the axiomatic method as it has now received recognition, chiefly through the work of Hilbert, that we not only strive to keep the axioms as few and their content as limited as possible but also attempt to make the number of fundamental undefined notions as small as we can; we do this by seeking out those notions from which we shall best be able to construct all other notions of the branch of science in question. Understandably, in approaching this task we shall have to be appropriately modest in our demands concerning the simplicity of the initial notions.

… That the reduction to a single fundamental connective is nevertheless entirely possible provided we remove the restriction that the connective be taken only from the sequence above [of negation, conjunction, disjunction, implication and equivalence] was discovered not long ago by Sheffer (1913).

… We are led to the idea, which at first glance certainly appears extremely bold, of attempting to eliminate by suitable reduction the remaining fundamental notions, those of proposition, propositional function, and variable, from those contexts in which we are dealing with completely arbitrary, logically general propositions … To examine this possibility more closely and to pursue it would be valuable not only from the methodological point of view that enjoins us to strive for the greatest possible conceptual uniformity but also from a certain philosophic, or, if you wish, aesthetic point of view.

… It seems to me remarkable in the extreme that the goal we have just set can be realized also; as it happens, it can be done by a reduction to three fundamental signs.

… It will … be necessary to leave our problem at the point reached above and to develop first a kind of function calculus [Funktionenkalkul] ; we are using this term here in a sense more general than is otherwise customary.

As is well known, by function we mean in the simplest case a correspondence between the elements of some domain of quantities, the argument domain, and those of a domain of function values … such that to each argument value there corresponds at most one function value. We now extend this notion, permitting functions themselves to appear as argument values and also as function values. We denote the value of a function $f$ for the argument value $x$ by simple juxtaposition of the signs for the function and the argument, that is, by $fx$

Functions of several arguments can, on the basis of our extended definition of function, be reduced to those of a single argument in the following way.

We shall regard
$F(x, y)$
for example, as a function of the single argument $y$, say, but not as a fixed given function; instead, we now consider it to be a variable function that depends on $x$ for its form. (Of course we are here concerned with a dependence of the function, that is, of the correspondence itself; we are not referring to the obvious dependence of the function value upon the argument.)

… We therefore write
$(fx)y$
in our symbolism, or, by agreeing … that parentheses around the left end of such a symbolic form may be omitted, more simply,
$fxy$
here the new function, $f$, must be clearly distinguished from the former one, $F$.

After introducing the device, first discussed by FregeThe Basic Laws of Arithmetic, p. 94: §35. Representation of second-level functions by first-level functions. and that we today call currying, to transform a multivariate function into a function that maps a single argument to another single-argument function, Schönfinkel introduces his calculus — today known as the SKI combinator calculus — which can express all the same sentences expressible in the predicate calculus by means of composing only two fundamental “functions”, S and K, where K is a constant function, and S is a substitution operator. The calculus has no other quantifiers and no free variables.

Gandy writes that,The Confluence of Ideas in 1936, p. 64 “In the first half of this century an up-and-coming mathematical logician was expected to win his spurs by producing his own individual system of logic; this should be — at least apparently — consistent and should provide a foundation for as much of standard mathematics (more particularly, analysis) as the author believed to be sound.” This was what Alonzo Church (1903-1995) set out to do in a 1932 paper. Like Schönfinkel, Church used a calculus based on functions and substitution, and borrowed Schönfinkel’s currying device (which he attributed to Schönfinkel, not to Frege), but his ambition was greater than Schönfinkel’s. Rather than rethinking just the calculus — i.e., the inference rules — and basing them on substitution alone, he sought to create an entire foundational logic, axioms and all, where functions, rather than sets, serve as the foundational objects of the logic. He had two goals for the logic: avoiding free (unbound, unquantified) variables, which he saw as an unnecessary complication, and avoiding inconsistencies like Russell’s paradox. The latter he hoped to achieve by virtue of his functions not being ordinary functions, but “partial functions”, meaning that some were not normalizing (the substitution process would never terminate), and so such function applications were “meaningless”, i.e., neither true nor false:

⊕A Set of Postulates for the Foundation of Logic pp. 346-349, 1932 1. Introduction. In this paper we present a set of postulates for the foundation of formal logic, in which we avoid use of the free, or real, variable, and in which we introduce a certain restriction on the law of excluded middle as a means of avoiding the paradoxes connected with the mathematics of the transfinite.

Our reason for avoiding use of the free variable is that we require that every combination of symbols belonging to our system, if it represents a proposition at all, shall represent a particular proposition, unambigouously, and without the addition of verbal explanations.

⊕Precisely why it was that Church’s logic turned out to be inconsistent is of interest. Unlike some conjectures one can find online, the reason was not only that its calculus (the λ-calculus) was Turing complete. That is a necessary condition in this case, but an insufficient one. Indeed, most non-trivial logics are Turing complete, including (as we’ll see), first-order logic, without being inconsistent. Haskell Curry (1900-1982) simplified the paradox found by Kleene and Rosser into what’s known as Curry’s paradox, and generalized the result into a theorem about certain formal systems, in The Inconsistency of Certain Formal Logics, 1942. The primitive objects of Church’s paradox were partial functions (rather than sets), but boolean-valued functions were also how propositions were modeled in the logic. The result was that the logic could directly ‘talk about’ and quantify over its own propositions, whereas ordinary first-order logic can only indirectly talk about propositions by encoding them into the objects of the logic (e.g. natural numbers in Peano arithemetic) as Gödel had done. It is this feature combined with the Turing-completeness of the calculus (the inference procedure) that invariably leads to paradoxes. This property remains in more modern logics based on the lambda calculus, like Martin-Löf type theory. The advantage of using such logics is that implementing them as a computer program (proof assistant) is relatively simple, as they require only a very small core (they also have other advantages, mostly of interest to logicians). The disadvantage is that, teetering on the brink of inconsistency, they must take great care not to introduce paradoxes, for example by verifying that lambda terms indeed terminate.

… Rather than adopt the method of Russell for avoiding the familiar paradoxes of mathematical logic, or that of Zermelo, both of which appear somewhat artificial, we introduce for this purpose, as we have said, a certain restriction on the law of excluded middle. This restriction consists in leaving open the possibility that a propositional function F may, for some values X of the independent variable, represent neither a true proposition nor a false proposition. For such a value X of the independent variable we suppose that F(X) is undefined and represents nothing, and we use a system of logical symbols capable of dealing with propositional functions whose ranges of definition are limited.

In the case of the Russell paradox the relevance of this proposed restriction on the law of excluded middle is evident. The formula P which leads to this paradox may be written, in the notation explained below, ${\lambda \varphi.\sim\varphi(\varphi)}(\lambda \varphi.\sim\varphi(\varphi))$. It has the property that if we assume ~P then we can infer P and if we assume P then we can infer ~P. On ordinary assumptions both the truth and the falsehood of P can be deduced in consequence of this property, but the system of this paper, while it provides for the existence of a propositional function $\lambda \varphi.\sim\varphi(\varphi)$ does not provide either that this propositional function shall be true or that it shall be false, for the value $\lambda \varphi.\sim\varphi(\varphi)$ of the independent variable.

Other paradoxes… disappear in the same way…

… Whether the system of logic which results from our postulates is adequate for the development of mathematics, and whether it is wholly free from contradiction, are questions which we cannot now answer except by conjecture. Our proposal is to seek at least an empirical answer to these questions by carrying out in some detail a derivation of the consequences of our postulates, and it is hoped either that the system will turn out to satisfy the conditions of adequacy and freedom from contradiction or that it can be made to do so by modifications or addition.

2. Relation to intuitionism. Since, in the postulate set which is given below, the law of the excluded middle is replaced by weaker assumptions, the question arises what the relation is between the system of logic which results from this set, and the intuitionism of L. E. J. Brouwer.

⊕
Philip Wadler, a computer science professor and a fan of the λ-calculus, dressed as Lambda Man. Like other creative works, formal logics spawn avid fans, cosplay, fanfic and heated religious debate. (source: Twitter)

The two systems are not the same, because, although both reject a certain part of the principle of the excluded middle, the parts rejected are different. The law of double negation, denied by Brouwer, is preserved in the system of this paper, and the principle, which Brouwer accepts, that a statements from which a contradiction can be inferred is false, we find it necessary to abandon in certain cases.

Our system appears, however, to have the property, which relates it to intuitionism, that a statement of the form Σ x . F (x) (read, “there exists x such that F (x)”) is never provable unless there exists a formula M such that F (M) is provable.

3. The abstract character of formal logic. We do not attach any character of uniqueness or absolute truth to any particular system of logic. The entities of formal logic are abstractions, invented because of their use in describing and systematizing facts of experience or observation, and their properties, determined in rough outline by this intended use, depend for their exact character on the arbitrary choice of the inventor. … [T]here exist, undoubtedly, more than one formal system whose use as a logic is feasible, and of these systems one may be more pleasing or more convenient than another, but it cannot be said that one is right and the other wrong.

In consequence of this abstract character of the system which we are about to formulate, it is not admissible, in proving theorems of the system, to make use of the meaning of any of the symbols, although in the application which is intended the symbols do acquire meanings. The initial set of postulates must of themselves define the system as a formal structure, and in developing this formal structure reference to the proposed application must be held irrelevant. There may, indeed, be other applications of the system than its use as a logic.

In 1934, Church’s students, Stephen Kleene (1909-1994) and J. Barkley Rosser (1907-1989), found an inconsistency in Church’s logic, on which Martin Davis remarked (according to Gandy), “Not exactly what one dreams of having one’s graduate students accomplish for one.”

Gandy writes that,

⊕The Confluence of Ideas in 1936, pp. 65-67 In 1933 Kleene gradually discovered the astonishing power of definition of the untyped λ-calculus. He has given … graphic accounts of his discovery that the predecessor function is λ-definable. Church had doubted that this could be so…. Thus in 1934 the interest of the group shifted from systems of logic to the λ-calculus.

[I]n his (December 1933) lecture Church uses the term ‘intuitively definable function’ but does not discuss exactly how this phrase should be understood… And soon after using the above phrase he writes:

Hence it appears possible that there should be a system of symbolic logic containing a formula to stand for every definable function of positive integers, and I fully believe that such a system exists.

It is certainly conceivable that he had in mind a rather narrow sense of ‘(intuitively) defined’ and believed that the λ-calculus, or some mild extension of it, would provide ‘such a system’.

According to [Martin] Davis … Church first proposed his thesis, identifying ‘effectively calculable’ with ‘λ-definable’ sometime in February 1934.

… It would obviously be a comfort to Church if something of foundational impor- tance should emerge from his work; that a phoenix should arise from the ashes of ‘A Set of Postulates’.

… Church’s type-free system, in which ‘function’ was the key concept, was designed to provide a foundation for logic and mathematics. Unlike its rivals it made iteration the starting point for arithmetic. Church did not, to begin with, realize that the logic free part of the system — the λ-calculus — was a thing of depth and power; it was Kleene’s researches which revealed this. With hindsight one can say that it provides a true foundation for Babbage’s key idea — conditional iteration — and because it is type-free it provides for a wider range of calculable functions than did Dedekind’s ‘definition by induction’. Church knew of the work on primitive and more general recursions, and of the importance of such functions in proof theory, and of the need to set limits on ‘effective calculability’. This, and an appreciation of the power of the λ-calculus, was what was ‘in the air’ for him. With insight and boldness he proposed his thesis.

Church and his students turned their attention from his inconsistent logic to just its inference rules — the λ-calculus. This calculus, based only on substitution, was far simpler than the calculus for Frege and Hilbert’s logic, and yet, as Kleene discovered (employing Schönfinkel’s combinator ideas) powerful enough. Church published his thesis — that the functions that are calculable by a mechanical procedure, a “definite method”, an algorithm — are those that are definable as a λ term in his calculus, in a 1936 paper:

⊕An Unsolvable Problem of Elementary Number Theory, p. 345, 1936 The purpose of the present paper is to propose a definition of effective calculability which is thought to correspond satisfactorily to the somewhat vague intuitive notion in terms of which problems of this class are often stated, and to show, by means of an example, that not every problem of this class is solvable.

⊕An Unsolvable Problem of Elementary Number Theory, p. 356 … We now define the notion, already discussed, of an effectively calculable function of positive integers by identifying it with the notion of a recursive function of positive integers (or of a λ-definable function of positive integers). This definition is thought to be justified by the considerations which follow, so far as positive justification can ever be obtained for the selection of a formal definition to correspond to an intuitive notion.

It has already been pointed out that, for every function of positive integers which is effectively calculable in the sense just defined, there exists an algorithm for the calculation of its values.

Conversely it is true, under the same definition of effective calculability, that every function, an algorithm for the calculation of the values of which exists, is effectively calculable.

Church doesn’t quite set out to demonstrate what calculation or computation is. Instead, he seeks to identify those functions that are calculable, and states that those functions correspond to those definable as λ-terms. He then states this not as a thesis about computation, but as a definition of “effective calculability.” It would be Kleene who would later restate it as a thesis, and call it Church’s Thesis. Using that definition/thesis, Church then proves that Hilbert’s Entscheidungsproblem is undecidable, provided that his notion of an algorithm is indeed universal.

That an algorithm for computing a function written as a λ term exists is obvious.It should be pointed out that a λ term does not uniquely describe a computation, as it can be reduced in different ways. The converse is harder to show, and indeed, without precisely analyzing what an algorithm is, Church finds himself relying on a bit of circular reasoning. He writes that the algorithm for computing a function can be broken down into steps (according to the intuitive notion of an algorithm) that are sufficiently primitive. Those steps would also need to be calculable, and so, if we take that to mean that they are λ-definable, then so is the function that is composed of those steps. Church notices this obvious circularity, and writes what can only be described as an exasperated footnote:

⊕An Unsolvable Problem of Elementary Number Theory, footnote 19, p. 357 If this interpretation or some similar one is not allowed, it is difficult to see how the notion of an algorithm can be given any exact meaning at all.

Gandy lists some of the objections to Church’s thesis, among them:

⊕The Confluence of Ideas in 1936, p. 73 The argument by exampleChurch does not use an argument by example explicitly in his paper, but Gandy writes that ‘I think he assumes that the interested reader would be familiar with some of the results of Kleene, and with familiar mathematical examples of definition by recursion.’ does justify the heuristic value of the thesis, and was entirely suitable for Babbage’s purpose. But it cannot settle the philosophical (or foundational) question. It might happen that one day some genius established an entirely new sort of calculation.

… A similar doubt arises in connection with the step-by-step argument. An entirely new kind of algorithm, or an entirely new kind of rule of proof, might proceed by (irreducible) steps which were not recursive. How can one make unassailable predictions about the future development of mathematics? (Turing showed how.)

Andrew Appel writes that,The Birth of Computer Science at Princeton in the 1930s, p. 2 “it was not at all obvious that the Herbrand-Gödel recursive functions or the λ-calculus really constitutes the essence of ‘computation.’”

Stephen Kleene put it succinctly:

⊕The Theory of Recursive Functions, Approaching its Centennial, p. 49 Turing’s computability is intrinsically persuasive in the sense that the ideas embodied in it directly support the thesis that the functions encompassed are all for which there are algorithms; λ-definability is not intrinsically persuasive (the thesis using it was supported not by the concept itself but rather by results established about it).

Yet he defended the λ-calculus thus:

⊕Origins of Recursive Function Theory, p. 62 λ-definability, has… the remarkable feature that it is all contained in a very simple and almost inevitable formulation, arising in a natural connection with no prethought of the result.

Of course, any chosen formulation would have led to the same result, whether devised in order to prove it or not (for reasons shown by Turing), but Kleene praises the λ-calculus’s simplicity, which, combined with it “arising naturally”, makes it “almost inevitable.”

More fundamentally, perhaps, Church does not at all attempt to address the question of what computation is. Instead, he proposes a particular formalism that he believes is sufficient to universally describe computations, but while he makes an partially successful attempt at showing how that could be done, he does not try to understand why.

Moreover, in our attempt to map the evolution of logic and computation, it is important to observe that while Church stripped away most of his logic and focused just on the calculus — i.e., on the computational content of his formal logic and all others (since Frege) — even the λ-calculus alone is very much a part of the Aristotelian framework, and the formal logic tradition in the sense that it is a language whose terms are meaningful as human “thought concepts” — functions in this case — combined to form more complex concepts.

The Uncluttered Mind

⊕
Alan Mathison Turing (left) with friends, on their way to school (source: The Turing Digital Archive)

We are now in a position to appreciate the remarkable revolution instigated by 24-year-old Alan Turing (1912-1954) in 1936. From Leibniz onward, the goal of logicians has been to study the human mind and precisely and even mechanically describe its operation (either descriptively or prescriptively). They did that by studying language. Inasmuch as they came near to actually studying the human mind, they did not only model its contentual concepts (as expressed in language) believed to be the building block of the human intellect, but took on faith Aristotle’s description, made in Ancient Greece, of how those concepts are combined. All of this came to an abrupt end (at least as being the sole way of looking at logic and the mind) with Turing’s 1936 paper, On Computable Numbers, with an Application to the Entscheidungsproblem.

There has, of course, been one notable exception — that of Charles Babbage — who, despite not being very rigorous about the theoretical underpinnings of his planned machines, nevertheless seemed to believe them capable of mimicking at least some if not all aspects of human intelligence. It must therefore be asked whether Turing was influenced by Babbage. Gandy writes:

⊕The Confluence of Ideas in 1936, pp. 77-78 It has been suggested (e.g., by Hyman…) that Turing was influenced by Babbage’s ideas. I think this can be ruled out: when Turing learned about them in the 1940’s he talked of them with enthusiasm… Had he known about them in 1936 he would certainly have referred to Babbage in his paper. He may have known that Babbage planned an Analytic Engine, either from the exhibit in the Science Museum in Kensington, or from the article ‘Calculating Machines’ in the 11th edition of the Encyclopaedia Britannica. But neither of these sources gives information about the theoretical principles — in particular, the use of conditional instructions — on which the design of the engine was based.

Turing’s paper contains a negative result for Hilbert’s Entscheidungsproblem, but while this is the main mathematical result of the paper, it is probably its smallest contribution (although, as we’ll see, the proof uses a quite exceptional device). Its main contribution is a careful analysis of the concept of computation that allows to understand and define what is meant by “algorithm”, and even by a “formal system”. This analysis, unlike all others that preceded it, does not depend on any particular chosen language or formalism.

Gandy writes:

⊕The Confluence of Ideas in 1936, p. 78 It is almost true to say that Turing succeeded in his analysis because he was not familiar with the work of others. In the version submitted in May 1936 the only logical works referred to are Hilbert and Ackermann … and Godel … The approach is novel, the style refreshing in its directness and simplicity. The bare-hands, do-it-yourself approach does lead to clumsiness and error. But the way in which he uses concrete objects such as exercise books and printer’s ink to illustrate and control the argument is typical of his insight and originality. Let us praise the uncluttered mind.

Turing begins thus:

⊕On Computable Numbers pp. 230-231 The “computable” numbers may be described briefly as the real numbers whose expressions as a decimal are calculable by finite means. Although the subject of this paper is ostensibly the computable numbers, it is almost equally easy to define and investigate computable functions of an integral variable or a real or computable variable, computable predicates, and so forth. The fundamental problems involved are, however, the same in each case, and I have chosen the computable numbers for explicit treatment as involving the least cumbrous technique. I hope shortly to give an account of the relations of the computable numbers, functions, and so forth to one another. This will include a development of the theory of functions of a real variable expressed in terms of computable numbers. According to my definition, a number is computable if its decimal can be written down by a machine.

… We have said that the computable numbers are those whose decimals are calculable by finite means. This requires rather more explicit definition. … For the present I shall only say that the justification lies in the fact that the human memory is necessarily limited.

He begins his analysis not with the question, “what are the computable functions”, but with a very different one — about the processes involved in computation. In this case, as in many others, asking the right question is what allowed Turing to answer what computation is:

⊕On Computable Numbers p. 249 No attempt has yet been made to show that the “computable” numbers include all numbers which would naturally be regarded as computable. All arguments which can be given are bound to be, fundamentally, appeals to intuition, and for this reason rather unsatisfactory mathematically. The real question at issue is “What are the possible processes which can be carried out in computing a number?” The arguments which I shall use are of three kinds.

(a) A direct appeal to intuition.

(b) A proof of the equivalence of two definitions (in case the new definition has a greater intuitive appeal).

(c) Giving examples of large classes of numbers which are computable.

Once it is granted that computable numbers are all “computable”, several other propositions of the same character follow. In particular, it follows that, if there is a general process for determining whether a formula of the Hilbert function calculus is provable, then the determination can be carried out by a machine.

The most famous argument is the first one, but we will quickly describe the second, in which Turing demonstrates the equivalence of his computation machines with formal proofs in a restricted predicate calculus (“the Hilbert functional calculus”) with Peano arithmetic.Gödel was able to make do with primitive recursive functions because each individual inference is indeed primitive recursive, but for his ‘provable’ operator (named $Bew$ in the paper and discussion above) he relied on the existential quantifier, which is not only not primitive-recursive but non-computable. He shows that the mechanical inference of formal logic is indeed computational, and establishes an equivalence between formal proofs and computation by describing a machine that can enumerate the proofs of the calculus, and, conversely, formulas in the logic that can describe the operations of a machine:

⊕On Computable Numbers p. 252 If the notation of the Hilbert functional calculus is modified so as to be systematic, and so as to involve only a finite number of symbols it becomes possible to construct an automatic machine K, which will find all the provable formulae of the calculus.

Now let $a$ be a sequence, and let us denote by $G_a(x)$ the proposition “The x-th figure of a is 1”, so that $-G_a(x)$ means “The x-th figure of a is 0”. Suppose further that we can find a set of properties which define the sequence a and which can be expressed in terms of $G_a(x)$ and of the prepositional functions $N(x)$ meaning “x is a non-negative integer” and $F(x, y)$ meaning “$y = x+1$ “. When we join all these formulae together conjunctively, we shall have a formula, $\mathfrak{U}$ say, which defines a. The terms of $\mathfrak{U}$ must include the necessary parts of the Peano axioms… which we will abbreviate to P.

… I say that a is then a computable sequence: a machine $K_a$ to compute a can be obtained by a fairly simple modification of $K$.

… It can also be shown that the numbers a definable in this way by the use of axioms include all the computable numbers. This is done by describing computing machines in terms of the function calculus.

It must be remembered that we have attached rather a special meaning to the phrase “U defines a”. The computable numbers do not include all (in the ordinary sense) definable numbers.

Turing’s first argument, however, is the best known, and the most brilliant. Instead of analyzing the concepts in the mind of the logician or mathematician and how they can be represented in a precise, formal language, Turing analyzes what it is that the mathematician actually does, without assigning any direct syntactic representation to the concepts she thinks about:

⊕On Computable Numbers p. 249-251 Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child’s arithmetic book. In elementary arithmetic the two-dimensional character of the paper is sometimes used. But such a use is always avoidable, and I think that it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, i.e. on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite. If we were to allow an infinity of symbols, then there would be symbols differing to an arbitrarily small extent. The effect of this restriction of the number of symbols is not very serious. It is always possible to use sequences of symbols in the place of single symbols. … The differences from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance. This is in accordance with experience. We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same.

⊕In 1972, Gödel wrote that he’d found a ‘philosophical error’ in Turing’s analysis (Some remarks on the undecidability results, 1972, in Collected Works, Vol 2, p. 306). He read Turing’s analysis not as pertaining to mechanical calculation alone, but to the mind in general, and hypothesized that the number of mental states may ‘converge towards infinity’ while carrying out a thought process. Stephen Kleene called Gödel’s hypothesis a ‘pie in the sky’, and said that regardless of the applicability of Turing analysis to human cognition in general, it is certainly correct in capturing the notion of an algorithm, because an algorithm is something that can be communicated in its entirety in a finite amount of time — and so must be ‘encoded’ in some finite mental states — and that it is necessarily digital (Kleene, Stephen C., Reflections on Church’s Thesis, 1987, p. 493-494 and Turing’s Analysis of Computability, and Major Applications of It, 1988, in The Universal Turing Machine, p. 46). Given Turing’s future work, however, it is clear that, if not in this paper then in others, Turing would eventually come to believe his analysis to apply to all of human cognition, and that there is no difference between ‘mechanical’ thought and other kinds; that is a view that Gödel rejected.

The behaviour of the computer at any moment is determined by the symbols which he is observing, and his “state of mind” at that moment. We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite. The reasons for this are of the same character as those which restrict the number of symbols. If we admitted an infinity of states of mind, some of them will be ‘‘arbitrarily close” and will be confused. Again, the restriction is not one which seriously affects computation, since the use of more complicated states of mind can be avoided by writing more symbols on the tape.

Let us imagine the operations performed by the computer to be split up into “simple operations” which are so elementary that it is not easy to imagine them further divided. Every such operation consists of some change of the physical system consisting of the computer and his tape. We know the state of the system if we know the sequence of symbols on the tape, which of these are observed by the computer (possibly with a special order), and the state of mind of the computer. We may suppose that in a simple operation not more than one symbol is altered. Any other changes can be split up into simple changes of this kind. The situation in regard to the squares whose symbols may be altered in this way is the same as in regard to the observed squares. We may, therefore, without loss of generality, assume that the squares whose symbols are changed are always “observed” squares.

Besides these changes of symbols, the simple operations must include changes of distribution of observed squares. The new observed squares must be immediately recognisable by the computer. I think it is reasonable to suppose that they can only be squares whose distance from the closest of the immediately previously observed squares does not exceed a certain fixed amount. Let us say that each of the new observed squares is within L squares of an immediately previously observed square.

… Now if these squares are marked only by single symbols there can be only a finite number of them, and we should not upset our theory by adjoining these marked squares to the observed squares. If, on the other hand, they are marked by a sequence of symbols, we cannot regard the process of recognition as a simple process. This is a fundamental point and should be illustrated. In most mathematical papers the equations and theorems are numbered. Normally the numbers do not go beyond (say) 1000. It is, therefore, possible to recognise a theorem at a glance by its number. But if the paper was very long, we might reach Theorem 157767733443477 ; then, further on in the paper, we might find “… hence (applying Theorem 157767733443477) we have … “. In order to make sure which was the relevant theorem we should have to compare the two numbers figure by figure, possibly ticking the figures off in pencil to make sure of their not being counted twice. If in spite of this it is still thought that there are other “immediately recognisable” squares, it does not upset my contention so long as these squares can be found by some process of which my type of machine is capable.

⊕
A blueprint for Alan Turing’s analog zeta function machine (source: The Turing Digital Archive)

… The simple operations must therefore include:

(a) Changes of the symbol on one of the observed squares.

(b) Changes of one of the squares observed to another square within L squares of one of the previously observed squares.

It may be that some of these changes necessarily involve a change of state of mind. The most general single operation must therefore be taken to be one of the following:

(A) A possible change (a) of symbol together with a possible change of state of mind.

(B) A possible change (b) of observed squares, together with a possible change of state of mind.

The operation actually performed is determined… by the state of mind of the computer and the observed symbols. In particular, they determine the state of mind of the computer after the operation is carried out.

We may now construct a machine to do the work of this computer.

Even though Turing assigns no contentual meaning to the “states of mind”, he considers that some may consider even that description too vague, and so provides an alternative:

⊕On Computable Numbers p. 253-254 We suppose… that the computation is carried out on a tape; but we avoid introducing the “state of mind” by considering a more physical and definite counterpart of it. It is always possible for the computer to break off from his work, to go away and forget all about it, and later to come back and go on with it. If he does this he must leave a note of instructions (written in some standard form) explaining how the work is to be continued. This note is the counterpart of the “state of mind”. We will suppose that the computer works in such a desultory manner that he never does more than one step at a sitting. The note of instructions must enable him to carry out one step and write the next note. Thus the state of progress of the computation at any stage is completely determined by the note of instructions and the symbols on the tape.

Crucially, unlike Church who defined calculability using a specific formalism (the λ-calculus), Turing does not define computation through his particular machine construction. Turing’s construction of his “automatic machine” — later called the “Turing machine” by Alonzo Church in his review of Turing’s paper — is constructed completely arbitrarily based only on the assumption that the human mind is limited: he makes arbitrary choices, without loss of generality, on the geometry of the paper, the number of symbols that can be written in a square, the number of symbols that can be observed in one “glance”, the distance between one observation and the next etc.. Through those arbitrary choices, made to make the formalism simple enough to be primitive yet convenient enough for Turing’s further mathematical analysis, Turing makes it clear that any machine that can perform the basic operations outlined — under similar limitations — would serve just as well. In this way, Turing remains mathematically rigorous while at the same time not tying himself to any particular formalism. This arbitrariness of the construction is the source of its generality. Kurt Gödel called it a miracle:

⊕Remarks before the Princeton bicentennial conference on problems in mathematics, 1942, in Collected Works, Vol II, p. 150 Tarski has stressed in his lecture (and I think justly) the great importance of the concept of general recursiveness (or Turing’s computability). It seems to me that this importance is largely due to the fact that with this concept one has for the first time succeeded in giving an absolute definition of an interesting epistemological notion, i.e., one not depending on the formalism chosen. In all other cases treated previously, such as demonstrability or definability, one has been able to define them only relative to a given language, and for each individual language it is clear that the one thus obtained is not the one looked for. For the concept of computability, however, although it is merely a special kind of demonstrability or decidability, the situation is different. By a kind of miracle it is not necessary to distinguish orders, and the diagonal procedure does not lead outside the defined notion.

The final point deserves some explanation. To understand computation (and the general notion of what a formal system is) a clear, precise and convincing philosophical treatment was required. Computation is a process that can be carried out by a finite machine is one such explanation, but lambda definability is not, for the following reason: the lambda calculus is a formal language like any other, that starts with a set of axiomatic inference (or reduction) rules that are chosen arbitrarily. While Church showed that a certain set of axioms leads to the conclusion that no lambda term can determine whether or not another term has a normal form (“terminates”), it gives no justification to why another axiom stating that there exists some constant lambda term, H, that does precisely that (except, of course, for terms that themselves make use of H, thus leading to yet another, “relative” notion of undecidability and a layering of “orders”) must not be added to the language — i.e., the notion of lambda definability that can be “escaped” by a diagonalization procedure like that done by Church in his paper, can then be “filled in” with the addition of another axiom. It would be just as arbitrary as those axioms that are included in the lambda calculus. Turing’s notion of computation and computability, however, is absolute. While Turing himself suggested the notion of a machine consulting an “oracle” that can provide it with answers (and leads to the concept of relative computability), the computation performed by the oracle cannot itself be carried out by any machine, or human, for the reasons Turing had shown. A machine cannot have arbitrary axioms — i.e. capabilities — added to it, as it is bound by external limitations, such as the laws of physics.

The cause of this difference lies in how those two descriptions of computations came about. As we’ve seen, the lambda calculus grew from an attempt to simplify and reduce logic to its bare essentials, and, due to Kleene’s experiments with the formalism, it nevertheless seemed capable of expressing everything logicians intuitively viewed to be computable, leading Church to his conjecture. Adding more axioms, such as the existence of a “halting decider” lambda term to fill-in the hole of undecidable computations felt “unnatural”, just like adding an axiom to fill in the unprovability of Gödel’s unprovable proposition. Turing’s analysis, on the other hand, was different. It is clear that a human or a machine can carry out the operations of Turing’s machine, and it is also clear that — barring some as-of-yet unknown physical principle — this is all a finite mechanism can do.

Turing’s work was conclusive evidence that the incompleteness theorems (the first of which is a simple corollary of the halting theoremSuppose a system rich enough to express arithmetic, and so also Turing machines, were complete; every sentence in that system is provable or refutable. Given a TM, we could then decide whether it halts or not by enumerating all theorems in the system, as we’re guaranteed to eventually find a proof or a refutation — in contradiction with the halting theorem. , and the second is a result of the first) indeed apply to all formal systems, rather than particular formalisms like the predicate calculus or the λ-calculus. In 1964, Gödel added the following note to his 1931 paperAn earlier version of the note, with similar content, was added in 1963, and appears in From Frege to Gödel, p. 616 :

⊕Postscriptum (3 June 1964) to On Undecidable Propositions of Formal Mathematical Systems I (1934) in Collected Works Vol I, p. 369 In consequence of later advances, in particular of the fact that, due to A. M. Turing’s work, a precise and unquestionably adequate definition of the general concept of formal system can now be given, the existence of undecidable arithmetical propositions and the non-demonstrability of the consistency of a system in the same system can now be proved rigorously for every consistent formal system containing a certain amount of finitary number theory.

Turing’s work gives an analysis of the concept of “mechanical procedure” (alias “algorithm” or “computation procedure” or “finite combinatorial procedure”). This concept is shown to be equivalent with that of a “Turing machine”. A formal system can simply be defined to be any mechanical procedure for producing formulas, called provable formulas. For any formal system in this sense there exists one in the sense [discussed] above that has the same provable formulas (and likewise vice versa), provided the term “finite procedure” [mentioned above]… is understood to mean “mechanical procedure”. This meaning, however, is required by the concept of formal system, whose essence it is that reasoning is completely replaced by mechanical operations on formulas. (Note that the question of whether there exist finite non-mechanical procedures, not equivalent with any algorithm, has nothing whatsoever to do with the adequacy of the definition of “formal system” and of “mechanical procedure”.)

Alonzo Church also views this arbitrary nature of the machine as an important factor in his review of Turing’s paper, where he coined the term “Turing machine”:

⊕Review: A.M. Turing On Computable Numbers, 1937 The author proposes as a criterion that an infinite sequence of digits 0 and 1 be “computable” that it shall be possible to devise a computing machine, occupying a finite space and with working parts of finite size, which will write down the sequence to any desired number of terms if allowed to run for a sufficiently long time. As a matter of convenience, certain further restrictions are imposed on the character of the machine, but these are of such a nature as obviously to cause no loss of generality—in particular, a human calculator, provided with pencil and paper and explicit instructions, can be regarded as a kind of Turing machine. It is thus immediately clear that computability, so defined, can be identified with (especially, is no less general than) the notion of effectiveness as it appears in certain mathematical problems…

As a matter of fact, there is involved here the equivalence of three different notions: computability by a Turing machine, general recursiveness in the sense of Herbrand-Godel-Kleene, and λ-definability in the sense of Kleene and the present reviewer. Of these, the first has the advantage of making the identification with effectiveness in the ordinary (not explicitly defined) sense evident immediately—i.e. without the necessity of proving preliminary theorems. The second and third have the advantage of suitability for embodiment in a system of symbolic logic.

This is made especially clear in Church’s review, on the same page, of Emil Post’s paper that came out the same year (see below):

⊕Review: Emil L. Post Finite Combinatory Processes, 1937 The author proposes a definition of “finite 1-process” which is similar in formulation, and in fact equivalent, to computation by a Turing machine (see the preceding review). He does not, however, regard his formulation as certainly to be identified with effectiveness in the ordinary sense, but takes this identification as a “working hypothesis” in need of continual verification. To this the reviewer would object that effectiveness in the ordinary sense has not been given an exact definition, and hence the working hypothesis in question has not an exact meaning. To define effectiveness as computability by an arbitrary machine, subject to restrictions of finiteness, would seem to be an adequate representation of the ordinary notion, and if this is done the need for a working hypothesis disappears.

There appears to be some controversy about whether Turing and Church had a thesis about machines or just about humans doing computations; in other words, whether they personally theorized what today is known as the physical Church-Turing thesis, namely, that all possible physical computational systems can be simulated by Turing machines. Andrew Hodges, author of the definitive (and terrific) Turing biography seems to conclusively settle the matter with a positive answer.Did Church and Turing Have a Thesis about Machines?, 2006

The truly revolutionary aspect of Turing’s paper, however, is the radical break from considering computation only through the perspective of logic, a contentual language (and so modeling the mind with a contentual language). If Frege devised formal logic, his Begriffsschrift, to solve the problem of natural language and create a language for “pure thought” devoid of all but “conceptual content” that enables mechanical inference, Turing took that a step further, and considered system that is completely non-linguistic. In the logic of Frege and his successors, the process of inference alone was formal and non-contentual but had to start from some contentual foundation, and every symbol must mean something (recall Frege’s words, “It is possible, of course, to operate with figures mechanically, just as it is possible to speak like a parrot: but that hardly deserves the name of thought. It only becomes possible at all after the mathematical notation has, as a result of genuine thought, been so developed that it does the thinking for us, so to speak.”The Foundation of Arithmetic, p. xvi and “Everyone who uses words or mathematical symbols makes the claim that they mean something, and no one will expect any sense to emerge from empty symbols.”The Foundation of Arithmetic, p. 22 ). In Turing’s system there is no foundation other than the actions people actually do. What is the meaning of the symbols on the machine’s tape? What concepts do the states of mind represent? Turing doesn’t care; all he needs is for them to be some finite number of mental states. Turing’s formalism of a machine has no a priori meaning — only an interpretation (e.g., the symbols written on the tape are interpreted by a human observer as the decimals of some real number). It is not a language — it’s a natural process. It is not that computation or a formal language is devoid of meaning, but that meaning need not be inherent or an a priori feature of computation or a formal language. Meaning can be ascribed to computation a posteriori as an interpretation, and meaning can emerge from empty symbols. Even more radically, Turing’s machines are not even intended as a way to reach the truth. Unlike Frege’s insistence that the “laws of thought” as captured by logic are prescriptive rather than descriptive, or psychological, Turing’s analysis is entirely descriptive.

In fact, the word language or formalism — so central to Leibniz, Euler, Boole and Frege — is completely absent from Turing’s analysis (except when comparing his work to others). Turing was the first to free computation from the Aristotelian framework of a language for combining concepts, and perhaps ironically, it was this departure from content that uncovered the hidden meaning of a formal system, and convinced Gödel that the concept was now well-understood.

Turing’s explanation immediately clarified why it was that various formal systems appeared “complete” from a computational point of view; why Church’s lambda calculus was complete, why Moses Schönfinkel’s combinatory logic was complete, and why Frege’s and Hilbert’s first-order logic was complete: the contentual meaning of their axioms didn’t matter; all that matters was that they all, through their inference rules, could effect a small set of necessary symbol manipulation operations.

In addition, Turing — who was skeptical of all mathematical foundationsAs an undergraduate, he gave a talk in 1933 of which only the following minutes remain:
The sixth meeting of the Michaelmas term was held in Mr Turing’s rooms in King’s College. A. M. Turing read a paper on ‘Mathematics and logic’. He suggested that a purely logistic view of mathematics was inadequate; and that mathematical propositions possessed a variety of interpretations, of which the logistic was merely one. A discussion followed. (Alan Turing: The Enigma, p. 110) — and certainly of there being a single definitive or best one — finding them to be ultimately unnecessary for mathematics and useful only inasmuch as they serve some practical purpose, was able to avoid any philosophical debate in his system. It is completely neutral as it’s devoid of any a priori meaning, and so rests on no underlying philosophical framework.

In a paper about Turing’s various philosophical beliefs and contributions, professor of philosophy and Turing scholar Juliet Floyd writes:

⊕Turing on “Common Sense”: Cambridge Resonances Turing’s main philosophical contribution [in On Computable Numbers] was to recast and clarify the primordial notion of logic—by which we shall mean here nothing less than all the different routines, subroutines, patterns and regularities that humans design and follow in their procedures of describing, informing, proving, inferring, arguing about and coming to terms with one another and nature. Turing shifted attention away from the aim of “grounding” logic, mathematical truth, or meaning, and toward an emphasis on us—though not primarily, and certainly not only, via a theory of mind.

… [T]he philosophical backdrop at Cambridge, in which the nature of logic, meaning and mathematics were under discussion, brought to the fore intuitive, human aspects of logic as evolving, purposive technology. Logic was approached, not first and foremost axiomatically, but practically and in thought experiments. These features of the backdrop mattered centrally to Turing. If inspiration is to be found for the distinctive characteristics of his analysis of computability in his undergraduate years 1931–1934, it is here, in the philosophical foundations of logic.

… In the Hilbert school, the notion of a “general method” for deciding problems— the key notion at issue in the Entscheidungsproblem—was framed in terms of the idea of a “finitistic”, step-by-step calculation, “derivation” or proof in a formal system of logic. In attacking the intuitive idea of “effective calculability”, logicians of the early 1930s generated a more and more crowded and refined set of different logical systems and approaches, finding themselves awash in ideological, philosophical and heuristic discussions.

… By 1935 a desideratum, given the differing precisifications, was an analysis of the general notion of formal system itself. In order to gain this, it would not have been enough to write down another, slightly different formalism: one had instead to place the very idea of what such writing down is in a new light, one that did not turn on choice of a particular notation, a set of principles, or a specific “metalogic”. What was needed was a philosophical clarification of the first step, a novel way of thinking. This is what Turing provided: he made the notion of a “formal system” plain. Turing’s analysis of computability is, as Hodges remarks, “profoundly ordinary”.

⊕
A slate statue of Alan Turing at the Bletchley Park Museum (source: Flickr)

… Each and every individual Turing Machine may be regarded as a precise mathematical object—in fact, a formal system or system of equations, or a collection of quintuples. But, as Gödel noted, the distinctively wide applicability and robustness of Turing’s analysis of formal logic comes from its other face: the fact that it bypasses entanglement with this or that particular formal language or this or that choice of logic… It reduces our general notion of a formal system to an ordinary, informal comparison between the step-by-step routine of a human calculator and a machine—in other words, to something belonging to “common sense”. In this way Turing avoided entangling himself with dogmatism or ideology, formalism vs. Platonism, intuitionism vs. classical logic, and so on. He showed that it is no part of our general notion of a human being following a (formal) rule that the person must be a classical, or intuitionistic, or any other particular kind of theorist of logic and/or mathematics.

… In order to appreciate its philosophical significance, we must understand—contrary to what many have suggested elsewhere—that Turing’s analysis did not rest on any general theory of language, logic, or mind, even if he utilizes the notion of “state of mind” explicitly in “On Computable Numbers”.

… Turing’s analysis is subject to perfectly precise axiomatic presentation within the theory of computability. So in saying that Turing connected his analysis with an informal model of what we do— thereby doing something of great philosophical interest—we are not saying that he defeated, delimited or circumscribed in any way the mathematician’s autonomy to mathematically clarify mathematics: far from it. Turing—and independently Post, with his “workers”—showed how the finitistic, constructivistic elements of the proof theory of the Hilbert program, motivated by an idea of rigor that aimed to make mathematical arguments humanly controllable, had to be construed in terms of the model of a human computor with limited powers of taking in symbolic congurations in order to be made mathematically precise. It had to be done, this self-reflection, this beginning and ending with a snapshot of what we do in doing mathematics.

… The mathematical precisification had to be done with minimal fuss and maximal range of applicability. Unlike Post and Gödel, Turing did not believe that his proof rested upon, or even necessarily applied to, limits of the human mind per se. Such issues he dodged entirely. Taking up an “anthropological”, rather than an ideological perspective on logic, he was able to bypass or transform the issue of psychological fidelity at the foundations, de-psychologizing logic, thereby leaving his interest in the mind and its capacities to later works.

… It was Turing’s insight into the human purposiveness of the notion of computability specifically—and logic more generally—that enabled him to boil the relevant notions down to their most vivid, ordinary, essential and simple elements. This turns, however, on Turing’s having not offered a general definition of logicality, the nature of thought, or the nature of meaning or mind.

… Quine’s emphasis on syntactic finesse and the use of formalized notation to enunciate a view of the world as a whole stemmed from the Russellian and Carnapian traditions. Turing set his face against these. These philosophies of logic were oriented cosmically, toward an explicit enunciation or renunciation of an ontology of the world as a whole, and the articulation of meaning through logical consequence, implemented, especially in the hands of Quine and Carnap, through emphasis on syntax. Turing’s work stemmed from a different quarter. He was an artful and practical dodger in matters of ontology and meaning, oriented toward the values of use and simplicity all the way down.

What is common to philosophies of mathematics is that they all insist on mathematics being correct: Logicism wishes to base mathematics on self-evidently correct logical foundations (“[B]eing true is different from being taken to be true, whether by one or many or everybody, and in no case is to be reduced to it.”Frege, The Basic Laws of Arithmetic, xv, p. 12-13, 1893 ); Platonism sees the goal of mathematics as uncovering the truth of a world of ideal concepts; Formalism may not care about the essential correctness of the mathematical objects formulas allude to, but insists that the rules of inference themselves be correct which can be established through its consistence with inherently correct “real” propositions (“These rules form a closed system that can be discovered and definitively stated. The fundamental idea of my proof theory is none other than … to make a protocol of the rules according to which our thinking actually proceeds.”Hilbert, The Foundations of Mathematics, 1927, in From Frege to Gödel, p. 475 ); to Intuitionism this correctness of procedures does not suffice, and it is the mathematical objects themselves that must be correct in a strong empirical sense, even if empirical only in principle (“[A]n incorrect theory, even if it cannot be inhibited by any contradiction that would refute it, is none the less incorrect.”Brouwer, On the Significance of the Principle of Excluded Middle in Mathematics, 1923, in From Frege to Gödel, p. 336 ). But for Turing, the crucial question is not “is it correct?” but “does it work?” A formal system derives its value from its utility, not its verisimilitude. That is all the justification math needs.

Floyd also points out something truly remarkable about Turing’s proof of what later came to be known as the halting problem or halting theorem. The most widely known proof of the theorem — as well as its name — is due to Martin Davis. Essentially, we assume the existence of a halting-deciding machine H, and using it, construct a machine G in some specific way, pass G to itself and obtain a contradiction on its result: it should say “yes” if it says “no” and vice versa, hence H cannot exist.

By the way, Turing uses different notions instead of halting and non-halting: A “well-behaved” machine — analogous to a halting machine in the modern formulation — is circle-free, namely, it never repeats a state but, say, continues writing the digits of π forever, while a circular machine either halts or enters a cycle. This distinction is not very important in itself, except to show that our habit of designating non-terminating computations as ill-behaved is not at all intrinsic to the notion of computation but is a bias due to the programming languages we most commonly use, where the most common form of composition corresponds to sequencing of computations, and so most syntactic terms must correspond to terminating computations.

Turing alludes to this proof, but immediately rejects it, or, rather, suggests something different:

⊕On Computable Numbers p. 246 The simplest and most direct proof of this is by showing that, if this general process exists, then there is a machine which computes β[, a non-computable sequence obtained through diagonalization]. This proof, although perfectly sound, has the disadvantage that it may leave the reader with a feeling that “there must be something wrong”. The proof which I shall give has not this disadvantage.

His chosen proof constructs a machine, which, in the order of its operation, must in the k-th step of its algorithm, emit some digit, and that digit is to obtained by simulating itself and observing what digit it emits in the k-th step. In the k-th step, its instruction is, essentially, “do whatever it is that you do”, an instruction that is impossible (for a deterministic machine) to follow. This is not a violation of a logical axiom, not a reductio ad absurdum, but a contradiction of common sense — a machine encounters an instruction that is impossible to follow. It seems like Turing is intent to completely avoid relying even on the least controversial of formal axioms.

Hodges writes:

⊕Alan Turing: The Enigma, pp. 137-138 ⊕
Natural Wonders Every Child Should Know by Edwin Tenney Brewster, a children’s biology book, was given to Turing at age ten and sparked his interest in science. (source: Flickr)[He] had resolved a central question about mathematics, and had done so by crashing in as an unknown and unsophisticated outsider. But it was not only a matter of abstract mathematics, not only a play of symbols, for it involved thinking about what people did in the physical world. It was not exactly science, in the sense of making observations and predictions. All he had done was to set up a new model, a new framework. It was a play of imagination like that of Einstein or von Neumann, doubting the axioms rather than measuring effects. Even his model was not really new, for there were plenty of ideas, even in Natural Wonders, about the brain as a machine, a telephone exchange or an office system. What he had done was to combine such a naive mechanistic picture of the mind with the precise logic of pure mathematics. His machines — soon to be called Turing machines — offered a bridge, a connection between abstract symbols and the physical world. Indeed, his imagery was, for Cambridge, almost shockingly industrial.

Indeed, Max Newman (1897–1984), Turing’s teacher who first got him interested in logic, and fellow WWII codebreaker, wrote in his 1955 memoir of Turing for the Royal SocietyTuring was admitted as a fellow (FRS) in 1951, sponsored by Newman and Bertrand Russell. :

⊕Royal Society Memoir in Collected Works of A. M. Turing, Vol 4, p. 271 [T]hat combination of powerful mathematical analysis and intuitive short cuts… showed him at heart more of an applied than a pure mathematician.

In the same year, 1936, Emil Post (1897-1954) independently published a short 3-page paper in which he described a “worker” remarkably similar to Turing’s machine. He, too, remarks on the issue of meaning:

⊕Finite Combinatory Processes—Formulation 1, pp. 104-105, 1936 We do not concern ourselves here with how the configuration of marked boxes corresponding to a specific problem, and that corresponding to its answer, symbolize the meaningful problem and answer. In fact the above assumes the specific problem to be given in symbolized form by an outside agency and, presumably, the symbolic answer likewise to be received.

… The writer expects the present formulation to turn out to be logically equivalent to recursiveness in the sense of the Gödel-Church development. Its purpose, however, is not only to present a system of a certain logical potency but also, in its restricted field, of psychological fidelity… On the other hand, our aim will be to show that all such are logically reducible to formulation 1. We offer this conclusion at the present moment as a working hypothesis. And to our mind such is Church’s identification of effective calculability with recursiveness. … The success of the above program would, for us, change this hypothesis not so much to a definition or to an axiom but to a natural law. Only so, it seems to the writer, can Gödel’s theorem concerning the incompleteness of symbolic logics of a certain general type and Church’s results on the recursive unsolvability of certain problems be transformed into conclusions concerning all symbolic logics and all methods of solvability.

However, Post had access to Church’s paper and, more importantly, does not provide any justification for his design. Gandy writes:

⊕The Confluence of Ideas in 1936, pp. 89-90 It must seem to us that ‘psychological analysis’ of the steps by which theorems are proved in any system of symbolic logic would lead, by an argument similar to — perhaps simpler than — Turing’s, to a complete justification of Post’s thesis. From the 1941 footnotes it is plain that Post did not think so, although it appears that he did think that Church and Turing had correctly characterized the notion of decision procedure.

… Post does not analyze nor justify his formulation, nor does he indicate any chain of ideas leading to it. … Evidently in 1936 he knew of Church’s work (to which he refers) and this led him to write up and publish his work. The importance he attached to psychological analysis and psychological fidelity leads one to suppose that he must have thought, however tentatively, along the same lines as Turing. More cannot be said.

Turing’s wholehearted adoption of common sense and rejection of the “straightjacket” of dogma, freed scientific thought — at least when it comes to computation — from the monism that had characterized it at least since Leibniz. Meaning is not inherent and fixed, it is a separate act of interpretation of a phenomenon.In another post I showed how we can literally measure — at least by proxy — the amount of a priori meaning is built into a computational model or a programming language. The same approach would be seen in Turing’s later discussion of the “imitation game” (now more commonly known as the “Turing test”). What matters isn’t some inherent quality of the actors — man or machine — but the interpretation we ascribe to their actual behavior. What we know, what common sense necessitates is merely an action, a behavior, taking place, to which one is later free to assign various interpretive meanings. “To speak like a parrot”, to quote Frege, does deserves the name of thought, as thought is an attribute we assign to behavior as an interpretation. Whether “thought” has some intrinsic qualities is one of the philosophical questions — like the source of math’s correctness — was one of those philosophical questions that Turing sought to avoid. Hodges notes that Turing’s view was also largely due to his general inclination to avoid comitting to a specific philosophy or answering hard philosophical questions.

⊕Turing provided us with two interesting philosophical tools to grapple with the question of mind-body duality. One is the concept of non-computability (later refined, by computational complexity theory, to the concept of infeasibility) that places strict limits on reduction. That a process is really just bits printed on a tape does not mean that the machine printing the bits explains the behavior, even though it creates it, thanks to the halting theorem. In other words, while the game of Super Mario Bros. — let alone the human brain — is indeed a result of electric processes, it cannot be reduced to such physical process in any usefully explanatory way (i.e., the physics guiding the real electrons in the Nintendo box are not a useful explanation for the mechanics of the simulated Super Mario world). The other is the notion of computational universality, which means that both our internal constructions of the mind and the workings of the entire universe can both be placed on an equal footing, despite one existing inside the other. The barrier of non-computability means that an external universe cannot predetermine the workings of simulations created within it.

This post-hoc interpretation can be seen in On Computable Numbers notion of compilation (of first-order logic and the λ-calculus to Turing machines). One may argue that the concept had certainly existed in Babbage’s treatment, and to a far lesser degree in all preceding mechanical calculators, but Turing gives this notion a thorough, rigorous treatment. As a first approximation, we think of compilation as translation, but in computing — and certainly in Turing’s examples — compilation is often deeply different from natural language translation, and not only because it is a purely mechanical process. When we translate from one natural language to another, what we want to preserve is, first and foremost, the conceptual content of the text (and as a secondary goal, the style, too). Some are delighted to point out correspondences or even “isomorphisms” between concepts in different formalisms,like the correspondence between typed lambda calculi and natural deduction in intuitionistic logic called propositions-as-types or the Curry-Howard correspondence, where types correspond to propositions and reductions correspond to inferences. This is a conceptual translation between those respective concepts (which arises unsurprisingly, as we’ve seen, because it is a correspondence between systems that ultimately were built to describe the same conceptual framework), whereas Turing’s correspondence between proofs and Turing machines, is of a very different nature. Propositions and proofs, or even functions and sets, don’t have any corresponding concepts in the system of Turing machines. Turing machines don’t attempt to represent any concept at all. Nevertheless, contentual frameworks can still be mapped to Turing machines and back, because those concepts may be viewed as interpretations of certain physical phenomena, rather than essential components without which computation cannot exist.

This idea wasn’t entirely new in the realm of logic. Set theory encodes various concepts, such as numbers and functions, as sets, and Gödel numbering encodes propositions as numbers, but in both cases one primitive concept — that of a set or a number — which normally means one thing, is used to encode another. When Leibniz and Boole spoke of algebra as the science of symbol manipulation, the symbols stood for contentual concepts, and even when Hilbert spoke of some ideal propositions being nothing more than text and of mathematics as a symbol-manipulation game, the rules of the game, the axioms, were created to represent intuitive, contentual notions. But the symbols on the tape mean absolutely nothing in themselves. This is liberating. We can try on multiple subjective interpretations and multiple conceptual frameworks, each appealing to some under some circumstances, yet still be talking about the same objective phenomenon.

Turing realized the importance of programming languages (contentual and intended for human programmers) — whose design would draw inspiration from formal logic systems — in the future of computing, saying:

⊕Lecture to the London Mathematieal Society on 20 February 1947, p. 12 ⊕
From Alan Turing;s, Programmers’ handbook for Manchester electronic computer. Mark II (source: The Turing Digital Archive)I expect that digital computing machines will eventually stimulate a considerable interest in symbolic logic and mathematical philosophy. The language in which one communicates with these machines, i.e. the language of instruction tables, forms a sort of symbolic logic. The machine interprets whatever it is told in a quite definite manner without any sense of humour or sense of proportion. Unless in communicating with it one says exactly what one means, trouble is bound to result. Actually one could communicate with these machines in any language provided it was an exact language, i.e. in principle one should be able to communicate in any symbolic logic, provided that the machine were given instruction tables which would enable it to interpret that logical system. This would mean that there will be much more practical scope for logical systems than there has been in the past. Some attempts will probably be made to get the machine to do actual manipulations of mathematical formulae. To do so will require the development of a special logical system for the purpose. This system should resemble normal mathematical procedure closely, but at the same time should be as unambiguous as possible. As regards mathematical philosophy, since the machines will be doing more and more mathematics themselves, the centre of gravity of the human interest will be driven further and further into philosophical questions of what can in principle be done etc.

Few people were better suited than Alan Turing for the task of designing the first programming languages, being an expert in computer hardware design, programming (he was one the world’s very first programmers, if not the very first, and was the first to introduce the idea of a subroutine, for which he designed a specific machine instruction) and symbolic logic, including the λ-calculus (e.g., Turing was the first to prove that the simply typed λ-calculus is strongly-normalizingSee Collected Works of A. M. Turing, Vol. 4: Mathematical Logic, p. 201 ). But he chose not to pursue that direction, which did not hold his interest.

Writing On Computable Numbers seems to have had a powerful effect on Turing’s personal beliefs. In 1933, in the letter to the mother of his deceased high-school friend, Christopher Morcom, Turing wrote:

⊕Alan Turing: The Enigma, p. 97 20/4/33

My dear Mrs Morcom,

… Chris is in some way alive now. One is perhaps too inclined to think only of him alive at some future time when we shall meet him again; but it is really so much more helpful to think of him as just separated from us for the present.

But by 1936, his thoughts were different:

⊕Alan Turing: The Enigma, p. 138 Obviously there was a connection between the Turing machine and his earlier concern with the problem of Laplacian determinism. The relationship was indirect. For one thing, it might be argued that the ‘spirit’ he had thought about was not the ‘mind’ that performed intellectual tasks. For another, the description of the Turing machines had nothing to do with physics. Nevertheless, he had gone out of his way to set down a thesis of ‘finitely many mental states’, a thesis implying material basis to the mind, rather than stick to the safer ‘instruction note’ argument. ⊕
From Alan Turing’s Intelligent Machinery manuscript, with handwritten additions and corrections (source: The Turing Digital Archive) And it would appear that by 1936 he had indeed ceased to believe in the ideas that he had described to Mrs Morcom as ‘helpful’ as late as 1933 — ideas of spiritual survival and spiritual communication. He would soon emerge as a forceful exponent of the materialist view and identify himself as an atheist. Christopher Morcom had died a second death, and Computable Numbers marked his passing.

Turing went on to investigate and write about what today we’d call machine learning and artificial intelligence (thus establishing those fields), toying particularly with neural networks. Unlike Leibniz who believed that only much of our thinking is made of “blind thoughts”, Turing showed — although he only explicitly claimed so in his later writing — that all thought could be completely mechanistic, or “blind”, or, at least, indistinguishable from it. In a sense, Turing reconciled Decartes’s psychologistic view of logic and Leibniz’s formalistic, linguistic view, by essentially showing they are one and the same. While a particular language may not guide our thoughts, many formalisms are universal in the sense of being able to describe any kind of human cognition.

Computation is not just a faculty of the human mind, but the human mind is what the notion of computation, from Hobbes on, has tried to describe. Turing was indeed particularly drawn to artificial intelligence — writing,A letter to William Ross Ashby, 1946 ⊕
Coloured diagrams showing patterns of dappling and calculations, made by AMT in connection with work on morphogenesis (The Chemical Basis of Morphogenesis, 1952), his attempt to understand the mathematical and computational foundation of life and his experiments on artificial life simulations (source: The Turing Digital Archive) “I am more interested in the possibility of producing models of the action of the brain than in the practical applications to computing” — but the idea (even if we disregard Hobbes’s explicit reference to the Arificial Man) of a precise, mechanical description of human reasoning that may aid or replace humans in its application has been a main motivation not only of Turing’s, but of Leibniz, Boole, Jevons, Peirce, Frege, and the principal driving force in the development of the logic, algebra and computation.

Deciding, as was his way, that perhaps prior to modeling the brain it’s best to understand how complex biological organisms are formed, Alan Turing then turned his focus to the formation of life and the problem of morphogenesis, establishing in the process the fields of mathematical and computational biology. Turing’s simulation biological systems on the Manchester University computer, one of the first electronic computers built — thus fulfilling Leibniz’s dream that “[o]ne could also give exact descriptions of natural things by means of it, such, for example, as the structure of plants and animals” — was one of the earliest uses of computers.

In the previous chapter I mentioned that formal logic in the style of Frege and Hilbert suffered from two problems: dogmatism in the choice of axioms and increasing detachment from the work of practicing mathematicians. Turing managed to avoid dogmatism in his own work, even though he, too, didn’t put an end to quarrel among men, but he was also concerned with logic growing arcane and further away from Leibniz’s dream of formal logic helping mathematicians and scientists express themselves, becoming “forbidding and alien” (a la Post) and “refractory” (a la von Neumann). Turing wrote:

⊕The Reform of Mathematical Notation and Phraseology, 1944, unpublished, in Collected Works of A. M. Turing: Mathematical Logic, p. 215. Turing had written some papers on type theory, and in this one his goal was to describe a ‘naive’ type theory, in order to make it accessible to practicing mathematicians. It has long been recognised that mathematics and logic are virtually the same and that they may be expected to merge imperceptibly into one anotherJuliet Floyd remarks that Turing here was not arguing for logicism as a conceptual or metaphysical doctrine in the foundations of mathematics. He was making an observation. Turing’s aim was to revise both symbolic logic and mathematics to effect a better ‘merging’, where this would be a matter of degree rather than a matter of either a sharply ‘perceptible’ line or a reduction of one subject to the other. — Turing on ‘Common Sense’: Cambridge Resonances p. 42 . Actually this merging process has not gone at all far, and mathematics has profited very little from researches in symbolic logic. The chief reasons for this seem to be a lack of liaison between the logician and the mathematician-in-the-street. Symbolic logic is a very alarming mouthful for most mathematicians, and the logicians are not very much interested in making it more palatable.

… In particular it seems that symbolic logic will help the mathematicians to improve their notation and phraseology, which are at present exceedingly unsystematic, and constitute a definite handicap both to the would-be-learner and to the writer who is unable to express ideas because the necessary notation for expressing them is not widely known.

… It would not be advisable to let the reform take the form of a cast-iron logical system into which all the mathematics of the future are to be expressed. No democratic mathematical community would stand for such an idea, nor would it be desirable. Instead one must put forward a number of definite small suggestions for improvement, each backed up by good argument and examples. It should be possible for each suggestion to be adopted singly. Under these circumstances one may hope that some of the suggestions will be adopted in one quarter or another, and that the use of all will spread.

Although it is not desirable to try and put mathematics into the straight-jacket of a logical system, it may be desirable to use such a system when investigating notation. One is likely to be taking typical phrases from mathematical text-books and analysing their meaning. It is useful to have a logical system for expressing these meanings in a fairly unambiguous way. It may not greatly matter what system is used for this purpose, and it would be quite possible for different workers to use different systems.

In this endeavor, Turing was far less successful.

In any event, just as Frege separated logic from algebra by seeking not to embed logic in mathematics but to construct logic as something more basic, more primitive, than mathematics, Turing, by foregoing a contentual language, separated computation from logic, and established it as something more basic, more primitive — a process that precedes meaning. A logic (or a programming languages) is a formalisms on top of the more foundational notion of computation that imbues it with one particular meaning out of many possible ones. Computation is the meaningless substrate on which meaningful logic is built. Computation, logic and abstract algebra, that began as a single discipline, have now become three separate ones, but with a deep connection due to their common provenance.

Conceptions of Intelligence

Warren McCulloch (1898-1969), a neurophysiologist and aspiring philosopher, and Walter Pitts (1923-1969), a math wunderkind who, at age 12, after finding some mistakes in the Principia Mathematica, was invited by Bertrand Russell to come to Cambridge as a graduate student, made a rather unique pair of researchers. McCulloch, who was fascinated by Russell and Whitehead’s Principia, considered his work “experimental epistemology”, seeking to understand “the physiological substrate of knowledge.Mind as Machine: A History of Cognitive Science, p. 184 He recalled:

⊕Mind as Machine: A History of Cognitive Science, p. 184

In the fall of 1917, I entered Haverford College with two strings to my bow — facility in Latin and a sure foundation in mathematics. I ‘‘honored’’ in the latter and was seduced by it. That winter [the Quaker philosopher] Rufus Jones called me in. ‘‘Warren’’, said he, ‘‘what is thee going to be?’’ And I said ‘‘I don’t know.’’ ‘‘And what is thee going to do?’’ And again I said, ‘‘I have no idea; but there is one question I would like to answer: What is a number, that a man may know it, and a man, that he may know a number?’’ He smiled and said, ‘‘Friend, thee will be busy as long as thee lives.’’

It was Pitts who told him about Leibniz’s attempt to mechanize logic and reason:

⊕Jerome Lettvin, ‘Walter Pitts’ in The MIT Encyclopedia of the Cognitive Sciences, p. 651 In late 1942, after weeks of reviewing the material in neurophysiology, Pitts told McCulloch of Leibniz’s dictum that any task which can be described completely and unambiguously by a finite set of terms can be performed by a logical machine. Six years earlier Turing had published his marvelous essay on the universal computing engine. The analogy of neurons (as pulsatile rather than two-state devices) to the elements of a logical machine was inescapable.

In 1943 the pair published a seminal paper about a mechanistic model of the brain, inspired by Turing’s machines, formal logic and neurophysiology. Their paper echoes and revisits Peirce’s claim about logic in the very structure of the brain, making it into a research hypothesis:

⊕A Logical Calculus of the Ideas Immanent in Nervous Activity, 1943 Theoretical neurophysiology rests on certain cardinal assumptions. The nervous system is a net of neurons, each having a soma and an axon. Their adjunctions, or synapses, are always between the axon of one neuron and the soma of another. At any instant a neuron has some threshold, which excitation must exceed to initiate an impulse. This, except for the fact and the time of its occurence, is determined by the neuron, not by the excitation. From the point of excitation the impulse is propagated to all parts of the neuron.

… Many years ago one of us, by considerations impertinent to this argument, was led to conceive of the response of any neuron as factually equivalent to a proposition which proposed its adequate stimulus. He therefore attempted to record the behavior of complicated nets in the notation of the symbolic logic of propositions. The “all-or-none” law of nervous activity is sufficient to insure that the activity of any neuron may be represented as a proposition. Physiological relations existing among nervous activities correspond, of course, to relations among the propositions; and the utility of the representation depends upon the identity of these relations with those of the logic of propositions. To each reaction of any neuron there is a corresponding assertion of a simple proposition. This, in turn, implies either some other simple proposition or the disjunction of the conjunction, with or without negation, of similar propositions, according to the configuration of the synapses upon and the threshold of the neuron in question. Two difficulties appeared. The first concerns facilitation and extinction, in which antecedent activity temporarily alters responsiveness to subsequent stimulation of one and the same part of the net. The second concerns learning, in which activities concurrent at some previous time have altered the net permanently, so that a stimulus which would previously have been inadequate is now adequate. But for nets undergoing both alterations, we can substitute equivalent fictitious nets composed of neurons whose connections and thresholds are unaltered. But one point must be made clear: neither of us conceives the formal equivalence to be a factual explanation. Per contra!—we regard facilitation and extinction as dependent upon continuous changes in threshold related to electrical and chemical variables, such as after-potentials and ionic concentrations; and learning as an enduring change which can survive sleep, anaesthesia, convulsions and coma. The importance of the formal equivalence lies in this: that the alterations actually underlying facilitation, extinction and learning in no way affect the conclusions which follow from the formal treatment of the activity of nervous nets, and the relations of the corresponding propositions remain those of the logic of propositions.

⊕
McCulloch and Pitts’s nervous nets, from Logical Calculus for Nervous Activity.

… It is easily shown: first, that every net, if furnished with a tape, scanners connected to afferents, and suitable efferents to perform the necessary motor-operations, can compute only such numbers as can a Turing machine; second, that each of the latter numbers can be computed by such a net; and that nets with circles can be computed by such a net; and that nets with circles can compute, without scanners and a tape, some of the numbers the machine can, but no others, and not all of them. This is of interest as affording a psychological justification of the Turing definition of computability and its equivalents, Church’s λ-definability and Kleene’s primitive recursiveness: if any number can be computed by an organism, it is computable by these definitions, and conversely.

Causality, which requires description of states and a law of necessary connection relating them, has appeared in several forms in several sciences, but never, except in statistics, has it been as irreciprocal as in this theory. Specification for any one time of afferent stimulation and of the activity of all constituent neurons, each an “all-or-none” affair, determines the state. Specification of the nervous net provides the law of necessary connection whereby one can compute from the description of any state that of the succeeding state, but the inclusion of disjunctive relations prevents complete determination of the one before. Moreover, the regenerative activity of constituent circles renders reference indefinite as to time past. Thus our knowledge of the world, including ourselves, is incomplete as to space and indefinite as to time. This ignorance, implicit in all our brains, is the counterpart of the abstraction which renders our knowledge useful. The role of brains in determining the epistemic relations of our theories to our observations and of these to the facts is all too clear, for it is apparent that every idea and every sensation is realized by activity within that net, and by no such activity are the actual afferents fully determined.

… To psychology, however defined, specification of the net would contribute all that could be achieved in that field-even if the analysis were pushed to ultimate psychic units or “psychons,” for a psychon can be no less than the activity of a single neuron. Since that activity is inherently propositional, all psychic events have an intentional, or “semiotic,” character. The “all-or-none” law of these activities, and the conformity of their relations to those of the logic of propositions, insure that the relations of psychons are those of the two-valued logic of propositions. Thus in psychology, introspective, behavioristic or physiological, the fundamental relations are those of two-valued logic.

… From the irreciprocity of causality it follows that even if the net be known, though we may predict future from present activities, we can deduce neither afferent from central, nor central from efferent, nor past from present activities–conclusions which are reinforced by the contradictory testimony of eye-witnesses, by the difficulty of diagnosing differentially the organically diseased, the hysteric and the malingerer, and by comparing one’s own memories or recollections with his contemporaneous records.

… Certainly for the psychiatrist it is more to the point that in such systems “Mind” no longer “goes more ghostly than a ghost.” Instead, diseased mentality can be understood without loss of scope or rigor, in the scientific terms of neurophysiology.

The work of Turing, McCullogh and Pitts inspired the mathematician and philosopher Norbert Wiener (1894-1964) to envision what he saw as a new science and perhaps a social movement, Cybernetics, which he defined as “Control and Communication in the Animal and the Machine,” and considered information and logic to be the most appropriate ingredients of useful descriptions of the world. Cybernetics would later become the social and scientific field of artificial intelligence:

⊕Wiener, Cybernetics, or Control and Communication in the Animal and the Machine, 1948, p. 12 At this point there enters an element which occurs repeatedly in the history of cybernetics–the influence of mathematical logic. If I were to choose a patron saint for cybernetics out of the history of science, I should have to choose Leibniz. The philosophy of Leibniz centers about two closely related concepts–that of a universal symbolism and that of a calculus of reasoning. From these are descended the mathematical notation and the symbolic logic of the present day. Now, just as the calculus of arithmetic lends itself to a mechanization progressing through the abacus and the desk computing machine to the ultra-rapid computing machines of the present day, so the calculus ratiocinator of Leibniz contains the germs of the machina ratiocinatrix, the reasoning machine. Indeed, Leibniz himself, like his predecessor Pascal, was interested in the construction of computing machines in the metal. It is therefore not in the least surprising that the same intellectual impulse which has led to the development of mathematical logic has at the same time led to the ideal or actual mechanization of processes of thought.

⊕pp. 124-125 We have already spoken of the computing machine, and consequently the brain, as a logical machine. It is by no means trivial to consider the light cast on logic by such machines, both natural and artificial. Here the chief work is that of Turing. We have said before that the machina ratiocinatrix is nothing but the calculus ratiocinator of Leibniz with an engine in it; and just as modern mathematical logic begins with this calculus, so it is inevitable that its present engineering development should cast a new light on logic. The science of today is operational; that is, it considers every statement as essentially concerned with possible experiments or observable processes. According to this, the study of logic must reduce to the study of the logical machine, whether nervous or mechanical, with all its non-removable limitations and imperfections.

It may be said by some readers that this reduces logic to psychology, and that the two sciences are observably and demonstrably different. This is true in the sense that many psychological states and sequences of thought do not conform to the canons of logic. Psychology contains much that is foreign to logic, but–and this is the important fact–any logic which means anything to us can contain nothing which the human mind–and hence the human nervous system–is unable to encompass. All logic is limited by the limitations of the human mind when it is engaged in that activity known as logical thinking.

Hodges writes that,

⊕Hodges, Alan Turing: The Enigma, pp. 507-508 Wiener regarded Alan as a cybernetician, and indeed ‘cybernetics’ came close to giving a name to the range of concerns that had long gripped him, which the war had given him an opportunity to develop, and which did not fit into any existing academic category. In spring 1947, on his way to Nancy, Wiener had been able to ‘talk over the fundamental ideas of cybernetics with Mr Turing,’ as he explained in the introduction to his book.

… Alan was more than a match for Wiener, and although genuinely sharing many common interests, their outlooks were different. Wiener had an empire-building tendency which rendered almost every department of human endeavour into a branch of cybernetics. Another difference lay in Wiener’s total lack of a sense of humour. While Alan always managed to convey his solid ideas with a light English touch of wit, Wiener delivered with awesome solemnity some pretty transient suggestions, to the general effect that solutions to fundamental problems in psychology lay just around the corner, rather than putting them at least fifty years in the future. Thus in Cybernetics it was seriously suggested that McCulloch and Pitts had solved the problem of how the brain performed visual pattern recognition. The cybernetic movement was rather liable to such over-optimistic stabs in the dark. One story going around, which later turned out to be a hoax, but which found its way into serious literature, was of an experiment supposed to measure the memory capacity of the brain by hypnotising bricklayers and asking them such questions as ‘What shape was the crack in the fifteenth brick on the fourth row above the damp course in such and such a house?’. Alan’s reaction to these cybernetic tales was one of amusement.

When McCulloch visited Turing in Manchester in 1949, Hodges reports,Alan Turing: The Enigma, p. 518 Turing thought him ‘a charlatan’. Nevertheless, it is probably McCulloch and Pitts paper (and not Shannon’s work) which inspired the electronic designs of both von Neumann’s and Turing’s computers.

As logic has always sought to describe the operations of the mind — although, at least until Turing, thoughts mediated by language — and as the view of the mind and its operation changed considerably as a result of Turing’s work, have there been attempts at creating new kinds of logic to represent the new perspective? Indeed there have. For example, in 1948, John von Neumann imagined new kinds of logic that would be appropriate for describing automata — the name given to a generalization of abstract machines in the style of Turing’s machine:

⊕The General and Logical Theory of Automata, 1948/1951, in Collected Works, Volume 5, pp. 302-304 THE FUTURE LOGICAL THEORY OF AUTOMATA

We are very far from possessing a theory of automata which deserves that name, that is, a properly mathematical-logical theory. There exists today a very elaborate system of formal logic, and, specifically, of logic as applied to mathematics. This is a discipline with many good sides, but also with certain serious weaknesses. This is not the occasion to enlarge upon the good sides, which I have certainly no intention to belittle. About the inadequacies, however, this may be said: Everybody who has worked in formal logic will confirm that it is one of the technically most refractory parts of mathematics. The reason for this is that it deals with rigid, all-or-none concepts, and has very little contact with the continuous concept of the real or of the complex number, that is, with mathematical analysis. Yet analysis is the technically most successful and best-elaborated part of mathematics. Thus formal logic is, by the nature of its approach, cut off from the best cultivated portions of mathematics, and forced onto the most difficult part of the mathematical terrain, into combinatorics.

The theory of automata, of the digital, all-or-none type, as discussed up to now, is certainly a chapter in formal logic. It would, therefore, seem that it will have to share this unattractive property of formal logic. It will have to be, from the mathematical point of view, combinatorial rather than analytical.

Now it seems to me that this will in fact not be the case. In studying the functioning of automata, it is clearly necessary to pay attention to a circumstance which has never before made its appearance in formal logic.

Throughout all modern logic, the only thing that is important is whether a result can be achieved in a finite number of elementary steps or not. The size of the number of steps which are required, on the other hand, is hardly ever a concern of formal logic. Any finite sequence of correct steps is, as a matter of principle, as good as any other. It is a matter of no consequence whether the number is small or large, or even so large that it couldn’t possibly be carried out in a lifetime, or in the presumptive lifetime of the stellar universe as we know it. In dealing with automata, this statement must be significantly modified. In the case of an automaton the thing which matters is not only whether it can reach a certain result in a finite number of steps at all but also how many such steps are needed.

… Thus the logic of automata will differ from the present system of formal logic in two relevant respects.

The actual length of “chains of reasoning,” that is, of the chains of operations, will have to be considered.

The operations of logic (syllogisms, conjunctions, disjunctions, negations, etc., that is, in the terminology that is customary for automata, various forms of gating, coincidence, anti-coincidence, blocking, etc., actions) will all have to be treated by procedures which allow exceptions (malfunctions) with low but non-zero probabilities. All of this will lead to theories which are much less rigidly of an all-or-none nature than past and present formal logic. They will be of a much less combinatorial, and much more analytical, character. In fact, there are numerous indications to make us believe that this new system of formal logic will move closer to another discipline which has been little linked in the past with logic. This is thermodynamics, primarily in the form it was received from Boltzmann, and is that part of theoretical physics which comes nearest in some of its aspects to manipulating and measuring information. Its techniques are indeed much more analytical than combinatorial…

All of this re-emphasizes the conclusion… that a detailed, highly mathematical, and more specifically analytical, theory of automata and of information is needed.

A logic of automata began to be realized when Amir Pnueli (1941-2009) introduced temporal logic (first developed in the 1950s) to computer science.The Temporal Logic of Programs, 1977 While not completely in line with von Neumann’s goals, temporal logic can be seen as a fusion of formal logic and dynamical systems (from analysis). Temporal logic has since become the logic most widely used when reasoning about computer programs, especially reactive (programs that continuously interact with their environment) and concurrent programs.

As to the “the actual length of ‘chains of reasoning’”, in 1956, Gödel sent a letter to (the then sick) von Neumann, echoing Hardy and von Neumanns’s point about the possibility of computation making mathematicians redundant:

⊕A Letter to John von Neumann, March 1956 One can obviously easily construct a Turing machine, which for every formula $F$ in first order predicate logic and every natural number $n$, allows one to decide if there is a proof of $F$ of length $n$ (length = number of symbols)This letter is considered to be the first formulation of the P vs. NP problem. . Let $\Psi(F,n)$ be the number of steps the machine requires for this and let $\varphi(n) = max_F \Psi(F, n)$. The question is how fast $\varphi(n)$ grows for an optimal machine. One can show that $\varphi(n) \geq k \cdot n$. If there really were a machine with $\varphi(n) \sim k \cdot n$ (or even $\sim k \cdot n^2$), this would have consequences of the greatest importance. Namely, it would obviously mean that in spite of the undecidability of the Entscheidungsproblem, the mental work of a mathematician concerning Yes-or-No questions could be completely replaced by a machine. After all, one would simply have to choose the natural number $n$ so large that when the machine does not deliver a result, it makes no sense to think more about the problem. Now it seems to me, however, to be completely within the realm of possibility that $\varphi(n)$ grows that slowly. … It would be interesting to know, for instance, the situation concerning the determination of primality of a number and how strongly in general the number of steps in finite combinatorial problems can be reduced with respect to simple exhaustive search.

A theory of automata where “the actual length of ‘chains of reasoning’” is of the utmost concern was indeed developed about a decade later, starting with the work of Juris Hartmanis (1928-) and Richard E. Stearns (1936-), who published the time hierarchy theorem,On The Computational Complexity of Algorithms, 1965 a generalization of the halting theorem, which (roughly) shows that for every computable function $f$ on the natural numbers, there exists a problem whose inherent time complexity grows faster than $f$ (i.e., it cannot be solved by a Turing machine in under $f(n)$ for an input of size $n$). That result was not inspired by any work in formal logic, but rather with the view of computation as a natural phenomenon. Some years later, Stephen Cook (1939-) showedThe Complexity of Theorem Proving Procedures, 1971 that mechanically proving problems expressed even in the simplest of all logics, the propositional calculus, cannot, in general, be feasibly accomplished (unless P = NP etc.). These developments led to some of the most fundamental areas of research – and discoveries – in modern theoretical computer science, but all that is part of another story, best told at another time.

Epilogue

On a question-and-answer website I found the following question: Do aliens have LISP or Scheme?

Presumably, the question refers to intelligent aliens, and the accepted answers is that they inevitably do, because those aliens would be bound to discover the very relationship between computation, logic and algebra that’s been a focus of this text, and that since this relationship is somehow best expressed by the λ-calculus (or so the answer’s author believes), those aliens would undoubtedly find it a useful programming paradigm and from there the road to Lisp is short. In light of the actual history of computation, logic and algebra, I am not so certain we should answer in the positive so definitively. But the question, as asked, is very hard to answer given that we don’t even know what intelligence is, let alone how it manifests in extra-terrestrial species and whether it would bear a strong resemblance to our own. We could, however, ask a different question: “do all programming entities have a LISP?” Where, by “programming” I don’t mean anything that creates a physical system that could be considered computational — as they all could be — but only things that create computational systems that share the properties that compositional programming languages — inspired by or derived from compositional logics — were created to encourage: modularity, maintainability, abstraction. It is those desired properties that supposedly drives us, for convenience reasons, to adopt programming languages based on the composition of humanly-intuitive concepts. We can even be more generous and relax the requirement on Lisp — let’s chalk the belief that the λ-calculus is somehow the ultimate formalism for the composability of concepts up to a limited familiarity with the general principles of logic, and so why not a language based on a more Hilbert-like calculus, like, say, Prolog or SQL, or a language based on temporal logic, like Esterel, or one based on Hoare logic, like C or Java, or one based on a process calculus — and instead ask whether any formalism in the compositional spirit of the Aristotelian framework would necessarily be used.

The answer to that question, even in this very general form, would be a definitive “no.” Two of the most elaborate, impressive and successful computing systems we know of are life and the brain. The former was “programmed” by evolution and perhaps by other forces as well, as well; the latter was “programmed” by the former. In both cases, the systems clearly exhibit behavior that suggests abstraction (e.g., legs are constructed quite differently in different species, yet perform similar functions), modularity and composition (cells, organs, symbiosis, brain regions), maintainability and even meta-programming (chromosomal crossover, epigenetics, learning). Both even exhibit remarkable ability to handle concurrent tasks and to exploit parallelism for efficiency. And yet, while life is programmed as a composition of proteins, and brains are a composition of neurons and their synapses, there is no indication that this programming model has any kind of “compositional semantics”, namely that the composition of “terms” by which the program is built (proteins, neurons) corresponds directly to the composition of concepts (legs, ability to speak). That those systems are programmed so differently from software is what makes them hard to reverse-engineer and makes genetic engineering and mind reading or control so challenging.

So of three examples we have of programming entities — evolution (and possibly something else) programming life, life programming brains, and brains, i.e. humans programming computers — humans so far seem to be the minority, preferring the Aristotelian framework. It is therefore clear that the programming language of choice is tied to the nature of the programmer and is not at all universal. Claims like “Type theory and the lambda calculus are eternal, and trans-universal,” or, “The genomics of programming is type theory, the systematic study of computation,” seem unsubstantiated or a result of conflating programming — the indirect description of computation in some contentual, meaningful language, where, as in logic, the computation is usually just the sense of the language¹ that has some other mathematical denotation (be it Horn-clause predicates as in Prolog, partial functions as in Haskell or predicate transformers as in Java) — with the more basic, mechanical and inherently meaningless computation, explicitly described by Frege as the mechanical operation of his inference rules and later analyzed and understood by Turing. Those claims would have been more reasonable had they been made prior to Alan Turing’s isolation of the concept of computation that separated it from logic (or programming).

The computer scientist Leslie Lamport says that programmers are prone to what he calls the Whorfian Syndrome (after the Sapir-Whorf hypothesis of linguistic relativity) — confusing language with reality, or a description with that being described — perhaps because programmers create a reality with language. Separating the two is hard, and it is not surprising that until Turing, logicians thought that the language people used is a precise model of their thoughts. Realizing that thought could be something more basic and that language is just one possible interpretation took millennia, but it was just what was needed to understand how very different languages can be similarly expressive.

The ontological concept turned out to be computation — but not logic (the assignment of a meaning to computation through the use of a language) — and perhaps an even more fundamental notion that we find over and over in algebra and mathematics in general: the “axiom of causality,” which gives rise to the notions of transitivity (and associativity), because if A (necessarily/possibly) causes B, and B (necessarily/possibly) causes C, then A (necessarily/possibly) causes C. ²

The digital computer, designed by Turing, von Neumann and others, has indeed turned out to be “a new kind of instrument [to] increase the power of the mind,” as Leibniz predicted, and an invention as useful as the magnetic needle, although it has not — so far — put an end to disputes among men and to “that burdensome raising of objections”, nor has it ushered an era of religious piety. But at the same time, Turing’s development — and the subsequent work on complexity theory — demonstrated the limitations of logic and computation. Leibniz’s “mechanical thread of meditation”, cannot, as it turned out, “easily resolve any idea whatever”. It is to be wielded not with arrogance, but with humility.

Bibliography

Primary Sources

Church, Alonzo, A Set of Postulates for the Foundation of Logic, 1932
Church, Alonzo, An Unsolvable Problem of Elementary Number Theory, 1936
Church, Alonzo, Review: A. M. Turing, On Computable Numbers, with an Application to the Entscheidungsproblem, 1937
Church, Alonzo, Review: Emil L. Post, Finite Combinatory Processes—Formulation 1, 1937
Cook, Stephen A., The Complexity of Theorem-Proving Procedures, 1971
Curry, Haskell, The Inconsistency of Certain Formal Logics, 1942
Frege, Gottlob, Montgomery Furth, tran., The Basic Laws of Arithmetic, 1964
Frege, Gottlob, J. L. Austin tran., The Foundations of Arithmetic, 2nd ed. 1980
Frege, Gottlob, On the purpose of the begriffsschrift, Australasian Journal of Philosophy, 1968
Frege, Gottlob, On Sense and Reference, 1892
Girard, Jean-Yves, Proofs and Types, 1989
Gödel, Kurt, Solomon Feferman et al. eds., Collected Works, Vol. I, Publications 1929-1936, 1986
Gödel, Kurt, Solomon Fefermanet al. eds., Collected Works, Vol. II, Publications 1938-1974, 1990
Gödel, Kurt, A Letter to John von Neumann, March 1956
Hartmanis, Juris. and Stearns, Richard E., On The Computational Complexity of Algorithms, 1965
van Heijenoort, Jean, From Frege to Gödel, A Source Book in Mathematical Logic, 1879-1931, 1967
Kleene, Stephen C., Gödel’s impression on students of logic in the 1930’s in Gödel Remembered, 1987
Kleene, Stephen C., Origins of Recursive Function Theory, 1981
Kleene, Stephen C., The Theory of Recursive Functions, Approaching its Centennial, 1981
Kleene, Stephen C., Reflections on Church’s Thesis, 1987
McCulloch, Warren S., and Pitts, Walter, A Logical Calculus of the Ideas Immanent in Nervous Activity, 1943, reprinted 1990
Mill, John S., A System of Logic, Rationative and Inductive, 1882
Russell, Bertrand, A Critical Exposition of the Philosophy of Leibniz, 1900
von Neumann, John, A. H. Taub ed., Collected Works, Volume 5, 1963
Pnueli, Amir, The Temporal Logic of Programs, FOCS 1977: 46-57
Post, Emil L., Finite Combinatory Processes—Formulation 1, 1936
Post, Emil L., Recursively enumerable sets of positive integers and their decision problems, 1944
Schröder, Ernst, Review of Frege’s Conceptual Notation, 1880. A different translation, by Victor H. Dudman, 1969
Turing, Alan M., The Chemical Basis of Morphogenesis, 1952
Turing, Alan M., R. O. Gandy & C.E.M Yates ed., Collected Works of A. M. Turing, Vol. 4: Mathematical Logic, 2001
Turing, Alan M., On Computable Numbers, with an Application to the Entscheidungsproblem, J. of Math 58, 1936
Turing, Alan M., Lecture to the London Mathematieal Society on 20 February 1947
Turing, Alan M., A letter to William Ross Ashby, 1946
Wiener, Norbert, Cybernetics, or Control and Communication in the Animal and the Machine, 1948 (second edition, 1965)

Secondary Sources

Appel, Andrew W., The Birth of Computer Science at Princeton in the 1930s, in A. Appel ed., Alan Turing’s Systems of Logic, The Princeton Thesis, 2012
Boden, Margaret, Mind as Machine: A History of Cognitive Science, 2006
Feferman, Solomon, The nature and significance of Gödel’s incompleteness theorems, 2006
Floyd, Juliet, Turing on “Common Sense”: Cambridge Resonances in Floyd, Juliet, Bokulich, Alisa eds. Philosophical Explorations of the Legacy of Alan Turing, 2017
Gandy, Robin O., The Confluence of Ideas in 1936, 1988, in Rolf Herken, ed., The Universal Turing Machine (2nd Ed.), 1995
Herken, Rolf ed., The Universal Turing Machine (2nd Ed.), 1995
Hodges, Andrew, Alan Turing: The Enigma, 1983, Film Tie-In edition, 2014
Hodges, Andrew, Did Church and Turing Have a Thesis about Machines?, 2006
Marciszewski, Witold and Murawski, Roman, Mechanization Of Reasoning In A Historical Perspective, 1995, Kindle Edition
Peckhaus, Volker, Calculus Ratiocinator vs. Characteristica Universalis? The Two Traditions in Logic, Revisited, 2004
Wilson, Robert A., Frank Keil, The MIT Encyclopedia of the Cognitive Sciences, MIT Press (1999)

Part 1, Part 2

But not always — remember temporal logic, the “logic of automata”. ↩
The same algebraic principles would arise if the universe wasn’t causal but teleological. ↩