29 August, 2010

What makes mathematical thinking different from all other kinds of thinking is that it is *exact*.
Mathematical thinking proceeds by the application of exact rules, and produces answers which are guaranteed
to be correct. And no matter how many steps occur within a sequence of mathematical deductions, if each
step is mathematically correct, then the final result will be correct.

In the "real world", most of our thinking is *inexact*. We apply rules which work most of the time.
Like: "if a girl smiles at me, then she likes me". Or "birds can fly".

In the real world, there is the concept of **over-thinking**. If we think about something too much
(which, for example, tends to happen when people do *philosophy*), we are likely to come to conclusions
which are ridiculous.

But in mathematics, there is no such thing as over-thinking. Once the rules of exact thinking have been laid down, we are free to apply them as many times as we wish, without any fear of falling into error.

The "benefit" of learning mathematics, if indeed there is such a benefit (and this can vary somewhat depending on the learner and what the circumstances of their life is), is the benefit of learning how to think "exactly", and of learning that there is indeed such a thing as exact thinking, which may be somewhat different to the ordinary "inexact" thinking which makes up most of the one's everyday thoughts.

A mathematical *system* is typically defined formally as follows:

- A set of
*axioms*is defined. The axioms consitute an initial set of*theorems*which are deemed to be true within the system. - One or more
*rules of deduction*are defined. These rules of deduction specify how new theorems can be deduced (or*proven*) from an existing set of known theorems.

(I've left out a few technical details here, like how the axioms and theorems have
to be written in some *symbolic language*, and the rules have to
given as a set of operations which are permitted to be performed on
theorems expressed as sequences of symbols in order to generate new sequences of
symbols representing newly proven theorems.)

A set of rules like this tells us how to *do* mathematics, within a particular
mathematical system, but it doesn't tell us what the theorems in the system actually
*mean*.

In order for the results of our theorem-proving activities to be useful, we want
the theorems to be statements *about* something, so that the process of proving
theorems is then telling us something new that is useful to know.

If our knowledge and understanding of the real world was itself exact, then we could freely apply mathematical thinking to all aspects of thinking about the real world. Unfortunately we don't have an exact knowledge and understanding of the real world, which somewhat limits the applicability of mathematics to the daily problems of real life.

*However*, we can always *assume* that we have exact knowledge and understanding of
some component of reality. Having made such an assumption – or assumptions –
we can freely apply mathematical thinking to deduce any number of consequences of our assumptions,
and we can *compare* those deductions to our observations of the real world.

In effect, this is a simplified definition of what *science* is, and a
description of how science is done. (When talking about this from a scientific point of
view, the "assumptions" are normally called **theories**,
and the "deductions" are normally called **predictions**.)

The benefits of this process are somewhat indirect. After all, our initial assumption
could just be wrong, and it could be wrong even if no discrepancy is detected between
the consequences deduced from the assumptions and our observations of the real world.
However, in practice, if we maximise the *simplicity* of our assumptions, and maximise
the number of consequences that we deduce and test, then any assumptions that pass
enough tests usually turn out to give us some useful information about the world, in
that typically such assumptions *continue* to give us correct answers. And even
when a previously un-falsified set of assumptions is falsified by some new observation
(or some new deduction compared to an existing observation), typically we can *evolve*
the newly falsified set of assumptions into a new better more encompassing set of assumptions,
taking into account information about how the old assumptions worked for all the
observations which they did explain).

Mathematics is most useful for understanding a system which follows exact and known rules.
Which means that mathematics is very good for talking and thinking about *mathematics*.

In particular, we can formulate one mathematical system, let us call it **System X**, and
then we can formulate a second mathematical system, **System Y**, and we can interpret
the theorems of System Y as telling us statements about the provability or otherwise of
theorems of System X.

It might not be immediately clear what the benefit is of such arrangement. If a theorem
**T** can be proven in System X, what is the point of being able to prove a theorem
**T'** in System Y whose meaning is that **T** is a theorem of System X?

In practice, the benefit is that often the number of steps to prove theorem **T'**
may be much *smaller* than the number of steps required to prove the original theorem
**T**.

To give a very simple example, System X might represent a theory of arithmetic which tells us how to add numbers by simple counting (like 3 + 4 = (counting up from 3: 4,5,6,7), the answer is 7), and System Y might be a theory about System X that tells us how to add decimal numbers using the normal system of adding digits from the right and carrying where necessary. So I can use System Y to add 341 + 299 to get 640 and System Y is telling me that I would get the same answer if I started at 341 and counted 299 steps to get to 640, which is how I would have to do the addition in System X. (I've left out a technical detail here that I am assuming numbers already have a decimal representation, and that System X includes rules about counting forward with decimal numbers.)

One of the more profound discoveries of modern mathematics (made by
the Austrian mathematician
Kurt Gödel) is that there are limitations
to how much mathematics can be applied even to itself. In particular, no mathematical system
which has an interpretation as describing itself can be *complete* as a description
of truths about itself (i.e. there will be statements which will be true, but it won't be possible
to prove the statements whose interpretation is that those statements are provably true),
and no such system can be used to prove its own *consistency*
(i.e. to prove that it won't give wrong answers).

However these limitations do not alter the fact that a lot of mathematics is about other mathematics, and that such "meta-mathematics" often saves us a lot of work in practice.

In theory, mathematics and **computation** are the same thing. This follows, because we can describe
computation in the same terms that I formally defined mathematical deduction:

- Start with data consisting of items belonging to a symbolic language.
- Apply a set of exact rules to generate new data items.

So, for example, we can regard 341 + 299 = 640 as a computation, or, we can regard it as a theorem which is proven to be true.

In practice, the difference is that mathematics is something that *people* do, and
computation is something that *machines* do (or which a machine *could* do, even if a person
might do it sometimes, so the distinction relates to the *difficulty* of the thinking involved).

In this article I've tried to explain what I think is the *essence* of mathematics.

To keep it simple, I've left out all sorts of important details, and I've probably even said a few things that aren't completely true. In this section I attempt to make up for some of these short-comings (or at least confess to them).

In the discussion above, I mocked (within parentheses), the inexact nature of philosophical thinking. But of course this whole article is an article about the Philosophy of Mathematics. And I don't think there is any way it can be reduced to the application of a set of formally defined rules of deduction to a set of initial axioms.

There are some parts of mathematics which explicitly deal with types of knowledge which are
not exact. The biggest of these is **Probability and Statistics**.

It is a peculiar fact about probability that the ultimate definition of probability is somewhat
circular, i.e. the **Law of Large Numbers**, which more or less says that if you observe an
event with probability 1/p a "large" number of times, you will *probably* observe that it
occurs with a frequency "close" to 1/p.

Another type of mathematics dealing with "inexactness" is **Modal Logic**, which, among
other things, sometimes deals with statements which "might" be true (without assigning any specific
probability to the truth of such statements).

Mathematicians can make mistakes. Computers can also make mistakes (actually computers can make several kinds of mistakes for various reasons, including mistakes by the people who program them, faulty hardware, and "acts of nature" such as high-energy cosmic rays).

There are even areas of mathematics which deal with methods for reducing the probability of errors in systems which can't avoid the occurrence of certain types of errors. In effect one can think exactly about how to mitigate the inexactness of exact thinking.

A mathematical proof is defined by a precise series of formal steps. But in practice, actually
filling in all the details is a lot of work. Usually, when publishing or explaining a proof,
mathematicians typically provide enough detail such that a reader could, in principle,
fill in *all* details if they were so inclined.

However, a modern alternative is the **interactive theorem assistant**, such
as Coq or Isabelle.
These software tools are like very strict math journal reviewers that refuse to
accept a proof unless *every single required detail* is provided (or
provided in a manner such that the proof assistant itself can fill in any gaps).

Furthermore, for many of these assistants there are proofs of major mathematical theorems which have been posted online and proven true to the satisfaction of the assistants. (For example, see here for an archive of theorems proven using Isabelle.)

"Standard" set-theory combined with "classical" logic is supposedly "about" something, yet some of its axioms are explicitly "non-constructive", which implies that they have no computational meaning.

On the one hand such non-constructive mathematics is simpler to do than explicitly constructive mathematics (because with non-constructive mathematics you believe in "more truths" to start with, which makes it easier to prove new theorems), on the other hand the value of theorems proved within it are less certain.

As it happens, most of modern mathematics is supposedly based on this foundation of non-constructive set theory, however in practice much of the mathematics which actually matters for real world applications can be proven using constructive methods only.

In my informal definition of formal deduction given above, I did not say anything about
the *order* in which rules are to be applied, and this is because *there is no
specified order*. At any point in a procedure for proving new theorems, one has to *choose*
which deduction to apply next.

Much of the "skill" of human mathematicians consists of deciding which choice to make next.
Since any choice from the list of available choices is valid, it follows that, once restricted
to the set of valid choices, *there are no rules as to which choice should be chosen*.
Mathematics is all about following rules, yet the very doing of mathematics in a practical sense
involves making choices, and there are *no rules at all* saying which choice should be made.

The caveat "in a practical sense" matters, because if we are prepared to be enormously patient, then it is always possible to define a deterministic enumeration of all possible deductions which guarantees to eventually enumerate all provable theorems in a mathematical system. One could even say that the "art" of mathematics consists of picking a finite amount of "good stuff" out of an infinite amount "junk", and getting there sooner rather than later.

Which raises a further question: What is "good mathematics"? This is an open-ended subject
in itself, and an interesting read on *that* subject is
What is good mathematics? by Terence Tao.