Probability is logic

Tivadar Danka small portrait Tivadar Danka
Classical versus probabilistic thinking

Understanding math will make you a better engineer.

So, I am writing the best and most comprehensive book about it.

"Probability is the logic of science."

There is a deep truth behind this conventional wisdom: probability is the mathematical extension of logic, augmenting our reasoning toolkit with the concept of uncertainty.

In-depth exploration of probabilistic thinking incoming.

Our journey ahead has three stops:

  1. an introduction to mathematical logic,
  2. a touch of elementary set theory,
  3. and finally, understanding probabilistic thinking.

First things first: mathematical logic.

Mathematical logic 101

In logic, we work with propositions. A proposition is a statement that is either true or false, like "it's raining outside" or "the sidewalk is wet". These are often abbreviated as variables, such as

A="it’s raining outside".A = \text{"it's raining outside"}.

We can formulate complex propositions from smaller building blocks with logical connectives.

For example, consider the proposition "if it is raining outside, then the sidewalk is wet". This is the combination of two propositions, connected by the implication connective.

There are four essential connectives:

• NOT (¬ \neg ), also known as negation,
• AND ( \land ),
• OR ( \lor ),
• THEN ( \to ), also known as implication.

Connectives are defined by the truth values of the resulting propositions. For instance, if A is true, then NOT A is false; if A is false, then NOT A is true. Denoting true by 1 and false by 0, we can describe connectives with truth tables. Here is the one for negation (¬ \neg ).

NOT truth table

AND ( \land ) and OR ( \lor ) connect two propositions. ABA \land B is true if both A\textstyle A and B\textstyle B are true, and ABA \lor B is true if either one is.

AND and OR truth table

The implication connective THEN ( \to ) formalizes the deduction of a conclusion B\textstyle B from a premise A\textstyle A.

By definition, ABA \to B is true if B\textstyle B is true, or both A\textstyle A and B\textstyle B are false. An example: if "it's raining outside", THEN "the sidewalk is wet".

Implication truth table

Science is just the collection of complex propositions like "if X is a closed system, THEN the entropy of X cannot decrease". (As the 2nd law of thermodynamics states.)

The entire body of scientific knowledge is made of ABA \to B propositions.

In practice, our thinking process is the following: "I know that ABA \to B is true and A\textstyle A is true. Therefore, B\textstyle B must be true as well."

This is called modus ponens, the cornerstone of scientific reasoning. (If you don't understand modus ponens, take a look at the truth table of the \to connective, a few paragraphs above. The case when AB A \to B is true and A\textstyle A is true is described by the very first row, which can only happen if B\textstyle B is true as well.)

Set theory = logic (more or less)

Logical connectives can be translated to the language of sets.

Union ( \cup ) and intersection ( \cap ), two fundamental operations, are particularly relevant to us. Notice how similar the symbols for AND ( \land ) and intersection ( \cap ) are? This is not an accident.

Union and intersection

By definition, any element x\textstyle x is the element of ABA \cap B if and only if (x\textstyle x is an element of A\textstyle A) AND (x\textstyle x is an element of B\textstyle B).

Similarly, union corresponds to the OR connective, as the figure below shows.

Logical connectives as set operations

What's most important for us is that the implication connective THEN ( \to ) corresponds to the "subset of" relation, denoted by the \subseteq symbol.

Implication as subset relation

Now that we understand how to formulate scientific truths as "premise \to conclusion" statements and see how this translates to sets, we are finally ready to talk about probability.

Probability as logic

What is the biggest flaw of mathematical logic? That we rarely have all the information to decide if a proposition is true or false.

Consider the following: "it'll rain tomorrow". During the rainy season, all we can say is that rain is more likely, but tomorrow can be sunny as well.

Probability theory generalizes classical logic by measuring truth on a scale between 0 and 1, where 0 is false and 1 is true. If the probability of rain tomorrow is 0.9, it means that rain is significantly more likely, but not absolutely certain.

Instead of propositions, probability operates on events. In turn, events are represented by sets.

For example, if I roll a dice, the event "the result is less than five" is represented by the set A=1,2,3,4A = {1, 2, 3, 4}. In fact, P(A)=4/6P(A) = 4/6. (P\textstyle P denotes the probability of an event.)

As discussed earlier, the logical connectives AND and OR correspond to basic set operations: AND is intersection, OR is union.

This translates to probabilities as well.

Probability and set operations

How can probability be used to generalize the logical implication? A "probabilistic ABA \to B" should represent the likelihood of B\textstyle B, given that A\textstyle A is observed. This is formalized by conditional probability.

Conditional probability

(If you want to know more about conditional probabilities, here is a brief explainer.)

At the deepest level, the conditional probability P(BA)P(B | A) is the mathematical formulation of our belief in the hypothesis B\textstyle B, given empirical evidence A\textstyle A. A high P(BA)P(B | A) makes B\textstyle B more likely to happen, given that A\textstyle A is observed.

High conditional probability

On the other hand, a low P(BA)P(B | A) makes B\textstyle B less likely to happen when A\textstyle A occurs as well. This is why probability is called the logic of science.

Low conditional probability

To give you a concrete example, let's go back to the one mentioned earlier: the rain and the wet sidewalk. For simplicity, denote the events by

A="the sidewalk is wet",B="it’s raining outside".\begin{align*} A &= \text{"the sidewalk is wet"}, \\ B &= \text{"it's raining outside"}. \end{align*}

The sidewalk can be wet for many reasons, say the neighbor just watered the lawn. Yet, the primary cause of a wet sidewalk is rain, so P(BA)P(B | A) is close to 1. If somebody comes in and tells you that the sidewalk is wet, it is safe to infer rain.

Probabilistic inference like the above is the foundation of machine learning.

For instance, the output of (most) classification models is the distribution of class probabilities, given an observation.

Probability as inference

To wrap up, here is how Maxwell — the famous physicist — thinks about probability.

"The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on. Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is, or ought to be, in a reasonable man's mind. — James Clerk Maxwell"

By now, you can fully understand what Maxwell meant.

Having a deep understanding of math will make you a better engineer.

I want to help you with this, so I am writing a comprehensive book that takes you from high school math to the advanced stuff.
Join me on this journey and let's do this together!