Part 3/8:
To illustrate the principles of information theory, one can start with a simple experiment: flipping a coin. When predicting the outcome of a single coin flip — heads or tails — the probability is equal (1/2). If the result turns out to be heads, one could say they are "one bit surprised." This follows the mathematical concept where a surprise associated with an event is defined using probability.
When two coins are flipped, the number of predictions rises. Now, predicting heads or tails provides more possible outcomes — specifically four: HH, HT, TH, TT. If one's predictions align with the outcome, they would be "two bits surprised." This establishes a foundational relationship between the predictability of events and the amount of information conveyed.