Part 5/8:
Diving into scenarios with skewed probabilities, Shannon’s principles can be further highlighted through the lottery. For instance, if there is a 1 in a billion chance of winning, this translates to a probability of ( 2^{-30} ). Winning is an event laden with high surprise (30 bits), while losing (which occurs most of the time) carries minimal surprise.
This contrast typifies the essence of entropy in information theory — it’s not simply about having information, but understanding the balance of likely versus unlikely events.
Defining Entropy
Entropy, denoted by ( H ), offers a measure of the average uncertainty in a set of possible outcomes based on their probabilities. It is defined mathematically for a discrete set of outcomes ( X ):
[ H(X) = -\sum_{i} P(x_i) \log(P(x_i)) ]