Physlib

QuantumInfo.ClassicalInfo.Entropy

Shannon entropy

Definitions and facts about the Shannon entropy function -x*ln(x), both on a single variable and on a distribution.

There is significant overlap with `Real.negMulLog` and `Real.binEntropy` in Mathlib, and probably these files could be combined in some form.

14 declarations

definition

One-event Shannon entropy H1(p)=plnpH_1(p) = -p \ln p

The one-event entropy function H1:[0,1]RH_1: [0, 1] \to \mathbb{R} maps a probability pp to its Shannon entropy, defined as H1(p)=plnpH_1(p) = -p \ln p. Here, pp is an element of the unit interval [0,1][0, 1], and by convention (inherited from the underlying `negMulLog` function), 0ln0=00 \ln 0 = 0.

definition

H1(0)=0H_1(0) = 0

For the Shannon entropy function H1:[0,1]RH_1: [0, 1] \to \mathbb{R} defined by H1(p)=pln(p)H_1(p) = -p \ln(p), the value of the function at p=0p = 0 is equal to 00.

definition

H1(1)=0 H_1(1) = 0

The Shannon entropy function H1 H_1 evaluated at the probability p=1 p = 1 is equal to 0 0 .

theorem

H1(p)0H_1(p) \ge 0 for p[0,1]p \in [0, 1]

Let p[0,1]p \in [0, 1] be a probability. The Shannon entropy of pp, denoted by H1(p)H_1(p), is non-negative, i.e., 0H1(p)0 \le H_1(p).

theorem

H1(p)<1H_1(p) < 1 for all pProbp \in \text{Prob}

For any probability pp in the unit interval [0,1][0, 1], the Shannon entropy H1(p)H_1(p) is strictly less than 1.

theorem

H1(p)1/eH_1(p) \le 1/e

For any probability p[0,1]p \in [0, 1], the Shannon entropy H1(p)H_1(p) satisfies the inequality H1(p)e1H_1(p) \le e^{-1}, where ee is Euler's number.

theorem

Concavity of the Shannon Entropy H1H_1

For any probabilities x,y[0,1]x, y \in [0, 1] and any weight p[0,1]p \in [0, 1], the Shannon entropy function H1H_1 is concave, satisfying the inequality: pH1(x)+(1p)H1(y)H1(px+(1p)y) p \cdot H_1(x) + (1 - p) \cdot H_1(y) \le H_1(p \cdot x + (1 - p) \cdot y) where p[ab]p[a \leftrightarrow b] denotes the convex combination pa+(1p)bp \cdot a + (1 - p) \cdot b.

definition

Shannon entropy HsH_s of a discrete distribution

Let α\alpha be a finite set and dd be a probability distribution on α\alpha. The Shannon entropy Hs(d)H_s(d) of the distribution is defined as the sum over all elements xαx \in \alpha of the individual entropy contributions H1(px)H_1(p_x), where pxp_x is the probability assigned to xx by dd. Mathematically, this is expressed as: Hs(d)=xαH1(d(x))H_s(d) = \sum_{x \in \alpha} H_1(d(x)) where H1(p)=pln(p)H_1(p) = -p \ln(p) for p[0,1]p \in [0, 1].

theorem

The Shannon entropy Hs(d)H_s(d) is non-negative

Let α\alpha be a finite set and let dd be a probability distribution on α\alpha. The Shannon entropy of the distribution, denoted by Hs(d)H_s(d), is non-negative, i.e., 0Hs(d)0 \leq H_s(d).

theorem

Hs(d)lnαH_s(d) \le \ln |\alpha|

Let α\alpha be a finite set and dd be a probability distribution on α\alpha. The Shannon entropy of the distribution, denoted Hs(d)H_s(d), is less than or equal to the natural logarithm of the cardinality of α\alpha, i.e., Hs(d)lnαH_s(d) \le \ln |\alpha|.

theorem

The Shannon entropy of a constant distribution is 00

Let α\alpha be a type and iαi \in \alpha be an element. Let d=δid = \delta_i be the constant probability distribution centered at ii (where δi(y)=1\delta_i(y) = 1 if y=iy = i and 00 otherwise). Then the Shannon entropy of this distribution is zero, i.e., H(d)=0H(d) = 0.

theorem

Hs(uniform)=lnαH_s(\text{uniform}) = \ln |\alpha|

Let α\alpha be a non-empty finite set. For the uniform probability distribution PP on α\alpha, the Shannon entropy Hs(P)H_s(P) is equal to the natural logarithm of the cardinality of α\alpha, denoted lnα\ln |\alpha|.

theorem

Hs(coin p)=Hb(p)H_s(\text{coin } p) = H_b(p)

Let p[0,1]p \in [0, 1] be a probability. The Shannon entropy HsH_s of a coin distribution (a two-event distribution) with success probability pp, denoted by ProbDistribution.coin p\text{ProbDistribution.coin } p, is equal to the binary entropy of pp, denoted by Hb(p)H_b(p) (or `Real.binEntropy p`).

theorem

Shannon entropy HsH_s is invariant under reordering of probabilities

Let d1d_1 and d2d_2 be probability distributions on finite sets α\alpha and β\beta, respectively. If the multiset of probabilities occurring in d1d_1 is equal to the multiset of probabilities occurring in d2d_2, then the Shannon entropy of d1d_1 is equal to the Shannon entropy of d2d_2, denoted Hs(d1)=Hs(d2)H_s(d_1) = H_s(d_2).