Problem Set 11 Compression of a Random Source



EECS 126: Probability and Random Processes
Problem Set 11

1. Midterm
Solve all of the problems on the midterm again (including the ones you got correct).
2. Compression of a Random Source
Suppose I’m trying to send a text message to my friend. In general, I know I need log2
bits for every letter I want to send because there are 26 letters in the alphabet. However, it
turns out if I have some information on the distribution of the letters, I can do better. For
example, I might give the letter e a shorter bit representation because I know it’s the most
common. Actually, it turns out the number of bits I need on average is the entropy, and in
this problem, we try to show why this is true in general.
Let (Xi)∞
i.i.d. ∼ p(·), where p is a discrete PMF on a finite set X . We know the entropy of a
random variable X is
H(X) = −
p(x) log2 p(x)
Since entropy is really a function of the distribution, we could write the entropy as H(p).
(a) Show that

log2 p(X1, . . . , Xn)
n→∞ −−−→ H(X1) almost surely.
(Here, we are extending the notation p(·) to denote the joint PMF of (X1, . . . , Xn):
p(x1, . . . , xn) := p(x1)· · · p(xn).)
(b) Fix  > 0 and define A
 as the set of all sequences (x1, . . . , xn) ∈ X n
such that:
−n(H(X1)+) ≤ p(x1, . . . , xn) ≤ 2
Show that P((X1, . . . , Xn) ∈ A
 ) > 1 −  for all n sufficiently large. Consequently,
 is called the typical set because the observed sequences lie within A
 with high
(c) Show that (1 − )2n(H(X1)−) ≤ |A
 | ≤ 2
, for n sufficiently large. Use the
union bound.
Parts (b) and (c) are called the asymptotic equipartition property (AEP) because
they say that there are ≈ 2
nH(X1) observed sequences which each have probability
≈ 2
. Thus, by discarding the sequences outside of A
 , we need only keep track
of 2nH(X1)
sequences, which means that a length-n sequence can be compressed into
≈ nH(X1) bits, requiring H(X1) bits per symbol.
(d) Now show that for any δ > 0 and any positive integer n, if Bn ⊆ X n
is a set with
|Bn| ≤ 2
, then P((X1, . . . , Xn) ∈ Bn) → 0 as n → ∞.
This says that we cannot compress the observed sequences of length n into any set smaller
than size 2nH(X1)
[Hint: Consider the intersection of Bn and A
(e) Next we turn towards using the AEP for compression. Recall that in order to encode
a set of size n in binary, it requires dlog2 ne bits. Therefore, a na¨ıve encoding requires
|X |e bits per symbol.
From (b) and (d), if we use log2
 | ≈ nH(X1) bits to encode the sequences in A
ignoring all other sequences, then the probability of error with this encoding will tend
to 0 as n → ∞, and thus an asymptotically error-free encoding can be achieved using
H(X1) bits per symbol.
Alternatively, we can create an error-free code by using 1 + dlog2
 |e bits to encode
the sequences in A
 and 1 + ndlog2
|X |e bits to encode other sequences, where the first
bit is used to indicate whether the sequence belongs in A
 or not. Let Ln be the length
of the encoding of X1, . . . , Xn using this code; show that limn→∞ E[Ln]/n ≤ H(X1) + .
In other words, asymptotically, we can compress the sequence so that the number of bits
per symbol is arbitrary close to the entropy.


There are no reviews yet.

Be the first to review “Problem Set 11 Compression of a Random Source”

Your email address will not be published.