## Source Coding Theorem

One way to understand the source coding theorem [1, 2] is to first consider a general discrete source with an output alphabet of ${J}$ symbols. It would require ${\log_2 J}$ bits to transmit a given symbol. Similarly, for a block of $n$ sequential ordered elements of the alphabet there would be ${J^n}$ possible messages that can be transmitted in one block. It would thus require ${\log_2 J^n= n\log_2 J}$ bits to send a message of ${n}$ symbols.

Next specialize to the case of a discrete memoryless source also with an output alphabet of ${J}$ symbols. Let the ${i^{th}}$ member of the alphabet be denoted ${x_i}$ and let the probability of occurrence of ${x_i}$ be denoted ${p_i}.$ Now the block of ${n}$ symbols where ${n}$ is very large is considered. Let the expected number of occurrences of ${x_i}$ be denoted ${n_i}$. We have that for ${n}$ very large

$\displaystyle n_i = p_in\qquad\qquad\qquad\qquad$

That is to say for ${n}$ very large the number of occurrences of each of the ${x_i}$ is known and the number of likely messages is not ${J^n}$ but instead the number of distinct permutations of a block of symbols with

$\displaystyle p_0n \mbox{ occurrences of } x_0,$

$\displaystyle p_1n \mbox{ occurrences of } x_1,$

$\displaystyle p_2n \mbox{ occurrences of } x_2,$

$\displaystyle ~~~~~~~~~\vdots$

$\displaystyle p_{J-1}n \mbox{ occurrences of } x_{J-1}.$

So the number of possible messages that can be sent in one block of ${n}$ symbols is

$\displaystyle \mbox{Num. of messages} = \frac{n!}{\left(p_0n\right)!\left(p_1n\right)!\left(p_2n\right)!\ldots\left(p_{J-1}n\right)!} .\ \ \ \ \$

Therefore, the number of bits required to send these messages or, equivalently, the number of bits needed to indicate to a receiver which of these messages was transmitted would be

$\displaystyle{}N_{\rm bits}=\log_2\left[\frac{n!}{\left(p_0n\right)!\left(p_1n\right)!\left(p_2n\right)!\ldots\left(p_{J-1}n\right)!}\right]$

and

$\displaystyle{}N_{\rm bits}\ln 2=\ln n!-\sum_{j=0}^{J-1}\ln\left[\left(p_jn\right)!\right]. \ \ \ \ \$

Using the approximation ${\ln n! =n\ln n - n}$ [3] shows

$\displaystyle{}N_{\rm bits}\ln 2=n\ln n-n-\sum_{j=0}^{J-1}\left[p_j n\ln\left(p_j n\right)-p_j n\right], \ \ \ \ \$

$\displaystyle=n\ln n-n-n\sum_{j=0}^{J-1}\left[p_j\ln p_j + p_j\ln n-p_j\right], \ \ \ \ \$

$\displaystyle=-n\sum_{j=0}^{J-1}p_j\ln p_j.\ \ \ \ \$

Or

$\displaystyle{}N_{\rm bits}=-n\sum_{j=0}^{J-1}p_j\log_2 p_j=nH\left({\cal X}\right), \ \ \ \ \$

where ${H\left({\cal X}\right)}$ is the entropy of the alphabet.

Of course more bits can always be used to code all of these different messages but this presents the minimum needed when ${n}$ is very large. The minimum required average number of bits per transmitted symbol is denoted ${\bar L_{\rm min}}$ and thus satisfies

$\displaystyle{}\bar L_{\rm min}=\frac{N_{\rm bits}}{n}=H\left({\cal X}\right). \ \ \ \ \$

This result is the source coding theorem [1, 2]. Of course we have not shown that uniquely decodable codes exist that operate on one symbol at a time and can obtain this limit. However, we have shown that this limit exists.

I have provided [4] a more detailed write-up introducing information theory available by clicking below.

References

[1] S. Haykin, Digital Communications, Wiley, N.Y. (1988), Chapter 2.

[2] C.E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal (1948).

[3] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions, Dover, N.Y. (1965), pp. 255 – 257.

[4] H.L. Rappaport, Notes on Information Theory I  and the Source Coding Theorem, 7G Communications, 7GCTN01 (2014); infoI