Geometric Interpretation of the Channel Capacity Theorem

Papoulis [1] shows that a band-limited signal ${x\left(t\right)}$ may be written

$\displaystyle{}x\left(t\right)=\sum_{j=-\infty}^\infty x_jS_j\left(t\right)=\sum_{j = - \infty}^\infty x_j\,\frac{\sin\left(2\pi Bt-j\pi\right)}{2\pi Bt - j\pi},\ \ \ \ \ (1)$

where ${x_j}$ is the value of ${x\left(t\right)}$ at the ${j^{\rm th}}$ sample time and ${B}$ is the bandwidth. The signal spectrum is assumed to vanish for frequencies outside the domain ${\left[-B,B\right]}$. Reza [2] considers a signal of this type that is approximately limited to a time interval of duration ${T}$, e.g., the signal vanishes for times ${t}$ outside the domain ${\left[-T/2,T/2\right]}$. The duration of the functions $S_j\left(t\right)$ is on the order of $1/B$ and the time between samples is $={1/2B}$. The duration of the signal $x_j\left(t\right)$ is thus given approximately as (number of samples + 1)$/2B$. Now ${BT\gg1}$ is assumed throughout these calculations so

$\displaystyle x\left(t\right)\approx\sum_{j = - BT}^{BT} x_j\,\frac{\sin\left(2\pi Bt-j\pi\right)}{2\pi Bt -j\pi}.$

Next, the average power in the signal ${P_x}$ is computed

$\displaystyle P_x = \frac{1}{T}\int_{-T/2}^{T/2}\left[x\left(t\right)\right]^2\,dt\approx\frac{1}{T}\int_{-\infty}^{\infty}\left[x\left(t\right)\right]^2\,dt$

$\displaystyle =\frac{1}{2BT}\sum_{j=-BT}^{BT} x_j^2\ \ \ \ \ \ \ \ \ \ (2)$

as shown in detail in [3]. It turns out that the functions $S_j\left(t\right)$ in Eq. (1) form an orthogonal set [2, 3]. Thus Reza argues that the sum in Eq. (2) can be viewed as the norm squared of a vector in a ${2BT}$ dimensional vector space. The vector coordinates are given by the ${x_j}$ and ${x_0 = 0}$ is assumed. If the length of the vector is ${d_x}$ then from Eq. (2)

$\displaystyle d_x=\sqrt{2BTP_x}$

Thus in the Gaussian noise channel with input ${X}$, output ${Y}$ and noise ${N}$ satisfying

$\displaystyle Y=X+N, \ \ \ \ \$

the input signal is represented by a point a distance $d_x$ from the origin in the $2BT$ dimensional space. The output signal is represented by a point a distance

$\displaystyle d_y=\sqrt{2BT\left(P_x+P_n\right)}, \ \ \ \ \$

from the origin, given that the input signal and the noise are uncorrelated. The noise is represented by a point a distance

$\displaystyle d_n=\sqrt{2BTP_n}, \ \ \ \ \$

from the origin.

The requirement for transmission of signals without noise is that the allowable signal points in the ${2BT}$ dimensional space must be separated a distance given by twice the length of the noise vector. Each of the received signals are represented by a point on a sphere with radius ${d_y}$ in ${2BT}$ dimensional space.

The question is now how many distinct signals (points on the sphere) can be allowed while keeping the separation between the points equal to ${2d_n}$? Enforcing this requirement permits decoding this signal without ambiguity. Alternatively, one can ask how many non-overlapping noise spheres can be embedded in the surface of the output signal’s sphere? Each noise sphere has radius ${d_n}$ and has it’s center on the surface of the output signal’s sphere. Reza argues this problem is equivalent to asking how many spheres of radius ${d_n}$ can be placed within the sphere of radius ${d_y}$ because for ${2BT}$ very large, e.g., in a very high dimensional space, most of the volume of a sphere is close to its surface.

According to these prescriptions the number of allowed signals ${M}$ is given by

$\displaystyle M\approx\frac{\mbox{Volume of sphere with radius }d_y}{\mbox{Volume of sphere with radius }d_n}\ \ \ \$

where the volumes are to be computed in $2BT$ dimensional space. Since the volume of a sphere in ${p}$ dimensional space is proportional to ${r^p}$ where $r$ is the sphere’s radius,

$\displaystyle M\approx\left(\frac{d_y}{d_n}\right)^{2BT}=\left(\frac{P_x+P_n}{P_n}\right)^{BT}.$

The number of bits sent by these ${M}$ allowed signals is

$\displaystyle \mbox{number of bits}=\log_2 M=BT\log_2\left(1+\frac{P_x}{P_n}\right),$

and so the channel capacity ${C_t}$ in bits/s is

$\displaystyle C_t = \frac{1}{T}\log_2 M=B\log_2\left(1+\frac{P_x}{P_n}\right),$

which is Shannon’s [4] famous result. A more detailed derivation is provided in [3] and is available by clicking below.

References

[1] A. Papoulis, Probability, Random Variables and Stochastic Processes, McGraw-Hill, N.Y. (1965), p. 176.

[2] F. M. Reza, An Introduction to Information Theory, McGraw-Hill, N.Y. (1961), pp. 318 – 320.

[3] H. L. Rappaport, Notes on Information Theory II and the Geometric Interpretation of the Shannon Channel Capacity, 7G Communications, 7GCTN02 (2014); infoII

[4] C.E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal (1948).