Geometric Interpretation of the Channel Capacity Theorem

Papoulis [1] shows that a band-limited signal {x\left(t\right)} may be written

\displaystyle{}x\left(t\right)=\sum_{j=-\infty}^\infty x_jS_j\left(t\right)=\sum_{j = - \infty}^\infty x_j\,\frac{\sin\left(2\pi Bt-j\pi\right)}{2\pi Bt - j\pi},\ \ \ \ \ (1)

where {x_j} is the value of {x\left(t\right)} at the {j^{\rm th}} sample time and {B} is the bandwidth. The signal spectrum is assumed to vanish for frequencies outside the domain {\left[-B,B\right]}. Reza [2] considers a signal of this type that is approximately limited to a time interval of duration {T}, e.g., the signal vanishes for times {t} outside the domain {\left[-T/2,T/2\right]}. The duration of the functions S_j\left(t\right) is on the order of 1/B and the time between samples is ={1/2B}. The duration of the signal x_j\left(t\right) is thus given approximately as (number of samples + 1)/2B. Now {BT\gg1} is assumed throughout these calculations so

\displaystyle x\left(t\right)\approx\sum_{j = - BT}^{BT} x_j\,\frac{\sin\left(2\pi Bt-j\pi\right)}{2\pi Bt -j\pi}.

Next, the average power in the signal {P_x} is computed

\displaystyle P_x = \frac{1}{T}\int_{-T/2}^{T/2}\left[x\left(t\right)\right]^2\,dt\approx\frac{1}{T}\int_{-\infty}^{\infty}\left[x\left(t\right)\right]^2\,dt

\displaystyle =\frac{1}{2BT}\sum_{j=-BT}^{BT} x_j^2\ \ \ \ \ \ \ \ \ \ (2)

as shown in detail in [3]. It turns out that the functions S_j\left(t\right) in Eq. (1) form an orthogonal set [2, 3]. Thus Reza argues that the sum in Eq. (2) can be viewed as the norm squared of a vector in a {2BT} dimensional vector space. The vector coordinates are given by the {x_j} and {x_0 = 0} is assumed. If the length of the vector is {d_x} then from Eq. (2)

\displaystyle d_x=\sqrt{2BTP_x}

Thus in the Gaussian noise channel with input {X}, output {Y} and noise {N} satisfying

\displaystyle Y=X+N, \ \ \ \ \

the input signal is represented by a point a distance d_x from the origin in the 2BT dimensional space. The output signal is represented by a point a distance

\displaystyle d_y=\sqrt{2BT\left(P_x+P_n\right)}, \ \ \ \ \

from the origin, given that the input signal and the noise are uncorrelated. The noise is represented by a point a distance

\displaystyle d_n=\sqrt{2BTP_n}, \ \ \ \ \

from the origin.

The requirement for transmission of signals without noise is that the allowable signal points in the {2BT} dimensional space must be separated a distance given by twice the length of the noise vector. Each of the received signals are represented by a point on a sphere with radius {d_y} in {2BT} dimensional space.

The question is now how many distinct signals (points on the sphere) can be allowed while keeping the separation between the points equal to {2d_n}? Enforcing this requirement permits decoding this signal without ambiguity. Alternatively, one can ask how many non-overlapping noise spheres can be embedded in the surface of the output signal’s sphere? Each noise sphere has radius {d_n} and has it’s center on the surface of the output signal’s sphere. Reza argues this problem is equivalent to asking how many spheres of radius {d_n} can be placed within the sphere of radius {d_y} because for {2BT} very large, e.g., in a very high dimensional space, most of the volume of a sphere is close to its surface.

According to these prescriptions the number of allowed signals {M} is given by

\displaystyle M\approx\frac{\mbox{Volume of sphere with radius }d_y}{\mbox{Volume of sphere with radius }d_n}\ \ \ \

where the volumes are to be computed in 2BT dimensional space. Since the volume of a sphere in {p} dimensional space is proportional to {r^p} where r is the sphere’s radius,

\displaystyle M\approx\left(\frac{d_y}{d_n}\right)^{2BT}=\left(\frac{P_x+P_n}{P_n}\right)^{BT}.

The number of bits sent by these {M} allowed signals is

\displaystyle \mbox{number of bits}=\log_2 M=BT\log_2\left(1+\frac{P_x}{P_n}\right),

and so the channel capacity {C_t} in bits/s is

\displaystyle C_t = \frac{1}{T}\log_2 M=B\log_2\left(1+\frac{P_x}{P_n}\right),

which is Shannon’s [4] famous result. A more detailed derivation is provided in [3] and is available by clicking below.


[1] A. Papoulis, Probability, Random Variables and Stochastic Processes, McGraw-Hill, N.Y. (1965), p. 176.

[2] F. M. Reza, An Introduction to Information Theory, McGraw-Hill, N.Y. (1961), pp. 318 – 320.

[3] H. L. Rappaport, Notes on Information Theory II and the Geometric Interpretation of the Shannon Channel Capacity, 7G Communications, 7GCTN02 (2014); infoII

[4] C.E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal (1948).



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: