Monthly Archives: February 2008

Journal Club: A Mathematical Theory of Communication by Claude Shannon


As promised before, I have finally worked through the majority of this paper, enough to give a brief introduction and discussion.

The key point of this paper is to demonstrate the importance of statistical analysis and its applications to determining information generation and transmission capacity. The measure H, or entropy, can be thought of as the amount of variance, or uncertainty, in a communication system. This leads us to define the theoretical capacity of a communication system given the known statistical properties of its constituents as well as apply analysis to practical systems.

The concept of information entropy deals with the uncertainty in the expected value of this information. Although it is rooted in statistical mechanics, it can be seen that highly predictable information has low variance, and therefore lower entropy, as compared to more random information. From this measure of information entropy, we can determine the necessary number of bits to efficiently encode this information, or to put it another way, how many symbols we can transmit per bit (assuming digital communication medium). Although the case of uniform probability distribution for all information symbols is easiest to analyze and leads to highest entropy, most practical applications have particular statistical distributions for symbol/information generation. Shannon goes to lengths to demonstrate this with the English language noting that selection of letters, or even words, is highly structured and far from random. This structure is a measure of redundancy of information, so that if I typ like ths, you cn stil undersnd me. (Spammers have been rediscovering this fact for years.)

Once the information entropy for all of the circuits involved in the communication system are determined, the channel capacity can be determined in the form of symbols per second given a finite certainty and a raw channel bit-rate. Shannon gives a fine example of a digital channel operating at 1000bits/s with a 1% error rate leading to an effective bit rate of ~919bits/s to account for error detection. Some communication system examples are given which I will not discuss in depth, however, I will try to reiterate the important steps in efficient communication design. Although Shannon gives a mathematical formulation for determining the theoretical limit for channel throughput, it is up to the designer to realize create a system which comes close to the limit. To do this, it is imperative to know the statistical properties of all of the sub-systems involved and the noise that may be present, and only then can efficiency be achieved.

The paper is by far more in-depth than this introduction and the math is not too hard, if anything, it is worth a look-over for some commentary on the statistical nature of the English language. As always, feel free to post a comment to discuss something about the paper, add something, or correct a mistake I have made. As a small bonus, I am adding Shannons’ patent for PCM-encoded voice/telephone service for those who like to read those types of things.

( 1948shannon-a-mathematical-theory-of-communication.pdf )
( 1946shannon-communication-syste-memploying-pulse-code-modulation-patent.pdf )

Biting the bullet: Vista Ultimate 32bit


I decided to install Vista on my home workstation today in hopes of determining which software that our lab group uses will work fine and which will have problems. To be more specific, stuff if the data collections will continue to run on Vista machines. This test was partially motivated by a growing of support for Vista drivers and neglect for XP drivers by hardware manufacturers.

The system under test is an Athlon64 3200+ with 2GB of RAM and GeForceFX 5600 graphics adapter. The software tested will be MATLAB 2007b/2008a, seek LabView 8.2/8.5 with PCI-based DAQ, Cadence/Allegro 15.x.

The very short time that I have used Vista (on this machine) has been mostly pleasant. The good is that everything seems to work fairly smoothly and all of the hardware was identified at bootup and all drivers have been loaded. The main downside is that Vista has needed my permission for almost every action.

More OpenWRT goodies


Some time ago, for sale I wrote a guide for compiling OpenWRT firmware for the la Fonera router. I began to really like OpenWRT and decided that I may want to put it on some other devices I have around, doctor namely a Linksys WAP54G and WRT54G. I could have modified my development suite, order however, I figured that it is better to let someone else do the work this time. Freifunk has done just that and has posted modified OpenWRT images that will even fit on the limited WAP54G. I have one of the TRX files loaded on my version 2.0 WAP54G and running without problems. The only slight hickup was that the Linksys firmware did not want to “downgrade”, so I pointed the a browser to http://router_ip/fw-conf.asp and disabled both check there and then simply uploaded the new TRX file using the updater. When everything was done, the router was back up on the same IP and was accepting ssh connections with username “root” and password “admin”. [I previously posted the password was "password", that is incorrect, sorry for the error.]

[ Image is from ]

Accidental finding: new “greener” HDD from Western Digital


We got some new drives in the lab today and I accidentally looked at the power consumption of these 1TB SATA drives and discovered that the +5V line required 700mA and the +12V line required a mere 550mA to operate. I compared it to 200GB Maxtor drive and noted that the +5V rating was about the same, however, the +12V rating was 1500mA. The 12W power rating reduction is impressive. WD’s product specifications page notes that read/write power is about 7W while idle power consumption is around 4W. Anandtech claims that Seagate’s 1TB drive is also fairly efficient. Please understand that I have no financial interest in selling these drives, I am simply impressed that we can get 1TB of storage in such an energy-efficient footprint. Combining this with one of an energy-efficient x86 system could soon become the new trend in always-on home media servers.

National Semi application note on practical uses of undersampling


If I had to sum up this application note with a phrase, I would re-iterate that the minimum sampling rate to adequately capture a signal depends not only on the frequency content, but also the signal bandwidth. To demonstrate, we can look at GSM-based mobile communications which operate at around 1700MHz in the US. Even though the frequency content is high, each GSM channel is only 200kHz wide, so we can use a relatively slow ADC and a bit of good design. The typical trick employed in RF equipment is to set up an oscillator to run at the center frequency (~1700MHz) of the desired GSM channel and multiply it by the incoming RF signal (also ~1700MHz). As with a Fourier transform, the DC component of the result will represent the power at the oscillator frequency and the adjacent frequencies will be shifted to center around DC and will show up as “beats”. This new signal will have a much lower frequency content, on the order of the 200kHz, and will therefore allow slower ADCs to be used with a focus on economics (cheaper handsets) and higher accuracy (better reception).

The application note presents a similar type of trick, except this time, digital undersampling is involved. The idea is that unfiltered frequency content that is outside of the Nyquist band will be aliased into the Nyquist band and still provide meaningful information as long as it has narrow bandwidth and it is the only frequency content coming in. To use the previous example, if we can set up a well-tuned bandpass filter to center around the GSM channel of choice, we can run an ADC at 400kHz and expect the higher-frequency content to be aliased in.

On a final note, I have to apologize for my negligence on keeping up the `Journal Club‘. I have not forgotten about discussing Shannon’s work and plan to write a post about it at the earliest convenient time.