Journal Club: Power-constrained, high-frequency circuits for the IBM POWER6 microprocessor


The inaugural paper for the Journal Club is titled “Power-constrained high-frequency circuits for the IBM POWER6 microprocessor” by Brian Curran et al. and is published in the November 2007 issue of the IBM Journal of Research and Development. I have much respect for the whole POWER micro-architecture, mind
consequently, I am interested in learning a little bit about their design methodology which lead to a near-5GHz core logic clock rate. The IBM design team responsible for the POWER6 applied a three-direction strategy to achieving this performance goal: cutting edge technology, manual circuit optimization and thorough testing.

The processor was designed at a 65um manufacturing node so various technologies needed to be employed to keep leakage current to a minimum and thereby maintain an acceptable power usage. The first method involved using silicon-on-insulator (SOI) which reduced back-gate current due to parasitic capacitances and can CMOS latch-up. The processing steps to implement SOI are well understood, however, extra care must be given to design layout as it is no longer possible to drive the back-gate by connecting the whole substrate to a fixed potential. Another technological advance employed was the use of dielectrics with low relative permittivity between traces to further reduce transmission line effects and the associated propagation delay of interconnects. Since less energy is stored in the dielectric material between interconnects, this also reduces power consumption.

From a design stand point, the goal of the team was to distribute the clock properly and to maintain the latency of the core logic circuits below “13FO-4”. Propagation delays, loading and transmission line effects play a very important role in the 5GHz regime. It was very interesting to see how multiple layers of buffers and clock delays were included to guarantee that clock pulses would be synchronized around various cells while maintaining an adequate slew rate. The 13FO-4 latency means that each processing cycle had to be accomplished in the time it would take for a signal to propagate along a chain of thirteen inverters that were loaded with four devices each. This is the criteria which allowed for a 5GHz core logic clock rate. It was mentioned that threshold voltages were tuned, probably through ion implantation, to minimize leakage while maximizing speed.

Simulations, being the last major piece of the paper, were less interesting as they relied mostly on proprietary tools. The piece that may have been important for readers was the iterative cycle of debugging and performance tuning. Going from schematic overview to transmission line calculations to back-annotation, to placing and routing made some sense.

Please feel free to contribute your thoughts on this paper, my interpretation or another paper that would be an interesting read in the comments section. Lets look at Claude Shannon’s paper titled ‘A Mathematical Theory of Communications’ as suggested by Adam. As the full paper is quite long, we may want to look at only the first thirty pages in detail. Those that want to brush up on their mathematics before attempting the paper should start on page thirty-two.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>