低消费電力指向VLSI設計の一手法 三好 章夫 三井 信雄 日本アイビーエム 野洲研究所 IBM corp. 滋賀県野州郡野州町市三宅 &cc番地 ボカオン、フロリダ U.S.A あっまし 近年、ASICの大規模に、为係化に伴い、低消費電力のVLSI開発が亜末さ れるようになってきた。本論文では、設計システムの観点から、特に、LSI内部の 配線容量を小さくするための配置・配線手法に焦点を当て、新しい低消費電力指向の 一設計手法を提案する。 A Method of Power Driven Design Akio Miyoshi Nebuo Mii Yasu Technology Application Laboratory, IBM Japan Yasu 800, Ichimiyake. Yasu-cho, Yasu-gun, Shiga-ken 520-23 IBM corporation ESD Boca Raton Florida U.S.A Abstract This paper describes a method of Power Driven Design focusing on layout of LSI chip to reduce power consumption. The function respecting timing constraints of critical paths are also discussed in the paper to eliminate timing problems. Using a power estimation tool, it was con-cluded that this design method will bring about 20% to 30% power reduction on internal wiring nets and 10 to 15% reduction on the whole LSI. # INTRODUCTION The number of circuits in a LSI chip is much incresing so that high power caused by increase of total current in a LSI is becoming a sereous problem. Also, LSIs with lower power consumption are strongly required for recent portable products. From the point of these views, we take a new approach to meet this strong requirement from the design system view point. The new design methodology which is presented in this paper is named Power Driven Design (PDD). The main function of this PDD is called Power Control (PC). PC is performed in physical design phase, which consists of two sub-functions, what is named PC-P (Partitioning) and PC-S (Switching factor driven). Basic ideas of those new functions to reduce power consumption are as follows; PC-P is to generate the control files to perform the closer placement of latches (Shift Register Latch) in circuits to shorten CLOCK nets connected to SRLs. PC-S is to generate the control file to perform placement and wiring with the switching factor constraints of each net using Logic Simulator. A common goal of both sub-functions is to shorten nets with high switching factor like CLOCK nets and to reduce power being consumed by wiring capacitance. In this design method, power consumption can be estimated at each design phase, (logic-fixed, placementcompleted and wiring-completed) to judge whether or not it can satisfy power target. Combination of PC-P and PC-S will be flexibly determined from results of power estimation For TIMING concerns which might be caused by PDD functions, the timing-respecting functions are also implemented in PC functions. The method to eliminate timing concerns like Latch-to-Latch critical paths are discussed at 'TIMING CONSIDER-ATIONS'. These new approaches in PDD (Power Driven Design) for lowering LSI power consumption are discussed in detail from next chapters. # LSI POWER CONSUMPTION Power consumption is generally estimated with the following basic equation. $$POWER = ((\frac{1}{2}) \cdot C \cdot V^2 \cdot SWF) \cdot (\frac{1}{T})$$ V is the voltage that the capacitance is charged and T means the cycle time of the machine in 'ns'. C (Capacitance) includes effective capacitance in each basic macro and wiring capacitance as described below; $$\textit{POWER} = ((\frac{1}{2}) * (C_{\texttt{eff}} + C_{\texttt{wire}}) * V^2 * SWF) * (\frac{1}{T})$$ SWF (Switching Factor) is defined as the number of times the circuit switches during 1000 machine cycles, divided by 1000. TRANSITIONS means the number of transitions per 1000 machine cycles; $$\mathit{SWF} = \frac{\mathit{TRANSITIONS}}{1000} \qquad (0 < \mathit{SWF} \leq 2.00)$$ Switching facors of all nets in the total circuits are generated by a logic simulator and a SWF file generator as shown on Fig.01. Test case will be used as input to a logic simulator to calculate switching factor. SWF is the number between 0 and 2.00. In LSI chip, the SWF of base clock is 2.00 because clock signals take two transitions in one cycle. We calculate power consumption with this equation and factors. ### **PLACEMENT** To perform low power-oriented placement, PC(Power Control) was developed. PC con-PC-P(Power sists of two functions, Control-Partitioning) and PC-S(Power Control-Switching factor driven). Those two functions are focusing on latches (Shift Register Latch) and high switching factor nets like CLOCK signals. After generating control files by PC, a placement program is applied with those control files. The placement algorithm, which is baically Simulated Annealing <REF1><REF5> considers estimated wire congestion, access-pin density of basic macros and congestion, estimated wire capacitance as cost functions. Simulated annealing placement with PC-P is implemented as a primary step. If PC-P implementation can not achieve the expected (estimated) power reduction, then PC-S will be implemented as a secondary step. Or, placement will be performed once after implementing PC-P and PC-S at the same time. Fig.02 shows the placement design flow. #### PC-P (Power Control - Partitioning) PC-P generates two control files (an area definition file and a logic partitioning file) and pass them to the placement program. Before stating ideas of PC-P, We will simply explain about LSSD (Level Sensitive Scan Design) which is generally wellknown as high testable design method. We ususally adopt this design to obtain high testability of LSI chip. This LSSD is basically synchronous design and all SRLs in a LSI chip are directly connected to CLOCK signals (B and C clock) as shown below; From view point of this design structure, we expect that clock signals which are probably highest frequency among all nets in LSI chip can be shorten when SRLs are closely placed. Therefore, the function of PC-P is that all SRLs and clock drivers in a logic network are put together in one proper area which is defined as SRL Group Area. SRL Group Area size is determined by ,a parameter, p1 x cell#(the number of total SRLs and clock drivers). It is desirable that p1 should be equal or greater than '1.2', when 20% of SRL Group Area which SRLs themselves do not dominate can be effectively used as available area for other basic macros. SRL Group Area can be put anywhere in the chip image considering the physical position of clock signal pad. Fig. 03A shows SRL Group Area on SOUTH and on WEST. When floor planning will be involved in this PDD design flow, SRLs in each logic portion defined there are also gathered in specific areas (see Fig.03B). Another function of PC-P is to define the proper size of total placement area as ,a parameter, p2 x (total cell#). Optimal values of factors, p1 and p2 generally depend on logic structure. From our experiments, we get good results on wirability and power reduction at p1=1.2 and p2=1.5. ### PC-S (Power Control - Switching Factor Driven) PC-S generates a control file (Target Net Capacitance file) and pass it to the placement program. PC-S decides the Target Net Capacitance as to shorten high switching factor nets (net switching factor is given by a logic simulator and SWF generator). Nets with its COST > c1 is regarded as target nets and their target capacitances are set as the estimated capacitance value multiplied by c2 ( 0 < c2 < 1 ). Coefficients, c1 and c2 are parameters which designers can define. c1 and c2 should be optimized with characteristics of a logic network. COST and Target Net Capacitance are defined as follows: For each net, COST = Capacitance(pF) \* SWF For nets with COST > c1. $TARGET = c2 \cdot Capacitance(pF)$ Capacitance is estimated at Logic-fixed phase or post-placement with PC-P. The placement with PC-S is implemented using Target Net Capacitance file. This method does focus not on specific clock nets but on high switching factor nets, which means good effectiveness for power reduction even in asynchronous circuits. ### **WIRING** Target Net Capacitance file which was generated by PC-S after completing placement, is used in wiring phase. Wiring program based on Maze router algorithm tries to meet its target given in its file. COST and TARGET for Target Net Capacitance are defined as same as PC-S at placement. c3 and c4 are parameters that the designer defines. c3 will be generally equal or greater than c1 given at placement and c4 should be equal or greater than c2. In wiring phase, PC-S does not have to set high target because it is less effective in wiring than in placement. COST = Capacitance(pF) = SWF For nets with COST > c3, $TARGET = c4 \cdot Capacitance(pF)$ Capacitance is estimated after fixing placement. Fig.04 shows the design flow of wiring # TIMING CONSIDERATIONS So far, we have discussed the method to reduce POWER consumption. However, in fact, timing constraints may exist in circuits and they should never be ignored. Here, new ideas are applied for elimination of timing problems using Timing Analysis tool. The principles of timing analysis are described on a papaer <REF2>. - Basic macros on critical paths detected by timing analysis are included in SRL Group Area - Critical nets detected by timing analysis are shortened as well <REF3><REF4> The typical case of timing constraints is shown on Fig.05. We assume : Clock frequency : F MHz Block delay of gate 1,2,3 : Tp ns SRL2 data setup time : Ts ns wiring delay of net 1,2,3,4 : Tw ns To be normally functioned, the following equation must be satisfied. $$A=\frac{1}{F}$$ $$B = \sum_{i=1}^{3} T \rho_i + Ts + \sum_{j=1}^{4} T w_j$$ And. Slack = A - B > 0 When Slack is negative, the path is judged as critical. For example, when we assume f=50MHz, Tp=6ns, Ts=1ns, and Tw=0.5ns; Slack = -1 ns Therefore, the path, SRL1-30 to SRL2-D0, will be judged as a critical timing path. PC-P generates control files to include Gate1, Gate2 and Gate3 in SRL Group Area to shorten all nets on critical timing path (see Fig.06). Gate1, Gate2 and Gate3 are detected as macros on a critical path by a timing analysis tool. An another function, PC-S genarates Target Net capacitalnce file which accounts for timing as well as power. Target Net Capacitance of critical nets (net1, net2, net3 and net4 on Fig.05) detected by timing analysis is generated <REF3> and they are overlayed with power-oriented ones so that Target Net Capacitance detected from timing constraints should take the higher priority. Thus, finally, control files incorporated both POWER and TIMING are applied to the placement and wiring proarams ### **EXPERIMENTAL RESULTS** The effectiveness of this approach is shown by the experimental results with actual ASICs. Reduction of power consumed in internal wiring is calculated compared with the conventional way of placement and wiring. Results are described below. | SAMPLE | GATE# | Function | Power on NETs | Reduction | Condition | |--------|-------|---------------------|-------------------------------------------------|-----------------------|------------------------------------------------------------------------------| | (1) | 0.35K | -<br>PC-P<br>PC-P/S | 0.0447 mW/MHz<br>0.0358 mW/MHz<br>0.0352 mW/MHz | -20.0%<br>-25.1% | parm:<br>c1=0.1<br>c2=0.5<br>c3=0.2<br>c4=0.6<br>SMF:<br>clk=2.0<br>oth=0.25 | | (2) | 18.6K | PC-P<br>PC-P/S | 3.4624 mW/MHz<br>2.8739 mW/MHz<br>2.7402 mW/MHz | -<br>-16.9%<br>-20.8% | | | (3) | 12.8K | -<br>PC-P<br>PC-P/S | 5.5056 mW/MHz<br>5.1302 mW/MHz<br>4.5917 mW/MHz | -6.82%<br>-16.6% | | | (4) | 27.0K | -<br>PC-P<br>PC-S | 0.0742 mW/MHz<br>0.0537 mW/MHz<br>0.0476 mW/MHz | -27.7%<br>-35.9% | parm:<br>c1=c3=0.1<br>c2=c4=0.7<br>SWF:<br>actual<br>T/C<br>ave=0.015 | (power consumption estimated by simulation) Actual placement views of SAMPLE(4) are shown on Fig.07 to Fig.09. Fig.07 is a view implemented by PC-P (SRL Group Area is on SOUTH) and Fig.08 is done by PC-S. Fig.09 shows placement implemented by a conventional method. CPU time for SAMPLE(4) design is as follows; | | CP | ison (3090-600S) | | |-----------|-------------|------------------|---------------------| | SAMPLE(4) | PC-P method | PC-S method | conventional method | | Placement | 1:06:32 | 1:07:39 | 1:09:48 | | Wiring | 56:14 | 49:57 | 47:10 | ### CONCLUSION We established a method and system of low power driven design based on an idea of Power Control (PC). It includes the considerations on timing constraints of critical paths using a timing analysis tool. We can summarize Power Driven Design as follows: - Placement with partitioning SRLs in one proper rectangle area - Placement with Target Net Capacitance file based on switching factor of all internal nets - 3. Wiring with Target Net Capacitance file From results of our design, this method could bring about 15 to 30% power reduction on internal nets and it leads to 10 to 15% power reduction on the total chip. Refer Fig. 10 for the total PDD flow. ### **ACKNOWLEDGEMENTS** The authors wish to thank M.Kudoh, Y.Okada, T.Yokota, K.Kuroda and I.Mashima for valuable discussions and comments. ### REFERENCES - <REF1> S.Kirkpatrick, C.D. Gelatt Jr. and M.P Vecchi, "Optimization by simulated Annealing" Science, vol220,no4598, 1983 - <REF2> R.B Hitchcock Sr, G.L. Smith and D.D Cheng, "Timing Analysis of Computer hardware", IBM Journal of Research and Development, vol26, no.1, Jan. 1983 - <REF3> W.E Donald, "Timing Driven Placement using complete path delays" DAC June, 1990 - <REF4> Wing K. Luk, "A Fast Physical Constraint Generator for Timing driven Layout", DAC June, 1991 - <REF5> Carl Sechen "VLSI Placement and Global Routing Using Simulated Annealing", Kluwer Academic Publisher, 1988 <Fig.06> < Fig.07 > < Fig.08 >