# Analysis of SER Improvement by Soft Error Tolerant Latches

Ken Yano<sup>†1†2</sup> Takanori Hayashida<sup>†1</sup> Toshinori Sato<sup>†1†2</sup>

In this paper we investigate SER improvement of various soft error tolerant latches. Many soft error tolerant circuit designs have been proposed not only for DRAM, SARM but also for latches and flip-flops. However relative comparison of different designs has not been fully investigated so for. We propose a simulation based SER analysis method which can be applied to these circuits so that relative comparison of SER improvement can be accurately analyzed. For the experiments, we use three different soft error tolerant latches namely DICE, BISER, and TMR; then SER of each circuit is fully investigated. The DICE latch increases soft error resilience by adopting local redundancy enabling feedback from cross coupled nodes, on the other hand BISER and TMR latches archive soft error resilience by duplicating vulnerable normal D-latches with some logic gates to filter erroneous outputs. From our simulation results, it is confirmed that DICE circuit outperforms other circuits with minimum area, delay and energy consumption.

# 1. Introduction

If high energy neutron irradiated on the earth from the universe or alpha rays emitted from package material strike on a sensitive part of transistor, the phenomenon in which the state of bit is flipped as a result is called SEU (Single Event Upset). As the scaling of device technology advances, so the amount of electric charges stored at nodes become smaller, so they becomes more vulnerable for particle strikes in recent years. Although soft error has been considered as a threat only in DRAM and SRAM, it also becomes serious concerns in latches and flip-flops. In order to mitigate soft error, various methods from different levels of design hierarchy; device process, circuit and architecture have been studied. As an example from device process level, there are techniques of coating chip surface with special materials which prevent neither neutron nor alpha ray from striking and inserting DRAM type capacities in vulnerable parts of devices [13][14]. On circuit level, vulnerable parts of circuit are made tolerant by duplicating them locally or multiple copies of equivalent circuit block are used to increase its resilience to soft error [15]. In architecture level, ECC (Error Correcting Code) is known to be effective method to tackle SEU in the last stage of a memory circuit [15]. Soft-error protections from software side are also proposed such as multithreading, multistrobe, and software-implemented hardware fault tolerance [16][17].

Soft error will be triggered if collected electronic charge reaches critical charge level at a node and eventually flip the state of bit. Qcrit is defined as the minimum electronic charge required in order to flipping the state of bit stored at a node. The larger Qcrit at a node, it becomes more difficult to reverse the state of bit, therefore the node can be judged to be robust for soft error. Since Qcrit is proportional to the capacity of a node, and power supply voltage, Qcrit tends to become smaller as technology scales hence devices in deep submicron domain become more vulnerable to soft error in these days. Many soft error tolerant circuit designs have been proposed not only for DRAM, SARM but also for latches and flip-flops. However relative comparison of different designs has not been fully investigated so for. We propose a simulation based SER analysis method [19] which can be applied to these circuits so that relative comparison of SER improvement can be accurately analyzed. We evaluate three radiation hardened latches; DICE, BISER, and TMR. We also analyze area and delay and power overhead of each latch design. From our simulation results, it is confirmed that DICE circuit outperforms other circuits with minimum area, delay and power. The rest of the paper is organized as follows; Section 2 describes background of our study. In Section 3, soft error tolerant latches used for our experiments are explained in detail. In Section 4, the proposed soft error simulation method is described. Section 5 presents the simulation results, conclusions are drawn in Section 6.

# 2. Background

Various radiation hardening circuit designs have been proposed to mitigate SEU. The DICE (Dual-interlocked Cell) increase soft error resilience by utilizing local redundancy [1]. The details of how the corruption of a node in case of particle strikes are described in Section 3. In reference [12], BISER (Built-In Soft-Error Resilience) is proposed to reduce the impact of soft errors that affect flip-flop. They utilize scan flip-flops to implement the circuit to reduce area overhead to overcome the drawbacks of existing soft-error protection techniques. In reference [9], they propose modified BISER called BCDMR (Bistable Cross-coupled Dual Modular Redundancy) by utilizing cross-coupled C-elements and weak keepers to prevent unnecessary bit flip caused by particle strike on the C-element. The authors claim that SEUs induced by particle strikes at C-elements are dominant under intensive irradiation condition. TMR (Triple Modular Redundancy) allows protection of the functionality against soft error by coping identical circuit block three times with a 2-out-of-3 voter at the output.

<sup>†1</sup> Fukuoka University

<sup>†2</sup> CREST, Japan Science and Technology Agency



Figure 1. Normal D-latch

In order to estimate SER, various approaches have been studied so far. One method actually irradiates devices with particles and experimentally measures SER [1][9], the other approach measures Qcrit at vulnerable node by simulation such as HSPICE for analyzing its relative resilience for soft errors[2][3][6]. There are also methods which analytically measure SER using probability theory [4][5]. To the best of our knowledge, there are no simulation-based methods that quantitatively estimate SER with in consideration of CMOS process variation.

Accurate SER estimation is essential to develop efficient soft error tolerant schemes and to determine the contribution of circuit nodes to the overall system SER.

## 3. Soft Error tolerant latches

In this study, we use three different soft error tolerant latches and analyze SER improvement by each circuit along with its area and power overhead. We only consider SEU (Single Event Upset); the state of bit flipping by particle strikes when latch is operating in opaque. The contribution of the SER of latch to the overall system is not discussed in this paper. Interested readers should refer to other literatures [3][5][8]. We use D-latch shown in Fig.1 as the base normal D-latch for analyzing other rad-hardened latch circuits.

DICE (Dual Interlocked Cell) latch is proposed to increase its resistance to soft errors by duplicating the sensitive pair of nodes to store the state of bit [1]. Each sensitive node is connected to adjacent complementary gates as shown in Fig. 2. When the latch operates in opaque mode, logical values at node



Figure 3. Recovery of soft error by DICE latch (particle strikes at 500 ps).

N0a, Na1, N0b, N1b becomes either LHLH or HLHL. Here let us consider the case when particle strikes at node N0a and the state of bit is being flipped. For example, suppose that initially logical value L is stored at nodes N0a, N0b and logical value H is stored at nodes N1a, N1b before neutron strike with the node N0a. If the voltage at N0a is pulled up by the collision, the NMOS T5b is turned on and tries to pull down the voltage at node N1b. However at the same time voltage at node N0a is pushed down by the NMOS transistor T3a and the voltage at node N1b is pulled up by the PMOS transistor T4b.

After certain amount of time is elapsed, the latch is guaranteed to retrieve to the original state of logical values LHLH.

Fig. 3 shows the transient voltage levels at node N0a, N0b, N1a, N1b from the time when the current pulse is injected by particle hit until the state of latch is recovered to its original.

Fig.4 shows BISER (Build-In Soft-Error Resilience) latch [12]. It duplicates normal D-latch with C-element and a weak keeper. If the state of bit is flipped in one of the latches by particle strike, the error will not propagate to output with the help of C-element and weak keeper. BISER is vulnerable to soft error when the bit flipping occurs at both latches simultaneously. Fig.5 shows TMR (Triple Modular Redundancy) latch. It consists of three identical copies of normal D-latch and a majority voting circuit. Even if the state of bit is flipped in either of the three, the other two latches hold the correct value, so the majority voting dates can select the collect value. When the state of bit is flipped simultaneously in two or all of the three



Figure 2. DICE latch [1]



Figure 4. BISER latch [12]



Figure 5. TMR latch

latches, the error adversely propagates to the output. Basically these two latches achieve resilience to soft error by duplicating the vulnerable circuit block such as D-latch in this case by sacrificing large area and power overhead.

## 4. Soft error simulation method

In order to analyze SER, we use a Monte-Carlo simulation [19]. SER estimation by Qcrit is also well known however the method cannot accurately calculate SER especially rad-hardened circuits since interactions of injected charges at multiple nodes must be accounted for. First we explain how Qcrit is defined then followed by the proposed irradiation modeling in details.

## 4.1 Qcrit – Critical Charge

Qcrit is knows as an effective parameter to measure vulnerability to collision of particles at storage cell [2][3]. When weak pulse current generated by the strike of particles is defined as i(t), Qcrit is defined by the minimum time integral of i(t) in order to flip the state of bit.

$$Q_{Crit} = \int_0^t i(t)dt \tag{1}$$

Here i(t) is defined as double exponential function as follows[3].

$$i(t) = \frac{Q_{total}}{\tau_f - \tau_r} \left( e^{-t/\tau_f} - e^{-t/\tau_r} \right) \quad (2)$$

In this equation,  $Q_{total}$  represents the sum of the charge generated by the collision of particles, and  $\tau_r$  and  $\tau_f$  is the time constant of rise and fall of the current pulse respectively. Typically  $\tau_r$  is usually much smaller than  $\tau_f$ , hence the accumulation of charge

Table I. Parameter of random current pulse

|                  | Distribution | Average  | 3σ       |
|------------------|--------------|----------|----------|
| τ <sub>r</sub>   | Gaussian     | 0.05ps   | 0.005 ps |
| τ <sub>f</sub>   | Gaussian     | 5 ps     | 0.5 ps   |
|                  |              | Range    |          |
| T <sub>d</sub>   | Uniform      | 0-500 ps |          |
| C <sub>inj</sub> | Uniform      | 0-100 fC |          |

is conducted immediately after the steep current rise. The shape of current i(t) looks very similar as triangular, so it is often modeled as triangular waveform [2][3]. This approximation also helps to reduce complexities for simulating current injection.

## 4.2 Particle Irradiation Modeling

Next, in this subsection, we explain the quantitative estimation method for SER[19]. In order to approximate the variation of the characteristics of current pulse generated by particles collision we defined  $\tau_r$  and  $\tau_f$  as random variables according to the Gaussian distribution. The mean of  $\tau_r$  and  $\tau_f$  is obtained from the reference [3] and 3 sigma is set as 10% of its mean values empirically. The amount of charge injected by particle collisions C<sub>inj</sub> depends on the outer environment and packaging material and will change over time. We define it as a uniform random variable in order to keep generality. The range of charge injected defined from 0fC to 100fC. For reference, Ocrit of the D-latch at supply voltage 1.8V is calculated as 64.5fC, so if maximum charge, 100fC, is injected, it is guaranteed to cause soft error. The range and distribution of injected charge can be altered and incorporated in the simulation if correct experimental data can be obtained. In order to approximate the distribution of particle strikes, time of occurrence of particle strike is shifted randomly. The time shifted, Td, is defined as uniform random variable. For our experiment, the range is defined from 0ps to 500ps. All the



Figure 6. Test circuit with current sources

random variables used to define current triangular pulse are described in Table 1. The peak of current pulse can be defined as;

$$I_{peak} = \frac{2C_{inj}}{\tau_r + \tau_f} \tag{3}$$

Current sources defined as random pulse are inserted at each sensitive node storing the state of bit as shown in Fig. 6. Each current source is defined as individual random waveform by using different random values.

#### 4.3 Approximation of CMOS Process Variation

In order to approximate process variation of CMOS, the threshold voltage (VTH0) of NMOS and PMOS, channel length offset (XL), channel width offset (XW), and the thickness of gate oxide (TOX) are defined as random variables according to the Gaussian distribution. The average value and  $3\sigma$  of each random variable are calculated by referring to the spice model of Rohm 0.18um [11]. Note that the gate length and width are defined as fixed values as L=180n, W=2u for NMOS and L=180n, W=5u for PMOS except for transmission gates, which use same gate width, and no parameter optimizations have been done for all circuits.

# 5. Results

We analyze the SER improvement by these rad-hardened latches using the soft error simulation method described in Section 4. For the experiments, we assume that the particle hit only occurs at vulnerable parts of latches storing the state of bit to reduce complexities. In order to estimate SER Monte-Carlo simulation is conducted 10,000 times for each circuit and each time using different random variable.

Prior to the start of each simulation, as initial values low voltage is set at nodes corresponding to N0 and high voltage is set at nodes corresponding to N1 as shown in Fig. 6. At the end of each simulation the state of bit at nodes is examined and it is checked whether or not the bit is reversed. In order to calculate SER, the number of times the bit flipped is divided by the number of simulation times.

Fig. 7 shows SER of D-latch and each rad-hardened latch. SER improved by 13.0 times for DICE latch, 3.94 times for BISER latch, and 1.54 times for TMR latch. From this result, DICE latch improves SER much better than other rad-hardened latches.

For the DICE latch, soft error occurs when pulse current is injected at pair nodes during the short time frame in series while the other pair nodes are harmless then the amount of charge injected reaches critical level for flipping the state of bit. If we define the average current pulse cycle length as  $T_{cyc}$  and the



Figure 7. SER of D-latch and radiation hardened latches

(VDD=1.8v, Temp=27C, Td<sub>avg</sub>=500ps)



Figure 8. SER of D-latch and radiation hardened latches at different time interval of particle strikes (VDD=1.8V Temp=27C)

short time frame T<sub>fr</sub>, SER of DICE latch can be defined approximately as

$$SER_{DICE} = 2\Pr(Q_{N0a}, Q_{N0b}) \left(\frac{T_{fr}}{T_{cyc}}\right)^2 \left(\frac{T_{cyc} - T_{fr}}{T_{cyc}}\right)^2$$
(4)

, where the Pr(Q<sub>N0a</sub>,Q<sub>N0b</sub>) is the probability that the injected charges at node N0a and N0b reaches critical level. Since there are two pair nodes, the multiplier 2 is defined in equation 4. T<sub>cyc</sub> is obtained from the outer irradiation environment. However, T<sub>fr</sub> and Pr(Q<sub>N0a</sub>,Q<sub>N0b</sub>) is not known unless they are estimated by accurate simulation. Hence it is difficult to estimate SER improvement rate of DICE latch analytically.

If the SER of normal D-latch is defined as  $SER_{DLatch}$ , SER of BISER and TMR is defined as follows

$$SER_{BISER} = SER_{DLatch}^{2}$$
 (5)

$$SER_{TMR} = {}_{3}C_{2}SER_{DLatch}^{2}(1 - SER_{DLatch}) + SER_{DLatch}^{3}$$
(6)

It is confirmed that the SER of BISER and TMR obtained by equation 5 and 6 reflects the result of SER in Fig. 7.

SER of DICE latch depends on the timing when multiple nodes are stricken by particles. Fig. 8 shows the log scale of SER calculated when the average time interval of particle strikes is changed to 500ps, 2.5ns, 5ns, and 10ns. This figure clearly shows the dependent of DICE latch's reliability on the timing of particle strikes. As the timing interval of particles strikes becomes longer, the interference from adjacent nodes by particle strikes gets lower, so the SER is decreased. SER of D-latch less likely depends on timing interval of particle strikes, but it is mainly determined by whether or not the amount of charges injected exceeds the Qcrit. Since the SERs of BISER and TMR latches mainly depend on the SER of D-latch as shown in equations 5 and 6, so they less likely depend on the timing of particle strikes as well.

Table 2 shows the number of P-MOS, N-MOS used in each circuit. DICE latch is the most efficient in terms of area overhead. Table 3 describes delay time it took from input D to output QBar when the latch is in transparent mode. Delay (H->L) is the delay from the time when D crosses half of VDD from {Low} to {High} to the time when QBar crosses half of VDD from {High} to {Low}. Delay (L->H) is the delay from the time when D crosses half of VDD from {High} to {Low}. Delay (L->H) is the delay from the time when QBar crosses half of VDD from {High} to {Low}. Delay (L->H) is the delay from the time when QBar crosses half of VDD from {High} to {Low} to the time when QBar crosses half of VDD from {Low} to the time when QBar crosses half of VDD from {Low} to the time when QBar crosses half of VDD from {Low} to the time when QBar crosses half of VDD from {Low} to the time when QBar crosses half of VDD from {Low} to the time when QBar crosses half of VDD from {Low} to the time when QBar crosses half of VDD from {Low} to {High}. DICE latch has no delay overhead compared with D-latch however BISER and TMR latches have large delay overhead due to the additional logic gates at output.

Fig.9 shows the energy consumed at each latch during one cycle time and Fig.10 shows the energy-delay product of each latch. In order to calculate energy-delay product, worst delay is used. From these figures, DICE is also very efficient in terms of energy consumption. From these results, it can be concluded that DICE latch is highly efficient radiation hardened circuit. It achieves high SER improvement rate without scarifying the area, delay and power overhead. Moreover, the simplicity and compactness of the circuits is well suited for implementing rad-hardened SRAM. Actually it is reported that DICE latch is

Table II. Number of PMOS, NMOS used in each latch design

|        | D-Latch | DICE | BISER | TMR |
|--------|---------|------|-------|-----|
| PMOS   | 6       | 10   | 17    | 28  |
| NMOS   | 7       | 12   | 20    | 31  |
| PMOS + | 13      | 22   | 37    | 59  |
| NMOS   |         |      |       |     |

Table III. Delay of D-latch and radition hardened latches

| Delay[ps]       | D-Latch | DICE | BISER | TMR |
|-----------------|---------|------|-------|-----|
| Delay           | 38.3    | 36.1 | 176   | 163 |
| (H->L)          |         |      |       |     |
| Delay<br>(L->H) | 50.8    | 52.8 | 171   | 160 |



Figure 9. Energy of D-latch and radiation hardened latches



Figure 10. Energy-Delay product of D-latch and radiation hardened latches

used on commercial processors such as the quad-core Itanium processor [18].

## 6. Conclusion

In this study, SER of various radiation-hardened latches is investigated using soft error simulation method proposed in [19]. For the experiments, we use normal D-latch and three radiation hardened latches; DICE, BISER and TMR. DICE increases its resilience to soft error by adopting local redundant design. The temporal bit flipping at a node is retrieved to its original state by feedback drives from adjacent nodes. The mechanism of soft error tolerance of BISER and TMR latch is basically different from DICE. It increases the resilience to soft error by duplicating the vulnerable normal D-latch and some logic gates to filter erroneous output with huge area and power overhead.

The degree of SER improvement rate by DICE is confirmed to be much larger than BISER and TMR from the simulation results. Other than SER improvement rate, DICE is also very efficient in terms of area, delay, and energy overhead as shown in our results. It can be concluded that DICE is very efficient rad-hardening technique. DICE can become vulnerable when multiple nodes are struck simultaneously by particles. Hence it is essential to design the cell layout so that the pair nodes storing the same logical value should be placed separated as far as possible. In reference [10], they introduced the Double-DICE storage to reduce SEU by interleaving two DICE storage cells in layout. In this paper, we do not discuss the contribution of SER at nodes to overall system SER. This is one of our future works.

# ACKNOWLEGMENT

This study is supported in part by CREST project "Fundamental technologies for dependable VLSI system" of Japan Science and Technology Agency. This work is supported by VLSI Design & Education Center (VDEC), the University of Tokyo [11] in collaboration with Synopsys, Inc., Cadence Design Systems, Inc. and ROHM Co., Ltd.

## Reference

- [1] Hazucha, P.; Karnik, T.; Walstra, S.; Bloechel, B.; Tschanz, J.; Maiz, J.; Soumyanath, K.; Dermer, G.; Narendra, S.; De, V.; Borkar, S.; , "Measurements and analysis of SER tolerant latch in a 90 nm dual-Vt CMOS process," Custom Integrated Circuits Conference, 2003. Proceedings of the IEEE 2003, pp. 617- 620, Sept. 2003
- Qian Ding; Rong Luo; Hui Wang; Huazhong Yang; Yuan [2] Xie; , "Modeling the Impact of Process Variation on Critical Charge Distribution," SOC Conference, 2006 IEEE International , pp.243-246, 24-27 Sept. 2006.
- Chandra, V.; Aitken, R.; "Impact of Technology and Voltage Scaling on the Soft Error Susceptibility in [3] Nanoscale CMOS," Defect and Fault Tolerance of VLSI Systems, 2008. DFTVS '08. IEEE International Symposium on , pp.114-122, 1-3 Oct. 2008.
- Quming Zhou; Mohanram, K.; "Cost-effective radiation [4] hardening technique for combinational logic," Computer Aided Design, 2004. ICCAD-2004. IEEE/ACM International Conference on, pp. 100-106, 7-11 Nov. 2004.
- Asadi, G.; Tahoori, M.B.; "An analytical approach for [5] soft error rate estimation in digital circuits," Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on , pp. 2991- 2994 Vol. 3, 23-26 May 2005.
- Ramanarayanan, R.; Degalahal, V.; Vijaykrishnan, N.; [6] Irwin, M.J.; Duarte, D.; "Analysis of soft error rate in flip-flops and scannable latches," SOC Conference, 2003. Proceedings, pp. 231- 234, 17-20 Sept. 2003
- [7] Shivakumar, P.; Kistler, M.; Keckler, S.W.; Burger, D.; Alvisi, L.; , "Modeling the effect of technology trends on the soft error rate of combinational logic," Dependable Systems and Networks, 2002. DSN 2002. Proceedings. International Conference on , pp. 389- 398, 2002.
- Pontes, Julian; Calazans, Ney; Vivet, Pascal; , "An [8] accurate Single Event Effect digital design flow for reliable system level design," Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012, vol., no., pp.224-229, 12-16 March 2012.
- Furuta, J.; Hamanaka, C.; Kobayashi, K.; Onodera, H.; , [9] "A 65nm Bistable Cross-coupled Dual Modular Redundancy Flip-Flop capable of protecting soft errors on the C-element," VLSI Circuits (VLSIC), 2010 IEEE Symposium on , pp.123-124, 16-18 June 2010.
- [10] Haghi, M.; Draper, J.; , "The 90 nm Double-DICE storage element to reduce Single-Event upsets," Circuits and Systems, 2009. MWSCAS '09. 52nd IEEE International Midwest Symposium on , pp.463-466, 2-5 Aug. 2009.
- [11] VLSI Design & Education Center, the University of Tokvo.
- [12] Mitra, S.; Seifert, N.; Zhang, M.; Shi, Q.; Kim, K.S.; , "Robust system design with built-in soft-error resilience," Computer, vol.38, no.2, pp. 43- 52, Feb. 2005.
- [13] Baumann, R.C.; "Soft errors in advanced semiconductor

devices-part I: the three radiation sources," Device and Materials Reliability, IEEE Transactions on , vol.1, no.1, pp.17-22, Mar 2001.

- [14] Baumann, R.; "Soft errors in advanced computer systems," Design & Test of Computers, IEEE, vol.22, no.3, pp. 258- 266, May-June 2005.
- [15] Nicolaidis, M.; "Design for soft error mitigation," Device and Materials Reliability, IEEE Transactions on, vol.5, no.3, pp. 405- 418, Sept. 2005.
- [16] Reis, G.A.; Chang, J.; Vachharajani, N.; Rangan, R.; August, D.I.; "SWIFT: software implemented fault tolerance," Code Generation and Optimization, 2005. CGO 2005. International Symposium on , pp. 243- 254, 20-23 March 2005.
- [17] Wang, Cheng; Kim, Ho-seop; Wu, Youfeng; Ying, Victor; ; "Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection," Code Generation and Optimization, 2007. CGO '07. International Symposium on , pp.244-258, 11-14 March 2007.
- [18] Stackhouse, B.; Cherkauer, B.; Gowan, M.; Gronowski, P.; Lyles, C.; "A 65nm 2-Billion-Transistor Quad-Core Itanium® Processor," Solid-State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE International, pp.92-598, 3-7 Feb. 2008.
- [19] Yano, K.; Hayashida T.; Sato, T.; "Accurate analysis of SER improvement by Soft Error Tolerant Logic Circuits", (Under submission)