Chapter 17. Testing Aspects of Nanotechnology Trends

Mehdi B. TahooriNortheastern University, Boston, Massachusetts

Niraj K. JhaPrinceton University, Princeton, New Jersey

R. Iris BaharBrown University, Providence, Rhode Island

About This Chapter

As complementary metal oxide semiconductor (CMOS) devices are scaled down into the nanometer regime, new challenges at both the device and system levels are arising. Many of these problems were discussed in the previous chapters. Though the semiconductor industry and system designers will continue to find innovative solutions to many of these problems in the short term, it is essential to investigate CMOS alternatives. New devices and structures are being researched with vigor within the device community. They include resonant tunneling diodes, quantum-dot cellular automata, silicon nanowires, single electron transistors, and carbon nanotubes. Each of these devices promises to overcome the fundamental physical limitations of lithography-based silicon (Si) very-large-scale integration (VLSI) technology.

Although it is premature to predict which device will emerge as a possible candidate to either replace or augment CMOS technology, it is clear that most of these nanoscale devices will exhibit high defect rates. Consequently, one key question system designers will have to address is how to build reliable circuits, architectures, and systems from unreliable devices. New testing and defect tolerance methodologies at all levels of the design hierarchy will be needed to target these devices.

The focus of this chapter is to provide a brief overview of test technology trends for circuits and architectures composed of some of these emerging nanoscale devices. In particular, defect characterization, fault modeling, and test generation of circuits based on resonant tunneling diodes and quantum cellular automata are discussed. This provides a perspective on circuit-level test generation for nanoscale devices. To complement this discussion and provide a perspective on system-level testing, testing of architectures based on carbon nanotubes and silicon nanowires is also presented. Finally, techniques to tolerate manufacturing imperfections and variations in logic circuits implemented using carbon nanotube field effect transistors are presented.

Introduction

Continued improvement in chip manufacturing technology has resulted in an explosive increase in speed and complexity of circuits. The results are multigiga-hertz clock rates and billion-gate chips. As CMOS devices are scaled down into the nanometer regime, new challenges at both the device and system level are arising. Some of the challenges at the device level are manufacturing variability, subthreshold leakage, power dissipation, increased circuit noise sensitivity, and cost/performance improvement. At the system level, some of the challenges are effective utilization of more than a billion gates, system integration issues, power, and performance. Historically, the semiconductor industry has overcome many hurdles through innovate solutions, the current being the transition to silicon-on-insulator (SOI) technology [Cellar 2003]. The ever-persistent need to develop new materials (e.g., high-K and low-K dielectrics (SETs) [Luryi 2004]), shrink device geometries (e.g., dual-gate or Fin field effect transistor (FinFET) devices [Wong 2002]), manage power, decrease supply voltages, and reduce manufacturing costs poses ever-increasing challenges with each shrink in the CMOS technology node.

Although temporary solutions to these challenges will continue to be found, alternative devices need to be explored for possible replacement of or integration within CMOS. Some of the emerging candidates include carbon nanotubes (CNTs) [Iijima 1991] [Fuhrer 2000] [Rueckes 2000] [Bachtold 2001], silicon nanowires (NWs) [Kamins 2000] [Huang 2001], resonant tunneling diodes (RTDs) [Chen 1996], single electron transistors (SETs) [Likharev 1999], and quantum-dot cellular automata (QCA) [Tougaw 1994]. The goal is to try to introduce some of these devices at the 22-nm node or below [Wong 2006]. These devices promise to overcome many of the fundamental physical limitations of lithography-based silicon VLSI technology. Some of these include thermodynamic limits (power), gate oxide thickness, and changes in dopant concentrations that affect device behavior (i.e., variability). Owing to their small size, it is projected that these nanoscale devices will be able to achieve densities of 1012 devices per cm2 and operate at terahertz frequencies [Butts 2002].

Although it is premature to predict how far and how fast CMOS will down-scale and which of the afore-mentioned nanoscale devices will eventually enter production, it is certain that nanoscale devices will exhibit high manufacturing defects. Furthermore, it is also clear that the supply voltage, VDD, will be aggressively scaled down to reduce power consumption. The International Technology Roadmap for Semiconductors (ITRS) published by the Semiconductor Industry Association (SIA) in 2004 predicts VDD will be at 0.5 V for low-power CMOS in 2018 [SIA 2004]. However, VDD at 0.3 V has also been predicted in [Iwai 2004]. This reduction in noise margins will further reduce the circuit reliability and expose computation to high transient error rates (e.g., soft errors). In addition, the economics of manufacturing devices at the nanoscale might dictate the use of regular layout or self-assembled structures for a large proportion of the design. This would be a stark paradigm shift from CMOS where the designer has significantly more control over the placement and configuration of individual components on the silicon substrate.

Today’s integrated circuits are designed using a top-down approach where lithography imposes a pattern. Unnecessary bulk material is then etched away to generate the desired structure. An alternative bottom-up approach, which avoids the sophisticated and expensive lithographic process, utilizes self-assembly, in which nanoscale devices can be self-assembled on a molecule-by-molecule basis. Examples of such devices include QCA, silicon nanowires, and carbon nanotubes. Self-assembly processes promise to lower manufacturing costs considerably, but at the expense of reduced control of the exact placement of these devices. Without fine-grained control, these devices will certainly exhibit higher defect rates.

In addition to lacking control of precise placement of nanoscale devices, designers will also need to consider the effect of using a fabrication process that yields devices that are only a few atoms in diameter. For instance, the contact area between silicon nanowires is a few tens of atoms. With such small cross-sectional and contact areas, fragility of these devices will be orders of magnitude higher than devices currently being fabricated using conventional lithographic techniques. This will result in higher susceptibility to static and transient faults.

Integrated circuits, in general, require thorough testing to identify defective components. If a circuit is found to be defective, it can simply be discarded. However, to improve yield, defect tolerance techniques can be applied. For example, spare rows and columns are added in memories in case other rows or columns are found to be defective. Defect tolerance schemes often require high-resolution diagnosis to precisely locate defective resources. Only then can repair be attempted. To improve overall system reliability, periodic testing may also be employed during system operation to identify defects that appear later because of breakdown from device wearout. Finally, defect tolerance can also be used to detect temporary failures (i.e., failures caused by transient or intermittent faults).

Figure 17.1a shows how the various test and diagnosis techniques are used. Application-dependent test and diagnosis techniques are useful for defect tolerance and also for detection, location, and repair of permanent faults during the normal operation of a fault-tolerant reconfigurable system. Application-independent test and diagnosis techniques are used after manufacturing, mainly for identifying defective parts and also for defect tolerance. Test and diagnosis during system operation are complex tasks; however, they help detect permanent and transient faults and hence improve the overall system reliability (see Figure 17.1b).

Test and diagnosis for defect and fault tolerance.

Figure 17.1. Test and diagnosis for defect and fault tolerance.

Given that nanoscale circuits will have higher rates of faults and defects, several works have stressed the need for aggressive defect and fault tolerance for such circuits [Collier 1999] [Butts 2002] [Goldstein 2002] [DeHon 2003a] [Mishra 2003] [Mitra 2004]. Different fault models, test generation, and fault tolerance schemes will need to be developed for these nanoscale devices. In this chapter, we review some issues and trends arising from nanoscale computing and the related test and defect tolerance challenges. In particular, we focus on three promising emerging devices: RTDs and QCA, as well as carbon nanotubes and silicon nanowires. We discuss test methodologies for each of the devices. Also, we discuss defect tolerance by presenting techniques to tolerate imperfections and variations in carbon nano-tube transistors. We conclude this chapter with a discussion of future challenges and trends in nanoscale computing.

Resonant Tunneling Diodes and Quantum-Dot Cellular Automata

Although CMOS technology is not predicted to reach fundamental scaling limits for another decade, alternative emerging technologies are being researched in hopes of launching a new era in nanoelectronics. In this section, we concentrate on two promising nanotechnologies: resonant tunneling diodes (RTDs) and quantum-dot cellular automata (QCA). A combination of RTDs and heterostructure fieldeffect transistors (HFETs) can implement threshold gates, and QCA can implement majority gates, which are a special type of threshold gate. Threshold and majority network design was an active area of research in the 1950s and 1960s. Because of the emergence of these nanotechnologies, interest in this area has been revived. Although nanotechnologies are expected to bring us very high logic density, low power consumption, and high performance, as mentioned earlier, they are also expected to suffer from high defect rates. This makes efficient test generation for such circuits a necessity.

In this section, we first discuss testing of threshold networks with application to RTDs. Next, we describe testing of majority networks with application to QCA. An example is presented to illustrate testing of stuck-at faults. Furthermore, testing of bridging faults is required for QCA majority gates and interconnects. Finally, a test generation methodology is presented to target QCA defects in majority gates and interconnects.

Testing Threshold Networks with Application to RTDs

A threshold function is a multi-input function in which each digital input, xi i ∊ {1, 2,...,n}, is assigned a weight wi such that the output function assumes the value 1 if and only if the weighted sum of the inputs equals or exceeds the value of the function’s threshold, T [Muroga 1971]—that is:

Equation 17.1. 

A threshold gate is a multiterminal device that implements a threshold function. We will use the weight-threshold vector <w1, w2, ... , wn; T> to denote the weights and threshold of a threshold gate.

A threshold function can be realized by a monostable-bistable transition element (MOBILE), which is composed of RTDs and HFETs, such as the one shown in Figure 17.2a [Chen 1996]. Figure 17.2b shows a MOBILE’s equivalent threshold gate representation. The modulation current, ΔI, applied at the output node determines what digital state the device transitions to [Pacha 1999]. The modulation current is obtained from Kirchoff’s current law and is given as:

Equation 17.2. 

where Np and Nn are the number of positive and negative weighted inputs, respectively, and I(Vgs) is the peak current of a minimum-sized RTD. The net RTD current for the load and driver is IT = TI (Vgs). Consequently, the output is logic high if ΔIIT is positive and logic low otherwise.

(a) A threshold gate implemented using a MOBILE and (b) its schematic representation.

Figure 17.2. (a) A threshold gate implemented using a MOBILE and (b) its schematic representation.

Research in Boolean testing has flourished since the 1960s [Jha 2003]. On the other hand, there has been virtually no research in testing arbitrary threshold networks. The bulk of research in threshold logic was done in the 1950s and 1960s and focused primarily on the synthesis of threshold networks [Muroga 1971]. A practical methodology for synthesis of multilevel threshold networks has been presented [Zhang 2004a]. A survey of technologies capable of implementing threshold logic can be found in [Beiu 2003].

The first step in test generation is to decide which fault model to test for [Gupta 2004]. To obtain a fault model, it is important to evaluate the impact of cuts (opens) and shorts on MOBILEs. Figure 17.3a shows the cuts and shorts in a MOBILE that can be modeled as single stuck-at faults (SSFs) at the logic level. A cut (sites 1, 2, and 3) on an HFET or on a line connecting the RTD and HFET will render it permanently nonconducting and is modeled as a stuck-at-0 (SA0) fault. Similarly, a short across an RTD (site 4) or the driver RTD (site 8) is also modeled as an SA0 fault because in the former, the input weight will become zero, whereas in the latter, there will be a direct connection between the output and ground. A cut at site 6 represents either SA0 or stuck-at-1 (SA1) fault depending on the threshold of the gate. If the threshold is less than zero, then the cut is modeled as an SA1 fault. Otherwise, it is modeled as an SA0 fault. On the other hand, faults at sites 5 and 7 are modeled as SA1 faults. A short across the HFET will make it conduct permanently, whereas a direct connection between the output and bias voltage will exist in the presence of a short across the load RTD, making the fault appear as an SA1 fault when the MOBILE is active. These fault models have been verified through HSPICE simulations. HSPICE models for RTD-HFET gates are available from [Prost 2000].

(a) Fault modeling of a threshold gate with (b) no faults, (c) an SA0 fault, and (d) an SA1 fault.

Figure 17.3. (a) Fault modeling of a threshold gate with (b) no faults, (c) an SA0 fault, and (d) an SA1 fault.

The next step in test generation is redundancy identification and removal. In Boolean testing, irredundant networks are intricately related to circuit testability [Jha 2003]. The same is true for irredundant threshold networks. If no test vector exists for detecting fault s in a threshold network G, then s is redundant. In such a case, the corresponding node or edge in the network can be removed without affecting the functionality of G.

The rules for removing a redundant fault in a threshold network are as follows:

  1. If an SA0 fault on an edge is redundant, the edge in the network can be removed, as shown in Figure 17.3c.

  2. If an SA1 fault on an edge is redundant, the edge in the network can be removed, as shown in Figure 17.3d. In addition, the threshold of the nodes in the edge’s fanout must be lowered by the weight of the removed edge.

Furthermore, all nodes and edges in the subnetwork that do not fan out and are in the transitive fanin of the removed edge can be removed from the network in both cases.

The next step is the actual test generation step. To find a test vector for a fault at input xi of a threshold gate, it is necessary that the term wixi in Equation (17.1) be the dictating factor in determining the output value of the function. The following theorem gives the conditions the test vector has to satisfy:

Theorem 17.1

To find test vectors for xi SA0 and xi SA1 in a threshold gate implementing the threshold function f(x1, x2,,..., xn), we must find an assignment on the remaining input variables such that one of the following inequalities is satisfied:

Equation 17.3. 

or

Equation 17.4. 

If an assignment exists, then <x1, x2,...,xn = 1,...,xn > and <x1, x2,...,xn = 0,...,xn > are test vectors for xi SA0 and xi SA1 faults, respectively. If no assignment exists, then both faults are untestable and, therefore, redundant.

The following theorem shows how to reduce the test generation burden by obtaining the test vector of one fault directly from that of another:

Theorem 17.2

In a threshold gate implementing the threshold function f(x1, x2,...,xn), if there exist two (or more) inputs xj and xk such that wj = wk, then test vectors to detect xk SA0 and xk SA1 can be obtained simply by interchanging the bit positions of xj and xk in the SA0 and SA1 test vectors for xj, respectively, assuming they exist.

The preceding theorems can best be illustrated by the following example. Consider the threshold gate that realizes the threshold function, f(x1, x2, x3) = x1x2 + x1x3, with weight-threshold vector < 2, 1, 1;3 >. To test for x1 SA0, the inequalities to be satisfied are 1 ≤ (w2x2 + w3x3) < 3 or 3 ≥ (w2x2 + w3x3) < 1. This leads to three test vectors, namely 101, 110, and 111. The test vectors for x1 SA1 can be easily obtained by replacing x1 = 1 with x1 = 0 in the original test vectors. Thus, vectors 001, 010, and 011 detect x1 SA1. Finally, given that vector 110 is a test for x2 SA0 and because w2 = w3, a test vector that detects x3 SA0 is obtained by interchanging the bit positions of x2 and x3 in 110 to get 101.

The D-algorithm for threshold networks depends on the concepts of primitive D-cube of a fault (PDCF), propagation D-cubes, and singular covers [Jha 2003] [Wang 2006]. A PDCF for a stuck-at fault in a threshold gate can be obtained simply by obtaining a test vector for the fault that results in a D or D′ at its output, using the theorems given previously.

Propagation D-cubes are used to sensitize a path from the fault site to one (or more) primary outputs. Knowing the threshold function that is implemented by a threshold gate, we can use algebraic substitution to determine the propagation D-cubes by using the D-notation. For example, to determine the propagation D-cubes from x1 in f(x1, x2, x3) = x1x2 + x1x3, substituting D for x1 in f we get Dx2 + Dx3. For the fault to propagate, it is required that only the cubes containing D (or D′) get “activated” in f. In this case, because both cubes contain D, activating either or both cubes will result in a propagation D-cube. Thus, the propagation D-cubes for x1 are {D10D, D01D, D11D}. Of course, {D′10D′; D′01D′;D′11D′} are also propagation D-cubes.

Singular covers are used in test generation to justify the assignments made to the output of a threshold gate. They are easily obtained from the threshold function of the threshold gate. Consider the threshold network in Figure 17.4. Suppose we want to derive a test vector for x1 SA1. The PDCF for this fault in gate G1 is 0000D′. Using the propagation D-cube of gate G2 as shown, the fault effect can be propagated to circuit output f1. This requires 1 to be justified on line c2 through the application of the relevant singular cube to gate G3 as shown. Thus, a test vector for the above fault is (0, 0, 0, 0, φ, 1, 0, 0, φ, φ).

Testing for x1 SA1.

Figure 17.4. Testing for x1 SA1.

To reduce the run-time of test generation, it is necessary to reduce the number of faults in the fault list. This fault collapsing can be done by exploiting fault dominance relationships:

Theorem 17.3

The following fault dominance relationships hold in a threshold gate that implements the threshold function f(x1, x2, xn):

  1. An output f SA0 (SA1) fault dominates an xi SA0 (SA1) fault if Equation (17.3) is satisfied.

  2. An output f SA1 (SA0) fault dominates an xi SA0 (SA1) fault if Equation (17.4) is satisfied.

To demonstrate this theorem, consider the threshold function f(x1, x2, x3) = x1x2 + x1x3 again. Applying the theorem, we see that f SA0 (SA1) dominates x1 SA0 (SA1), x2 SA0 (SA1), and x3 SA0 (SA1). Hence, f SA0 and f SA1 can be discarded from the fault list. Exploiting this theorem leads to the following theorem on test generation for irredundant combinational threshold networks, which is similar to the one used for Boolean testing:

Theorem 17.4

In an irredundant combinational threshold network G, any test set V that detects all SSFs on the primary inputs and fanout branches detects all SSFs in G.

Testing Majority Networks with Application to QCA

We next present a test generation framework for majority networks for application to the testing of QCA circuits [Gupta 2006]. QCA is a nanotechnology that has attracted significant recent attention and shows immense promise as a viable future technology [Amlani 1998, 1999] [Lieberman 2002] [Tougaw 1994]. In QCA, logic states, rather than being encoded as voltage levels as in conventional CMOS technology, are represented by the configuration of an electron pair confined within a quantum-dot cell. QCA promises small feature size and ultralow power consumption. It is believed that a QCA cell of a few nanometers can be fabricated through molecular implementation by a self-assembly process [Hennessy 2001]. If this does hold true, then it is anticipated that QCA can achieve densities of 1012 devices/cm2 and operate at THz frequencies [Tahoori 2004a].

Since its initial proposal, QCA has attracted significant attention. Consequently, various researchers have addressed different computer-aided design (CAD) problems for QCA. In [Zhang 2004a], the authors have developed a majority logic synthesis tool for QCA called MALS. Majority logic synthesis has also been addressed in [Zhang 2004b]. A tool called QCADesigner for manual layout of QCA circuits has been presented in [Walus 2004]. This tool also offers various simulation engines, each offering a tradeoff between speed and accuracy, to simulate the layout for functional correctness. The authors of [Tahoori 2004a] and [Tahoori 2004b] characterized in detail the types of defects that are most likely to occur in the manufacturing of QCA circuits.

We first introduce some basic concepts. A QCA cell, shown in Figure 17.5, contains four quantum dots positioned at the corner of a square and two electrons that can move to any quantum dot within the cell through electron tunneling [Tougaw 1994]. Because of Coulombic interactions, only two configurations of the electron pair exist that are stable. Assigning a polarization P of –1 and +1 to distinguish between these two configurations leads to a binary logic system.

A QCA cell. The logic states are encoded in the electron pair configuration.

Figure 17.5. A QCA cell. The logic states are encoded in the electron pair configuration.

The fundamental logic gate in QCA is the majority gate. The output of a three-input majority gate M is logic 1 if two or more of its inputs are logic 1. That is:

Equation 17.5. 

From this point onward, a three-input majority gate will be simply referred to as a majority gate.

Computation in a QCA majority gate, as shown in Figure 17.6a, is performed by driving the device cell to its lowest energy state. This is achieved when the device cell assumes the polarization of the majority of the three input cells. The reason why the device cell always assumes a majority polarization is because it is in this polarization state that the Coulombic repulsion between electrons in the input cells is minimized [Tougaw 1994]. The polarization of the device cell is then transferred to the output cell.

A QCA (a) majority gate, (b) an inverter, (c) binary wire, and (d) inverter chain.

Figure 17.6. A QCA (a) majority gate, (b) an inverter, (c) binary wire, and (d) inverter chain.

The schematic diagrams of an inverter and interconnects are also shown in Figure 17.6. In the QCA binary wire shown in Figure 17.6c, information propagates from left to right. An inverter chain, shown in Figure 17.6d, can be constructed if the QCA cells are rotated by 45°. Furthermore, it is possible to implement two-input AND and OR gates by permanently fixing the polarization of one of the input cells of a majority gate to –1 (logic 0) and +1 (logic 1), respectively. Finally, the majority gate and inverter constitute a functionally complete set (i.e., they can implement any arbitrary Boolean function).

The types of defects that are likely to occur in the manufacturing of QCA devices have been investigated in [Momenzadeh 2004], [Tahoori 2004a], and [Tahoori 2004b] and are illustrated in Figure 17.7. They can be categorized as follows:

  1. In a cell displacement defect, the defective cell is displaced from its original position. For example, in Figure 17.7b the cell with input B is displaced to the north by Δnm from its original position (see Figure 17.7a).

  2. In a cell misalignment defect, the direction of the defective cell is not properly aligned. For example, in Figure 17.7c the cell with input B is misaligned to the east by Δnm from its original position.

  3. In a cell omission defect, the defective cell is missing as compared to the defect-free case. For example, in Figure 17.7d the cell with input B is not present.

(a) Defect-free majority gate, (b) displacement defect, (c) misalignment defect, and (d) omission defect.

Figure 17.7. (a) Defect-free majority gate, (b) displacement defect, (c) misalignment defect, and (d) omission defect.

In QCA interconnects, cell displacement and omission defects on binary wires and inverter chains were also considered in [Momenzadeh 2004]. These defects could be better modeled using a dominant fault model in which the output of the dominated wire is determined by the logic value on the dominant wire. Many scenarios could occur in the presence of a bridging fault, as illustrated in Figure 17.8. In the first scenario shown in Figure 17.8b, the second cell is displaced to the north from its original position by Δnm. In this case, the dominated wire O2 will have a logic value equal to that of the dominant wire O1. However, if the fourth cell is displaced, as shown in Figure 17.8c, then O2 will have a logic value equal to the complement of the logic value on O1 (i.e., (a) Defect-free majority gate, (b) displacement defect, (c) misalignment defect, and (d) omission defect.). Finally, if multiple cells are displaced, as shown in Figure 17.8d, then O2 will also equal O1′.

(a) Fault-free binary wire and (b)–(d) binary wire with a bridging fault.

Figure 17.8. (a) Fault-free binary wire and (b)–(d) binary wire with a bridging fault.

As a motivational example, consider the majority circuit shown in Figure 17.9a, which contains three majority gates (M1M3), seven primary inputs (AG), and one primary output (O). Because there are 10 lines in this circuit, there are 20 SSFs. However, we only need to target SA0/SA1 faults at the primary inputs in this circuit to guarantee detection of all 20 SSFs. This is because majority gates are also threshold gates with weight-threshold vector < 1, 1, 1; 2 >. Hence, Theorem 17.4 is also applicable to majority networks. Because there are 7 primary inputs, there are 14 SSFs that need to be targeted during test generation. Figure 17.9b shows a minimal single SSF test set for the circuit. For each test vector in Figure 17.9b, all the SSFs detected by it are also shown.

(a) Example majority circuit, (b) minimal SSF test set, (c) input vectors received by each majority gate, and (d) QCA layout of the circuit with a bridging fault between inputs C and D.

Figure 17.9. (a) Example majority circuit, (b) minimal SSF test set, (c) input vectors received by each majority gate, and (d) QCA layout of the circuit with a bridging fault between inputs C and D.

Given the test set in Figure 17.9b, Figure 17.9c shows the input vectors that the individual majority gates of the circuit receive in the fault-free case. For example, note that M2 receives input vectors 001, 011, 100, and 101. Now, consider fault I SA0, which is detected by three test vectors, namely 1100110, 0110110, and 0001011. Given the application of vectors 1100110 and 0110110, M3 will receive input vector 110 in the fault-free case. If vector 0001011 is applied, M3 will receive input vector 011. In all these cases, the expected fault-free logic value at output O is logic 1. However, in the presence of I SA0, output O becomes logic 0, thus detecting the fault.

Figure 17.9c shows that M1receives input vectors 000, 001, 011, 100, and 110. Even though these vectors form a complete SSF test set for a majority gate, they are not a complete test set for detecting all simulated defects that were presented in [Momenzadeh 2004], [Tahoori 2004a], and [Tahoori 2004b]. In fact, five defects (three misalignments and two displacements on the QCA cells in M1) cannot be detected by these input vectors. If we add vector 0100110 to our original test set, then M1 will also receive input vector 010. Note that the effect of the last four of the preceding five vectors is also propagated from the output of M1 to output O. This is now a complete test set for all simulated defects in M1. Gates M2 and M3 already receive input vectors, which form a complete defect test set and hence require no additional test vectors (the effect of these vectors is also propagated to output O).

Figure 17.9d shows a possible QCA layout for the circuit in Figure 17.9a (different shades in QCA cells indicate different phases of a four-phase clock that is typically used for QCA circuits [Hennessy 2001]). Consider the QCA cell displaced from the binary wire of input D so that there exists a bridging fault between inputs C and D. In this case, C dominates D. Furthermore, it is unknown whether the bridging fault will result in D = C or D = C′. To detect this defect, we need vectors to test for two of four possible conditions. Condition D SA0 with C = 0 or D SA1 with C = 1 must be tested, and condition D SA1 with C = 0 or D SA0 with C = 1 must be tested. Note that we will explain how we obtained these conditions in the next section. The first and second conditions can be tested by vectors 0001011 and 0010011 that are already present in our test set. However, we currently have no vector that can test either of the latter two conditions. Therefore, additional test generation is required, and vectors 0000011 and 0011011 are derived as tests for the third and fourth conditions, respectively. Adding either of these two vectors is sufficient as we only need to satisfy either the third or fourth condition for fault detection. The bridging fault between C and D, when C is dominant and D is dominated, can now be completely tested for all simulated defects.

The preceding example illustrates that the SSF test set of a circuit cannot guarantee the detection of all simulated defects in the QCA majority gates. Therefore, test generation will be required to cover the defects not covered by the SSF test set. In addition, test generation will also be needed to cover bridging faults on QCA interconnects. For a majority gate, there are nine minimal SSF test sets that contain four test vectors each. The minimal test sets were applied to a QCA majority gate, and all defects described in [Momenzadeh 2004], [Tahoori 2004a], and [Tahoori 2004b] were simulated using QCADesigner [Walus 2004]. Figure 17.10 shows the results of this experiment. Of the nine minimal test sets, three test sets were unable to detect all the simulated defects.

Table 17.10. Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate.

SSF Test Set

100% Defect Coverage?

SSF Test Set

100% Defect Coverage?

SSF Test Set

100% Defect Coverage?

{001, 010, 011, 101}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate.

{001, 100, 101, 110}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate.

{010, 011, 100, 101}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate. 3 Uncovered Defects

{001, 011, 100, 101}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate.

{010, 100, 101, 110}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate.

{001, 010, 011, 110}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate.

{001, 010, 101, 110}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate. 1 Uncovered Defect

{001, 011, 100, 110}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate. 5 Uncovered Defects

{010, 011, 100, 110}

Minimal SSF test sets for a majority gate. Some of these test sets, however, are not 100% defect test sets for a QCA majority gate.

Consider the majority gate M in Figure 17.11 that is embedded in a larger network. Suppose that after SSF test generation for the entire circuit, it is determined that M receives input vectors 010, 011, 100, and 101. According to Figure 17.10, this is not a complete defect test set as three defects remain uncovered. We can make it complete by ensuring that M also receives either 001 or 110 as its input vector.

Testing of defects in a QCA majority gate.

Figure 17.11. Testing of defects in a QCA majority gate.

It is very important to test for defects on QCA interconnects because it is predicted that interconnects will consume the bulk of chip area in future nanotechnologies. As Figures 17.8b to 17.8d show, a displacement of the QCA cells on a wire will result in a bridging fault between the two wires. Using QCADesigner [Walus 2004], it can be verified that such defects can be modeled using the dominant bridging fault model [Tahoori 2004a]. However, depending on the displacement distance, the lower wire will assume either the upper wire’s logic value or its complement.

Figure 17.12 shows the possible scenarios that can result if a bridging fault is present between two wires. The scenario that will occur depends on the displacement distance Δ of the defective cell or the number of cells that get displaced. It also shows the conditions that must be satisfied in order to detect the particular scenario. In the first scenario in Figure 17.12a, the lower wire’s logic value is equal to that of the upper wire, whereas in Figure 17.12b, the lower wire’s logic value is equal to the complement of that of the upper wire. If the first scenario occurs, then a vector is required to test for one of the two conditions shown in Figure 17.12a. Similarly, if the second scenario occurs, a vector is required to test for either of the two conditions shown in Figure 17.12b. In reality, it will not be known which scenario occurred in the presence of the defect because Δ will not be known beforehand. Consequently, we need to have vectors that test for both scenarios to obtain a test that can completely detect this fault.

Table 17.12. Modeling of bridging faults in a QCA binary wire.

Scenario 1: Δ is such that B = A

 

Scenario 2: Δ is such that B = A

Fault-free A B

Faulty A B

Equivalent Condition

 

Fault-free A B

Faulty A B

Equivalent Condition

0 0

0 0

-

 

0 0

0 1

B SA1 with A = 0

0 1

0 0

B SA0 with A = 0

 

0 1

0 1

-

1 0

1 1

B SA1 with A = 1

 

1 0

1 0

-

1 1

1 1

-

 

1 1

1 0

B SA0 with A = 1

Note: Assumption is that B is dominated by A.

 

Note: Assumption is that B is dominated by A.

(a)

 

(b)

As an example, consider once again the majority gate in Figure 17.11. Assume there is a bridging fault between inputs A and B and it is not known whether the defective cell is on wire A or wire B. In addition, Δ is unknown. To test for this fault, four conditions need to be satisfied (the first two result from A dominating B, and the last two result from B dominating A). The conditions are as follows:

B SA0 with A = 0 or B SA1 with A = 1

B SA1 with A = 0 or B SA0 with A = 1

A SA0 with B = 0 or A SA1 with B = 1

A SA1 with B = 0 or A SA0 with B = 1

If the test set contains vectors that can satisfy the preceding conditions, then this bridging fault can be detected. Otherwise, the fault is not completely testable.

Given a QCA circuit with n lines, there are at most n(n – 1) possible bridging faults involving two wires. This is based on the assumption that layout information is not available. Consequently, 2n(n–1) conditions will need to be satisfied in order to obtain a test set for all bridging faults. However, it should be noted that at least n(n – 1) (i.e., 50%) of these conditions will already be satisfied, given any complete SSF test set. This is because, given a pair of wires, when testing for a SA0/SA1 fault on one wire, the other line must have a value of 0 or 1. If the QCA layout of the circuit is available, then the designer can find out a priori which pairs of adjacent wires to test for bridging faults.

Table 17.1. Singular Cover of a Majority Gate

A

B

C

F

1

1

1

1

1

1

1

1

1

0

0

0

0

0

0

0

0

0

Because a majority network is also a threshold network, the D-algorithm steps outlined earlier for threshold networks are also applicable here. For example, to obtain the propagation D-cubes of x1 SA1 in M = x1x2 + x1x3 + x2x3, substituting D for x1 in M we get Dx2 + Dx3 + x2x3. For the fault to propagate, it is required that only the cubes containing D (or D′) get “activated” in M. Thus, the propagation D-cubes for x1 SA1 are D10D and D01D. Of course, D′10D and D01D′ are also propagation D-cubes. The propagation D-cubes for x2 and x3 are {0D1D, 1D0D, 0D′1D′, 1D′0D′} and {01DD, 10DD, 01DD′, 10DD′}, respectively. These can be stored in a table and used for fault propagation. The singular cover of a majority gate is shown in Table 17.1. Thus, it is possible to employ D-algorithm to perform test generation targeting defects in QCA majority gates as well.

In summary, we have discussed testing of threshold and majority networks and showed how RTD- and QCA-based circuits can be tested using these approaches. D-algorithm can be extended to test both types of networks. Other test generation approaches, such as path oriented decision making (PODEM) and satisfiability-based algorithms, are also possible. Although there are some similarities with testing of Boolean networks, traditional test generation approaches need to be augmented to take into account the different logic primitives (threshold and majority gates) as well as the inadequacy of the SSF model in the case of QCA.

Crossbar Array Architectures

Today’s approach to designing integrated circuits uses a top-down methodology. That is, layers are added on top of a silicon wafer, requiring hundreds of steps before the final circuit is complete. Although this process has allowed the manufacture of reliable circuits and architectures, future scaling will make the production of reliable mask sets extremely expensive. In the near future, a major shift from top-down lithography-based fabrication may be needed in order to cost-effectively fabricate devices at true nanoscale dimension.

As an alternative, bottom-up approaches rely on self-assembly for defining feature size and may offer opportunities to drastically reduce the number of steps required to produce a circuit. However, the biggest impact in going from top-down designs to bottom-up designs is the inability to arbitrarily determine placement of devices or wires. Without fine control of the design, devices made from self-assembly techniques tend to be restricted to simple structures, such as two-terminal devices. Because these devices are usually nonrestoring, one design challenge is providing signal restoration between nanoscale logic stages. Furthermore, this self-assembly approach also lends itself to defect rates orders of magnitude higher than traditional top-down approaches, so that fabricating defect-free circuits will be virtually impossible. Therefore, some means of defect or fault tolerance (whether at the circuit, logic, or architecture level) must be incorporated into the design if reliable computation is to be achieved. Testing also will take on a new role, not so much to sort out defective parts but rather to identify faulty devices/cells within the circuit and avoid them during operation mode.

Bottom-up assembly techniques require fabrication regularity. In addition, taking a hybrid approach that uses self-assembled structures as an add-on to a CMOS subsystem may create a design framework where fault-tolerance techniques can be more effectively applied. These photo-lithographically manufactured components may be built from regular structures as well, in order to lend themselves more easily to reconfigurable architectures. That is, the desired circuit may be designed by configuring around faulty structures; because all structures are identical, one faulty element can be easily swapped out and replaced with an operational one, thereby creating a reliable system out of an unreliable substrate. On the testing side, the challenge now becomes how to quickly identify faulty structures so that they can be avoided when configuring the desired circuit.

Of the molecular-scale devices being developed using these self-assembly techniques, the nonvolatile programmable switch has gained much attention. With these bottom-up techniques, it is possible to build features (e.g., wires and programmable switches) without relying on lithography. Recent work shows how to build nanoscale programmable logic arrays (PLAs) using the bottom-up synthesis techniques being developed by physical chemists [Goldstein 2001] [Luo 2002] [DeHon 2004]. The molecular switches can provide connection and disconnection states at the crosspoints of vertical and horizontal wires in a crossbar, thereby providing a path to continue the advance of field-programmable technology beyond the end of the traditional lithographic roadmap [SIA 2005]. Such a switch can be fabricated using two layers of parallel nanowires, with the two layers perpendicular to each other, forming a 2-D array. At every crosspoint, the wires are connected together via a two-terminal nanodevice formed by the layering [Huang 2001]. These crossbar arrays are similar to PLAs and can be used as building blocks for implementing logic. The programmable feature of such crossbars can serve the purpose of making the circuits fault tolerant [Huang 2004] [DeHon 2005a] [Tahoori 2006]. The array can then be interconnected, using CMOS circuitry, as part of a hybrid nanoscale/CMOS design architecture. Recent developments suggest both plausible fabrication techniques and viable architectures for building crossbars using nanowires or nanotubes and molecular-scale switches. We describe some of these hybrid architectures in Section 17.3.1.

Comparing CMOS-scale and nanoscale crossbar-based circuits shows fundamental differences in terms of defects occurring during fabrication. The transistor is considered to be the basic element in CMOS-scale circuits, whereas the crossbar is the main component in reconfigurable nanoscale circuits. The type of faults extracted from these two components is significantly different as well. A crossbar is composed of wires and switches; its type of defects is limited only to these two components. In CMOS technology, faults are seen in both transistors and interconnects, and they manifest themselves as stuck-at, open, short, and bridging faults. These faults in CMOS technology are commonly targeted during manufacturing testing.

The two more probable defects in crossbars are (1) defects in programmable crosspoints, and (2) defects in wires. The defective nanowires can be easily detected with the procedure suggested in [Tahoori 2005] and field programmable gate array (FPGA) literature [Stroud 1998]. The time required to test the wires of each array is linear in the code space size of the stochastic address decoder. The wire fault model includes broken wire, stuck-at 0, and stuck-at 1.

Defects in programmable switches are caused by the structure of the junctions, which is a sandwich of bistable molecules between two layers of wires. In each crosspoint, there are only a few molecules [Chen 2003]. The programmability of a crosspoint comes from the bistable attribute of the molecules located in the crosspoint area. For instance, if there are too few molecules at the crosspoint, then the junction may never be able to be programmed closed, or the closed state may have higher resistance than the designed threshold chosen for correct operation and timing of the crossbar.

In general, the model can be abstracted into a simple crosspoint defect model. Crosspoints will be in one of these two states:

  • Stuck-closed. Crosspoint cannot be programmed to an open state; its associated vertical and horizontal wires are always connected. Crosspoints that cannot be programmed into a suitable open state will result in the entire horizontal and vertical nanowires being unusable.

  • Stuck-open. Crosspoint cannot be programmed to a closed state; its associated vertical and horizontal wires are always disconnected.

Figure 17.13 shows two logic functions (AND and OR) implemented on a simple crossbar and its equivalent diode-resistor logic. No defects are considered in these crossbars. Figure 17.14, however, shows two implementations of the function f = ab + b′c on a defective crossbar-based programmable logic array (PLA). The PLA, shown in the figure, is a combination of an AND plane and an OR plane and can be implemented based on diode logic by using a crossbar of nanowires and configurable molecular switches on the crosspoints of the wires. The PLA may also be implemented using switching properties of FETs created with carbon nanotubes. As seen in the figure, even in the presence of a number of defects in the crossbar, there are several choices for fault-free implementations of function f. This is because there is a high amount of inherent redundancy in a crossbar and a large number of resources (switches and wires) available for implementing function f on the crossbar.

Simple logic functions using diode-resistor logic: (a) AND gate implemented on a crossbar and its diode-resistor equivalent circuitry and (b) OR gate implemented on a crossbar and its diode-resistor equivalent circuitry.

Figure 17.13. Simple logic functions using diode-resistor logic: (a) AND gate implemented on a crossbar and its diode-resistor equivalent circuitry and (b) OR gate implemented on a crossbar and its diode-resistor equivalent circuitry.

Two different implementations of f = ab + b’c on a defective crossbar. Complements of input signals are also provided because it is diode-resistor logic.

Figure 17.14. Two different implementations of f = ab + b’c on a defective crossbar. Complements of input signals are also provided because it is diode-resistor logic.

Hybrid Nanoscale/CMOS Structures

In this section, we provide two examples of hybrid nanoscale/CMOS circuits and architectures being recently proposed. These hybrid designs combine nanoscale devices and nanowires with larger CMOS components. The main advantage of these approaches is that the CMOS subsystem can serve as a reliable medium for connecting nanoscale circuit blocks, providing long interconnects and I/O functions.

The nanoPLA

In [DeHon 2004] and [DeHon 2005a], the authors proposed a programmable interconnect architecture built from hybrid components. The main building block, called the nano programmable logic array (nanoPLA), is built from a crossed set of N-type and P-type nanowires. An electrically switchable diode is formed at each crosspoint. The diodes then provide a programmable wired-OR plane that can be used to configure or program arbitrary logic into the PLA. The nanoPLA is programmed using lithographic-scale wires along with stochastically coded nanowire addressing [DeHon 2003b].

The nanoPLA block is shown in Figure 17.15. The block is composed of two stages of programmable crosspoints. The first stage defines the logical product terms (pterms) by creating a wired-OR of appropriate inputs. The outputs of this wire-OR plane are restored through field-effect controlled nanowires that invert the outputs (thus creating the logical NOR of the selected input signals). These restored signals are then sent to the inputs of the next stage of programmable crosspoints. Each nanowire in this plane computes the wired-OR of one or more restored pterms. The outputs of the stage are then restored in the same manner as the first stage. The two stages together provide NOR-NOR logic (equivalent to a conventional PLA) [DeHon 2005a].

A simple nanoPLA block (taken from [DeHon 2005b]).

Figure 17.15. A simple nanoPLA block (taken from [DeHon 2005b]).

The nanoPLA blocks are interconnected by overlapping the restored output nanowires from each block with the wired-OR input region of adjacent nanoPLA blocks. This organization allows each nanoPLA block to receive inputs from a number of different nanoPLA blocks. With multiple input sources and outputs routed in multiple directions, the nanoPLA block can also serve as a switching block by configuring the overlap appropriately. Their experiments mapping benchmark circuits onto the proposed architecture have suggested that device density could be one to two orders of magnitude better than what is projected for the 22nm roadmap node [DeHon 2005a].

Testing of the nanoPLA needs to be done through a process of probing and discovery. A testing process is required to identify a working set of address lines to access crossbars in a nanoscale device. Once a crossbar becomes accessible, defective nanowires and crosspoints can then be identified. This information can then be stored in a defect map and used during reconfiguration.

Restoration columns (as shown in Figure 17.15) are used to identify useful addresses. The gate side supply (top lithographic wire contacts in the figure) is driven high, and by sensing the voltage change on the opposite supply line (bottom set of lithographic wire contacts in the figure), the presence of a restored address can be deduced. Broken nanowires, or those with high resistance, will not be able to pull up quickly enough the contact of the bottom supply. This probing of addresses is repeated until enough live wires are discovered. The live addresses can then be used to program a single junction in a diode-programmable OR plane. For additional discussion on this architecture and how it is tested, see [DeHon 2005b].

Note that the programmability of the nanoPLA allows defective devices to be avoided as a means of fault tolerance. The authors in [DeHon 2005c] have shown that when 20% of devices (i.e., crossbar diodes) were defective, only a 10% overhead in devices was needed to correctly configure the array around the defects. The lithographic circuitry and wiring that the nanoPLA is built on top of provide a reliable means of probing for defects and configuring the logic.

Molecular CMOS (CMOL)

The molecular CMOS (CMOL) circuits proposed in [Likharev 2005] and [Ma 2005] are designed using the same crossbar array structure as the nanoPLA design consisting of two levels of nanowires. The main difference with CMOL is how the CMOS/nanodevices are interfaced. Pins are distributed over the circuit in a square array, on top of the CMOS stack, to connect to either lower or upper nanowire levels. The nano crossbar is turned by some angle less than 90° relative to the CMOS pin array.

By activating two pairs of perpendicular CMOS lines, two pins together with the two nanowires they contact are connected to the CMOS lines (see Figure 17.16). Each nanodevice may be uniquely accessed using this approach. That is, each device may be switched ON/OFF by applying some voltage to the selected nanowires such that the total voltage applied to the device exceeds the switching threshold of the selected nanodevices. By angling the nanoarray, the nanowires do not need to be precisely aligned with each other and the underlying CMOS layer in order to be able to uniquely access a nanodevice.

The CMOS logic cell consisting of two pass transistors and an inverter (taken from [Strukov 2005]). The function is determined by how the overlaying nanowires are programmed. Note that typically, there are many nanowires (and thus many nanodevices) available per CMOS cell.

Figure 17.16. The CMOS logic cell consisting of two pass transistors and an inverter (taken from [Strukov 2005]). The function is determined by how the overlaying nanowires are programmed. Note that typically, there are many nanowires (and thus many nanodevices) available per CMOS cell.

The most straightforward application of CMOL would be for memories (embedded or stand-alone). The authors project that a CMOL-based memory chip about 2×2cm in size will be able to store about 1Tb (terabits) of data [Ma 2005]. To improve the reliability of the memory array, the authors proposed adding spare lines and error correcting code (ECC), a standard procedure in memory array design, to improve yield.

The CMOL circuits have also been proposed for building FPGA-like architectures for implementing random logic [Strukov 2005]. A CMOS cell, composed of an inverter and two pass transistors, is connected to the nanowire crossbar via two pins, as shown in Figure 17.16. This essentially creates a configurable logic block (CLB) structure, similar to that found in an FPGA. The CMOS cell is then programmed by disabling the inverter and selectively switching devices ON in the crossbar array. After configuration, the pass transistors act as pulldown resistors while the nanodevices programmed to be in the ON state serve as pullup resistors. In this way, wired-NOR gates may be formed within a CMOS cell. Note that the inverter provides signal restoration. Any arbitrary Boolean function (represented as a product-of-sums) may be implemented as a connection of two or more CMOS cells. Further, the idea is to have many nanodevices per CMOS cell. This allows gates with high fanin or high fanout to be formed, with extra devices available as “spares” for reconfiguring around faulty devices.

The testing of this FPGA-like CMOL architecture is connected to its reconfigurable programming. However, before the circuit can be programmed on the FPGA fabric, there is an implicit assumption made by the authors that each nanodevice in a cell can be tested to determine if it is faulty or not. The authors consider only “stuck-on-open” (stuck-open) faults occurring from the absence of nanodevices at certain nanowire crosspoints. The CMOL FPGA configuration is then carried out at two stages. The first stage maps the desired circuit onto the FPGA cells, assuming a defect-free CMOL fabric. During the second stage, defective components may be reconfigured around, as necessary, until a defect-free mapping is found. The algorithm sequentially attempts to move each gate from a cell with bad input or output connections to a new cell, while keeping the gates in their immediate fanin/fanout in fixed positions. The main goal is to reassign cells for gates such that the interconnect length is minimized. A more detailed description of the reconfiguration algorithm can be found in [Strukov 2005] where their Monte Carlo simulations of a 32-bit Kogge-Stone adder demonstrated that this simple configuration procedure may allow them to achieve 99% circuit yield with as many as 22% defective nanodevices. Similarly, simulations on a 64-bit fully connected crossbar switch have shown a defect tolerance of about 25% [Strukov 2005].

Built-In Self-Test

In the previous section, we reviewed two examples using hybrid nanoscale/CMOS structures. Although testing is required for these designs, the authors focused more on the architecture itself and its configuration. In this section, we turn more toward the testing aspects of these architectures, and in particular, focus on built-in self-test (BIST) approaches [Wang 2005] [Tehranipoor 2007].

In a BIST scheme, specific components of the architecture are configured to generate the test vectors and observe the outputs in order to test and diagnose the defective components. Here, we consider an island-style architecture for nanodevices containing nanoblocks and switchblocks, as shown in Figure 17.17. Each nanoblock (e.g., crossbar or PLA) is configured as either a test pattern generator (TPG) or output response analyzer (ORA). Because of the reconfigurability of nanodevices, no extra BIST hardware is required to be permanently fabricated on-chip. Moreover, dedicated on-chip nanoscale test hardware can be highly susceptible to defects itself.

(a) Island-style architecture and (b) cluster structure in the nanoarchitecture.

Figure 17.17. (a) Island-style architecture and (b) cluster structure in the nanoarchitecture.

Test configuration generation is first performed externally and then delivered to the nanodevice during each test session. The test session is referred to as one particular test architecture with configured nanoblocks and switchblocks. Note that the nanoblock and switchblock are constructed similar to the crossbar architecture, but the former is used to implement the logic function and the latter for routing. The architecture used in the BIST procedure is called the test architecture (TA). Each TA includes test groups and each test group contains one TPG, one ORA, and one switchblock associated with the TPG and ORA. Note that this BIST scheme is similar to that used for traditional FPGAs discussed in Chapter 12. The TPG tests itself and sends a pattern through a switchblock to the ORA to test it, and then the response is generated and read back by the programming device or tester.

TAs are generated based on the detection of faults in each nanoblock and switch-block. Therefore, several TAs are generated for each test group, and several test configurations are generated to detect faults under consideration. During BIST, all TAs are configured similarly (i.e., the same faults are targeted within test groups in each test session).

The programming device can configure the nanoblocks and switchblocks as required by the test architectures and configurations. Test results should be read back from the ORA blocks using a tester (on/off chip). Interconnect resources provided in the nanodevice architecture should be used for transmitting these results to the tester. Because a fine-grained architecture with small clusters is considered, each cluster is composed of a small number of nanoblocks. Therefore, the number of test groups in a cluster will be small and interconnect resources of the nano-device are assumed to be sufficient to implement the read-back mechanism. Hence, when a test session is done, the output of ORAs is read for evaluation and analysis by the tester.

The BIST procedure can be performed using an on-chip tester (microprocessor or dedicated BIST hardware) implemented in reliable CMOS-scale circuitry on the substrate of the nanodevice. This will reduce the test time since external devices are generally slower than on-chip testers. The on-chip tester can execute the BIST procedure and collect the test results. It may also eliminate the need to store the defect map on-chip because it can find the faulty blocks before configuration in each cluster. Figure 17.18 shows two different TAs for testing nanoblocks and switchblocks [Wang 2005]. Figure 17.19 shows a test configuration for detecting SA1 faults on all vertical and horizontal lines [Tehranipoor 2007]. By applying 0 and then 1 to the inputs of the crossbar, stuck-open (broken line) faults can be detected. After identifying the location of all faults, they are stored in a defect map and will be used during configuration.

Two different test architectures (TAs): (a) nanoblock is used as a TPG in one TA and (b) ORA in another.

Figure 17.18. Two different test architectures (TAs): (a) nanoblock is used as a TPG in one TA and (b) ORA in another.

A test configuration for detecting SA1 faults on all vertical and horizontal lines.

Figure 17.19. A test configuration for detecting SA1 faults on all vertical and horizontal lines.

Simultaneous Configuration and Test

In the examples described in Section 17.3.1, the defect tolerance of the circuits designed from these fabrics relies on some test procedure, completed as a preprocessing step, which identifies the exact location of the defects. This is a nontrivial process since locating all defects in a reconfigurable architecture with a high defect density is a challenging and time-consuming task. Implementing an on-chip BIST circuit or using an on-chip microprocessor will significantly speed up this process, but it will still be a time-consuming process to find the location of all defects in a chip with an extremely large number of blocks. Furthermore, even if these techniques can be applied, storing this information onto a defect map will require large (defect-tolerant) memories. Recall that the configuration process required for nanoPLA or CMOL architectures requires that this defect map be accessed repeatedly during reconfiguration so defective cells can be avoided during the mapping. We cannot use the on-chip nanoscale resources as memory to store the defect map because they are unreliable. On the other hand, using on-chip CMOS-scale memory for the defect map will result in a considerable area overhead. Therefore, it will be impractical to store this large defect map on-chip. It will also be impractical to ship the defect map of each chip along with the chip to the customer. Lastly, the aging faults need to be considered as well (i.e., the defect map must be updated regularly). Another important issue for defect-map-based approaches is that different chips have different defect maps, which result in different performance for chips. This also requires per-chip placement and routing, which is prohibitively expensive [Rad 2006].

An alternative to having separate test and configuration stages is to combine them as part of the same step. This approach would also eliminate the need for storing a defect map. The combined test and configuration method that avoids the time-consuming process of locating all defects is called simultaneous configuration and test (SCT) [Rad 2006]. SCT assumes that the crossbar array architecture offers rich interconnect resources and is able to provide efficient access to its logic blocks through its input/output interfaces. The method is conceptually similar to those proposed for FPGAs, except that the TPG and ORA are components of the BIST circuit to provide test patterns and analyze the responses, respectively. A key conceptual difference of this method compared with other BIST approaches is that the goal of testing is not to confirm the correct functionality of a block under test (BUT) for every possible function. Instead, the goal is to make sure that each function (fi) of an application configured into a block is working correctly. Hence, the test patterns should be applied for testing that function only.

Instead of testing all resources of a reconfigurable architecture to locate all defects in the device, each block of the architecture is tested using the SCT method for a specific function (fi), after fi is configured into a block of the fabric. The applied test here just checks the correct functionality of the configured fi, rather than diagnosing all defects of the block. Hence, there might be defects in molecular switches or wires of the block, but as long as those defects do not cause any malfunction, function fi is identified as fault-free. In other words, creating function fi on a block bj requires just a subset of all wires and switches of that block. Hence, if the defective components of the block are not used for configuring fi into that block, then the function can operate without a fault. Therefore, the defects of the block are tolerated.

Using the SCT procedure, the application is divided into m-input functions; each function (fi) should be configured into a block of the fabric and the input and output lines from the BIST circuit to fi must also be configured. Finally, the same function fi should be configured into the look-up table (LUT) of the BIST circuit (see Figure 17.20). Next, the BIST circuit can simply apply an exhaustive set of 2m test patterns to the function and test its functionality. If the implemented function passes the test, then it can be reliably used in the circuit. The process of selecting a function, mapping it to a block of the fabric, creating connections between that function and the BIST circuit, and testing the function will be repeated for all functions of the application. If a function fails the test, then it must be mapped onto another block and the test process should be repeated.

CMOS BIST circuit used to test nanodevices.

Figure 17.20. CMOS BIST circuit used to test nanodevices.

Note that methods and tools for configuring nanoscale reconfigurable architectures are similar to those used for FPGAs; however, some modifications may be required because of architectural differences. The BIST circuit shown in Figure 17.20 [Rad 2006] is composed of an m-bit counter, an m-input LUT, and a comparator, resulting in low BIST area overhead. The BIST circuit is assumed to be implemented in reliable CMOS-scale circuitry.

The low area overhead of BIST circuits provides an opportunity for parallel implementation of these circuits on-chip so at any time more than one function can be implemented and tested simultaneously, as shown in Figure 17.21. When multiple BIST circuits are implemented, the test time is significantly reduced. In this case, more than one function and more than one block of the device should be selected at any time. Appropriate methods based on meeting placement and routing constraints can be devised for such selections.

Parallel use of multiple BIST circuits for testing nanodevices.

Figure 17.21. Parallel use of multiple BIST circuits for testing nanodevices.

Carbon Nanotube (CNT) Field Effect Transistors

Carbon nanotubes (CNTs) have been the subject of much research in recent years because of their unique electrical properties. In particular, their fine pitch and ballistic transport conduction mechanism enables fabrication of carbon nanotube field effect transistors (CNFETs) with excellent CV/I device performance and high transition speeds [Wong 2002, 2003, 2006]. Consequently, CNFETs are regarded as a promising extension to CMOS to facilitate IC scaling beyond the limitations currently projected by the International Technology Roadmap for Semiconductors (ITRS) [SIA 2006].

For CNFETs to be used for mainstream VLSI circuits, the impact of imperfections must be understood and, ideally, controlled. Two major sources of imperfections dominate CNFET circuit design: (1) misaligned carbon nanotubes that can result in incorrect logic implementations, and (2) metallic carbon nanotubes that can result in incorrect logic implementations or variations [Patil 2007].

In this section, we address the topic of reliable and robust carbon nanotube circuits in the presence of these imperfections. First, we present a robust CNFET logic design technique in the presence of a large number of misaligned CNTs. Next, we discuss modeling and analysis of CNFET circuits in the presence of metallic nanotubes.

Imperfection-Immune Circuits for Misaligned CNTs

Many imperfections inhibit the proper functionality of CNFET gates. For example, the carbon nanotubes may terminate too early before reaching the contacts, or the nanotubes themselves may have a point break, electrically severing the tube into two. Both of these defects only vary the effective number of semiconducting CNTs; thus, by simply synthesizing an appropriately higher density of CNTs, the yield loss caused by these defects can be reduced.

On the other hand, one of the most problematic imperfections commonly found in CNFET gates is misaligned CNTs. In this case, CNTs may not run straight from transistor source to drain under the gate. The transistor gate is used as the mask for source/drain doping, so any CNT segments not under a gate will be heavily doped and thus highly conductive (see Figure 17.22). Consequently, a misaligned CNT may bend, which either could prevent the gate channel from forming (causing a short between the nodes) or could establish a path without passing under the correct gate (resulting in possibly incorrect logic) or any gate at all (causing a short between the nodes). As an example, Figure 17.22 illustrates a NAND gate with ideal (aligned) CNTs, whereas Figure 17.23a shows the same NAND layout but with a critically misaligned CNT. This misaligned CNT shorts the output to power and causes the NAND to implement faulty logic, namely the output is always 1. Thus, misalignment can significantly increase the defect rate of CNFETs and exponentially decrease the yield of a VLSI chip.

CNFET NAND cell example showing lithographically defined features as well as sublithographic CNTs with doped and intrinsic segments.

Figure 17.22. CNFET NAND cell example showing lithographically defined features as well as sublithographic CNTs with doped and intrinsic segments.

(a) A misaligned-vulnerable NAND cell with a critically misaligned CNT. (b) The overlay grid applied to the layout and (c) the conduction condition table for each node are used to create (d) the full path graph, and (e) its equivalent, reduced graph for the pullup network (pulldown network graph omitted).

Figure 17.23. (a) A misaligned-vulnerable NAND cell with a critically misaligned CNT. (b) The overlay grid applied to the layout and (c) the conduction condition table for each node are used to create (d) the full path graph, and (e) its equivalent, reduced graph for the pullup network (pulldown network graph omitted).

A layout design can be assessed as misaligned-CNT immune (MCI) or misaligned-CNT vulnerable (MCV). An MCI design is guaranteed to implement the correct logical function regardless of misaligned CNTs, whereas an MCV design may implement incorrect logic in the presence of misaligned CNTs. In [Patil 2007], a method for verifying an MCI design is demonstrated using concepts from graph theory. Fundamentally, the problem is to examine all possible paths (as if there were misaligned CNTs) from power or ground to the output. Each path has a corresponding logical (Boolean) expression consisting of the inputs or its complements, which represent the conduction condition for the path. For an MCI design, the Boolean OR of all path expressions must be equivalent to the intended conduction expression of the network; otherwise, the design is MCV. However, a sufficient condition to prove a design is MCV is to look for paths with conduction expressions that are true when the intended network conduction expression is false. The design is proven to be MCV when the first such path is found, because in this case the Boolean OR of all path expressions cannot possibly be equivalent to the intended conduction expression.

Consider the layout in Figure 17.22. The pullup network has CNTs, which pass under gate A only and under gate B only. This corresponds to the logical terms A and B, respectively. The desired logic function of the NAND pullup network is A OR B, and this particular cell instance implements the logic correctly. However, this layout is MCV, as shown in Figure 17.23a. There is a path that does not pass under any gate because of a misaligned CNT. This causes a short from Vdd to Out, which represents a logical 1 because the path is always conducting. Because 1 is always true even when the intended pullup network expression is false, this design is MCV and this path is said to induce the vulnerability. A similar analysis can be done with the pulldown network to ensure correct functionality.

Using graph theory, all possible paths can be formalized and algorithmically tested. Figure 17.23d shows the path graph, representing all possible paths from Vdd to Out, of the NAND pullup network. The graph is generated by dividing the layout into a fine grid and determining the Boolean function representing the conduction condition at each square (see Figures 17.23b and 17.23c). The graph edges connect to all the neighbors in eight directions, as a CNT can pass through a square into any of the eight adjacent squares. Also shown is the reduced, equivalent graph in Figure 17.23e. This graph can then be used to derive all possible paths between the supply node (Vdd) and the output (Out). The Boolean expression corresponding to each path is then the Boolean AND of each of the Boolean functions for the path nodes. For example, from the reduced graph, there is a potential path from node Vdd to node Out, which only passes through a 1 node (heavily doped source/drain region that always conducts). This corresponds to the same misalignment path shown in Figure 17.23a and indicates that the layout design is MCV. This formal derivation of paths and logical expressions using graph theory allows for an automated MCI design checker for future VLSI.

Not all misaligned CNTs lead simply to shorts from power to output. Consider the layout of a complex logic cell in Figure 17.24. The intended conduction expression of the network is (A AND B) OR (C AND D). However, a misaligned CNT has created a new path gated by gates A and D, and the actual logic function implemented is (A AND B) OR (C AND D) OR (A AND D). Thus, this layout consists of an extra (potential) conduction path, and, hence is MCV. Other forms of incorrect logic implementation because of misaligned CNTs are also possible in various other logic circuit layouts.

An example of an MCV pullup network layout design resulting in incorrect logic implementation.

Figure 17.24. An example of an MCV pullup network layout design resulting in incorrect logic implementation.

A robust design method for eliminating gate defects because of misaligned CNTs has been developed by [Patil 2007]. By following the design rules, circuit layouts are guaranteed to be MCI. The insight gained from the graph theory analysis presented earlier is that an MCI design must yield a graph, which does not have any additional paths that contribute extra minterms (extra 1’s that are not in the desired logic table) for the network. Thus, by designing for a graph, which inherently removes such extra paths (and only such extra paths), an MCI layout design can be achieved.

The design method described in [Patil 2007] uses lithographically etched regions to achieve a robust MCI design. Figure 17.25a shows an MCI NAND cell. A region in the pullup network is etched by lithographically patterning an etch mask. This causes any and all misaligned nanotubes in this region to be etched and thus nonconducting through the associated node (see Figure 17.23c). Figure 17.25a also shows the corresponding reduced path graph for the pullup network, which no longer has the logical 1 path from Vdd to Out seen in Figures 17.23a and 17.23e. Figure 17.25b generalizes the MCI design methodology for any logic network.

(a) A misaligned-tolerant (MCI) NAND cell and the corresponding reduced path graph for its pullup network (pulldown network graph omitted). (b) An MCI layout design illustrating how to use “etched regions” for a general logic network in Sums of Products (SoP) or Product of Sums (PoS) form.

Figure 17.25. (a) A misaligned-tolerant (MCI) NAND cell and the corresponding reduced path graph for its pullup network (pulldown network graph omitted). (b) An MCI layout design illustrating how to use “etched regions” for a general logic network in Sums of Products (SoP) or Product of Sums (PoS) form.

By developing a robust MCI design and verification methodology, logical defects that result from misaligned CNTs can be completely eliminated in layout design and rigorously checked. Although an MCV design may fail in the presence of misaligned nanotubes, an MCI design is guaranteed to implement the correct logic functionality. Such an MCI design significantly increases the yield and reliability of CNFET VLSI circuits.

Robust Circuits for Metallic CNTs

Variations in CNFETs arise mainly from variations in the physical properties of the carbon nanotubes, which constitute the semiconducting transistor channel. For example, the CNTs may have varying source/drain doping levels or diameters, leading to varying drive current. However, the largest source of variation comes from the varying numbers of semiconducting CNTs in the device caused by the random fraction of undesired metallic nanotubes from growth. In theory, roughly a third of the CNTs will be metallic if the growth process does not have preferential selectivity [Saito 1998]; however, [Li 2004] has reported a preferential growth technique that yields as low as 10% metallic nanotubes.

Metallic nanotubes, unlike semiconducting nanotubes, are highly conductive, regardless of the applied gate voltage. Consequently, they behave like resistors in parallel with the transistors, rather than semiconducting channels controlled by the gate. Metallic nanotubes will short the source and drain of the transistor, so they must be removed for proper gate functionality; for example, via plasma etching and electrical breakdown techniques presented in [Collins 2001] and [Zhang 2006]. Assuming metallic nanotubes can be completely removed via a perfected removal process, then for a given number of starting nanotubes the resulting number of semiconducting nanotubes forming the channel will vary depending on the number of metallic nanotubes.

In a study by [Deng 2007], this metallic nanotube-related variation was found to be the dominant form of performance variation compared to CNT doping or diameter variation. The study used the Stanford University CNFET model [Stanford 2007] presented in [Deng 2006], which includes practical nonidealities such as parasitics and screening effects. Using this model allowed them to compare the performance of a CMOS inverter versus a CNFET inverter in the presence of CNFET variation. An example CNFET inverter layout and the simulation results of the study are shown in Figures 17.26a and 17.26b, respectively. In the ideal case without variations, the CNFET inverter (eight nanotubes) exhibits 2.6X energy per cycle advantage and 5.1X delay advantage over 32 nm Si CMOS. However, with metallic tube variation, the CNFET inverter advantages reduce to 2.3X and 3.7X, respectively, compared to 2.5X and 4.6X from doping and diameter variations only [Deng 2007].

Figure 17.26b illustrates an important conclusion. Although CNT doping and diameter variations can result in performance degradation, the variation resulting from 32% metallic nanotubes causes far more degradation in both delay and energy. In addition, if preferential semiconducting CNT growth techniques can be improved to yield only 8% metallic nanotubes, then the performance degradation from metallic nanotube variation becomes comparable to other sources of variation, such as CNT diameter variation. Consequently, this study hints that CNT growth and the metallic nanotube removal processes deserve great attention for future CNFET VLSI. In addition, if it is assumed that one third of the nanotubes are metallic, the authors in [Deng 2007] noted that devices cannot be reliably designed if too few CNTs are used per device. The study recommends designing for 8 or more semiconducting CNTs (i.e., 12 or more total CNTs) per transistor as a design guideline to reduce the probability of no semiconducting nanotubes in a transistor to about one in a million. Following these design guidelines, a 2.4X energy per cycle advantage and 4.5X delay advantage can be achieved from CNFET inverters, which is considerably closer to the ideal case.

Simulating CNFET variation: (a) the layout of a CNFET inverter with multiple CNTs and (b) energy per cycle and FO4 delay improvement of CNFET inverter at 3σ points (bars indicate 6σ variation) compared to the 32-nm CMOS FO4 inverter.

Figure 17.26. Simulating CNFET variation: (a) the layout of a CNFET inverter with multiple CNTs and (b) energy per cycle and FO4 delay improvement of CNFET inverter at 3σ points (bars indicate 6σ variation) compared to the 32-nm CMOS FO4 inverter.

Lastly, practical metallic CNT removal techniques bring about an additional issue for CNFET circuit design. An ideal metallic CNT removal technique etches out all metallic CNTs, while leaving all semiconducting CNTs electrically intact to provide drive current. However, in practice, the selectivity of the removal process is not perfect, and even a good removal technique will remove most, but not all, metallic CNTs and leave most, but not all, semiconducting CNTs. The few remaining metallic CNTs will likely contribute significant leakage current, as they are very conductive, making leakage power a dominant issue in CNFET circuits. In this case, the leakage power will be dominated by the metallic CNT removal rate (the percentage of metallic CNTs successfully removed). Imperfect removal processes will also affect other circuit performance metrics such as the on-off current ratio. The exact requirements on selectivity and other removal process metrics for reliable CNFET circuit design is currently an area of ongoing research. A complete understanding of the imperfections and variations in CNFETs will aid in creating guidelines for designs. Ideally, these design guidelines will help limit the detrimental effects of imperfections and variations on performance to an acceptable level and thus aid in propelling CNFETs forward as a promising extension to CMOS scaling into the nanometer era.

Concluding Remarks

Emerging nanoscale devices will enable an extremely high density of components to be integrated onto a single substrate. In this chapter, we reviewed some of the most promising devices, namely resonant tunneling diodes (RTDs), quantum-dot cellular automata (QCA), carbon nanotubes/silicon nanowires, and carbon nanotube field effect transistors (CNFETs). We discussed some test challenges and presented test generation techniques for these devices. In particular, we presented defect characterization, fault modeling, and test generation of circuits based on RTDs and QCA. We discussed built-in self-test (BIST) of carbon-nanotubes-based crossbar array architectures. We also presented imperfections and variations tolerance in logic circuits implemented by CNFETs.

Regardless of which devices make their way into the mainstream of nanoscale computing, there is a general consensus that testing will be a key issue, as these devices are expected to have high defect rates. Consequently, some sort of defect and fault tolerance schemes will have to be built into nanoscale circuits, systems, and architectures. Ultimately, the testability and reliability of these devices will need to become an additional constraint during system design.

Acknowledgments

The authors wish to thank Professor Pallav Gupta of Villanova University for contributing a portion of the Resonant Tunneling Diodes and Quantum-Dot Cellular Automata section, Professor Mohammad H. Tehranipoor of University of Connecticut for contributing a portion of the Crossbar Array Architectures section, Albert Lin and Nishant Patil of Stanford University for contributing the Carbon Nanotube Field Effect Transistors section, Professor Krishnendu Chakrabarty of Duke University for contributing to the overall structure of the chapter, and Professor Subhasish Mitra and Professor H.-S. Philip Wong of Stanford University for reviewing the Carbon Nanotube Field Effect Transistors section.

References

Books

Introduction

Resonant Tunneling Diodes and Quantum-Dot Cellular Automata

Crossbar Array Architectures

Carbon Nanotube (CNT) Field Effect Transistors

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset