Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

List of Figures

Figure 1.1	History of semiconductor transistors and logic styles	2
Figure 1.2	Evolutionary paths of the microelectronics industry	3
Figure 1.3	Interconnect system composed of groups of local, semi-global, and global layers. The metal layers in each group are typically of different thickness	4
Figure 1.4	Repeaters are inserted at specific distances to improve the interconnect delay	5
Figure 1.5	Interconnect shielding to improve signal integrity, (A) single-sided shielding, and (B) double-sided shielding. The shield and signal lines are, respectively, illustrated by the gray and white color	5
Figure 1.6	Cross-section of a joint MOS (JMOS) inverter	7
Figure 1.7	Reduction in wirelength where the original 2-D circuit is composed of two and four tiers	8
Figure 1.8	Heterogeneous 3-D SoC comprising sensors and processing tiers	9
Figure 2.1	Three-dimensional stacked inverter	16
Figure 2.2	Examples of SiP technologies. (A) Wire-bonded SiP, (B) solder balls at the perimeter of the tiers, (C) area array vertical interconnects, and (D) interconnects on the faces of the SiP	17
Figure 2.3	Different communication schemes for 3-D ICs. (A) Short TSVs, (B) inductive coupling, and (C) capacitive coupling	18
Figure 2.4	Typical SoP, which can include both SiP and SoCs	19
Figure 2.5	Manufacturing and design challenges for 3-D integration	20
Figure 2.6	System miniaturization through the integration of sophisticated 3-D ICs	21
Figure 2.7	Wire-bonded SiP. (A) Dissimilar dies with multiple row bonding, (B) wire-bonded stack delimited by spacer, (C) SiP with die-to-die and die-to-package wire bonding, and (D) top view of wire-bonded SiP	22
Figure 2.8	SiP with peripheral connections. (A) solder balls, (B) through hole via and spacers, and (C) through hole via in a PCB frame structure	24
Figure 2.9	Basic manufacturing phases of an SiP. (A) Interposer bumping and solder ball deposition, (B) die attachment, (C) tier stacking, and (D) epoxy underfill for enhanced reliability	25
Figure 2.10	Cross-section of the SiP after removing the mold. (A) The SiP encapsulated in epoxy resin, (B) sawing to expose the metal traces, and (C) sawing to expose the bonding wires	28
Figure 2.11	Interposer-based 2.5-D systems where (A) ICs are mounted on only one face of the interposer (single side), and (B) ICs are attached to both sides of the interposer (double side)	29
Figure 2.12	Embedded interposer within a package substrate enabling multi-IC integration	31
Figure 2.13	An SiP system comprising (A) a top IC with Cu pillars and a bottom IC with solder bumps on a TSH interposer, and (B) the dimensions of several components (not shown to scale)	33
Figure 3.1	Typical interconnects paths for (A) wire-bonded SiP, (B) SiP with solder balls, and (C) 3-D IC with TSVs	38
Figure 3.2	Cross-section of a stacked 3-D IC with a planarized heat shield to avoid degradation of the transistor characteristics on the first layer due to the temperature of the fabrication processes	40
Figure 3.3	Cross-section of a device level stacked 3-D IC with a PMOS device on the bottom layer and an NMOS device in recrystallized silicon on the second layer	40
Figure 3.4	Processing steps for laterally crystallized TFT based on Ge-seeding. (A) Deposition of amorphous silicon, (B) creating seeding windows, (C) deposition of seeding materials, (D) producing silicon islands, and (E) processing of TFTs	41
Figure 3.5	Processing steps for vertical and lateral growth of 3-D SOI devices. (A) Definition of SOI islands, (B) silicon dioxide deposition, (C) formation of SEG window, (D) silicon growth within the SEG window, (E) etching of redundant silicon with CMP, (F) definition of upper device layer, (G) deposition of upper layer, (H) formation of SOI islands on the upper tier	43
Figure 3.6	Basic processing steps for a 3-D inverter utilizing the local clustering approach. (A) Oxide deposition, (B) wafer patterning and active area definition, (C) low temperature oxide deposition, (D) deposition of nitride film, (E) via formation at the drain side, (F) etching of the nitride film, (G) boron doping, (H) active area definition, (I) gate oxide growth by thermal oxidation, (J) deposition of doped polysilicon	44
Figure 3.7	Sequential process for fabricating monolithic 3-D circuits, where (A) the SOI devices in the first layer are manufactured with standard SOI processes, (B) molecular bonding allows the transfer of a high quality substrate, (C) the devices in the upper layers are formed, and (D) metal contacts connect the device layers	45
Figure 3.8	Monolithically stacked devices where the interlayer contacts (3-D contact) and the standard metal contacts (tungsten plug) connecting the devices are illustrated. The 3-D contact has similar traits to a standard contact connecting two metal layers	46
Figure 3.9	Typical fabrication steps for a 3-D IC process. (A) Wafer preparation, (B) TSV etching, (C) wafer thinning, bumping, and handle wafer attachment, (D) wafer bonding, and (E) handle wafer removal	49
Figure 3.10	The cavity alignment method, (A) the cavity template is aligned and bonded to the substrate, (B) the individual tiers are placed in the cavity through compression, (C) the 3-D stack is assmbled through thermal compression, and (D) the cavity template is removed	52
Figure 3.11	Process for face-to-face bonding and substrate assembly, removing the need for TSVs	53
Figure 3.12	Metal-to-metal bonding; (A) square bumps, and (B) conic bumps for improved bonding quality	54
Figure 3.13	Capacitively coupled 3-D IC. The large plate capacitors are utilized for power transfer, while the small plate capacitors provide signal propagation	56
Figure 3.14	Inductively coupled 3-D ICs. Galvanic connections may be used for power delivery	57
Figure 3.15	Basic steps of a via-last manufacturing process (not to scale)	59
Figure 3.16	Basic steps of a via-first manufacturing process (not to scale)	59
Figure 3.17	Basic steps of a via-middle manufacturing process (not to scale)	60
Figure 3.18	TSV formation and filling after FEOL (wafer thinning) and BEOL (via-last approach)	61
Figure 3.19	TSV shapes. (A) Straight and (B) tapered	62
Figure 3.20	The scallops formed due to the time multiplexed nature of the BOSCH process	62
Figure 3.21	Poor TSV filling resulting in void formation, (A) large void at the bottom, and (B) seam void	63
Figure 3.22	Structure of partial TSV and related materials	64
Figure 4.1	Early TSV from patents filed by (A) William Shockley, and (B) Merlin Smith and Emanuel Stern of IBM	68
Figure 4.2	Equivalent π-model of a TSV	69
Figure 4.3	Top view of a TSV in silicon depicting the oxide layer, TiCu seed layer, and copper TSV	71
Figure 4.4	3-D via structure. (A) 3-D via with top and bottom copper landings, and (B) equivalent structure without metal landings	72
Figure 4.5	Current profile due to the proximity effect for (A) currents propagating in opposite directions, and (B) currents flowing in the same direction	76
Figure 4.6	Cross-sectional view of different CMOS technologies with TSVs depicting the formation of a depletion region around the TSV in (A) bulk CMOS, and (B) bulk CMOS with a p+ buried layer. The TSVs in either PD-SOI (shown in (C) top) or FD-SOI (shown in (C) bottom) reveal minimal formation of a depletion region	80
Figure 4.7	Ratio of the total TSV capacitance to the oxide capacitance as a function of applied voltage V_g	81
Figure 4.8	Physical parameters and materials used in the compact models of a TSV for a single device layer, as listed in Table 4.6. (A) Side view, and (B) top view	83
Figure 4.9	Resistance of a cylindrical 3-D via at DC, 1 GHz, and 2 GHz	86
Figure 4.10	Per cent error as a function of frequency for the resistance of a 3-D via (a.r.=aspect ratio)	87
Figure 4.11	Self-inductance L₁₁ of a cylindrical 3-D via	93
Figure 4.12	Mutual inductance L₂₁ of a cylindrical 3-D via with a 20 µm diameter	94
Figure 4.13	Mutual inductance L₂₁ between two 3-D vias with different lengths (D=10 µm, and $3 ℒ_{g} = ℒ_{y}$ $3 ℒ_{g} = ℒ_{y}$ )	95
Figure 4.14	Capacitance of a cylindrical 3-D via over a ground plane	101
Figure 4.15	Coupling capacitance between two 3-D vias over a ground plane (D=20 µm)	103
Figure 4.16	Frequency range applicable to Q3D models and closed-form inductance expressions	106
Figure 4.17	Critical dimensions of a 3-D via over a ground plane for the MITLL 3-D process	108
Figure 4.18	Circuit model for RLC extraction of (A) two 3-D vias, and (B) two 3-D vias with a shield via between two signal vias	111
Figure 4.19	Effect of a return path on the loop inductance. (A) Return path placed on 3-D via 2, and (B) return path placed on 3-D via 3	114
Figure 5.1	Heterogeneous 3-D integrated circuit	120
Figure 5.2	Model of noise coupling from TSV to a victim device through a silicon substrate. (A) General model, and (B) reduced model	121
Figure 5.3	Noise coupling from a TSV to a victim device. (A) Short-circuit Ge substrate model, and (B) open circuit GaAs substrate model	123
Figure 5.4	Isolation efficiency of a noise coupled system for different substrate materials. (A) Silicon, (B) germanium, and (C) gallium arsenide	125
Figure 5.5	Equivalent small-signal model of a noise coupled system	126
Figure 5.6	Resistance and inductance versus line width of the ground network. The ground network is composed of copper-based interconnects	128
Figure 5.7	Isolation efficiency of a noise coupled system as a function of the line width of the ground network for different substrate materials. (A) Silicon, (B) germanium, and (C) gallium arsenide	129
Figure 5.8	Distance from aggressor module “A” on tier m to victim module “V” on tier n	130
Figure 5.9	Effect of distance between an aggressor and victim on the isolation efficiency for a Ge substrate. The resonant frequency is observed at the peak isolation efficiency due to the increasing reactance of the ground network	130
Figure 5.10	Keep out region around an aggressor TSV. The victim modules (Victim) should be placed outside this region	131
Figure 5.11	Isolation efficiency versus frequency and radius of keep out region for different substrate materials. (A) Si, (B) Ge, and (C) GaAs	132
Figure 5.12	Keep out region around aggressor TSV for $N_{\max} = - 40 dB$ $N_{\max} = - 40 dB$ . The victim modules should be placed on the isolation efficiency surface below the base surface	133
Figure 5.13	Comparison between SPICE model and extracted transfer function for different substrate materials. (A) Si, (B) Ge, and (C) GaAs	134
Figure 6.1	An inductive link between the transmitter and receiver circuits including the coupled on-chip inductors	139
Figure 6.2	Equivalent circuit of an inductive link including the parasitic resistance and capacitance of the on-chip inductors	139
Figure 6.3	Model for pulse modulation. (A) Current of the transmitter modeled as a Gaussian pulse and (B) voltage induced on the receiver	140
Figure 6.4	Coupling efficiency for distance X and decreasing outer diameter d_out	141
Figure 6.5	Square spiral on-chip inductor with n=7 turns, illustrating the geometric parameters	142
Figure 6.6	Flow diagram for the design of the coils in an inductive link under power, performance, and area constraints	144
Figure 6.7	Transceiver circuit of a synchronous inductive link	146
Figure 6.8	Transceiver circuit of an asynchronous inductive link	147
Figure 6.9	Block diagram of an inductive coupling scheme with burst transmission	148
Figure 6.10	Efficiency of TSV and inductive interfaces with higher multiplexing density	151
Figure 6.11	Top view of a structure comprising an inductive link and the return path through a power delivery network placed in different locations	154
Figure 6.12	Noise induced by an inductive pair for varying δ_c	154
Figure 6.13	Array of inductive links and P/G loops connected to C4 supply pads. The power and ground lines are depicted, respectively, by solid and dashed lines	156
Figure 6.14	Parasitic noise induced on a power wire depending upon the distance of the wire from the inductor	157
Figure 6.15	Power delivery network topologies. (A) Interdigitated P/G–P/G topology, (B) paired type-I P/G–P/G topology, and (C) paired type-II P/P–G/G topology	157
Figure 6.16	Wireless power transmission for standard inductive coupling	160
Figure 7.1	An example of the method used to determine the distribution of the interconnect length. Group N_A includes one gate, group N_B includes the gates located at a distance smaller than l (encircled by the dashed curve), and N_C is the group of gates at distance l from group N_A (encircled by the solid curve). In this example, l=4 (the distance is measured in gate pitches)	164
Figure 7.2	An example of the method used to determine the interconnect length distribution in 3-D circuits. (A) Partial Manhattan hemisphere, and (B) cross-section of the partial Manhattan hemisphere along e-e′. The gates in N_B and N_C are shown, respectively, with light and dark gray tones	167
Figure 7.3	Example of starting and nonstarting gates. Gates P and Q can be starting gates while S is a nonstarting gate	167
Figure 7.4	Possible vertical interconnections for two cells with each cell containing n gates	169
Figure 7.5	Interconnect length distribution for a 2-D and 3-D IC	170
Figure 7.6	Variation of gate pitch, total interconnect length, and interconnect power consumption with the number of tiers	171
Figure 8.1	Cross-sectional view. (A) TSV middle, and (B) TSV last	176
Figure 8.2	Processing flow for TSV middle and TSV last considered in terms of cost and complexity	177
Figure 8.3	Comparison of TSV lithography cost for different TSV process flows and geometries. The difference in cost between TSV middle and TSV last is due to different process equipment	178
Figure 8.4	Comparison of processing cost of the TSV etching step for different TSV geometries. The processing cost is normalized to the cost of etching a 5×50 TSV middle structure	179
Figure 8.5	Comparison of processing cost to deposit the TSV oxide liner for different TSV geometries. The processing cost is normalized to the cost of the liner deposition for a 5×50 TSV middle structure. In the case of TSV middle, the oxide liner at the field of the wafer is removed by CMP. For TSV last, no CMP polishing of the liner is necessary	180
Figure 8.6	Comparison of in-via oxide liner etch processing cost for different TSV last geometries. The TSVs with a smaller diameter require longer liner etch processing time. The processing cost is normalized to the process cost of a 5×50 TSV last flow	181
Figure 8.7	Cost comparison of barrier seed process for TSV middle geometries. Non-PVD deposition approaches can be applied to TSV sizes of 3×50 and 2×40. The processing cost is normalized to the process cost of a 5×50 TSV middle flow	182
Figure 8.8	Cost comparison of barrier seed process for TSV last geometries. PVD processing is applied for all TSV last sizes. The processing cost is normalized to the process cost of 5×50 TSV middle flow	182
Figure 8.9	Cost comparison of TSV Cu plating process for both TSV middle and TSV last flows for different TSV geometries. The cost comparison considers processing and material costs and is normalized to the process cost of the 5×50 TSV middle (POR) flow	183
Figure 8.10	Cost of Cu CMP for different Cu overburden thicknesses. The fine Cu polish step is the dominant cost component for Cu thicknesses up to 2,000 nm	183
Figure 8.11	CMP benchmark of deposition materials used in TSV processing. For each material, a thickness of 100 nm is considered to estimate polishing time and slurry consumption	184
Figure 8.12	Cost benchmark of backside processing steps for TSV middle and TSV last flows	186
Figure 8.13	Benchmark of overall processing costs for different TSV geometries for both TSV middle and TSV last flows	186
Figure 8.14	Cost benchmark of the 5×50 and 10×100 TSV flows. For the 5×50 TSV middle, polishing the oxide liner increases the cost of the CMP step. For the 5×50 TSV last process, the liner deposition, liner etch steps, and backside CMP are the primary cost differentiators. For the 10×100 TSV middle process, polishing the oxide liner increases the overall processing cost by up to 9% as compared to 10×100 TSV last flow	187
Figure 8.15	Cost benchmark of TSV middle processing flows for different TSV geometries. Note that the TSV size and pitch are scaled while maintaining the TSV processing cost	188
Figure 8.16	System integrated on top of an interposer substrate. The TSVs connect the interposer to the package substrate	189
Figure 8.17	Comparison of processing costs per wafer for different features of an interposer substrate. All of the processing costs are normalized to the wafer cost of processing a 10 µm × 100 µm TSV middle flow	190
Figure 8.18	Different interposer configurations: (A) single metal layer over a power metal plane, (B) two thick metal layers, and (C) MIM capacitor between power and ground metal planes with two thick metal layers	192
Figure 8.19	Comparison of wafer-level processing cost for different interposer structures. The cost of each component is normalized to the cost of the 10 µm × 100 µm TSV	192
Figure 8.20	3-D stacking approaches: vertical stack of three active dice, (A) D2W or W2W stacking, and (B) 2.5-D interposer-based stacking	193
Figure 8.21	3-D integration technologies. (A) Three die stack. The stacking interface between the dice is microbumps. TSVs are fabricated on die 1 and die 2 to enable vertical signal propagation. (B) Three active dice on an interposer substrate. The active dice are stacked using microbumps. The TSVs are fabricated within the interposer die to provide access to the backside. The interposer is connected to the package substrate (not shown) by Cu pillars	194
Figure 8.22	Comparison of processing cost per wafer to enable 3-D stacking. The features are processed either on the active dice and/or on the interposer substrate. For the die pick and place step, processing of 541 die/wafer is assumed	195
Figure 8.23	Cost comparison of different stacking approaches and components. The cost of the compound yield losses is illustrated for each stacking approach. An area of 10 mm×10 mm is considered for the active dice	196
Figure 8.24	Process cost and total cost of a 3-D system per die area as a function of active die size. Three different 3-D integration approaches are considered: D2W, W2W, and 2.5-D interposer	197
Figure 8.25	Effect of interposer processing yield and test fault coverage on the cost of an interposer-based 2.5-D system	198
Figure 8.26	Cost of 2.5-D interposer-based system in terms of the size of the stacked active die, interposer die processing yield (Y_INT), and fault coverage (FC) of interposer prestack testing. (A) Y_INT=99%, FC=100%, (B) Y_INT=99%, FC=50%, (C) Y_INT=99%, FC=0%, (D) Y_INT=90%, FC=100%, (E) Y_INT=90%, FC=50%, (F) Y_INT=90%, FC=0%, (G) Y_INT=80%, FC=100%, (H) Y_INT=80%, FC=50%, and (I) Y_INT=80%, FC=0%	199
Figure 9.1	Example of positive (shown with solid lines) and negative (shown with dashed lines) step lines for block b	205
Figure 9.2	Example of SP representation, where (A) is a group of blocks comprising a floorplan, (B) positive step lines for these blocks, and (C) negative step lines for the blocks	206
Figure 9.3	Example of a net bounding box connecting pins from blocks a and c. The HPWL metric is the half length of the perimeter of the net bounding box. The solid line shows a possible net route to connect pins of blocks a and c marked by the solid squares	207
Figure 9.4	Example of computing slack where (A) the blocks are floorplanned in left-to-right and top-to-bottom manner, and (B) the blocks are floorplanned in right-to-left and bottom-to-top mode	208
Figure 9.5	Upper bound of area and volume for two- and three-dimensional slicing floorplans (F) depicted, respectively, by the solid and dashed curve for different shape aspect ratios. V_total (A_total) and V_max(A_max) are, respectively, the total and maximum volume (area) of a 3-D (2-D) system	209
Figure 9.6	Floorplanning strategies for 3-D ICs. (A) Single step approach, and (B) multistep approach	211
Figure 9.7	Different metrics to determine the length of a 3-D net, (A) the classic HPWL metric including only the pins of the net in all tiers, (B) an extended bounding box including the TSV locations, (C) the bounding box of the segments of the net within tier 2, and (D) the bounding box of the segment of the net belonging to tier 3	215
Figure 9.8	Flow of two stage floorplanning methods considering the TSV locations	216
Figure 9.9	Whitespace within the bounding box of the intertier net can be used for placing a TSV without increasing the wirelength. This whitespace defines the candidate TSV islands. The whitespace outside the bounding box describes noncandidate TSV islands, as placing a TSV into these regions increases the wirelength	218
Figure 9.10	A two tier floorplan with three intertier connected nets, (A) the blocks and pins, (B) the virtual die with the projection of the bounding box of each net, and (C) the routed nets and corresponding TSV island are shown. The notation p_i,j is the pin of net i in tier j. The pins connected by each net are also indicated in the figure	220
Figure 9.11	A three tier circuit, (A) the independent feasible region for a two pin net starting from tier 1 and terminating in tier 3 is shown by the dashed rectangle, (B) the allowed row (intertier) and column (intratier) connections are depicted with dashed lines, and (C) a potential route for this net is shown by the solid line. The dots illustrate available locations for buffers in each row (tier)	222
Figure 9.12	Design flow of microarchitectural floorplanning process for 3-D microprocessors	225
Figure 9.13	Two force directed placement processes, (A) the TSVs and circuit cells are placed simultaneously, and (B) the TSVs are placed prior to the circuit cells and behave as placement obstacles	232
Figure 9.14	TSV assignment based on the MST of a net, where the closest TSV to the shortest edge of the net is inscribed by the dotted eclipse	233
Figure 9.15	Analytic placement process for 3-D circuits considering number of TSVs and wirelength	238
Figure 9.16	Process for determining available whitespace (WS), which is illustrated by the white regions	240
Figure 9.17	Block placement of an SOP. (A) Initial placement, and (B) increase in the total area in the x and y directions to extend the area of the whitespace	240
Figure 9.18	Layout of supercells. Supercells have the same heigth and varying width. The space around the supercells is used for buffers and TSVs	243
Figure 9.19	An example of computing the matrices of a two tier grid, (A) route counts, and (B) routing density	243
Figure 9.20	Channel alignment procedure to create intertier routing channels	246
Figure 9.21	Pseudocode of 3-D routing algorithm targeting reductions in both performance and temperature	247
Figure 9.22	An SOP consisting of n tiers. The vertical dashed lines correspond to vias between the routing layers, and the thick vertical solid lines correspond to through silicon vias that penetrate the device layers	248
Figure 9.23	Stages of a 3-D global routing algorithm	249
Figure 9.24	Layout windows with different area markers; (A) layout window for tier 1, and (B) layout window for tier 2 (the windows are not on the same scale)	251
Figure 10.1	Global interconnect structures for impedance extraction. (A) Three parallel metal lines over a ground plane in a 2-D circuit, and (B) three parallel metal lines sandwiched between two ground planes in a 3-D circuit	254
Figure 10.2	A three-tier FDSOI 3-D circuit. Tiers one and two are face-to-face bonded, while tiers two and three are face-to-back bonded	255
Figure 10.3	Capacitance extraction for an intertier via structure, (A) intertier via surrounded by orthogonal metal layers, and (B) capacitance values for different via sizes and spacing values. The same dielectric material is assumed for all of the layers (i.e., ε_d = ε_i = ε_SiO₂)	256
Figure 10.4	Capacitance extraction for an intertier via structure, (A) intertier via through layers of dielectric and the bonding interface, surrounded by eight intertier vias, and (B) capacitance values for different via sizes and spacings	257
Figure 10.5	Capacitance extraction for an intertier via structure, (A) intertier via through silicon substrate, surrounded by a thin insulator layer, and (B) capacitance for different via sizes and thicknesses of the insulator layer	258
Figure 10.6	Two terminal intertier interconnect with single via and corresponding electrical model	258
Figure 10.7	An example of interconnect sizing. (A) An interconnect of minimum width, W_min, (B) uniform interconnect sizing W>W_min, and (C) nonuniform interconnect sizing W=f(l)	260
Figure 10.8	SPICE measurements of 50% propagation delay of a 600 μm line versus the via location l₁ for different values of r₂₁. The interconnect parameters are r₁=79.5 Ω/mm, r_v1=5.7 Ω/mm, c_v1=6 pF/mm, c₂=439 fF/mm, c₁₂=1.45, l_v=20 μm, and n=2. The driver resistance and load capacitance are, resepctively, R_S=50 Ω and C_L=50 fF	262
Figure 10.9	SPICE measurements of the 50% propagation delay for a 600 μm line versus the via location l₁ for different values of r₂₁. The interconnect parameters are r₁=79.5 Ω/mm, r_v1=5.7 Ω/mm, c_v1=6 pF/mm, c₂=439 fF/mm, c₁₂=0.46, l_v=20 μm, and n=2. The driver resistance and load capacitance are, respectively, R_S=50 Ω and C_L=50 fF	263
Figure 10.10	Decrease in the delay improvement caused by the nonoptimal placement of the intertier via for a 500 μm interconnect. The interconnect parameters are r₁=23.5 Ω/mm, r_v1=270 Ω/mm, c_v1=270 fF/mm, c₂=287 fF/mm, l_v=15 μm, and n=2. The driver resistance and load capacitance are, respectively, R_S=30 Ω and C_L=100 fF	265
Figure 10.11	Decrease in the delay improvement due to the nonoptimal placement of the intertier via for a 500 μm interconnect. The interconnect parameters are r₁=23.5 Ω/mm, r_v1=6.7 Ω/mm, c_v1=270 fF/mm, c₂=287 fF/mm, l_v=15 μm, and n=2. The driver resistance and load capacitance are, respectively, R_S=100 Ω and C_L=100 fF	266
Figure 10.12	Intertier interconnect consisting of m segments connecting two circuits located n tiers apart	267
Figure 10.13	Intertier interconnect model composed of a set of nonuniform distributed RC segments	268
Figure 10.14	Case (iii) of the two terminal net heuristic. The allowed interval is iteratively decreased ensuring the optimum via location is eventually determined	271
Figure 10.15	A subset of interconnect instances depicted by the dashed lines for case (iv) of the via placement heuristic. The interconnect traverses eight tiers and has a length L=1.455 mm. The resistance r_j and capacitance c_j of each interconnect segment range, respectively, from 10 to 50 Ω/mm and 100 to 500 fF/mm	271
Figure 10.16	Pseudocode of the proposed two terminal net via placement algorithm	273
Figure 10.17	Average and maximum improvement in delay for different range of interconnect segment resistance and capacitance ratios. The vias are placed either at the center of the allowed intervals or randomly, as explained in the legend of the diagram	276
Figure 10.18	Comparison of the average Elmore delay based on wire sizing and optimum via placement techniques. The instance where the optimum via placement outperforms wire sizing (and vice versa) is also depicted	277
Figure 10.19	NAPC for minimum width and wire segments of equal length, wire sizing, and wire segments of equal length, and minimum width and optimum via placement, yielding segments of different length	278
Figure 11.1	Intertier interconnect tree. (A) Typical intertier interconnect tree, and (B) intervals and directions that the intertier via can be placed	282
Figure 11.2	Different intertier via moves. (A) Type-1 move (allowed), (B) type-2 move (allowed), and (C) type-3 move (prohibited)	283
Figure 11.3	Simple interconnect tree, illustrating a critical path (w₃=1) and on path and off path intertier vias	286
Figure 11.4	Pseudocode of the Interconnect Tree Via Placement Algorithm (ITVPA)	287
Figure 11.5	Pseudocode of the near-optimal Single Critical Sink interconnect tree Via Placement Algorithm (SCSVPA)	288
Figure 11.6	A symmetric tree including two intertier vias. The interconnect parameters per tier are r₁=10.98 Ω/mm, r₂=11.97 Ω/mm, r₃=96.31 Ω/mm, c₁=147.89 fF/mm, c₂=202 fF/mm, and c₃=388.51 fF/mm, and the allowed interval l_di,_v₂=75 μm	290
Figure 12.1	Cross-session of a 3-D stack illustrating (A) the variety of materials, including the package and heat sink, which increase the complexity of the thermal analysis process, and (B) a thermal circuit used to model the flow of heat along the z-direction	297
Figure 12.2	Schematic of a cross-section of a 3-D system with intertier liquid cooling through (A) microfluidic channels, and (B) through a micropin array	300
Figure 12.3	Example of the duality of thermal and electrical systems	304
Figure 12.4	Thermal model of a 3-D circuit where 1-D heat transfer is assumed. Each layer is assumed homogeneous with a single thermal conductivity	305
Figure 12.5	Increase in temperature in a 3-D circuit for different number of tiers and power densities	307
Figure 12.6	Different vertical heat transfer paths within a 3-D IC	308
Figure 12.7	Maximum temperature versus power density for 3-D ICs, SOI, and bulk CMOS. The difference among the curves for the 3-D ICs is that the first curve (3-D horizontal and vertical) includes thermal paths with a horizontal interconnect segment, while the second curve includes only continuous vertical flow of heat through the wires	310
Figure 12.8	Unit tile (or cell) including a thermal resistor in each x, y, z-direction. A thermal capacitor models the heat capacity of the tile and a heat source q_x,y,z for the power consumed by the devices or the joule heating of the wires within this cell	311
Figure 12.9	Thermal model of a 3-D IC. (A) A 3-D tile stack, (B) one pillar of the stack, and (C) an equivalent thermal resistive network. R₁ and R_p correspond, respectively, to the thermal resistance of the thick silicon substrate of the first tier and the thermal resistance of the package	312
Figure 12.10	Cross-section of a cell including a TSV within the silicon substrate	314
Figure 12.11	Simulation setup for determining the thermal conductivity of the cell shown in Fig. 12.10 along (A) the xy-plane, and (B) along the z-direction	314
Figure 12.12	A segment of a three tier 3-D IC with a TTSV, where (A) is the geometric structure, and (B) is the cross-section of a TTSV of this segment. The area of the circuit is denoted by A₀. The three main paths of heat transfer are depicted by the dashed lines	316
Figure 12.13	Thermal model of a TTSV in a three tier circuit, extendible to n tiers, where double notation is used to demonstrate that the model can be extended to a 3-D stack of n tiers	318
Figure 12.14	Maximum rise in temperature in a three tier 3-D circuit for different dielectric liner thicknesses, where D_TSV=10 μm. The other parameters are $t_{{SiO}_{2}}$ $t_{{SiO}_{2}}$ =7 μm, t_b=1 μm, t_Si2=t_Si3=45 μm, k₁=1.3, and k₂=0.55	319
Figure 12.15	Neighboring cells bending the isothermal curves due to the TSVs	319
Figure 12.16	Schematic of a tapered TSV	320
Figure 12.17	Thermal model of microchannel with conductive and convective thermal resistances	322
Figure 12.18	Schematic illustration of the thermal wake effect, which leads to an exponential decay of the temperature downstream from the channel due to the heated cells located upstream. The transfer of heat occurs both downstream and transverse to the flow within the channel	324
Figure 12.19	A four tier 3-D circuit discretized into a mesh	325
Figure 12.20	Traditional V-cycles of multigrid methods with coarsening and refining stages	326
Figure 12.21	Coarsening process excluding the BEOL layers in the z-direction to ensure that valuable physical information is not lost, improving the overall efficiency and accuracy of the multigrid technique	327
Figure 12.22	Principle of power blurring method	328
Figure 13.1	Cost function of the temperature	335
Figure 13.2	A bucket structure example for a two tier circuit consisting of 12 blocks. (A) A two tier 3-D IC, (B) a 2×2 bucket structure imposed on a 3-D IC, and (C) the resulting bucket index	335
Figure 13.3	Intertier moves. (A) An initial placement, (B) a z-neighbor swap between blocks a and h, and (C) a z-neighbor move for block l from the first tier to the second tier	336
Figure 13.4	Three stage floorplanning process based on the force directed method	340
Figure 13.5	Transition from a continuous 3-D space to discrete tiers. Block 2 is assigned to either the lower or upper tier, which results in different overlaps	342
Figure 13.6	Mapping of a task graph onto physical PEs within a 3-D NoC	343
Figure 13.7	Temperature balancing heuristic where (A) the tasks are sorted in descending power and assigned to super-tasks, (B) the temperature of each core, and (C) the super-tasks assigned to the super-cores	348
Figure 13.8	First order thermal model, where each core is thermally modeled by a node with power P_i, specific heat C_i, and inter- and intratier thermal resistances	349
Figure 13.9	3-D CMP consisting of a single four core tier with three tiers of SRAM and one tier of MRAM	354
Figure 13.10	Dynamic thermal management schemes for a 3-D CMP employing a mixture of SRAM, MRAM, and DVFS, (A) SRAM-1 GHz core, (B) SRAM-3 GHz core, (C) SRAM-core DVFS, (D) hybrid-3 GHz core, and (E) hybrid-core DVFS	355
Figure 13.11	Cross-sectional view of a 3-D ultra-thin system with peripheral copper TSVs	358
Figure 13.12	Cross-sectional view of a two tier structure with a spatial heat source to evaluate the effects of the metal grid/plate and thickness of the adhesive materials on the thermal behavior of the structure (not to scale)	359
Figure 13.13	Average temperature of a circuit surrounded by resistors used as heating elements where different means such as a TSV or metal ring are used to thermally isolate the circuit	360
Figure 13.14	Thermal conductivity versus thermal via density	366
Figure 13.15	Multi-level routing process with thermal via planning	368
Figure 13.16	Heat propagation paths within a 3-D grid	369
Figure 13.17	Routing grid for a two tier 3-D IC. Each horizontal edge of the grid is associated with a horizontal wire capacity. Each vertical edge is associated with an intertier via capacity	372
Figure 13.18	Effect of a thermal wire on the routing capacity of each grid cell. v_i and v_j denote the capacity of the intertier vias for, respectively, cell i and j. The horizontal cell capacity is equal to the width of the cell boundary	373
Figure 13.19	Flowchart of a temperature aware 3-D global routing technique	373
Figure 13.20	Floorplan of a 3-D MPSoC, (A) cores and L2 caches are placed in separate tiers, and (B) cores and caches share the same tier	375
Figure 14.1	Heat propagation from one tier spreading into a second stacked tier	382
Figure 14.2	Physical layout, (A) on-chip resistive heater, (B) on-chip four-point resistive thermal sensor, and (C) overlay of the resistive heater and resistive thermal sensor	384
Figure 14.3	Physical layout, (A) back metal resistive heater and (B) back metal four-point resistive thermal sensor	385
Figure 14.4	Microphotograph of the test circuit depicting the back metal pattern with an overlay indicating the location of the on-chip thermal test sites	386
Figure 14.5	Placement of thermal heaters and sensors, respectively, in metals 2 and 3 in the two stacked device planes. The placement of the back metal heaters and sensors is also shown	387
Figure 14.6	Calibration of (A) on-chip thermal sensors, and (B) back metal thermal sensors	389
Figure 14.7	Experimental results for the different test conditions. Each label describes the device plane, site location of the heater, and whether active cooling is applied	390
Figure 14.8	Structure of the 3-D test circuit consisting of two silicon tiers and one back metal layer. Each tier has two separately controlled heaters (H1 and H2). The back metal is connected to WTop using thermal through silicon vias	403
Figure 14.9	Comparison of temperatures for a horizontal path (length=1,300 μm)	404
Figure 14.10	Comparison of temperatures for a vertical path (length=10 μm)	404
Figure 14.11	Comparison of temperatures for a diagonal path (length=1,300 μm)	405
Figure 14.12	Comparison of thermal resistance per unit length for a horizontal path (length=1,300 μm)	405
Figure 14.13	Comparison of thermal resistance per unit length for a vertical path (length=1,300 μm)	406
Figure 14.14	Comparison of thermal resistance per unit length for a diagonal path (length=1,300 μm)	406
Figure 14.15	Simulated temperature at the WTop site 1 sensor for four densities of TSVs placed between the WTop site 1 heater/sensor pair and the back metal	407
Figure 15.1	A data path depicting a pair of sequentially-adjacent registers	411
Figure 15.2	Simple example of the MMM clock synthesis method where a clock tree is generated. (A) Without look-ahead, and (B) with look-ahead. An xy-cut leads to larger skew in (A) than a yx-cut in (B)	412
Figure 15.3	TRR where the core of the region is a Manhattan arc and the boundary points are at a radius distance from the core	414
Figure 15.4	Merging segment ms(u) for node u that is the parent node of nodes a and b based on TRR_a and TRR_b	415
Figure 15.5	TRR with a core point. The placement location of the parent node p, pl(p) (which is known from the previous iteration) and radius equal to the wirelength of edge e_u. The segment of ms(u) within the TRR is the thick line and represents the set of valid placement locations for node u	416
Figure 15.6	Example of the DME method for a tree with eight sinks. (A) to (C) Bottom-up phase where the recursive derivation of the merging segments is accomplished and (D) to (F) top-down phase where the exact placement of each internal node is determined	416
Figure 15.7	Two-dimensional four level H-tree	418
Figure 15.8	Buffered and symmetric clock tree that drives a grid, where each unit grid constitutes a local clock network modeled as a lumped capacitor C_{l_seg}	418
Figure 15.9	Global 3-D clock distribution networks based on planar symmetric H-trees, where during normal operation (A) one H-tree and multiple TSVs distribute the clock signal, and (B) two H-trees and a root TSV distribute the clock signal	419
Figure 15.10	Cross-section of a 3-D stack of five tiers with one dedicated clock tier and four logic tiers	420
Figure 15.11	Two clock delivery networks, (A) the networks are shorted only at the initial stages of the clock distribution, and (B) TSVs connect the clock networks at the lower levels of the clock network hierarchy	422
Figure 15.12	Multi-TSV clock tree with 13 sinks and three TSVs spanning two tiers	424
Figure 15.13	Several abstract trees for a set of eight sinks generated by the MMM-TB algorithm for different bounds of TSVs, (A) is the 2-D view of these trees and the dashed lines denote TSVs, (B) a 3-D view of the same trees, and (C) the resulting connection topologies where the gray rectangles refer to a TSV	425
Figure 15.14	Pseudocode of the z-cut procedure for the MMM-TB algorithm	426
Figure 15.15	A set of sinks S={a, b, c} where the effect of the recursive z-cuts in the MMM-TB algorithm is exemplified. (A) Two z-cuts are successively applied, (B) the source is in tier 3 and z-cut¹ is followed by z-cut², (C) the source is in tier 2 and the sinks in this tier are first extracted, and (D) the source is in tier 1 and z-cut² is followed by z-cut¹	427
Figure 15.16	Examples of merging segments for two intertier nodes u, v merged with node p. (A) An unbuffered tree, and (B) a buffered tree	428
Figure 15.17	A tree with four sinks embedded in two tiers for different cases of embedding the internal nodes x₁ and x₂ and the root node s_r and the resulting number of TSVs for each case. The notation x_i,j (s_i,j) implies the placement of node x_i (s_i) in tier j. (A) TSV=2, (B) TSV=3, (C) TSV=3, (D) TSV=4, (E) TSV=4, (F) TSV=3, (G) TSV=3, and (H) TSV=2	429
Figure 15.18	Different cases to determine the number of embedding tiers for node x, where the children nodes x₁ and x₂ are (A) clock sinks, and (C) and (E) are internal nodes. The minimum number of TSV for (A), (C), and (E) are shown, respectively, in (B), (D), and (F)	430
Figure 15.19	Three tier clock tree using a single TSV for the intertier connections. This topology is pre-bond testable as each tier includes a network connecting all of the sinks	432
Figure 15.20	Pre-bond testable clock tree with multiple TSVs. The buffers are inserted before the TSVs, thereby not changing the capacitance of the tree in tier 1. TGs in tier 2 connect the redundant tree (shown as a dashed line) with the subtrees during pre-bond test. The TGs are switched off after bonding, disconnecting the redundant tree	433
Figure 15.21	Portion of a 3-D clock tree consisting of several subtrees ST_i. The TSVs in (A) are replaced with TSV buffers in (B) to decouple the clock tree in tier 1 from the clock tree in tier 2	434
Figure 15.22	Different cases where TSV and/or clock buffers are inserted. (A) A clock buffer is inserted to balance the delay between the two branches where t_dA<t_dB, (B) multiple clock buffers are inserted due to long wires or high downstream capacitance, and (C) a TSV buffer is inserted to decouple the downstream clock tree, and a clock buffer is added to counterbalance the delay imbalance caused by the TSV buffer	434
Figure 15.23	Self-configured circuit controlling the operation of the TG (N₅ and P₅)	439
Figure 15.24	A two tier clock tree, (A) the initial TSV locations are shown, and (B) TSV₁ and TSV₃ are relocated within the whitespace. The relocation adds wirelength (shown by the dashed lines) which degrades the performance of the clock tree topology shown in (A)	440
Figure 15.25	Pre-clustering stage of the whitespace-aware CTS method, (A) a set of sinks and whitespaces are projected onto a plane, (B) the sinks per tier are located beyond distance β⋅HPWL_tier from whitespaces, (C) those sinks within a cluster belong to the same tier, and (D) the root of the subtrees from the clustered sinks in each tier and some non-clustered sinks is depicted	441
Figure 15.26	Reconstruction of the merging segments	442
Figure 15.27	Different TSV redundancy schemes. (A) Double (N-times) redundancy, (B) 4:2 shared spare topology with two spare TSVs, (C) 4:1 shared spare topology with one spare TSV, and (D) 4:2 shared spare topology with no spare TSVs.	444
Figure 15.28	Operation of a TSV TFC, (A) a pair TFC, (B) in pre-bond operation, the redundant tree is connected (shown with solid lines) while the TSVs are not present, (C) in post-bond operation with no defects, the clock signal is transferred by the TSVs, and (D) the TSV2 is defective and part of the redundant tree is used to propagate the clock signal to an adjacent subtree through TG2 and MUX2	445
Figure 15.29	Example of fault tolerant CTS from adjacent TSVs. TSV_A and TSV_B are within distance r_p and form a TFC pair	446
Figure 16.1	Three wafers are individually fabricated with an FDSOI process	450
Figure 16.2	The second wafer is face-to-face bonded with the first wafer	450
Figure 16.3	The 3-D vias are formed and the surface is planarized with chemical mechanical polishing	451
Figure 16.4	The backside vias are etched, and the backside metal is deposited on the second wafer	451
Figure 16.5	The third wafer is face-to-back bonded with the second wafer and the 3-D vias for that tier are formed	451
Figure 16.6	Backside metal is deposited and glass layers are cut to create openings for the pads	452
Figure 16.7	Layer thicknesses in the 3-D IC MITLL technology	453
Figure 16.8	Block diagram of the 3-D test IC. Each block has an area of approximately 1 mm². The remaining area is reserved for the I/O pads (the gray shapes)	455
Figure 16.9	Block diagram of the logic circuit included in each tier of each block	455
Figure 16.10	Physical layout of a pseudorandom number generator	456
Figure 16.11	Physical layout of 6×6 crossbar switch with 16-bit wide ports	456
Figure 16.12	Cascoded current mirror with an additional control transistor	457
Figure 16.13	Four stage cascoded current mirrors	458
Figure 16.14	Physical layout of the test circuit. Some decoupling capacitors are highlighted	459
Figure 16.15	Two-dimensional H-trees constituting a clock distribution network for a 3-D IC	460
Figure 16.16	Different 3-D clock distribution networks within the test circuit. (A) H-trees, (B) H-tree and local rings/meshes, (C) H-tree and global rings, and (D) trunk based	461
Figure 16.17	Physical layout of the clock distribution networks in the 3-D IC. (A) H-trees, (B) H-tree and local rings/meshes, (C) H-tree and global rings, and (D) trunk based	462
Figure 16.18	Clock signal probes with RF pads	463
Figure 16.19	Open drain transistor and circuit model of the probe (includes impedance of RF pads)	463
Figure 16.20	Structure of clock signal path from Fig. 16.16A to model the clock skew. The number within each oval represents the number of parallel TSVs between device tiers	464
Figure 16.21	Equivalent electrical model of a TSV	466
Figure 16.22	Top view of fabricated 3-D test circuit	467
Figure 16.23	Magnified view of one block of the fabricated 3-D test circuit	468
Figure 16.24	Die assembly of the 3-D test circuit with RF probes	469
Figure 16.25	Clock signal input and output waveform from the topology with global rings, as illustrated in Fig. 16.16C	470
Figure 16.26	Maximum measured clock skew between two tiers within the different clock distribution networks	471
Figure 16.27	Part of the clock distribution networks illustrated in Figs. 16.16A and B. (A) The local clock skew is individually adjusted within each tier for the H-tree topology, and (B) the local skew is simultaneously adjusted for all of the tiers for the local mesh topology	471
Figure 16.28	Measured power consumption at 1 GHz of the different circuit blocks	472
Figure 17.1	Classification of process variations and an illustration of the physical scale of the disparate sources of variations	476
Figure 17.2	Example of intratier and intertier paths. (A) One random variable is required to model D2D variations, and (B) two random variables (one for each tier) are used to model D2D variations for the entire path	477
Figure 17.3	Notation used in the delay variability model for 2-D and 3-D circuits. (A) 2-D circuit comprising two critical paths each with three logic gates, and (B) two-tier 3-D circuit contains three critical paths each with three stages, where two paths are intratier paths and one path is an intertier path. Two random variables are required in (B) to model the D2D variations of each tier	479
Figure 17.4	Cdf of a 2-D circuit (dashed line), a 3-D circuit with uneven critical path distribution between the two tiers (dashed dotted line), and a 3-D circuit with the same number of critical paths in each tier (dotted lined)	481
Figure 17.5	3-D H-tree spanning four tiers. (A) Notation for all of the 64 sinks, and (B) certain sinks used to evaluate clock skew	483
Figure 17.6	Elemental circuit to measure the distribution of delay due to variations in the buffer characteristics	483
Figure 17.7	Electrical model of a segment of an intertier clock path	485
Figure 17.8	Clock paths to sinks u and v where the paths share n_u,v buffers	489
Figure 17.9	A single via 3-D clock H-tree	490
Figure 17.10	σ of skew for increasing number of tiers (tiers) and uncorellated WID variations for both the multi and single via topologies, (A) between sinks in the first tier, and (B) between sinks in the first and topmost tiers	492
Figure 17.11	Example multi-group 3-D clock topology	493
Figure 17.12	σ of skew for 3-D clock tree topologies. (A) Intratier skew of sink pairs s_1,2 and s_1,3, and (B) intertier skew of sink pairs s_1,6 and s_1,7 within a group of data related tiers	494
Figure 17.13	Simplified 1-D model of a power distribution network to evaluate global power noise. R_ti and C_ti denote, respectively, the TSV resistance and capacitance of tier i	497
Figure 17.14	Amplitude and frequency of the resonant noise versus the switching current in different tiers	498
Figure 17.15	Resonant supply noise and IR drop versus the total resistance of the TSVs	499
Figure 17.16	Resonant noise versus the number of tiers	499
Figure 17.17	Clock uncertainty between 3-D clock paths. (A) Two paths and flip flops, and (B) corresponding clock signals	501
Figure 17.18	Skitter versus length of 3-D clock paths	504
Figure 17.19	Skitter for V_n1=90 mV and different V_n2	505
Figure 17.20	Setup skitter versus (V_n2, V_n1). (A) 3-D plot of μ_JA, (B) contour of μ_JA, (C) 3-D plot of μ_JB, (D) contour μ_JB, (E) contour of σ_JA, and (F) contour of σ_JB	506
Figure 17.21	Hold skitter versus (V_n1 and V_n2). (A) Contours for σ_SA, and (B) contours for σ_SB	507
Figure 17.22	Tradeoff between power and maximum allowed setup skitter max(J_1,2)	508
Figure 17.23	Skitter versus different ϕ (ϕ₁=ϕ₂). (A) change in μ_J1,2, (B) change in σ_J1,2, and (C) change in σ_S1,2	509
Figure 17.24	Skitter J_1,2 versus shifted ϕ₁ and ϕ₂. (A) 3-D plot of σ_J1,2 versus (ϕ₂=ϕ₁) for distribution (A), (B) contour map of σ_J1,2 versus (ϕ₂=ϕ₁) for distribution (A), and (C) contour map of σ_J1,2 for distribution (B)	510
Figure 17.25	Skitter versus f_n. (A) Change in J_1,2, and (B) change in S_1,2	511
Figure 17.26	Change of f_n on delay variations. (A) Mean and standard deviation of buffer delay versus V_dd, and (B) supply voltage to the clock path during propagation of a clock edge	512
Figure 17.27	Synthesized 3-D clock tree. (A) Majority of clock buffers in the first tier, (B) majority of clock buffers in the third tier, and (C) regions where the skitter is measured	513
Figure 17.28	Normalized number of TSVs and power dissipation for Cases 2 to 4	516
Figure 18.1	Cross-sectional view of power distribution system where several levels of the hierarchy, motherboard, PCB, package, and integrated circuit are shown. The VRM and the decoupling capacitors placed at all levels of the hierarchy are also illustrated	520
Figure 18.2	A three tier circuit where DC–DC conversion is integrated in the upper tiers to reduce losses within the power delivery system	522
Figure 18.3	Buck converter integrated within a separate tier and connected to the logic tier with TSVs	522
Figure 18.4	3-D power delivery system. (A) DC–DC buck converters are integrated within only one tier, and (B) DC–DC converters are integrated in the tiers at both ends of the stack. Two different types of TSVs are noted, those TSVs that distribute a high (off-chip) voltage (V_DDH) to the converters and those TSVs which distribute a low (on-chip) voltage (V_DDL) downstream from the output of the converters	523
Figure 18.5	Equivalent circuit of the on-chip power distribution network of an n tier 3-D circuit, where the total IR drop across the tiers is denoted as V_drop. Only one buck converter is integrated in one tier and the on-chip power distribution network is modeled as a 1-D network	524
Figure 18.6	Equivalent circuit of the on-chip power distribution network of an n tier 3-D circuit, where the total IR drop across the tiers is denoted as V′_drop. Two buck converters are integrated within the tiers at both ends of the circuit, each supplying current to half of the tiers of the stack	524
Figure 18.7	A converter providing current within a prototype 2-D circuit used to emulate a 3-D system comprising eight tiers, where the TSVs and active loads are connected in a daisy chain	526
Figure 18.8	A multi-level power distribution network applied to a three tier circuit where each pair of power levels is mapped to a single tier	527
Figure 18.9	Equivalent circuit diagram of a power distribution network of a 3-D circuit, (A) supplied by a single V_dd, and (B) supplied by several pairs of V_dd supplies	527
Figure 18.10	A 3-D circuit consisting of two memory tiers and one processor tier	528
Figure 18.11	Multi-level power delivery system where two pairs of voltage levels are employed in each tier. (A) All of the circuits are active, (B) right half of the circuit in each tier is inactive (shown in gray), and (C) left half of the circuit in each tier is inactive (shown in gray)	529
Figure 18.12	Multi-level power delivery system where each tier is supplied by one pair of voltage levels. (A) All of the circuits are active, (B) the processor is inactive (shown in gray), and (C) the memory tiers are inactive (shown in gray)	530
Figure 18.13	A 3-D power distribution netwok (not to scale), (A) the power (ground) meshes are connected by power (ground) TSVs, and (B) the equivalent circuit model of a package pin, TSV, and unit cell including the decoupling capacitance and current source	531
Figure 18.14	The segmentation method linking successive unit cells to model an entire power distribution network	534
Figure 18.15	Decomposition of a unit cell including both power and ground lines along the x and y directions. The different structures formed by the decomposition process are also illustrated. Two metal layers are utilized for the power distribution network	534
Figure 18.16	Decomposed structures and equivalent RLGC lumped sections. The notation of the physical parameters used in Table 18.2 is also defined	535
Figure 18.17	Iterative process for electro-thermal analysis	537
Figure 18.18	Overview of power grid, (A) a small segment of a power grid, and (B) corresponding electrical model including the parasitic impedance of the package	538
Figure 18.19	Cross-sectional view of a TSV. (A) A standard solid TSV, and (B) a CTSV with two layers of metal separated by a dielectric layer	541
Figure 18.20	Current paths within a 3-D circuit. (A) Where the TSV is connected to the power lines on both the uppermost (MT) and the first (M1) metal layers, and (B) where the TSV is connected only to the topmost (MT) metal layer	542
Figure 18.21	Equivalent circuit of the current flow paths illustrated in Fig. 18.20. (A) The TSV locally distributes current, and (B) only stacks of metal vias supply current to the load	542
Figure 18.22	Voltage drop at the current source as a function of the current drawn by the power supply	543
Figure 18.23	Voltage drop as a function of distance of the current source from the TSV	544
Figure 18.24	Resistive grid to model a segment of a power distribution system. (A) In the uppermost (M6) metal layer, and (B) in the lowest (M1) metal layer	546
Figure 18.25	SPICE simulation of the voltage drop on the M1 grid for different nodes with (solid curves) and without (dashed curves) the TSV path. No stacked vias are removed (d=0)	547
Figure 18.26	SPICE simulation of the maximum voltage drop on the M1 grid by successively removing the stacked vias (i.e., increasing d) with (dashed curves) and without (solid curves) the TSV path	548
Figure 18.27	SPICE simulation of the voltage drop on the M1 grid for different nodes and with no stacked vias removed (d=0) with (solid curves) and without (dashed curves) the TSV path. Only three current sources switch	548
Figure 18.28	Nonuniform TSV tapering to address both power supply noise and temperature. (A) Opposite tapering is required to individually satisfy the power supply noise and temperature objectives, and (B) adapting the size of the TSVs across tiers to ensure that both objectives are satisfied	550
Figure 18.29	Power supply noise from employing one tier of decoupling capacitance. (A) A 2-D system, (B) a four tier system with no tier for the decoupling capacitance, (C) a decoupling capacitance tier close to the package, and (D) a decoupling capacitance tier on top of the 3-D system	552
Figure 18.30	Power supply noise from employing two tiers of decoupling capacitance. (A) A 2-D system, (B) one decoupling capacitance tier is placed next to the package and the second tier between tiers two and three, (C) one decoupling capacitance tier is placed on top of the stack and the second tier between tiers two and three, and (D) both decoupling capacitance tiers are placed on top of the stack	553
Figure 18.31	Reconfigurable decoupling capacitance topology where the decoupling capacitor is connected to the power rail even if the sleep transistors are switched off	554
Figure 18.32	Always on decoupling capacitance topology. The charge provided to the local circuit blocks flows through the sleep transistors	555
Figure 18.33	A daisy chain of buffers switches the sleep transistors on, subsequently ensuring that the current gradually increases, limiting the abrupt current changes within the power grid	556
Figure 18.34	Current flow within a three tier stack. Note the current flowing through the TSVs of each tier	558
Figure 18.35	Optimization framework for 3-D power distribution networks where both power supply noise and temperature constraints are considered. (A) Optimal sizing process for the middle tier(s) is initially determined, (B) the flowchart of the algorithm, and (C) step by step description of the algorithm	560
Figure 19.1	Power distribution network topologies. (A) interdigitated power network on all tiers with the 3-D vias distributing current on the periphery and through the middle of the circuit, (B) interdigitated power network on all tiers with the 3-D vias distributing current on the periphery, and (C) interdigitated power network on tiers 1 and 3 and power/ground planes on tier 2 with the 3-D vias distributing current on the periphery and through the middle of the circuit	567
Figure 19.2	Layout of the power distribution network test circuit	569
Figure 19.3	Layout of the test circuit containing three interdigitated power and ground networks and test circuits for generating and measuring noise. (A) Overlay of all three device planes, (B) power and ground networks of the bottom tier (tier 1), (C) power and ground networks of the middle tier (tier 2), and (D) power and ground networks of the top tier (tier 3)	570
Figure 19.4	Layout of the pattern sequence source for the noise generation circuits. (A) All three device planes, (B) noise generation circuits on the bottom tier (tier 1), (C) noise generation circuits on the middle tier (tier 2), and (D) noise generation circuits on the top tier (tier 3)	572
Figure 19.5	Pattern sequence source for the noise generation circuits. (A) Ring oscillator, (B) buffer used for the RO and PRNG, (C) 5-bit PRNG, (D) 6-bit PRNG, (E) 9-bit PRNG, and (F) 10-bit PRNG	574
Figure 19.6	Individual components in Figs. 19.5C–F with the corresponding transistor sizes. (A) Inverter, (B) AND gate, (C) OR gate, (D) XNOR gate, (E) 2-to-1 MUX, and (F) D flip-flop.	576
Figure 19.7	Layout of the noise generation circuits. (A) All three device planes, (B) noise generation circuits on the bottom tier (tier 1), (C) noise generation circuits on the middle tier (tier 2), and (D) noise generation circuits on the top tier (tier 3)	580
Figure 19.8	Schematic view of the (A) current mirror, and (B) switches that vary the total current through the current mirror	581
Figure 19.9	Layout of the power and ground noise detection circuits including the control circuit. (A) All three device planes, (B) power and ground sense circuits for the bottom tier (tier 1), (C) power and ground sense circuits for the middle tier (tier 2) and control circuit for all three tiers, and (D) power and ground sense circuits for the top tier (tier 3)	582
Figure 19.10	Rotating control logic to manage the RF output pads among the three device planes. The control signals to the RF pads are provided for both the power and ground detection signals for each device plane	586
Figure 19.11	Block and I/O pin diagram of the DC and RF pad layout. The numbered rectangles are DC pads providing power and ground, and DC bias points for the current mirrors, reset signals, and electrostatic discharge protection. The light colored squares and rectangles are RF pads used to calibrate the sense circuits (internal to the labeled blocks) and measure noise on the power/ground networks (external to the labeled blocks)	587
Figure 19.12	Microphotograph of the wire bonded test circuit	589
Figure 19.13	Block level schematic of noise generation and detection circuits	589
Figure 19.14	Source follower noise detection circuits detect noise on both the digital (A) power lines, and (B) ground lines	590
Figure 19.15	Fabricated test circuit examining noise propagation within three different power distribution networks, and a distributed DC-to-DC rectifier. (A) Microphotograph of the 3-D test circuit, and (B) an enlarged image of Block 1	591
Figure 19.16	S-parameter characterization of the power and ground noise detection circuits	592
Figure 19.17	Spectral analysis of the noise generated on the power line of Block 2, (A) board level decoupling capacitance, and (B) without board level decoupling capacitance.	593
Figure 19.18	Time domain measurement of the generated noise on the power line of Block 2 without board level decoupling capacitance for a voltage bias on the current mirrors of (A) 0 volts, (B) 0.5 volt, (C) 0.75 volts, and (D) 1 volt	594
Figure 19.19	Average noise voltage on the power and ground distribution networks with and without board level decoupling capacitance. (A) Average noise of power network without decoupling capacitance, (B) average noise of power network with decoupling capacitance, (C) average noise of ground network without decoupling capacitance, and (D) average noise of ground network with decoupling capacitance. A total of 4,096 data points are used to calculate the average noise for each topology at each current mirror bias voltage	595
Figure 19.20	Peak noise voltage on the power and ground distribution networks with and without board level decoupling capacitance. (A) Peak noise of power network without decoupling capacitance, (B) peak noise of power network with decoupling capacitance, (C) peak noise of ground network without decoupling capacitance, and (D) peak noise of ground network with decoupling capacitance. A single peak data point (from 4,096 points) is determined for each topology at each current mirror bias voltage	597
Figure 19.21	Equivalent electrical model of the cables, board, wirebonds, on-chip DC pads, power distribution networks, and TSVs	601
Figure 20.1	Taxonomy of 3-D architectures for wire limited circuits	606
Figure 20.2	Popular interconnection network topologies, (A) 3-D mesh, and (B) 2-D torus	607
Figure 20.3	Different partitioning levels and related design complexity vs the architectural granularity for 3-D microprocessors	608
Figure 20.4	An example of different partitions levels for a 3-D microprocessor system at the (A) core, (B) functional unit block (FUB), (C) macrocell, and (D) transistor levels	608
Figure 20.5	2-D organization of a cache memory with additional circuitry	611
Figure 20.6	2-D and 3-D organization of a 32 Kb cache memory array. N_spd is the number of sets connected to a word line	611
Figure 20.7	Word line partitioning onto two tiers of the 2-D cache memory shown in Fig. 20.5	612
Figure 20.8	Bit line partitioning onto two tiers of the 2-D cache memory shown in Fig. 20.5	613
Figure 20.9	Different organizations of a microprocessor system, (A) 2-D baseline system, (B) a second tier with 8 MB SRAM cache memory, (C) a second tier with 32 MB SRAM cache memory, and (D) a second tier with 64 MB DRAM cache memory	614
Figure 20.10	Several NoC topologies (not to scale), (A) 2-D IC–2-D NoC, (B) 2-D IC–3-D NoC, (C) 3-D IC–2-D NoC, and (D) 3-D IC–3-D NoC	617
Figure 20.11	Typical interconnect structure for intermediate metal layers	624
Figure 20.12	Zero-load latency for several network sizes. (A) A_PE=0.81 mm² and c_h=332.6 fF/mm, and (B) A_PE=4 mm² and c_h=332.6 fF/mm	626
Figure 20.13	Zero-load latency for various network sizes. (A) A_PE= 0.64 mm² and c_h= 192.5 fF/mm, (B) A_PE= 2.25 mm² and c_h= 192.5 fF/mm	627
Figure 20.14	Improvement in zero-load latency for different network sizes and PE areas (i.e., buss lengths). (A) 2-D IC–3-D NoC, and (B) 3-D IC–2-D NoC	628
Figure 20.15	Zero-load latency for various network sizes. (A) A_PE=1 mm² and c_h=332.6 fF/mm, and (B) A_PE=4 mm² and c_h=332.6 fF/mm	628
Figure 20.16	n₃ and n_p values for minimum zero-load latency for various network sizes. (A) A_PE=1 mm² and c_h=332.6 fF/mm, and (B) A_PE=4 mm² and c_h=332.6 fF/mm	629
Figure 20.17	Power consumption with delay constraints for several network sizes. (A) A_PE=1 mm², c_h=332.6 fF/mm, and T₀ = 500 ps, and (B) A_PE=4 mm², c_h=332.6 fF/mm, and T₀ = 500 ps	630
Figure 20.18	Power consumption with delay constraints for several network sizes. (A) A_PE= 0.64 mm², c_h= 192.5 fF/mm, and T₀=1000 ps, and (B) A_PE= 2.25 mm², c_h= 192.5 fF/mm, and T₀=1000 ps	631
Figure 20.19	Power consumption with delay constraints for various network sizes. (A) A_PE= 1 mm², c_h= 332.6 fF/mm, and T₀=500 ps, and (B) A_PE= 4 mm², c_h= 332.6 fF/mm, and T₀=500 ps	632
Figure 20.20	An overview of the 3-D NoC simulator	633
Figure 20.21	Position of the vertical interconnection links for each tier within a 3-D NoC (each tier is a 6 × 6 mesh), (A) fully connected 3-D NoC, (B) uniform distribution of vertical links, (C) vertical links at the center of the NoC, and (D) vertical links at the periphery of the NoC	635
Figure 20.22	Effect of traffic load on the latency of a 2-D and 3-D torus NoC for each type of traffic and XYZ routing	638
Figure 20.23	Latency of 64 node 2-D and 3-D meshes and tori NoCs under uniform traffic, XYZ routing, and several traffic loads	639
Figure 20.24	Different performance metrics under uniform traffic and a normal traffic load of a 3-D NoC for alternative interconnection topologies with XYZ-OLD routing, (A) 64 network nodes, and (B) 144 network nodes	640
Figure 20.25	Several performance metrics under uniform traffic and a low traffic load of a 3-D NoC for alternative interconnection topologies with XYZ routing, (A) a 4×4×4 3-D mesh, and (B) a 6×6×4 3-D mesh	641
Figure 20.26	Typical FPGA architecture, (A) 2-D FPGA, (B) 2-D switch box, and (C) 3-D switch box. A routing track can connect three outgoing tracks in a 2-D SB, while in a 3-D SB, a routing track can connect five outgoing routing tracks	642
Figure 20.27	Interconnects that span more than one logic block. L_i denotes the length of these interconnects and i is the number of LBs traversed by these wires	643
Figure 20.28	Interconnect delay for several number of physical tiers, (A) average length wires, and (B) die edge length interconnects	645
Figure 20.29	Power dissipated by 2-D and 3-D FPGAs	646
Figure C.1	Intertier interconnect consisting of m segments connecting two circuits located n tiers apart	658
Figure D.1	Portion of an interconnect tree	660
Figure E.1	Modeling spatial correlations using quad-tree partitioning	662

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for List of Figures

Create new playlist

Sign In

Sign Up

List of Figures

Table of Contents for
List of Figures