Physical Fault Tolerance of Nanoelectronics

The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.

<table>
<thead>
<tr>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>As Published</td>
<td><a href="http://dx.doi.org/10.1103/PhysRevLett.106.176801">http://dx.doi.org/10.1103/PhysRevLett.106.176801</a></td>
</tr>
<tr>
<td>Publisher</td>
<td>American Physical Society</td>
</tr>
<tr>
<td>Version</td>
<td>Final published version</td>
</tr>
<tr>
<td>Accessed</td>
<td>Sun Jun 18 03:56:43 EDT 2017</td>
</tr>
<tr>
<td>Citable Link</td>
<td><a href="http://hdl.handle.net/1721.1/66166">http://hdl.handle.net/1721.1/66166</a></td>
</tr>
<tr>
<td>Terms of Use</td>
<td>Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.</td>
</tr>
<tr>
<td>Detailed Terms</td>
<td></td>
</tr>
</tbody>
</table>

The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
Physical Fault Tolerance of Nanoelectronics

Thomas Szkopek
Department of Electrical and Computer Engineering, McGill University, Montréal, Québec H3A 2A7, Canada

Vwani P. Roychowdhury
Department of Electrical Engineering, University of California Los Angeles, Los Angeles, California 90095, USA

Dimitri A. Antoniadis
Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

John N. Damoulakis
Information Sciences Institute, University of Southern California, Marina Del Rey, California 90292, USA

(Received 23 January 2011; published 25 April 2011)

The error rate in complementary transistor circuits is suppressed exponentially in electron number, arising from an intrinsic physical implementation of fault-tolerant error correction. Contrariwise, explicit assembly of gates into the most efficient known fault-tolerant architecture is characterized by a subexponential suppression of error rate with electron number, and incurs significant overhead in wiring and complexity. We conclude that it is more efficient to prevent logical errors with physical fault tolerance than to correct logical errors with fault-tolerant architecture.

DOI: 10.1103/PhysRevLett.106.176801 PACS numbers: 85.35.–p, 85.40.Qx, 85.65.+h, 89.70.–a

Great effort has been devoted to realizing Moore’s law [1], whereby the number of components on a single integrated circuit doubles approximately every 2 years. With a greater number of components switching at higher rates, a greater amount of information can be processed per unit of space and time. The most fundamental limit to information processing is set by the spacetime density of physically distinguishable state transitions admitted by quantum mechanics and general relativity [2]. In the more immediate future, computers are unlikely to be very different from contemporary integrated circuits: an essentially planar technology that uses electromagnetic interactions between electrons at room temperature and with energy supplied at an electrochemical potential of \( \sim 1 \) eV. Logical gates approaching the molecular scale have been proposed [3], where a bit might be stored on a single electron charge or spin. Limitations are imposed by heat dissipation, wiring, and reliability [4,5]. The logic error rate is presently \( \sim 10^{-27} \) per gate operation according to data from the International Technology Roadmap for Semiconductors (ITRS) [6]. Entropy is introduced into the physical state of charge carriers by thermal fluctuations and structural disorder, both of which are increasingly important in devices of reduced physical dimension. The connection between thermodynamics and computation has been well elaborated [7,8], but, nonetheless, the minimum physical system size required for fault-tolerant logic has received comparatively little attention. The related question of minimum system size for storing information has been investigated [9], where the particle number emerges as a natural, dimensionless size parameter. We likewise adopt the electron number as a dimensionless size parameter.

von Neumann proposed [10] fault-tolerant architectures for computing with faulty components, a concept now extended to triplicate modular redundancy at the system level [11] and gate-level quantum error correction [12–14]. Fault-tolerant architectures are effective in correcting physical errors in nanoscale devices [15–17], but is the fault-tolerant architecture approach optimal for maximizing functional density? It has been postulated that topological excitations of many-body systems provide inherent physical fault tolerance for quantum computation [18,19] in lieu of architectural redundancies. Inspired by this development, we compare the error suppression performance of inherent physical fault tolerance versus architectural fault tolerance for classical computation. The error-rate scaling laws that we derive for complementary transistor logic subject to thermal fluctuations, and ballistic gates subject to atomic placement disorder, are compared in Table I to that of a fault-tolerant architecture.

Complementary transistor logic gates can have an inherently low error rate. CMOS is the most prevalent form of complementary transistor logic, but new materials such as carbon nanotubes [20,21] and semiconductor nanowires [22,23] can also be used for complementary logic. All operate by carrier transport over potential barriers. Physical redundancy and dissipation, associated with entropy removal, stabilizes circuit outputs against the effects of thermal noise and material disorder. The information theoretic origin of robust transistor gate behavior can be explained by considering transistor gate operation in terms...
of charge, rather than voltage. A representative, complementary transistor implementation of the universal NAND gate is illustrated in Fig. 1.

Each input charge carrier’s presence on an input gate capacitance can be associated with a physical bit value of 1 or 0, while the output is similarly composed of physical bits associated with the presence of charge carriers on the output node. The logical value of an input (output) is represented by the net polarity of the input (output) charge. At the sacrifice of speed, the greatest suppression of error can be achieved with transistors working at subthreshold. The electron-hole conductance will be ideally of the form $G_{p}/C_0 = \frac{n_p \delta \varphi}{\epsilon V_{GS}/k_BT}$, and the relation between input charge and output charge for the NAND gate is

$$N_{out} = \frac{N}{2} \frac{G_p - G_n}{G_p + G_n} = -\frac{N}{2} \tanh \left( \frac{N_{in}}{k_BT/e^2} \right). \tag{1}$$

where $N_{in}$ is an effective input charge defined by $\exp(-e^2 N_{in}/k_BT) = \exp(-e^2 N_A/k_BT) + \exp(-e^2 N_B/k_BT)$. $N/2$ is the charge of a fully polarized node, $G_{n/p}$ is the conductance of the complementary transistor networks, and $k_BT/e^2$ is the charge number equivalent of the thermal voltage. The NAND gate suppresses input charge fluctuation from the output charge provided the input fluctuation remains below the noise margin where $|\partial N_{out}/\partial N_{in}| < 1$. With the characteristic of Eq. (1), the noise margin is $\gamma \cdot N/2$ where $\gamma = (1 - (k_BT/eV) \times \ln(eV/k_BT)) \sim 0.90$ for a typical operating voltage $V = 1$ V at room temperature. Charge fluctuations below ~0.45$N$ are thus suppressed, as compared to the $N/2$ threshold of an ideal majority vote. The probability $P$ that a thermal charge fluctuation $\delta N_{in}/e^2 = k_BT/e^2$ at the input of a circuit induces a fault at the input to a subsequent circuit will be the probability that the noise margin is exceeded. Considering a fully polarized input, $P = \int_{-\infty}^{1} (1 - e^{-\gamma/2})^N dN_{in}$ where $P(N_{in}) = (2\pi k_BT/e^2)^{-1/2} \exp(- (1/2)e^2(N_{in} - N/2)^2/k_BT)$, evaluated in Table I.

The complementary transistor gate error probability $P$ scales as an ideal majority vote of $N$ samples with error rate $p = e^2/4 = \exp(-\gamma^2 eV/4k_BT)/4$ per sample. The majority vote is physically implemented by the competition to polarize the output charge per Eq. (1), with electron-hole channel transistors undergoing opposing metal-insulator phase transitions. The same physical fault-tolerance mechanism applies to other complementary transistor gates such as the buffer and more complex logic functions. This suppression of fault probability from input to output is the error correcting process that allows one to construct extremely complex networks of complementary logic without the destructive growth of correlated fluctuations, and thus logic errors, from one circuit to the next. Once operating voltage and temperature are determined, it is the logic element size (i.e., number of charges) that has the most profound effect on suppressing the propagation of fluctuations. The metal-insulator phase transition in the transistor channel is more sharply defined with increasing electron number [24], and the gate is consequently better able to discern the polarization of the input charge.

For device dimensions well below ~100 nm, the ballistic regime of device operation is approached. We consider the scaling of error in the representative ballistic system of a single electron spin [25] and more generally $N = 2j$ coupled electrons forming a single magnetic moment with total spin $j$ [26–28]. The spin orientation along a reference axis $z$ can be used to represent a classical logical

![FIG. 1 (color online).](176801-2) Universal NAND gate. (a) The output charge $N_{out}$ versus input charges $N_A$ and $N_B$ for a complementary transistor technology. (b) A majority vote physically implemented by competing transistor conductances $G_A$ and $G_B$ suppresses fluctuations in input charge from the output. The entropy of suppressed fluctuations leaves the system through a coupled heat bath, such as substrate phonons.
bit. We consider just one way in which errors may arise: the precision of spin placement to a distance $\delta r$. Ballistic interaction of two neighboring spins could be implemented by the dipole-dipole interaction $V_i \propto \mu^2/r^3$. The deviation $\delta r$ in spin position leads to a deviation in the interaction $\delta V_i = -3V\delta r/r$. Unitary evolution by Schrödinger’s equation dictates that a spin rotation error $\delta \varphi = -\delta V_i t/\hbar$ will develop over an interaction time $t$. The time $t$ must be sufficiently long to accumulate at least $\pi = V_i t/\hbar$ radians of spin rotation, corresponding to a classical bit inversion. The spin rotation error incurred in this operation is $\delta \varphi = -3\pi\delta r/r$ radians. With a single electron spin, the probability that the deviation $\delta \varphi J_x$ causes an erroneous spin flip is $P = |(1/2)|\exp(i \delta \varphi J_x)| - 1/2|^2 = \delta \varphi^2/4$. The larger Hilbert space of an $hN/2$ spin than that of a single spin $h/2$ permits a greater tolerance for interaction errors without inducing a reversal of spin polarization.

With $N = 2j$ electron spins, the deviation $\delta \varphi J_x$ causes an erroneous spin flip with probability $P = \Sigma_{m>0} |(m)|\exp(i \delta \varphi J_x)| - 1|^2$, approximated in Table I for $N \gg 1$.

Surprisingly, the error rate in the $N = 2j$ electron spin system is equal to that of a majority vote taken on $N$ independent spins with $\delta \varphi^2/4$ error probability per spin. With an error in spin placement of $\delta r = 2.5$ Å and a spin separation as large as $r = 100$ nm, the phase $\delta \varphi = 0.0236 = 1.3^\circ$ and the error probability for a single spin $\delta \varphi^2/4 = 1.4 \times 10^{-4}$. A minimum of $N = 17$ spins would be required to reach an error probability of $P \sim 10^{-27}$. Similar error estimates $P \sim (\delta r/r)^N$ arise for most other interactions, e.g., the Coulomb interaction $V_i = e^2/r$ between electrons in charge based devices. Errors arising from a coupling to a thermal bath can be similarly modeled, where the effective deviation in potential $\delta V_i$ is ascribed to thermal fluctuations.

We consider now the error scaling of architectural fault tolerance for universal computation. A fault-tolerant architecture is characterized by [13,14] encoding of logical bits into physical bits, detection and correction of physical errors in the encoded logical bits, a universal set of logic operations performed directly on encoded bits. The latter condition implies that only the repetition code is suitable for classical universal computation. The simplest and most efficient architecture is a concatenated repetition encoding with a universal majority gate [29] of Fig. 2. A hierarchy of concatenated encoding is recursively defined, with the final concatenation level $L$ determined by the required error probability. The logical error rate of an architecturally protected bit is given in Table I, where $p$ is the unprotected gate error rate and $e_r$ is a threshold error rate, estimated to be $1/108$ [29,30]. The bound in error rate of Table I can be turned to an equality through numerical refinement of $e_r$, while preserving the characteristic subexponential scaling of error with physical bit number $N$ [31]. The subexponential scaling of logical error rate with physical bit number, and thus electron number, distinguishes architectural fault tolerance from physical fault tolerance. The number of physical bits required to protect a level $L$ logical bit is $N = 3^L$ neglecting error correction ancillae, or $N = 9^L$ if error correction ancillae are included.

The logical error rates for physical and architectural fault-tolerant schemes are compared in Fig. 3 for device operating voltages anticipated by the ITRS [6]. The architectural model was applied to physical gates with thermally limited error rate with $eV = 0.97$ eV, $k_BT = 26$ meV and thus $p = (1/2)\text{erfc}[\exp(-\gamma eV/8k_BT)] = 2.9 \times 10^{-3}$, and ballistic single electron gate limited by atomic disorder $\delta r/r = 2.5$ Å/100 nm in Coulomb interaction ($V_i \propto 1/r$) with corresponding error rate $p = \delta \varphi^2/4 = 1.5 \times 10^{-5}$. The physical fault tolerance intrinsic to subthreshold complementary transistors gives superior error suppression per electron number as compared to architectural fault tolerance. In short, constructing a fault-tolerant majority gate from faulty majority gates is not as efficient as majority voting implemented directly at the physical level in subthreshold complementary logic. Moreover, the overhead associated with architectural fault tolerance can be significant, particularly the wiring for addressing individual bits. We also conclude that $N = 16–24$ electrons are minimally required to suppress thermally induced errors below the error rate of $P \sim 10^{-27}$ at room temperature. Sources of
additional charge fluctuation such as elevated operating temperature, charged defects, and parasitic signal coupling will require an increased electron number. Similar scaling of logical error with particle number will apply to any logic element with a single particle error rate of order \( \frac{eV}{4k_BT} \) resulting from thermal excitation across a potential barrier.

Universal computation thus appears to abide by the adage “a stitch in time saves nine,” where it is more efficient to prevent errors than to correct errors. Since subthreshold logic effectively implements an ideal majority vote, the question naturally arises as to whether a more efficient architectural fault-tolerant scheme approaching the ideal majority vote can be devised. The answer notwithstanding, fabrication imperfections render the thermal limit difficult to achieve, and the question remains as to the physical limits to structural order, such as dopant atom location, for a system with sufficient structural complexity to permit universal computation.

---