A 6-bit, 0.2 V to 0.9 V Highly Digital Flash ADC With Comparator Redundancy

Citation
Daly, Denis C., and A.P. Chandrakasan. “A 6-bit, 0.2 V to 0.9 V Highly Digital Flash ADC With Comparator Redundancy.” Solid-State Circuits, IEEE Journal of 44.11 (2009): 3030-3038. © 2009 IEEE

As Published
http://dx.doi.org/10.1109/jssc.2009.2032699

Publisher
Institute of Electrical and Electronics Engineers

Version
Final published version

Accessed
Mon Oct 29 14:00:49 EDT 2018

Citable Link
http://hdl.handle.net/1721.1/52494

Terms of Use
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Detailed Terms
The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
A 6-bit, 0.2 V to 0.9 V Highly Digital Flash ADC With Comparator Redundancy

Denis C. Daly, Student Member, IEEE, and Anantha P. Chandrakasan, Fellow, IEEE

Abstract—A 6-bit highly digital flash ADC is implemented in a 0.18 μm CMOS process. The ADC operates in the subthreshold regime down to 200 mV and employs comparator redundancy and reconfigurability to improve linearity. The low-voltage sampling switch employs voltage boosting, stacking and feedback to reduce leakage. Common-mode rejection is implemented digitally via an IIR filter. The minimum FOM of the ADC is 125 fJ/conversion-step at a 0.4 V supply, where it achieves an ENOB of 5.05 at 400 kS/s. The clocked comparators’ switching thresholds are adjusted through a combination of device sizing and stacking. A quadratic relationship between the amount of device stacking and the strength of an input network in the subthreshold regime is derived, demonstrating an advantage of stacking over device width scaling to adjust comparator thresholds.

Index Terms—ADC, analog-digital conversion, calibration, comparators (circuits), low-power electronics, reassignment, redundancy, ultra-low-voltage operation.

I. INTRODUCTION

MICROSENSOR wireless networks and implanted biomedical devices have emerged as exciting new application domains. These applications are highly energy constrained and require flexible, integrated, energy-efficient analog-to-digital converter (ADC) modules that can ideally operate at the same supply voltage as digital circuits. In many applications, the performance requirements are quite modest (∼100 kS/s). In systems with extensive digital signal processing, an additional demand faced by these ADCs is that they be compatible with advanced digital CMOS processes. As CMOS processes advance, digital switching energy reduces and scaling allows for increasingly complex algorithms with minimal energy overhead but key challenges emerge including increased leakage and device variation.

In recent years, highly digital ADC architectures like successive approximation register (SAR) and ΣΔ modulators have gained popularity due to their compatibility with advanced CMOS processes. In [1], a frequency-to-digital ΣΔ modulator is presented that uses only inverters and digital logic gates, operating at a supply voltage of 0.2 V. In many of these ADCs, the overall digital (CV²) power consumption is greater than analog power consumption, allowing for significant digital energy savings through voltage scaling. Voltage scaling can also be applied to analog circuits to reduce power consumption, particularly in low-resolution ADCs where thermal noise is not a limiting constraint; however, care must be taken to minimize the impact of power supply noise. Moreover, when operating analog circuits at low supply voltages, device leakage and variation, already serious concerns in advanced CMOS processes, become increasingly severe and traditional circuits and architectures are often impractical. To overcome these challenges, highly digital architectures must be employed and combined with techniques like redundancy and reconfigurability.

Inspired by the aforementioned scaling trends, much research has focused on realizing highly digital ADCs with the ultimate goal of a synthesizable ADC. Imagine, for instance, a highly digital ADC consisting solely of a sea of many redundant and reconfigurable inverter-based comparators combined with digital backend logic for calibration, as shown in Fig. 1. Here, reconfigurability is defined as allowing any comparator to be assigned to any ADC threshold. If, after calibration, only a subset of inverters are enabled such that their switching thresholds are linearly spaced, an energy efficient, highly digital ADC can be realized.

This paper presents a highly digital, voltage scalable flash ADC inspired by the vision of an inverter-based ADC [2]. Section II describes the ADC architecture highlighting how redundancy and reconfigurability is used to improve linearity and how extensive processing is moved to the digital domain. Section III presents the key ADC circuit blocks, including the front-end sampling switch and the clocked comparator array. Transistor sizing and stacking are used to vary comparator switching thresholds, and a mathematical analysis of the relationship between transistor stacking and comparator switching thresholds in the subthreshold regime is presented. Finally, measurement results are presented in Section IV.
II. ADC ARCHITECTURE

A. Background and Theory

To achieve energy efficiency, the ADC presented in this paper is designed to operate at low voltages, where the energy per conversion is minimized. This operating voltage is akin to the minimum energy point for digital circuits [3]. For ADCs, the energy per conversion is minimized when the sum of leakage energy and active energy is minimized, which for the ADC presented in this paper occurs at supply voltages near MOSFET threshold voltages. Low-voltage operation allows for improved energy efficiency but causes many analog design challenges that must be addressed. Two key architectural challenges are that increased variation in the subthreshold regime causes significant comparator offsets, and that traditional differential architectures are impractical.

A key block in flash ADCs is the comparator network, including the peripheral circuitry that ensures each comparator has an appropriate switching threshold. In traditional flash ADCs, where there is a 1:1 correspondence between comparator and output code, the combined comparator and reference voltage offset must be significantly less than 1 least significant bit (LSB) to ensure a reasonable linearity. For example, assuming a Gaussian distribution, a 6-bit ADC requires comparator offset, $\sigma_{\text{offset}}$ to be smaller than 0.2 LSB to achieve a 99% yield of INL $< 1$ LSB [4]. Maintaining low offsets requires large transistors, resulting in significant parasitic capacitance and area. Alternatively, offsets can be cancelled through analog and mixed-signal techniques such as a feedback digital-to-analog converter (DAC) [5], [6] or correlated double sampling (CDS) [7]. In [6], large offsets in a flash ADC preamplifier are tolerated by embedding a 5-bit DAC within each preamplifier.

As it is difficult to realize analog offset compensation at low supply voltages, the ADC architecture leverages digital calibration combined with redundancy [8]. Many redundant digital regenerative comparators with large offsets are used in place of a small number of precise comparators and reference voltages. Any comparator can be assigned to any specific threshold, and there are many more comparators available than thresholds required. By increasing the number of redundant comparators, the ADC can achieve the required yield even in the presence of very large comparator offsets, much larger than one LSB. Whereas when $\sigma \ll 1$ LSB, the probability distribution of individual comparator thresholds are narrow around their respective mean thresholds; when $\sigma \gg 1$ LSB, the probability distribution of individual comparator thresholds significantly overlaps those of comparators with nearby thresholds. In this scenario ($\sigma \gg 1$ LSB), the number of comparators within a given voltage range is proportional to the size of the voltage range, ignoring edge effects at the boundaries of the input range. Thus, the thresholds are Poisson distributed. If we assume $N$ comparator thresholds over an input range of $V_{th}$, and a redundancy factor of $R$, the probability that there are no thresholds within a voltage range of $x$ can be calculated to be

$$\text{Probability} = e^{-\frac{x}{V_{th}}}.$$  

(1)

From this equation, we can calculate the expected probability that INL $< 1$ LSB, assuming no correction for gain and offset errors. Here, INL is defined as the maximum difference between the ideal and actual code transition levels after correcting for gain and offset errors. As it is difficult to realize analog offset compensation at low supply voltages, the ADC architecture leverages digital calibration combined with redundancy [8]. Many redundant digital regenerative comparators with large offsets are used in place of a small number of precise comparators and reference voltages. Any comparator can be assigned to any specific threshold, and there are many more comparators available than thresholds required. By increasing the number of redundant comparators, the ADC can achieve the required yield even in the presence of very large comparator offsets, much larger than one LSB. Whereas when $\sigma \ll 1$ LSB, the probability distribution of individual comparator thresholds are narrow around their respective mean thresholds; when $\sigma \gg 1$ LSB, the probability distribution of individual comparator thresholds significantly overlaps those of comparators with nearby thresholds. In this scenario ($\sigma \gg 1$ LSB), the number of comparators within a given voltage range is proportional to the size of the voltage range, ignoring edge effects at the boundaries of the input range. Thus, the thresholds are Poisson distributed. If we assume $N$ comparator thresholds over an input range of $V_{th}$, and a redundancy factor of $R$, the probability that there are no thresholds within a voltage range of $x$ can be calculated to be

$$\text{Probability} = e^{-\frac{x}{V_{th}}}.$$  

(1)

B. Overview

Fig. 3 shows a block diagram of the ADC. The ADC can be configured in either a single-ended or pseudo-differential configuration. It consists of a sampling network, two arrays of 127 dynamic digital clocked comparators and a digital backend. The digital backend consists of two 127-bit Wallace tree encoders, two on-chip 127 by 9-bit memories with calibration logic, and an infinite-impulse response (IIR) common-mode rejection filter. The Wallace tree encoders sum the individual thermometer encoded comparator outputs and generate binary values.

The ADC is designed for a maximum of 6-bits of resolution, so in nominal mode no more than 63 comparators are enabled, and 64 comparators are disabled. For this implementation, a redundancy factor of 2 was used to reduce area over-
head at the cost of degraded linearity when compared to an ADC with higher redundancy factor. Before nominal operation can commence, the ADC must be calibrated by applying an input with known distribution such as a triangle wave. In single-ended mode, calibration can be applied in a ping-pong process, whereas in pseudo-differential mode calibration must be foreground. While calibrating, the Wallace tree encoder is bypassed and each comparator is assigned to a specific 9-bit accumulator. An estimate of the cumulative distribution function (CDF) of the input is generated in on-chip memory, and the comparator thresholds are back-calculated from this data off-chip. Based on these thresholds, an off-chip algorithm determines which comparators to enable. Once the appropriate subset of comparators are enabled, the ADC can operate in nominal mode with the output code taken at the output of the Wallace tree encoders.

As true differential architectures are not amenable to low-voltage operation, the ADC attempts to mimic the advantages of differential circuits through digital signal processing. Low-frequency common-mode rejection is implemented in pseudo-differential mode with an IIR filter and a 5-bit capacitive feedback DAC, which injects charge on the sampling capacitor to cancel common-mode offsets. The two single-ended ADC outputs are averaged and compared to the desired midscale code. This technique is advantageous for full-swing inputs where common-mode offsets can result in clipping and reduced performance. In an integrated system with a differential amplifier driving the ADC input, the feedback DAC can be removed and instead the IIR filter output can directly vary the common-mode output of the amplifier.

Instead of a traditional reference ladder that draws static current, the ADC uses dynamic comparators with static voltage offsets to generate comparator thresholds. The digital dynamic comparators are based on a sense-amplifier flip-flop and are described in detail in Section III.

An alternate architecture that does not require large on-chip memories or significant calibration computation is described in [10], whereby the inherent Gaussian variation in comparator thresholds is used to achieve linearity over an input-range. The stochastic ADC in [10] is fundamentally different from this work, as variation is leveraged in [10], whereas in this work variation is tolerated.

III. ADC CIRCUITS

To achieve good ADC performance at low supply voltages, there are several circuit challenges that must be addressed in the sampling network, comparator array and digital backend. This section describes the ADC circuit blocks in detail.

A. Sampling Network

At low supply voltages, it becomes challenging to realize good sampling switches due to the degraded ratio of ‘on’ conductance to ‘off’ current. The sampling switch must have a sufficiently high ‘on’ conductance and/or linearity such that it does not introduce distortion, and the ‘off’ current must not result in input-dependent ADC errors. To improve the linearity of the ‘on’ conductance, one can use resistor-based sampling techniques [11] and constant $V_{th}$ bootstrapping techniques [12]. As these techniques can be challenging to implement in combination with extreme voltage and frequency scaling, in this work we focus on techniques solely to increase the ‘on’ conductance.

To improve the ratio of ‘on’ conductance to ‘off’ current, device stacking [13], voltage boosting [14], and leakage feedback cancellation can be employed. To compare these techniques, Fig. 4 presents four sampling switch circuit implementations. The four implementations are all sized for equal ‘on’ conductance. Fig. 4(a) presents a simple, single transistor sampling switch. At low supply voltages, the gate overdrive can be as low as a few hundred millivolts and thus the switch must be sized very large, resulting in large ‘off’ leakage current and significant switching energy. If the ‘off’ current is sufficiently large, it can result in ADC errors while the comparators are resolving. Voltage boosting can be employed to increase the ‘on’ conductance while not increasing the ‘off’ current, as shown in Fig. 4(b), as long as device reliability is not a problem. Additionally, connecting devices in series can be employed to reduce leakage, as shown in Fig. 4(c). Connecting devices in series has been shown to result in significant leakage reduction compared to a single device [15]. While this results in only a minimal improvement in the ratio of ‘on’ conductance to ‘off’ current, when combined with a feedback amplifier as shown in Fig. 4(d), a substantial reduction in leakage can be achieved. The feedback amplifier serves to actively drive the internal node to the same voltage as the sampling capacitor, thus reducing the $V_{de}$ and $I_{de}$ of the sampling switch closest to the sampling capacitor. The feedback amplifier consists of self-biased nMOS and pMOS source followers and consumes only leakage current. The transient plot in Fig. 5 shows how these techniques reduce the leakage on the sampling capacitor when the sampling switch is open. Voltage boosting results in a dramatic decrease in leakage and the feedback amplifier reduces leakage an additional ~40%.

In this work, the sampling switch of Fig. 4(d) is implemented. In parallel with the load capacitor $C_L$ is a 5-bit capacitive DAC used to cancel low frequency common-mode offsets. Fig. 6 presents the voltage boosting circuit that drives the sampling switches. The final stage inverter of the voltage boosting circuit is designed so that the clock output can never drop below $V_{DD}$ due to leakage when it should be held high. Due to parasitic capacitances, the output voltage is simulated to reach a maximum of 510 mV when $V_{DD}$ equals 300 mV.
B. Comparator Array

The digital dynamic comparators used in the 0.18 μm ADC are based on a sense-amplifier flip-flop [16]. A simplified schematic of the flip-flop is shown in Fig. 7. The sampled analog voltage is applied to one input of the flip-flop, and a reference voltage of 0 V is applied to the other input. Comparator thresholds are varied by adjusting the effective strength of the input pMOS devices. A variable number of minimum sized pMOS input devices are connected in parallel and series. To reduce kick-back, the gates of dummy pMOS devices are connected to the sampling capacitor and their drain and source nodes are driven in counterphase to the internal flip-flop voltages. The single stage flip-flop uses positive feedback to achieve a superior power-delay product compared to a linear amplifier. Even though regenerative amplifiers are subject to large input-referred offsets, these offsets are acceptable given the redundancy and reconfigurability.

The comparator structure is designed to operate at supply voltages both above and below $V_T$. At low supply voltages, the comparator threshold range decreases and it becomes increasingly difficult to realize a large threshold range through device sizing. In the subthreshold regime, due to the exponential dependence of current on gate voltage, to achieve a threshold range of 200 mV solely by varying device width, a device must be varied in width by over 100 times. Stacking devices in series is preferred to linear width scaling as the device strength decreases.
quadratically rather than linearly in proportion to the number of stacked devices in series. This allows for a smaller comparator implementation and consumes less power than setting comparator thresholds by scaling device widths or by adding capacitors at the drain or source nodes of transistors M1 and M3 in Fig. 7 [17]. For example, when the comparator operates at a supply voltage of 300 mV, the switching threshold changes by 108 mV when increasing from one device to six stacked devices. Alternatively, if the width of a device is increased by six times, the switching threshold changes by only 65 mV. By using many instances of a single device of minimum size rather than varying its width or length, the comparator thresholds can be estimated by only characterizing a single device. A numerical proof of the quadratic relationship between stacking and effective device strength in the subthreshold regime is presented in the following sub-section.

C. Analysis of Device Stacking in the Subthreshold Regime

The effect of stacking transistors in digital CMOS logic has been well studied in literature at supply voltages above $V_T$. In this regime, transistors that are ‘on’ can be modeled as resistors [18] and stacking transistors results in a quadratic increase in propagation delay. However, in the subthreshold regime, transistors are not accurately modeled by resistors, and this relationship must be re-evaluated.

For the comparator shown in Fig. 7, the switching threshold is determined by what input voltage causes the input pull-up network to be equal in strength to the reference pull-up network. As an approximation, the switching threshold can be estimated as when the two pull-up networks have equal propagation delay if the positive feedback load is removed and the pull-up network is analyzed as if it were a digital gate. Such a structure is shown in Fig. 8, but with nMOS input devices instead of pMOS devices. By characterizing the effect of input voltage, stacking, and device width on propagation delay, one can estimate the switching threshold of the associated comparator.

For the mathematical analysis, we first assume that we have $N$ stacked nMOS devices as shown in Fig. 8. All internal nodes are initially precharged to $V_{DD}$. $C_P$ represents the parasitic capacitance seen at internal nodes, and $C_L$ represents the capacitance at the load node.

We can represent the circuit in Fig. 8 with the following set of differential equations:

\[
\frac{dV_i}{dt} = \frac{1}{C_P}(I_{D,M2} - I_{D,M1}) \quad (\text{3a})
\]

\[
\frac{dV_{N-1}}{dt} = \frac{1}{C_P}(I_{D,MN} - I_{D,MN-1}) \quad (\text{3b})
\]

\[
\frac{dV_N}{dt} = \frac{1}{C_L}(-I_{D,MN}) \quad (\text{3c})
\]

In the subthreshold regime, these equations can be expanded by using the following equation for subthreshold current [19]:

\[
I_{D,Mi} = I_{S}e^{\frac{V_{inh}-V_{inh}-V_{th}}{nV_{th}}}(1 - e^{-\frac{(V_{inh}-V_{th})}{nV_{th}}}) \quad (\text{4})
\]

For additional accuracy, $V_{th}$ can be modified to include the body effect. Although (3) cannot be easily analyzed analytically, it can be analyzed using an ordinary differential equation (ODE) numerical solver. As an example, we examine the scenario with $N = 10$, when all nMOS devices are minimum sized with a gate voltage of 300 mV, a $V_{th}$ of 400 mV, and a supply voltage of 400 mV. $C_P$ is assumed to be 1.5 fF and the load capacitance $C_L$ is assumed to be 5 fF. The delay is calculated to be the time when the load voltage, $V_{10}$, equals half of the supply voltage (i.e., 200 mV).

A transient solution of the ODE is shown in Fig. 9(a). An interesting characteristic of the transient plot is that only one node appears to be discharging at a time. Moreover, each node appears to be discharging at a different but constant rate, with the rate decreasing as later nodes are discharged. To simplify analysis, this system can be represented by a piecewise-linear (PWL) approximation as shown in Fig. 9(b) and derived in the Appendix. The PWL approximation achieves a very good match to the ODE solution.

Based on the PWL mathematical model, the following expression for the total propagation delay is derived in the Appendix:

\[
\text{Delay} \approx \sum_{i=1}^{N} C_i \cdot \left( \frac{n + i - 1}{nL} \left( e^{-\frac{V_{inh}-V_{th}}{nV_{th}}} \right) \right) \left( V_{DD} - nV_{th} \ln \left( \frac{1}{n^{i} + 1 - \frac{1}{n}} \right) \right). \quad (\text{5})
\]

Based on (5) if we assume the effect of the logarithm is negligible, we can model total delay with the following second-order equation:

\[
\text{Total Delay} = e^{-\frac{V_{inh}}{nV_{th}}}(C_1N^2 + C_2N + C_3) \quad (\text{6})
\]

where $C_1$, $C_2$, and $C_3$ are constants. This agrees with existing analysis of above-threshold logic elements [18]. At the switching threshold of the flip-flop, the delay of the input side of the flip-flop can be approximately assumed to equal the delay of the reference side ($T_{ref}$). Thus, one can calculate the relationship between comparator switching threshold, $V_{SW}$ and the amount of stacking, $N$:

\[
V_{SW} = nV_{th} \log \left( \frac{C_1N^2 + C_2N + C_3}{T_{ref}} \right) \quad (\text{7})
\]
As \( N \) increases, the \( N^2 \) term in (7) will dominate the numerator of the logarithm and thus the switching threshold will vary twice as quickly compared to adjusting the input device width.

**D. Wallace Tree Encoder and Memory**

The 127-bit thermometer output of each comparator array must be encoded to a 7-bit binary value to generate the digital output code. The encoder is realized with a Wallace tree that allows any combination of comparators to be enabled and guarantees ADC monotonicity. Comparators are not assigned to any specific code and can be reassigned arbitrarily. The Wallace tree implements an energy efficient encoder; however, it is not suitable for generating an estimated CDF as it breaks the link between comparators and their associated thresholds. To generate the estimated CDF, the comparator outputs are directly fed in parallel into a 127 by 9-bit memory. Nine bits of memory are associated with each comparator to allow sufficient threshold accuracy. Each block of memory has an associated counter that is used for CDF generation. The memory is realized with CMOS latches to enable operation down to 0.2 V and operates off an independent power supply so that it can be power gated when calibration is complete.

**IV. MEASUREMENT RESULTS**

The ADC is fabricated in a 0.18 \( \mu \)m 5M2P CMOS process and occupies 2 mm\(^2\) (Fig. 10). It was packaged in a 0.5 mm pitch TQFP package. The ADC operates from 2 kS/s at 0.2 V to 17.5 MS/s at 0.9 V, as shown in Fig. 11(a). The ADC can operate above 0.9 V, but the voltage boosting circuit must be disabled, the ADC speed plateaus and \( CV^2 \) losses significantly degrade energy efficiency. Near 0.9 V, ADC performance is limited by the sampling switch whereas at lower voltages, ADC performance is limited by the digital logic, including the adder and comparators. The remainder of this section describes how the prototype was tested and its measured performance. A summary of results is presented in Table I.

**A. Static and Dynamic Performance**

Static linearity ADC measurements were conducted at a supply voltage of 400 mV and a sampling frequency of 400 kS/s. The code density test was conducted using a full-swing, differential sinusoidal input with amplitude of

<table>
<thead>
<tr>
<th>Table I</th>
<th>Table of Results for ADC</th>
</tr>
</thead>
<tbody>
<tr>
<td>Active Die Area</td>
<td>1.4mm by 1.4mm</td>
</tr>
<tr>
<td>Supply Voltage</td>
<td>0.2 V to 0.9 V</td>
</tr>
<tr>
<td>Sampling Frequency</td>
<td>2 kS/s to 17.5 MS/s</td>
</tr>
<tr>
<td>Performance at 0.4 V, single-ended, post-calibration</td>
<td></td>
</tr>
<tr>
<td>Dynamic Performance</td>
<td>5.05b ENOB</td>
</tr>
<tr>
<td>Power Consumption</td>
<td>1.66 ( \mu ) W</td>
</tr>
<tr>
<td>FoM</td>
<td>125 fJ/conversion-step</td>
</tr>
<tr>
<td>DNL</td>
<td>+1.23/-0.91 LSB</td>
</tr>
<tr>
<td>INL</td>
<td>+0.72/-0.90 LSB</td>
</tr>
</tbody>
</table>

As \( N \) increases, the \( N^2 \) term in (7) will dominate the numerator of the logarithm and thus the switching threshold will vary twice as quickly compared to adjusting the input device width.
110 mV and frequency of 1.52625 kHz [20]. In single-ended mode, the maximum DNL and INL are +1.23/-0.91 LSB and +0.72/-0.90 LSB, respectively (Fig. 12). In pseudo-differential mode, the maximum DNL and INL are +0.98/-0.78 LSB and +0.73/-0.61 LSB, respectively. To improve the DNL and INL, additional redundancy is required.

The signal-to-noise-plus-distortion ratio (SNDR) and effective number of bits (ENOB) of the ADC were derived using tone testing at supply voltages from 0.2 V to 0.9 V. As the comparator thresholds vary at different supply voltages, the ADC is recalibrated at each supply voltage. The FFT of the ADC in single-ended and pseudo-differential mode at a supply voltage of 0.4 V is shown in Fig. 13. An ENOB of 5.05 and 5.56 are achieved in single-ended and pseudo-differential modes, respectively. The ENOB in pseudo-differential mode improves due to reduced harmonic distortion and also likely due to averaging of the two single-ended ADC outputs. The THD in pseudo-differential mode is 6 dB better than in single-ended mode, most likely due to the matching the two signal paths and cancellation of even order harmonics, potentially in the sampling switch.

### B. Power Consumption

The total power consumption of the ADC at 0.4 V, 400 kS/s is 2.84 μW and 1.66 μW in pseudo-differential and single-ended mode, respectively, of which 135 nW is leakage power. In single-ended mode with a high frequency sinusoidal input, the adder consumes 0.93 μW, the ADC state machine consumes 0.40 μW, the comparators consume 0.28 μW, and the sampling network consumes 0.05 μW. In pseudo-differential mode, when common-mode feedback is enabled, the power consumption increases by approximately 15%. As the Wallace tree encoder is primarily combinational logic, as the ADC input frequency decreases, the power consumption of the adder also decreases. A widely used figure of merit (FOM) normalizes the ADC power consumption to the input bandwidth it can digitize and the dynamic range it achieves

\[
FOM = \frac{P}{2 f_{\text{in}}^2 \text{ENOB}^2}. \tag{8}
\]

Shown in Fig. 11(b) is the FOM of the ADC in single-ended mode versus supply voltage. At low voltages, the leakage current degrades the FOM due to low sampling rates, whereas at high voltages, \(CV^2\) losses degrade the FOM, leading to the emergence of a minimum FOM supply voltage of 0.4 V [3]. At this voltage, the ADC achieves an FOM of 125 fJ/conversion-step in single-ended mode (5.05 ENOB) and 150 fJ/conversion-step in pseudo-differential mode (5.56 ENOB). The highly digital flash ADC has no bias currents and thus energy is only dissipated through switching events (CV^2) and by leakage currents.

### C. Calibration and Common-Mode Rejection

The comparators have a measured offset standard-deviation of approximately 8 mV, which is larger than 1 LSB, as the input range is approximately 100 mV at a 400 mV supply voltage. Fig. 14 presents statistical measurements of the ENOB for the ADC, before and after redundancy calibration. In pseudo-differential mode with a total of 126 comparators enabled, the ADC nominally achieves an average ENOB of 5.56 at 400 kS/s. If redundancy calibration is not used and the same comparators are enabled on all chips, the average ENOB reduces to 3.84. The comparator thresholds vary with temperature and ADC recalibration is required to maintain linearity. In single-ended 6-bit mode, the ADC ENOB degrades from 5.05 at 25 °C to 4.28 at 75 °C without recalibration. After recalibration the ENOB returns to 5.08.

When a full-scale sinusoid input is in the presence of a –12 dBFS common-mode signal at 0.005 F_S, the ENOB degrades by 0.5b compared to a 1.3b degradation when the common-mode rejection is disabled. Due to latency of the digital circuits, the common-mode feedback is only capable of cancelling low-frequency components and improving ENOB at common-mode frequencies less than approximately 0.04 F_S.

### V. Conclusion

A highly digital flash ADC has been presented that can operate from supply voltages of 200 mV to 900 mV. The archi-
tecture can tolerate large comparator and reference voltage offsets due to redundancy and reconfigurability of the comparator array. This allows for the use of a sense-amplifier based flip-flop with embedded offsets introduced through device stacking and sizing. Device stacking has been analyzed in the subthreshold regime and shown to result in a quadratic change in effective device strength.

APPENDIX

This Appendix derives an analytical expression for total propagation delay of the circuit shown in Fig. 8 when biased in the subthreshold regime. This expression is then used in Section III to estimate the switching threshold of a clocked comparator depending on the amount of device stacking.

As discussed in Section III, the ODE numerical solution [Fig. 9(a)] can be approximated with a piecewise-linear model [Fig. 9(b)]. A key observation is that once $V_2$ of Fig. 8 has discharged, $V_1$ is slightly reduced from the voltage it originally discharged to. This is expected, as the current through $M_3$ is assumed to be equal to the discharge current, which decreases with time.

For the following analysis we consider the situation when the $L^{th}$ node is discharging ($L \leq N$). In this scenario, as only $V_L$ is being discharged, the current through devices $M_1$ through $M_L$ is equal and there is no current through devices $M_{L+1}$ through $M_N$. We will refer to this current as $I_{M_L}$. Thus, we have the following set of equations:

$\begin{align*}
I_{M_L} & = I_s \left( e^{\frac{V_L-V_{th}}{nV_{th}}} \left( 1 - e^{-\left( \frac{V_L-V_{th}}{V_{th}} \right)} \right) \right) \\
\ldots & = I_s \left( e^{\frac{V_L-V_{th}}{nV_{th}}} \left( 1 - e^{-\left( \frac{V_L-V_{th}}{V_{th}} \right)} \right) \right) \\
& = I_s \left( e^{\frac{V_L-V_{th}}{nV_{th}}} \left( 1 - e^{-\left( \frac{V_L-V_{th}}{V_{th}} \right)} \right) \right) \\
\end{align*}$

(9a)

(9b)

(9c)

Initially $V_L$ is precharged to $V_{DD}$. As $V_{L-1}$ has already discharged, it is at a voltage much less than $V_{DD}$ and thus we can assume that

$$1 - e^{-\left( \frac{V_L-V_{th}}{V_{th}} \right)} = 1.$$  (10)

To simplify (9), we substitute $a_{i} = e^{-V_i/nV_{th}}$. After dividing out the common factor of $I_s e^{(\frac{V_L-V_{th}}{nV_{th}})}$, we are left with the following set of equations:

$\begin{align*}
I_{M_L} \& \propto a_{L-1} \\
\frac{a_{L-1}}{a_{L-2}} \& = \left( 1 - \left( \frac{a_{L-1}}{a_{L-2}} \right) \right)^n \\
\frac{a_{L-1}}{a_{L-2}} \& \cdots \frac{a_{2}}{a_{1}} \& = \left( 1 - a_{1}^n \right). \\
\end{align*}$

(11a)

(11b)

(11c)

(11d)

We can manipulate the above equations as follows:

$\begin{align*}
\frac{a_{L-1}}{a_{L-2}} \& = \left( 1 - \left( \frac{a_{L-1}}{a_{L-2}} \right) \right)^n \\
\frac{a_{L-1}}{a_{L-2}} \cdot \frac{a_{L-2}}{a_{L-3}} \cdots \frac{a_{2}}{a_{1}} \& = \left( 1 - a_{1}^n \right). \\
\end{align*}$

(12a)

(12b)

(12c)

As $a_{i}/a_{i-1} \approx 1$ for $i < L$, we can use the approximation that $(a_{i}/a_{i-1})^n \approx 1 - n (1 - (a_{i}/a_{i-1})^n)$. Thus:

$\begin{align*}
\frac{a_{L-1}}{a_{L-2}} \& = n \left( 1 - \left( \frac{a_{L-1}}{a_{L-2}} \right)^n \right) \\
\frac{a_{L-1}}{a_{L-2}} \cdot \frac{a_{L-2}}{a_{L-3}} \cdots \frac{a_{2}}{a_{1}} \& = n \left( 1 - a_{1}^n \right). \\
\end{align*}$

(13a)

(13b)

(13c)

We need to solve for $a_{L-1}$ to determine $I_{M_L}$. From (13), we can iteratively arrive at the solution $a_{L-1} = n/(n + L - 1)$. Thus, $I_{M_L} = n/(n + L - 1) I_s e^{(\frac{V_L-V_{th}}{nV_{th}})}$

Now that we have solved for the current as each node discharges, we can calculate the total delay for all nodes to discharge:

$$\Delta t_{tot} = \sum_{i=1}^{N} \frac{C_i}{\bar{I}_{M} \delta} |\Delta V_i|$$

(14)

$$\approx \sum_{i=1}^{N} \frac{C_i}{\bar{I}_{M} \delta} (V_{DD} - V_{i,low}).$$

(15)

A good approximation for $V_{i,low}$ is the source voltage of the top-most ‘on’ transistor (i.e., $V_{th}$). Thus:

$$V_{i,low} = V_{L-1} \mid _{V = i}.$$  (16)

As $a_{L-1} = e^{-V_{L-1}/nV_{th}} = n/(n + L - 1)$, we obtain

$$V_{i,low} \approx nV_{th} \ln \left( \frac{1}{n} + 1 - \frac{1}{n} \right).$$

(17)

This can be substituted back into (14) to obtain an expression for the total propagation delay, shown in (5).

Fig. 15 presents data based on this expression and comparing it to ODE simulation results. Equation 5 closely matches the ODE simulation and can also be accurately represented by a
second-order equation. Thus, a quadratic relationship exists be-
tween the amount of device stacking and the propagation delay
in the subthreshold regime. The system was resimulated taking
into account the body effect, and results were found to be con-
sistent.

REFERENCES

[1] U. Wismar, D. Wisland, and P. Andreani, “A 0.2 V 0.44 µW 20 kHz
analog to digital Σ∆ modulator with 57 dB/conversion FoM,” in Proc.

[2] D. C. Daly and A. P. Chandrakasan, “A 6-bit, 0.2 V to 0.9 V highly
digital flash ADC with comparator redundancy,” in IEEE Int. Solid-


high-speed CMOS ADCs,” IEEE Trans. Circuits Syst. II: Analog Dig.


reference voltage and common-mode calibration,” IEEE J. Solid-State

op-amp imperfections: Autozeroing, correlated double sampling, and
chopper stabilization,” Proc. IEEE, vol. 84, no. 11, pp. 1584–1614,
Nov. 1996.

porating redundancy of flash ADCs,” IEEE Trans. Circuits Syst. II:


chastic flash analog-to-digital converter without calibration or refer-

mura, K. Hamashita, K. Takasaka, G. Temes, and U. Moon, “A 0.6-V
82-dB δelta-sigma audio ADC using switched-RC integrators,” IEEE

ulator with 88-dB dynamic range using local switch bootstrapping,”

[13] J. Shen and P. R. Kinget, “A 0.5-V 8-bit 10-MS/s pipelined ADC in

[14] S. Gambini and J. Rabaei, “Low-power successive approximation con-
verter with 0.5 V supply in 90 nm CMOS,” IEEE J. Solid-State Circuits,

[15] S. Narendra, S. Borkar, V. De, D. Antoniadis, and A. Chandrakasan,
“Scaling of stack effect and its application for leakage reduction,” in Proc.

[16] M. Matsui, H. Hara, Y. Uetani, L.-S. Kim, T. Nagamatsu, Y. Watanabe,
A. Chiba, K. Matsuda, and T. Sakurai, “A 200 MHz 13 mm² 2-D DCT
macrocell using sense-amplifying pipeline flip-flop scheme,” IEEE J.

[17] G. Van der Plas, S. Decoutere, and S. Donnay, “A 0.16 µJ/conversion-
step 2.5 mW 1.25 GS/s 4b ADC in a 90 nm digital CMOS process,” in
2310–2311.

[18] M. J. Rabaei, A. Chandrakasan, and B. Nikolic, Digital Integrated Cir-


Denis C. Daly (S’02) received the B.A.Sc. degree in
engineering science from the University of Toronto,
Toronto, Ontario, Canada, in 2003, and the S.M.
and Ph.D. degrees from the Massachusetts Institute
of Technology, Cambridge, MA, in 2005 and 2009,
respectively.

Since June 2009, he has been with Energi Semiconduc-
tor, Cambridge, MA. His research interests include
low-power wireless transceivers, ultra-low-power systems,
and highly digital RF and analog circuits. From May 2005 to August 2005, he worked in the Wireless Analog Technology Center at Texas Instruments,
Dallas, TX, designing transceiver modules. From May 2003 to August 2003, he worked on high-speed signaling systems at Intel Laboratories, Hillsboro, OR.

Dr. Daly was awarded a Student Paper Prize at the 2006 RFIC Symposium and won third place in the 2006 DAC/ISSCC Student Design Contest (Opera-
tional System Category). He received Natural Sciences and Engineering Re-

Anantha P. Chandrakasan (M’95–SM’01–F’04) received the B.S., M.S., and Ph.D. degrees in elec-
trical engineering and computer sciences from the University of California, Berkeley, in 1989, 1990, and
1994, respectively.

Since September 1994, he has been with the Massachusetts Institute of Technology, Cambridge,
where he is currently the Joseph F. and Nancy P.
Keithley Professor of Electrical Engineering and
the Director of the MIT Microsystems Tech-
nology Laboratories. His research interests include
low-power digital integrated circuit design, wireless microsensors, ultra-wide-
band radios, and emerging technologies. He is a coauthor of Low Power
Circuits (Pearson Prentice-Hall, 2003, 2nd edition), and Sub-threshold Design
for Ultra-Low Power Systems (Springer 2006). He is also a co-editor of Low
Power CMOS Design (IEEE Press, 1998), Design of High-Performance Mi-
croprocessor Circuits (IEEE Press, 2000), and Leakage in Nanometer CMOS
Technologies (Springer, 2005).

Dr. Chandrakasan was a co-recipient of several awards including the 1993 IEEE Communications Society’s Best Tutorial Paper Award, the IEEE Electron Devices Society’s 1997 Paul Rappaport Award for the Best Paper in an EDS publication during 1997, the 1999 DAC Design Contest Award, the 2004 DAC/ISSCC Student Design Contest Award, the 2007 ISSCC Beatrice Winner Award for Editorial Excellence and the 2007 and 2008 ISSCC Jack Kilby Award for Outstanding Student Paper. He has served as a technical program co-chair for the 1997 International Symposium on Low Power Electronics and Design (ISLPED), VLSI Design ’98, and the 1998 IEEE Workshop on Signal Process-

Fig. 15. Propagation delay versus number of stacked nMOS devices for ODE simulation and mathematical approximation given in (5).