

# Università della Calabria

Dipartimento di Informatica, Modellistica, Elettronica e Sistemistica

Dottorato di Ricerca

in

Information and Communication Technologies

XXXIII Ciclo, Settore disciplinare: ING-INF/01

# Progettazione di circuiti a bassissima potenza e tensione per System on Chip energicamente autonomi

Coordinatore: Ch.mo Prof. Felice Crupi

Firma oscurata in base alle linee guida del Garante della privacy Firma......

Supervisore: Ch.mo Prof. Marco Lanuzza



Firma oscurata in base alle linee guida del Garante della privacy

Firma.....

Dottorando: Dott. Luigi Fassio

Firma oscurata in base alle linee guida del Garante della privacy Firma.....



# Università della Calabria

Dipartimento di Informatica, Modellistica, Elettronica e Sistemistica

Thesis for the degree of Doctor of Philosophy

in

Information and Communication Technologies

33° Cycle, Disciplinar sector: ING-INF/01

## Design of Ultra-Low Power and Voltage circuits for energy autonomous System on Chip

Coordinator: Prof. Dr. Felice Crupi

Email: felice.crupi@unical.it

Supervisor: Prof. Dr. Marco Lanuzza

Email: m.lanuzza@dimes.unical.it

Student: Luigi Fassio Email: luigi.fassio@unical.it I.fassio.92@gmail.com

Alla mia famiglia,

a Tiara, la mia nuova famiglia

Ringrazio il Professor Lanuzza, Professor Crupi e Professor Alioto per

l'opportunità data e il lavoro svolto e un ringraxiamento speciale a Raffaele per la

pazienza e il grande aiuto dato.

## **Table of Contents**

| List of Figures       | 1 |
|-----------------------|---|
| List of Tables        |   |
| List of Abbreviations | 6 |
| Sommario              |   |
| Abstarct              |   |
| Introduction          |   |

#### Chapter 1: Energy Harvesting for System-on-Chip and Design

| of Ultra-     | -low Power/Voltage Circuits                            |    |
|---------------|--------------------------------------------------------|----|
| 1.1 Energy-H  | Iarvested System on Chip                               |    |
| 1.1.1         | Energy harvesting sources                              |    |
| 1.1.2         | Super-micro capacitors                                 |    |
| 1.2 Design of | f Ultra-low Power/Voltage Circuits                     |    |
| 1.2.1         | MOSFET operation in subthreshold region                |    |
| 1.2.2         | Impact of process variability in subthreshold circuits |    |
|               | f CMOS Voltage and Current References                  | 34 |
|               | bltage Reference down to 0.25-V, 5.4-pW Operation      |    |
| 2.1.1 F       | Proposed architecture and operating principle          |    |
| 2.1.2 N       | Aeasurement results in 180-nm process                  |    |
| 2.1.3 0       | Comparison with the state of the art                   |    |
| 2.2 From a Vo | bltage Reference to a Current Reference                | 50 |
| 2.2.1 F       | Proposed architecture and operating principle          | 50 |

| 2.2.2 Measurement results in 180-nm process | 53 |
|---------------------------------------------|----|
| 2.2.3 Comparison with the state of the art  | 59 |

| Chapter 3: Design of a Corner-Aware CMOS Voltage Reference for   |     |
|------------------------------------------------------------------|-----|
| Purely-Harvested Systems                                         | 62  |
| 3.1 Circuit Architecture and Operating Principle                 | 62  |
| 3.2 Oscillator-Based On-Chip Process Sensor                      | 63  |
| 3.2.1 Measurement results in 180-nm process                      | 66  |
| 3.3 Basic ULP/ULV Voltage Reference Circuit                      | 69  |
| 3.3.1 Measurement results in 180-nm process with and without     |     |
| corner-aware replica combination                                 | 71  |
| 3.4 Comparison with the state of the art                         | 75  |
| Chapter 4: Design of an Ultra-Low Voltage Level Shifter          | 78  |
| 4.1 Prior Art on Level Shifter Designs                           | 78  |
| 4.2 Ultra-Low Voltage Level Shifter with High-Speed and          |     |
| Energy-Efficient Operation                                       | 79  |
| 4.2.1 Proposed solution and operating principle                  | 80  |
| 4.2.2 Measurement results in 180-nm process                      | 82  |
| 4.2.3 Comparison with the state of the art                       | 88  |
| Chapter 5: Design of Ultra-Low Power Dynamic Voltage Comparators | 90  |
| 5.1 Dynamic Voltage Comparator (DVC)                             | 90  |
| 5.2 Dynamic Leakage Suppression (DLS) Logic                      | 92  |
| 5.3 Ultra-Low Power Single-Stage DLS-Based DVC                   | 95  |
| 5.3.1 Measurement results in 180-nm process                      | 96  |
| 5.4 Ultra-Low Power Dual-Stage DLS-Based DVC                     | 98  |
| 5.4.1 Measurement results in 180-nm process                      | 101 |
| 5.5 Comparison with the state of the art                         | 105 |
| Conclusion                                                       | 107 |
| Bibliography                                                     | 109 |
| List of Publications                                             | 119 |

### **List of Figures**

- 1 This statistic represents the number of connected devices (Internet of Things-IoT) in the world between 2015 and 2025. In 2025, the source predicted that 75.44 billion connected objects would be in circulation worldwide [4].
- 1.1 A System-on-Chip integrates on die all the blocks needed to solve its tasks.
- 1.2 Energy harvesting sources and conversion methods.
- 1.3 The common structure of MSCs: (a) standard sandwich MSCs, (b) wire-shaped fibrous MSCs, and (c) interdigital MSCs. Figure from [29].
- 1.4 Low minimum voltage  $(V_{min})$  and minimum power  $(P_{min})$  are desirable in energyharvested SoCs to prolong operation under unfavorable conditions.
- 1.5 MOSFET current-voltage characteristics highlighting the subthreshold and superthreshold operating regions. In the superthreshold region, the current is fairly linear, whereas in the subthreshold regime it is exponentially dependent on V<sub>GS</sub>.
- 1.6 Structure of an n-channel MOSFET with different current contributions.
- 1.7 Measurement results for an nMOSFET (regular threshold voltage RVT device, L = 1 $\mu$ m,  $W = 2 \mu$ m, T = 21°C,  $V_{BS} = 0$  V and  $V_{DS} = 0.5$  V) from corner wafer chips.
- 2.1 Schematic of the proposed voltage reference.
- 2.2 Die photo and layout of the proposed voltage reference.
- 2.3 (a) Measured reference voltage vs. supply voltage at several temperatures, (b) color map of reference voltage across supply voltages and temperatures.
- 2.4 Measured line sensitivity vs. minimum supply voltage  $V_{min}$  (evaluated in the voltage range from  $V_{min}$  up to the 1.8 V nominal voltage) at 25°C and 120°C.
- 2.5 Measured power consumption versus (a) the supply voltage at 25 °C, (b) temperatures at  $V_{DD} = 0.25$  V for a typical sample.
- 2.6 Measured power supply rejection ratio (PSRR) vs. frequency at 25°C for a typical sample.
- 2.7 Measured start-up waveform of  $V_{REF}$  for different temperatures at  $V_{DD} = 0.25$  V. The inset shows the 5% settling time vs. temperature
- 2.8 Measured  $V_{REF}$  vs. temperature across 30 die samples at  $V_{DD} = 0.25$  V.
- 2.9 Measurements histogram across 30 die samples: (a)  $V_{REF}$  at 25 °C and  $V_{DD}$ = 0.25 V, (b) temperature coefficient at  $V_{DD}$ =0.25 V, (c) power consumption at 25 °C and  $V_{DD}$ =0.25 V, (d) line sensitivity at 25°C.

- 2.10 Benefit of body biasing compensation on line sensitivity at 25 °C: (a)  $V_B$  normalized to the value at  $V_{DD}$ =0.25 V vs.  $V_{DD}$  for a typical sample, b) cumulative distribution function (CDF) of line sensitivity across die samples.
- 2.11 Benefit of body biasing compensation on temperature coefficient at  $V_{DD}$ =0.25 V: (a)  $V_B$  normalized to the value at 0°C vs. temperature, (b)  $V_{REF}$  normalized to the value at 0 °C vs. temperature for a typical sample.
- 2.12 Benefit of body biasing compensation on process sensitivity, as shown by the cumulative distribution push towards the mean (25 °C,  $V_{DD}$ =0.25 V).
- 2.13 (a) Power and (b) absolute  $V_{REF}$  accuracy vs.  $V_{min}$  in state-of-the-art voltage references fabricated in the same 180-nm technology generation.
- 2.14 (a)  $I_D$ - $V_{GS}$  characteristics of a typical nMOSFET at different temperatures, (b) temperature coefficient *TC* versus  $V_{GS}$ , and (c) required  $\Delta V_{GS}$  at different  $V_{GS}$  for temperature compensation [39].
- 2.15 (a) Conceptual diagram and (b) schematic of the proposed current reference.
- 2.16 Die photo and layout of the proposed current reference.
- 2.17 (a) Measured reference current vs. supply voltage at several temperatures, (b) color map of reference current across supply voltages and temperatures ( $V_{OUT} = 0.6$  V).
- 2.18 Measured supply current versus (a) the supply voltage at 25 °C, (b) temperatures at  $V_{DD}$ = 0.6 V for a typical sample ( $V_{OUT}$  = 0.6 V in both cases) for a typical sample.
- 2.19 Measured reference current vs.  $V_{OUT}$ , i.e., load sensitivity for (a) a typical sample and (b) across 15 die samples at  $V_{DD} = 0.6$  V and 25 °C.
- 2.20 Measurements histogram across 15 die samples: (a)  $I_{REF}$  at 25 °C and  $V_{DD} = V_{OUT} = 0.6$ V, (b) temperature coefficient at  $V_{DD} = V_{OUT} = 0.6$  V, (c) supply current at 25 °C and  $V_{DD} = V_{OUT} = 0.6$  V, (d) line sensitivity at 25°C and  $V_{OUT} = 0.6$  V.
- 2.21 Benefit of body biasing on line sensitivity at 25 °C and  $V_{OUT} = 0.6$  V: (a)  $V_B$  normalized to the value at  $V_{DD} = 0.6$  V vs.  $V_{DD}$  for a typical sample, b) cumulative distribution function (CDF) of line sensitivity across die samples.
- 2.22 Benefit of body biasing on temperature coefficient at  $V_{DD} = V_{OUT} = 0.6$  V: (a)  $V_B$  normalized to the value at 0°C vs. temperature, (b) cumulative distribution function (CDF) of temperature coefficient across die samples.
- 2.23 (a) Power excluding  $I_{REF}$  contribution vs. area and (b)  $I_{TOT}/I_{REF}$  vs.  $V_{min}$  in state-of-the-art current references fabricated in the same 180-nm technology generation.
- 3.1 Scheme and operating principle of the proposed NMOS-only corner-aware architecture.

- 3.2 Process sensor architecture.
- 3.3 Operating principle of the process sensor: process corners can be detected with a pair of circuits sized as A and B by reading out a circuit parameter that depends on V<sub>TH</sub> (i.e., the oscillator frequency).
- 3.4 Schematic of implemented fast and slow oscillators.
- 3.5 Layout of the implemented oscillators and chip micrograph.
- 3.6 Frequency of the fast oscillator as function of the active area for  $V_{DD} = 1.8$  V and 25°C from corner simulation analysis.
- 3.7 Measured frequency ratio between the two oscillators as a function of  $V_{DD}$  at 25 °C across corner dice.
- 3.8 Measured statistical distribution of the frequency ratio between the two oscillators across dice, temperature, and voltage variations.
- 3.9 Schematic of the basic voltage reference circuit used for each corner replica.
- 3.10 Simplified circuit analysis of the basic reference circuit in Fig. 3.9 with M1-M3 lumped into a single transistor M13.
- 3.11 Layout of the voltage reference replicas and die micrograph.
- 3.12 (a) Measured reference voltage vs. supply voltage at room temperature in the case of typical behavior.
- 3.13 Measured (a) reference voltage and (b) supply current as a function of temperature in the case of typical behavior.
- 3.14 Measured statistical distribution of (a) reference voltage (at 0.2 V and 25 °C) and (b) temperature coefficient (at 0.2 V) for the TT circuit replica across 15 TT test chips.
- 3.15 Measured statistical distribution of reference voltage (at 0.2 V and 25 °C) w/o and w/ replica selection across corners.
- 3.16 Measured statistical distribution of temperature coefficient (at 0.2 V) w/o and w/ replica selection across corners.
- 3.17 Measured statistical distribution of line sensitivity (at 25 °C) w/o and w/ replica selection across corners.
- 3.18 Absolute  $V_{REF}$  accuracy vs.  $V_{min}$  in state-of-the-art voltage references fabricated in the same 180-nm technology generation. Start symbols denote results from corner wafer measurements.
- 4.1 Schematic of conventional (a) cross-coupled (CC) and (b) current mirror (CM) based level shifter topologies.
- 4.2 Schematic of the proposed level shifter.

- 4.3 Simulated transient behavior of the proposed level shifter for a voltage up-conversion from 200 mV up to 1.8 V. (a) Input (A) and output (OUT) voltages. (b) Voltages at internal nodes (NN, NP, NDP, and NDN). (c)-(d) Details of NN, NP, NDP, and NDN signals during L $\rightarrow$ H and H $\rightarrow$ L transitions. (e) Left- and right-branch currents (*I*<sub>L</sub> and *I*<sub>R</sub>). (f) Current of the output stage (*I*<sub>OUT</sub>).
- 4.4 Micrograph of the fabricated test chip and layout of the proposed level shifter.
- 4.5 (a) Statistical distribution of measured minimum V<sub>DDL</sub> for successful up-conversion to the nominal supply voltage (1.8 V) at different temperatures (-25 °C, 25 °C, and 80 °C), and (b) measured input (A) and output (OUT) waveforms for 70 mV→1.8 V conversion at 25 °C.
- 4.6 Measured static power as a function of  $V_{DDL}$  for  $V_{DDH} = 1.8$  V at 25 °C. 1- $\sigma$  error bars and mean ( $\mu$ ) plot are also shown.
- 4.7 Measured (a) delay and (b) energy per transition (100-kHz input signal) as a function of  $V_{DDL}$  for  $V_{DDH} = 1.8$  V at 25 °C. 1- $\sigma$  error bars and mean ( $\mu$ ) plot are also shown.
- 4.8 Measured delay as a function of  $V_{DDL}$  for a typical sample at  $V_{DDH} = 1.8$  V and three different temperatures (-25 °C, 25 °C, and 80 °C).
- 4.9 Measured (a) delay and (b) energy per transition (100-kHz input signal) as a function of  $V_{DDH}$  for  $V_{DDL} = 0.3$  V at 25 °C. 1- $\sigma$  error bars and mean ( $\mu$ ) plot are also shown.
- 4.10 Energy-delay comparison against state-of-the-art level shifters fabricated in 180-nm CMOS.
- 5.1 Schematic of a conventional dynamic voltage comparator.
- 5.2 Basic structure of DLS logic gates.
- 5.3 DLS NOT gate with (a) logic 'zero' and (b) logic 'one' at the input, highlighting the voltage across the transistors.
- 5.4 Static transfer characteristics of a DLS NOT gate.
- 5.5 Schematic of the single-stage DLS-based DVC with pre-discharge phase.
- 5.6 Layout and chip micrograph of the proposed ULP single-stage DLS-based DVC.
- 5.7 Measured delay and offset voltage vs.  $V_{CM}$  for  $V_{DD} = 0.4$  V, 25 °C, and 50-Hz clock frequency for one sample.
- 5.8 Schematic of single-stage DVCs: (left) pre-discharge based and (right) pre-charge based structures.
- 5.9 Interconnection circuit between the output nodes of the two single-stage structures.
- 5.10 Layout of the proposed DVC and die micrograph with area occupancy.

- 5.11 Measured delay vs.  $V_{CM}$  for  $V_{DD} = 0.6$  V, 25 °C, and 10-Hz clock frequency in the three test chips (TT, SS and FF corner wafers).
- 5.12 Measured (a) current consumption (a) and (b) offset voltage (b) as function of  $V_{CM}$  for  $V_{DD} = 0.6$  V, 25 °C, and 10-Hz clock frequency in the three test chips (TT, SS and FF corner wafers).
- 5.13 Measured input and output signals of the proposed dual-stage DVC for  $V_{DD} = 0.6$  V, 25 °C, and 10-Hz clock frequency in the three test chips (TT corner chip).

## **List of Tables**

- 1.1 Comparison of the available output powers from different energy harvesting sources.
- 1.2 Comparison of recent Micro Super-Capacitors technologies.
- 2.1 Adopted transistor flavor and sizing for the proposed voltage reference
- 2.2 Performance summary and comparison with the state-of-art voltage references (best performance in bold).
- 2.3 Adopted transistor flavor and sizing for the proposed current reference.
- 2.4 Performance summary and comparison with the state-of-art current references (only measured data and best performance in bold).
- 3.1 Transistor sizing for the implemented oscillators.
- 3.2 Adopted transistor sizing for the three replicas of the voltage reference circuit, each optimized at different corners for minimum process sensitivity and temperature coefficient (in bold sizes varying across different optimizations).
- 3.3 Performance summary and comparison with state-of-the-art voltage references (best performance in bold).
- 4.1 Transistors type and sizing for the proposed level shifter.
- 4.2 Comparison of the proposed level shifter with the state of the art.
- 5.1 Transistor flavor and sizing of the single-stage ULP DVC.
- 5.2 Transistor flavor and sizing for the dual-stage DLS-based DVC.
- 5.3 Comparison of two proposed DLS-based DVCs against state-of-the-art designs.

## List of Abbreviations

| ADC    | Analog to digital converter                       |
|--------|---------------------------------------------------|
| Bio-Z  | Bioelectrical impedance                           |
| BTBT   | Band-To-Band Tunnelling                           |
| CDF    | Cumulative Distribution Function                  |
| CMOS   | Complementary MOS                                 |
| CMR    | Common Mode Range                                 |
| CTAT   | Complementary to Absolute Temperaure              |
| DC     | Direct current                                    |
| DIBL   | Drain-Induced Barrier Lowering                    |
| DLS    | Dynamic Leakage Suppression logic                 |
| DNW    | Deep n-well                                       |
| DVC    | Dynamic voltage comparator                        |
| ECG    | Electrocardiogram                                 |
| EM     | Electro Magnetic                                  |
| ETR    | Energy per Transition                             |
| GIDL   | Gate-Induced Drain Leakage                        |
| IC     | Integrated Circuit                                |
| ID     | Drain current                                     |
| IoT    | Internet of Things                                |
| IP     | Intellectual Property                             |
| Li-ion | Lithium ione                                      |
| LP     | Low Power                                         |
| LS     | Level Shifter                                     |
| LV     | Low Voltage                                       |
| MCU    | Micro Controller Unit                             |
| MEMS   | Micro Electro-Mechanical Systems                  |
| MOSFET | Metal Oxide Semiconductor Field Effect Transistor |
| MPPT   | Maximum Power Point Tracker                       |
| MSC    | Micro Super Capacitor                             |
| MTC    | Minimum Temperatur Coefficient                    |
| MVT    | Medium/Low Threshold Voltage                      |
| NiCd   | Nickel-Cadmium                                    |
|        |                                                   |

| NiMH            | Nickel Metal Hidrate                |
|-----------------|-------------------------------------|
| PDN             | Pull-Down Network                   |
| PLL             | Phase-Locked Loop                   |
| PSRR            | Power Supply Rejection Ratio        |
| PTAT            | Proportional to Absolute Temperaure |
| PUN             | Pull-Up Network                     |
| PVC             | PhotoVoltaic cell                   |
| PVT             | Process Voltage Temperature         |
| RF              | Radio Frequency                     |
| RFID            | Radio Frequency Identification      |
| RVT             | Regular Threshold Voltage           |
| SAR             | Successive Approximation Register   |
| SoC             | System-on-Chip                      |
| TC              | Temperature Coefficient             |
| ULP             | Ultra Low Power                     |
| ULV             | Ultra-Low Voltage                   |
| V <sub>BS</sub> | Body to Source Voltage              |
| V <sub>DS</sub> | Drain to Source Voltage             |
| V <sub>GS</sub> | Gate to Source Voltage              |
| VLSI            | Very Large Scale Integration        |
| VR              | Voltage Reference                   |
| VTH             | Threshold Voltage                   |
| ZTC             | Zero Temperature Coefficient        |
|                 |                                     |

### Sommario

I circuiti integrati a bassissima potenza e tensione (ULP/ULV) sia digitali che analogici hanno riscosso un notevole interesse da parte della comunità scientifica negli ultimi anni. L'avvento dell'era dell'Internet of Things (IoT) ha inoltre incrementato l'interesse del mercato nello sviluppo di circuiti ULP/ULV in modo tale da realizzare Systems-on-Chip (SoCs) energicamente autonomi e con dimensioni estremamente ridotte. Reti di sensori wireless, dispositivi biomedici impiantabili, sensori indossabili, sistemi di controllo ambientale intelligenti, monitoraggio della qualità dell'aria, controllo di condizioni nei depositi e nei campi agricoli sono solo alcuni dei campi applicativi che possono beneficiare dalla progettazione di circuiti ULP/ULV.

La progettazione di blocchi circuitali ULP/ULV per SoC energeticamente autonomi è un argomento ampio e necessita di alcune conoscenze sui diversi elementi che possono comporre questi SoC. A tal riguardo, il presente lavoro di tesi fornisce innanzitutto una panoramica generale sui SoC energeticamente autonomi con un focus sulle fonti di raccolta di energia ("energy harvesting") disponibili e sulle diverse soluzioni di accumulo di energia.

La disponibilità di soluzioni di raccolta e accumulo di energia integrabili su chip apre la strada allo sviluppo di nodi sensori IoT senza batteria e sposta la sfida verso la progettazione di circuiti ULP/ULV tali da far funzionare il nodo anche con bassissime quantità di energia disponibile. Tra i vari elementi costitutivi chiave dei SoC, questa tesi presenta il progetto di circuiti di riferimento di tensione/corrente finalizzati alla generazine di una polarizzazione DC precisa e stabile in un'ampia gamma di condizioni operative, un traslatore di livello per interfacciare i blocchi operanti tra diversi domini di tensione e comparatori per interfacciare il mondo analogico con quello digitale.

In primo luogo, viene presentato un circuito di riferimento di tensione a bassa area in grado di funzionare a una tensione di alimentazione di soli 250 mV e 5.4 pW di consumo energetico a temperatura ambiente. Il circuito proposto sfrutta uno schema di polarizzazione del body ("body biasing") per contrastare l'effetto delle fluttuazioni di tensione/temperatura e quindi per garantire una buona precisione della tensione di uscita generata, come dimostrato attraverso misure su un chip di test fabbricato con una tecnologia CMOS a 180 nm. Viene inoltre presentato e convalidato mediante misure su un prototipo a 180 nm il progetto di un circuito di riferimento di corrente basato su un generatore di tensione che sfrutta la struttura utilizzata per

il riferimento di tensione. Il circuito proposto funziona correttamente fino a una tensione minima di 0.6 V per generare una corrente nel range di nA con solo 4.000 µm<sup>2</sup> di superficie occupata, raggiungendo allo stesso tempo un'elevata efficienza energetica garantita dal consumo di potenza nel range di pW nel sottoblocco del generatore di tensione. Quindi, viene proposto il progetto di un riferimento di tensione basato sull'impiego di un sensore di processo su chip con l'obiettivo di ottenere una bassa sensibilità alle variazioni di processo e una buona precisione complessiva rispetto alle variazioni di processo-tensione-temperatura, garantendo allo stesso tempo un funzionamento ULP/ULV (ovvero tensione di alimentazione minima di 200 mV e consumo energetico di soli 3.2 pW a temperatura ambiente). I risultati sperimentali in una tecnologia CMOS a 180 nm su wafer d'angolo dimostrano l'efficacia della soluzione proposta. Inoltre, viene presentato il progetto di un robusto traslatore di livello in grado di convertire tensioni di ingresso dal regime di sottosoglia (circa 100 mV) fino alla tensione di alimentazione nominale (1.8 V). Il circuito proposto si basa su una topologia a specchio di corrente cascode a bassa tensione auto-polarizzata che include transistor PMOS e NMOS collegati a diodo per pilotare efficacemente il buffer utilizzato come stadio di uscita in modo da ottenere un'elevata efficienza energetica. I risultati sperimentali ottenuti in una tecnologia CMOS a 180 nm e attraverso i wafer d'angolo dimostrano una buona robustezza e buone prestazioni del traslatore di livello proposto rispetto ai circuiti proposti in letteratura. Infine, viene proposto il progetto di un comparatore ULP/ULV implementato utilizzando la famiglia logica DLS. In particolare, vengono presentate due diverse topologie di comparatore, ovvero una struttura a singolo stadio e un'architettura a doppio stadio basata sulla combinazione di due comparatori a singolo stadio. I risultati sperimentali su chip di test fabbricati in una tecnologia a 180 nm dimostrano un consumo di potenza nell'ordine di poche decine di pW.

### Abstract

Ultra-low power/voltage (ULP/ULV) circuits (both analog and digital blocks) have been gaining considerable interest from the scientific community in the last few years. The advent of the Internet of Things (IoT) era has also increased the interest of the market in ULP/ULV circuits addressed to energy-autonomous and extremely small-sized Systems-on-Chip (SoCs). Wireless sensor networks, biomedical implantable devices, wearable computing, ambient control intelligence, air quality monitoring, warehouse, and agriculture monitoring are just some of the fields that can benefit from ULP/ULV circuits.

The design of ULP/ULV circuit blocks for energy-autonomous SoCs is a wide topic and needs some knowledge on several elements that can compose these SoCs. In this regard, this thesis first provides a general overview on energy-autonomous SoCs with a focus on available energy harvesting sources and energy storage solutions.

The availability of on-chip energy harvesting/storage opens the route for the development of battery-less IoT sensor nodes and moves the challenge towards the design of ULP/ULV circuits that make the node working even with a small amount of available energy from harvesting. Among various key building blocks of SoCs, this thesis presents the design of voltage/current reference circuits to provide a precise and stable DC bias under a wide range of environmental conditions, a level shifter to interface blocks between different voltage domains, and comparators to interface the analog world with the digital one.

More specifically, a low-area voltage reference circuit able to operate at supply voltage as low as 250 mV and 5.4 pW of power consumption at room temperature is first presented. The proposed circuit exploits a body biasing scheme to deal with the effect of voltage/temperature fluctuations and hence to ensure good accuracy of the generated output voltage, as demonstrated through measurements on a test chip fabricated in 180-nm CMOS technology. The design of a current reference circuit based on a voltage generator exploiting the structure used for the voltage reference is also presented and validated by means of silicon measurements on a 180nm prototype. The proposed circuit properly works down to 0.6 V to generate a current in the nA range with only 4,000- $\mu$ m<sup>2</sup> area occupancy, while reaching high power efficiency as guaranteed by the pW-power consumption of the voltage generator sub-block. Then, the design of a global variation-aware voltage reference based on an on-chip process sensor is proposed with the aim of achieving low sensitivity to process variations and overall good accuracy against process-voltage-temperature (PVT) variations, while also ensuring ULP/ULV operation, i.e., minimum supply voltage of 200 mV and power consumption of only 3.2 pW at room temperature. Experimental results in 180-nm CMOS technology across corner wafers demonstrate the effectiveness of the proposed solution. In addition, the design of a robust level shifter able to convert input voltages from the subthreshold regime (around 100 mV) up to the nominal supply voltage (1.8 V) is presented. The proposed circuit is based on a self-biased low-voltage cascode current mirror topology that features diode-connected PMOS and NMOS transistors to drive the split-input inverting buffer used as output stage with high energy efficiency. Obtained measurement results in 180-nm CMOS technology and across corner wafers demonstrate good robustness and performance of the proposed level shifter as compared to prior art. Finally, the design of an ULP/ULV comparator is proposed by using the dynamic leakage suppression (DLS) logic family. In particular, two different topologies, i.e., a single-stage structure and a dual-stage architecture based on the combination of two single-stage comparator are presented and validated through silicon measurements on 180-nm test chips, which demonstrate a power consumption of few tens of pW.

My research activity during PhD concerned the design of innovative ULP/ULV circuits and their validation through silicon measurements. First, a low-area voltage reference circuit able to operate at supply voltage as low as 250 mV and 5.4 pW of power consumption at room temperature was designed and fabricated in 180-nm CMOS technology. The proposed circuit exploits a body biasing scheme to deal with the effect of voltage/temperature fluctuations and hence to ensure good accuracy of the generated output voltage. A current reference circuit based on a voltage generator exploiting the structure used for the voltage reference was also designed and validated by means of silicon measurements on a 180-nm prototype. The proposed current reference properly works down to 0.6 V to generate a current in the nA range with only 4,000- $\mu$ m<sup>2</sup> area occupancy, while reaching high power efficiency as guaranteed by the pW-power consumption of the voltage generator sub-block. Then, the design of a global variation-aware voltage reference based on an on-chip process sensor was realized with the aim of achieving competitive sensitivity to process variations and and overall accuracy against process-voltagetemperature (PVT) variations, while also ensuring ULP/ULV operation (minimum supply voltage of 200 mV and power consumption of only 3.2pW at room temperature). Experimental results in 180-nm CMOS technology across corner wafers demonstrate the effectiveness of the proposed solution. The research activity was also addressed to interfacing blocks between different voltage domains in multiple-voltage systems. In this regard, a robust level shifter able

to convert input voltages from the subthreshold regime (around 100 mV) up to the nominal supply voltage (1.8 V) was designed. The proposed circuit is based on a self-biased low-voltage cascode current mirror topology that features diode-connected PMOS and NMOS transistors to drive the split-input inverting buffer used as output stage with high energy efficiency. Obtained measurement results in 180-nm CMOS technology and across corner wafers demonstrate good robustness and performance of the proposed level shifter as compared to prior art. Finally, to interface the analog world with the digital one, an ULP/ULV comparator was designed by using the dynamic leakage suppression (DLS) logic family. Two different topologies, i.e., a single-stage structure and a dual-stage architecture based on the combination of two single-stage comparator were fabricated and validated through silicon measurements on 180-nm test chips, which demonstrated a power consumption of few tens of pW.

## Introduction

The design of ultra-low power/voltage (ULP/ULV) circuits for energy-autonomous Systemson-Chip (SoCs) is gaining great interest in view of ever increasing demand of Internet of Things (IoT) sensor nodes with low area/power/voltage budget.

The main motivation in realizing ULP circuits and systems is the increasing battery lifetime of SoC. Indeed, battery technology has evolved much slower than CMOS technology. A considerable research effort has therefore been recently dedicated to reduce the power consumption of sensor nodes, which today has to be below the microwatt for the above applications. Voltage scaling is an attractive and very effective lever to reduce power consumption. ULP design thus often translates into ULV design. For this reason, energy-autonomous sensor nodes typically operate at supply voltages that are below the MOSFET threshold voltage ( $V_{TH}$ ) [1]. The latter represents a key challenge for circuit designers since it introduces several limitations in different aspects. Among these, the reduced supply voltage limits the use of conventional design techniques, thus leading the way to introduce new circuit topologies [2]. For example, bandgap voltage references typically require high supply voltage, thus giving rise to the need of introducing new voltage reference structures to be used in energy-autonomous SoC.

An aggressive power consumption scaling makes it possible to have SoCs that do not require any battery. Indeed, batteries in IoT devices is a major issue due to their large area occupation and expensive cost. They can be up to three times more expensive than the single chip they supply [3]. Their size is determined by the lifetime of the sensor node, which directly affects how often they need to be replaced. This has a significant impact on maintenance cost. To extend the overall lifetime, the battery is usually recharged slowly by harvesting some limited power from the environment, such as using a solar cell or radio frequency harvesting. Hence, battery miniaturization often results in highly-discontinuous operation of IoT devices, as they stop operating every time the battery runs out of energy.



Fig.1: This statistics represents the number of Internet of Things (IoT) connected devices in the world between 2015 and 2025. In 2025, the source predicted that 75.44 billion connected objects would be in circulation worldwide [4].

The statistics on the worldwide connected IoT devices (Fig.1) shows that the number of IoT devices will keep growing in the next years. The use of batteries will increase correspondingly by following a similar trend, which in turn will introduce environmental issue of battery disposal.

However, operation without any energy storage system may not be a feasible solution for selfpowering systems due to the unstable or intermittent harvesting sources. On the other hand, modern IoT devices are frequently switched between stand-by and active modes, where the stand-by mode time is typically longer compared to the active mode. Such duty-cycled operation mode gives rise to the need of energy storage to accumulate the energy surplus during the stand-by mode. The stored energy can be then used during the active mode or for the RF block (considering it requires a large amount of energy) to transmit information outside the IoT node.

Electrochemical super-capacitors as one of the energy storage solutions offers several advantages over rechargeable batteries in such applications. This because they exhibit much longer cycle life, which is equivalent to longer service time, and much higher power density for

higher efficiency of the whole power system. In the last few years, researchers have developed new materials and techniques to fabricate on-chip micro super-capacitors for micro-scale energy storages [5].

Recent technology advancements make it also possible to realize on-chip photovoltaic cells on the top level of the MOSFET structure while achieving a considerable power (1.22 mW/cm2) [6]. Similarly, antennas for RF harvesting can be also integrated on-chip with a capability of harvesting 0.5 uW of power at a distance of 7 cm away from a relatively low source power of +20-dBm (less than 1-W) at 4-GHz [7].

The availability of super-micro capacitors and on-chip energy harvesting thus enables the development of battery-less IoT sensor nodes, which in turn requires the design of ULP/ULV circuits to ensure the node is able to work even with a small amount of energy available from harvesting.

In the above context, this thesis mainly concerns the design of innovative ULP/ULV circuits for energy-autonomous SoCs and their validation through silicon measurements. A low-area voltage reference circuit able to operate at supply voltage as low as 250 mV and 5.4 pW of power consumption at room temperature is first proposed and validated in 180-nm CMOS technology. The proposed circuit exploits a body biasing scheme to deal with the effect of voltage/temperature fluctuations and hence to ensure good accuracy of the generated output voltage. A current reference circuit based on a voltage generator exploiting the structure used for the voltage reference is also presented and validated by means of silicon measurements on a 180-nm prototype. The proposed current reference properly works down to 0.6 V to generate a current in the nA range with only 4,000- $\mu$ m<sup>2</sup> area occupancy, while reaching high power efficiency as guaranteed by the pW-power consumption of the voltage generator sub-block. Then, the design of a global variation-aware voltage reference based on an on-chip process sensor is proposed to reach competitive sensitivity to process variations and overall accuracy against process-voltage-temperature (PVT) variations, while also ensuring ULP/ULV operation (minimum supply voltage of only 200 mV and power consumption of only 3.2 pW at room temperature). The proposed circuit is validated through measurements on a 180-nm test chip across corner wafers. My research activity during PhD was also addressed to interface blocks between different voltage domains in multiple-voltage systems. In this regard, a robust level shifter able to convert input voltages from the subthreshold regime (around 100 mV) up to the nominal supply voltage (1.8 V) is proposed. This circuit is based on a self-biased low-voltage cascode current mirror topology that includes diode-connected PMOS and NMOS transistors to drive the split-input inverting buffer used as output stage with high energy efficiency. Obtained measurement results in 180-nm CMOS technology and across corner wafers demonstrate good robustness and performance of the proposed level shifter as compared to prior art. Finally, to interface the analog world with the digital one, an ULP/ULV comparator is designed by using the dynamic leakage suppression (DLS) logic family. In particular, two different topologies, i.e., a single-stage structure and a dual-stage architecture based on the combination of two single-stage comparator are proposed and validated through silicon measurements on 180-nm test chips, which demonstrate a power consumption of few tens of pW.

A summary of the content of each chapter of this thesis is presented below:

Chapter 1 provides an overview of SoCs and energy harvesting sources that can power those SoCs. The technology used to realize micro-super capacitor for on-chip energy storage systems is also explored. In addition, the design of ULP/ULV circuits is introduced with a focus on the MOSFET operation in the subthreshold regime and the impact of process variability in subthreshold circuits.

Chapter 2 presents the design of CMOS voltage and current reference circuits for highly uncertain harvesting systems. More specifically, the design of a ULP/ULV voltage reference which exploit body biasing feedback to improve its figures of merit is presented, along with a detailed circuit analysis. Measurement results on a 180-nm test chip are provided to demonstrate the effectiveness of the proposed approach. Then, a CMOS current reference circuit exploiting the structure used for the voltage reference is also presented and validated by means of silicon measurements on a 180-nm prototype.

Chapter 3 introduces the design of a global variation-aware voltage reference with ULP/ULV operation. The architecture is based on design replicas optimized at different process corners and an on-chip process sensor, which allows selecting the best combination of replicas. Experimental results in 180-nm CMOS technology across corner wafers are provided to demonstrate the effectiveness of the proposed solution.

Chapter 4 presents the design of a robust level shifter able to convert input voltages from the subthreshold regime (100 mV) up to the nominal supply voltage (1.8 V). The proposed circuit is based on a self-biased low-voltage cascode current mirror (CM) topology that features diode-connected PMOS and NMOS transistors to drive the split-input inverting buffer used as output

stage with high energy efficiency. Obtained measurement results in 180-nm CMOS technology and across corner wafers demonstrate the effectiveness of the proposed level shifter as compared to prior art.

Chapter 5 presents the design of ULP dynamic voltage comparators (DVCs) exploiting Dynamic Leakage Suppression (DLS) logic. Two different DVC topologies, i.e., a single-stage structure and a dual-stage architecture based on the combination of two single-stage DVCs are presented and validated through silicon measurements on 180-nm test chips.

## Chapter 1: Energy Harvesting for System-on-Chip and Design of Ultra-low Power/Voltage Circuits

This chapter firstly gives an overview of energy-harvested Systems-on-Chip (SoCs) and some of their recent applications. Particular attention is paid to the power levels that can be generated on a chip level. Available harvesting sources are presented, while comparing their power density data. This overview on energy harvesting sources provides assistance in choosing the type of harvester when designing a ULP/ULV SoC. Along with the selection of the energy harvesting source, a storage system that can be integrated and able to supply the energy for the circuits on the SoC has to be defined. Accordingly, technologies for realizing micro super-capacitors are also presented. Then, the chapter introduces the design of ULP/ULV circuits with a focus on the MOSFET operation in the subthreshold regime and the impact of process variability in subthreshold circuits.

#### **1.1 Energy-Harvested System-on-Chip**

A System-on-chip (SoC) architecture is a microchip that can integrate sensors, power supply block with MPPT, analog blocks for the signal conditioning, ADCs, microprocessor, memories, and RF block to communicate (Fig.1.1). The main difference between system-on-chip and other computing systems is that an SoC requires parts that have to be designed for the specific function and environment where it is placed.



Fig.1.1: A System-on-Chip integrates on die all the blocks needed to solve its tasks.

This device typically exploits energy harvesting systems for the power supply and is usually paired with a rechargeable battery (i.e., Li-ion, NiCd, NiMH and others) which stores the

exceeded power from the harvesting. Recent works show how it is possible to integrate a great number of sensors and computing blocks on a single chip supplied by integrated energy harvesting systems. This is possible by implementing ULP/ULV analog and mixed-signal circuits.

An example of SoC power supply block used for a biomedical implant is reported in [6]. A charge pump that receives the harvested solar energy from parallel-connected photodiodes is implemented to achieve an output power of 1.65  $\mu$ W in a 1.54-mm<sup>2</sup> active area with 1.22 mW/cm<sup>2</sup> light input. The device under test is placed under pork skin to simulate the biomedical implant. Lower light condition brings the power supply block to generate a lower power.

In order to solve the problem of the discontinuous energy source from the energy harvesting, a recent work [8] shows a dual-mode architecture comprising of a microcontroller and a power management module, which can operate both in normal (NM) and leakage suppression mode (LSM). These two operating modes of NM and LSM, respectively, allow the use of microcontroller (MCU) with a fast clock frequency and low energy per operation when the system is powered by the battery and with a low frequency and low power configuration when the energy comes from purely harvesting. The MCU can work with poor light condition: in detail, the system is fully running at 55-lux light intensity with a 0.54 mm<sup>2</sup> on-chip solar cell.

An even more interesting harvesting source comes from biological sources. In particular, recent works that adopt microbial fuel cells demonstrate harvesting energy from bacteria is a feasible solution. In this regard, a paper-based microbial fuel cells realized in [9] shows the possibility to harvest bio-power from bacteria-containing liquid derived from renewable and sustainable wastewater. Intrinsic feature of the paper allows for rapid adsorption of bacteria-containing solution through capillary, thus leading to a very short start-up time. SoC addressed to analyse water quality or underground sensors system for agriculture [10] could benefit from the use of biologic harvesting source. Basically, this kind of harvesting is interestingly useful in all environment where there is humidity or water and no light or RF sources.

Concerning the sensing blocks, several solutions were proposed to integrate on-chip temperature sensors. An ultra-low power CMOS temperature sensor in 0.13- $\mu$ m standard CMOS process with an area of 0.0014 mm<sup>2</sup> and a power consumption of 0.15  $\mu$ W is presented in [11]. In addition, a pW relaxation oscillator is proposed in [12]. Implementing pW oscillators makes it possible to save energy for the clock generation and to design temperature sensors

with power consumption in the order of pW (by using ULP logic design for the control block). Using Micro Electro-Mechanical Systems (MEMS) allows realizing gyroscope sensors, force sensors for accelerometer or air pressure measurement, humidity and temperature sensors [13] and others. Fabrication of MEMS has recently evolved from the process technology in semiconductor device fabrication [14], thus making a complete on-chip integration of this kind of sensors feasible. A wide variety of sensors can be integrated in SoC for biomedical applications. A 2-channel ECG monitoring, a Bio-Z readout channel for respiration analysis and a thermistor-based temperature sensor for body temperature monitoring are implemented in [15]. The Bio-Z is the Bioelectrical impedance, i.e., a commonly used method for estimating body composition, more specifically body fat and muscle mass.

Overall, the research effort in designing ULP/ULV circuits, along with the research on exploiting alternative energy source and storage systems in micrometer scale will allow realizing energy-autonomous SoCs that can solve multiple tasks, being ideally perpetual and working until something physically breaks inside.

#### 1.1.1 Energy harvesting sources

The goal of energy harvesting is to convert energy from one form to another that can be used to power supply electronic devices or SoC. When implemented in environmental monitoring nodes, energy-harvested solutions can directly extract energy from the environment and use it to feed such nodes. Our surroundings offer plenty of opportunities to take advantage of (Fig.1.2), from which a small amount of energy can be scavenged and used for the power supply of specific sensor nodes. The difference between energy harvesting and energy scavenging is related to the energy source dependency. When a waste energy is used (e.g., indoor light of heat from air conditioning unit), it is possible to define the harvesting of energy as energy scavenging [16].



Fig.1.2: Energy harvesting sources and conversion methods.

Energy harvesting sources can be categorized to ambient or external ones. The ambient sources are accessible within an environment without any external energy supply. The external sources emit energy in the environment, with the intent for this energy to be harvested by the sensor nodes [17].

The most commonly used source for energy harvesting is the Sun, i.e. an affordable energy source that presents high power density in outdoor environments. A photovoltaic cell (PVC) generates a DC voltage that can be directly used by the circuits. Working with low-light condition is also possible by using DC-DC converter, enabling the increase of supply voltage needed for the operation of SoC. It is well known that the solar cell is not a constant voltage and current source. The output power of the cell depends on the sunlight intensity and the ambient temperature. DC-DC converter with MPPT is used to deal with this. Correct position of the PVC in the environment is also important to achieve its maximum possible efficiency. Therefore, solar harvesting is a discontinuous energy source where the discontinuity is predictable (night and day cycle). Solar harvesting solutions, e.g. PVC, can be easily placed on a chip, as demonstrated in [18].

Thermal energy can be converted to electricity by thermoelectric transducers depending on spatial variations in temperature or pyroelectric transducers depending on temporal variations in temperature. Thermoelectric generators are based on the Seebeck effect [19] and implemented with series-connected p-type and n-type semiconductor blocks. The open circuit

voltage and the maximum power point of thermoelectric transducers depend on the temperature difference between the cold side and the hot side of the transducer. Pyroelectric converters use pyroelectric material, i.e., a class of non-centrosymmetric polar crystals that exhibit an inherent coupling between electrical polarization and temperature, such that a change in temperature results in a change in the electric dipole moment known as pyroelectric [20]. The open circuit voltage and generated power mainly depend on the rate of the temperature change.

Wind-based energy harvesting systems convert kinetic energy into electricity using turbines, rotors, and in general the principles of electromagnetic induction. For instance, using piezoelectric material allows exploiting micro piezoelectric strips moved by the wind to generate energy [21]. The wind-based harvesting solutions are classified in electromagnetic and piezoelectric types. In particular, piezoelectric-based solutions allows compact system, easy operation with low wind speed conditions, higher efficiency, instant start-up with no dead time, smaller size, and lightweight. However, they can be easily damaged when high pressure is applied to the piezoelectric wind energy harvester because of the brittleness of the piezoelectric device [22]. Research on the effect of global warming shows an increase in the average global wind speed in the last ten year. Decadal-scale variations of near-surface wind are probably determined by internal decadal ocean-atmosphere oscillations, rather than by vegetation growth and/or urbanization as hypothesized previously. Such strengthening has increased potential wind energy by  $17 \pm 2\%$  from 2010 to 2017 [23]. The main issue of wind-based harvesting comes from their size and the discontinuity of the source. The average wind speed increasing combined with research on miniaturized wind-based harvester can allow the use of piezoelectric-based wind energy harvesting in SoC.

Piezoelectric material is also used for vibration energy harvester, where a vibration or movement is transduced in a strength applied to the piezoelectric material that will finally transform it into electric energy. Vibration or movement harvesting systems can also exploit MEMS. For instance, a MEMS-based vibration harvester can consist of an electroplated copper planar spring, a permanent magnet and a copper planar coil [24], while generating a maximal output power of 700 nW with an input vibration frequency of 94.5 Hz and input acceleration of 4.94 m/s.

Radio frequency (RF) energy harvesting is an energy conversion method used for converting energy from the electromagnetic (EM) field in a voltage source that can supply a current that

depends on the EM field intensity. Modern house has Wi-Fi router, mobile phones, Bluetooth devices and other devices that transmit RF EM waves. Outside the house, we can find mobile phone repeaters that can supply our electronic devices by using RF harvesting. RF-based energy harvesting/scavenging can be used to supply the power required for wearable electronics devices, RFID, medical implantable devices, wireless sensor network and internet of things (IoT) nodes [25]. RF energy harvesting is a continuous source in an environment that is surrounded by transmitting devices. Furthermore, RF-based harvesting systems are simple and can be easily placed on a chip, as solar harvesting solutions. They typically consist of an antenna, a matching circuit, and a rectifier. The antenna can be made using the metal layer inside the chip and usually placed at the border of the chip. The matching circuit is made with capacitors and inductors. The former can be implemented by using the MOSFET capacitance, while the latter are available in the library of different IC producers or can be realized with spirals implemented with metal layer. Rectifier can be implemented using N-P Diode or MOSFETs [26].

Finally, Table 1.1 [27] compares the different energy harvesting sources in terms of the provided power density. Solar energy shows the highest power density in outdoor applications, while the power decreases for indoor application where light has less intensity. Vibration and wind energy become comparable to solar energy for indoor applications. Thermal energy sources are a good candidate for human activities harvesting and can be combined with vibration harvesting. RF harvesting shows the smallest amount of power density and it is strongly dependent by the distance of the RF sources.

| Harvesting source |                              | Power density              |
|-------------------|------------------------------|----------------------------|
| Solon onongy      | Outdoors                     | 15-0.15 mW/cm <sup>2</sup> |
| Solar energy      | Indoors                      | 10-100 µW/cm <sup>2</sup>  |
|                   | Piezoelectric (in the shoes) | 330 µW                     |
| Vibration         | Electrostatic conversion     | 21 nW/mm <sup>3</sup>      |
|                   | Electromagnetic conversion   | 184 µW/mm³                 |
|                   | Thermoelectric (5°C          | 40 µW/cm <sup>2</sup>      |
| Thermic           | gradient)                    | 40 µ w/cm                  |
| Thermic           | Pyroelectric (Temperature    | 8.6 μW/cm <sup>2</sup>     |
|                   | rate 8.5°C/s)                | 8.0 µ w/cm                 |
| RF                | GSM 900/1800 MHz             | 100 nW/cm <sup>2</sup>     |
| Л                 | Wi-Fi 2.4 GHz                | 10 nW/cm <sup>2</sup>      |
| Wind              | Wind speed 5m/s              | 380 µW/mm <sup>3</sup>     |

| Acoustic (similar to |          |                       |
|----------------------|----------|-----------------------|
| vibration            | At 75 dB | 30 nW/mm <sup>3</sup> |
| harvesting)          |          |                       |

Table 1.1: Comparison of the available output powers from different energy harvesting sources.

#### **1.1.2 Super-micro capacitors**

The term "micro-supercapacitors" (MSC) normally refers to miniaturized super-capacitors that range from microns to centimetres and can be integrated on-chip with circuits and microelectronic components. MSCs with stable performance can be integrated as the energy storage and power supply units. In general, micro-batteries are the primary choice for self-powered systems since they provide an energy density that can ensure stable supply current. However, the charge/discharge mechanism of batteries typically result in limited lifetime and power density. On the other hand, MSCs ensure longer operating lifetime (>100,000 cycles), faster charge/discharge rates as well as higher power density [28]. The reduced number of charge/discharge cycles also translates into a smaller capacitance degradation.



Fig.1.3: The common structure of MSCs: (a) standard sandwich MSCs, (b) wire-shaped fibrous MSCs, and (c) interdigital MSCs. Figure from [29].

At present, there are three typologies of MSC: standard sandwich-structure MSCs, interdigitalstructure MSCs and fibrous MSCs [29], as shown in Fig.1.3. Conventional sandwich MSC is a vertical structure composed of two electrodes and electrolyte sandwiched in the middle. MSCs with in-plane interdigital electrodes are separated by insulated gap. In particular, the performance of an interdigital-structure MSC mainly depends on the electrode width, thickness and gap size [30]. Fibrous MSC is pseudo-capacitor relying on the fast surface faradaic redox reactions, where a fibre material is used as support. This type of MSCs allow flexible design for wearable devices. A large number of studies have shown that reasonable design of the materials (in particular, the electrode material) is fundamental to improve the electrochemical performance of MSCs.

Overall, MSCs suffer of self-discharge that can consist of a relatively fast diffusion process and a slower leakage current. The open-circuit voltage decay due to charge losses can be caused by side reactions, which may be due to over-potential decomposition of the electrolyte, redox-reactions caused by impurities, or possible functional groups on the carbon surface. Another cause for observed self-discharge is the flaws during the production, which may result in micro-short circuits between the anode and the cathode [31].

Research on new materials for the electrode and electrolyte can mitigate the problem of the self-discharge, while increasing the capacitance of MSCs. Table 1.2 compares different materials and technologies used for MSCs, also in terms of the energy density that can be achieved. MSCs can have 3-dimension or 2-dimension. The first type is presented in [32], [33] and [36] and it is not possible to be used in SoC considering they cannot be integrated on-chip, even if it shows a great capacitance density. [35] shows the highest capacitance among on-chip MSCs as compared to [34] and [37].

| References | Electrode material                                  | Electrolyte<br>material                                  | Flexi<br>bility | On-<br>chip | Capacitance<br>density | Energy<br>density          |
|------------|-----------------------------------------------------|----------------------------------------------------------|-----------------|-------------|------------------------|----------------------------|
| [32] 2020  | Carbon/Vanadium<br>disulphide<br>nanosheets (C/VS2) | H <sub>2</sub> SO <sub>4</sub> Gel                       | No              | No          | 86.5 F/cm <sup>3</sup> | 15.6<br>mWh/cm³            |
| [33] 2017  | Micropatterned<br>multi-walled<br>carbon nanotube   | Polyvinyl alcohol-<br>H <sub>3</sub> PO <sub>4</sub> gel | Yes             | No          | 2.02 F/cm <sup>3</sup> | N/A                        |
| [34] 2017  | Aluminium and<br>Silicon nanowires                  | 1-Ethyl-3-<br>Methylimidazolium<br>Bismide               | No              | Yes         | 13 µF/cm <sup>2</sup>  | 108<br>μWh/cm <sup>2</sup> |
| [35] 2018  | RuO <sub>2</sub>                                    | H <sub>2</sub> SO <sub>4</sub> gel                       | Yes             | Yes         | 6.5 mF/cm <sup>2</sup> | N/A                        |

| [36] 2020 | Ti <sub>3</sub> C <sub>2</sub> T <sub>x</sub> MXenes<br>with sodium<br>ascorbate | H <sub>2</sub> SO <sub>4</sub> /PVA gel | Yes | No  | 322 F/cm <sup>3</sup>  | 100<br>mWh/cm <sup>3</sup> |
|-----------|----------------------------------------------------------------------------------|-----------------------------------------|-----|-----|------------------------|----------------------------|
| [37] 2019 | MnO <sub>2</sub> and Silicon nanotube                                            | 1 M Na2SO4                              | Yes | Yes | 2.1 mF/cm <sup>2</sup> | N/A                        |

Table 1.2: Comparison of recent Micro Super-Capacitors technologies.

### 1.2 Design of Ultra-low Power/Voltage Circuits

The fast-increasing demand for IoT sensor nodes with aggressive form factor, cost and lifetime targets requires relentless power/voltage reductions to fit the capabilities of smaller and low-cost energy harvesting sources [38]-[39]. Such miniaturized sensor nodes with millimeter-scale energy harvesters tightly constrain the system power [40]-[43], due to the fluctuating and uncertain nature of the harvested power and frequent unavailability of miniaturized batteries [12], [44]-[50]. Indeed, whenever the harvested power (voltage) drops below the minimum system power  $P_{min}$  (and minimum voltage  $V_{min}$ ), the operation is inevitably interrupted. Therefore, reducing both  $P_{min}$  and  $V_{min}$  is fundamental to prevent forced system shutdowns or interruptions, whenever unfavorable environmental conditions challenge the regulation limits of the intermediate power conversion [44], as depicted in Fig. 1.4.



Fig. 1.4: Low minimum voltage  $(V_{min})$  and minimum power  $(P_{min})$  are desirable in energyharvested SoCs to prolong operation under unfavorable conditions.

The same considerations apply to directly-harvested systems, where the intermediate power conversion between the harvester and the system is eliminated altogether to further reduce the system power floor  $P_{min}$  [12], [40]. More quantitatively, the harvested power density in most energy sources is in the nW/mm<sup>2</sup> range (see Table 1.1) and hence constrain  $P_{min}$  in the scale of nWs. Concerning the harvested voltage, its range strictly depends on the operating principle and the environmental conditions [49]. For instance, solar harvesting typically generates voltages below 0.5 V, and down to less than 0.3 V at dim light conditions [42].

Taking all this into account, the design of ULP (i.e., power consumption in the order of few nW or below) and ULV (i.e., minimum operating voltage below 0.5 V) circuits to be integrated in energy-harvested SoCs becomes crucial.

#### **1.2.1 MOSFET operation in subthreshold region**

In general, the total power dissipation of a CMOS circuit is given by the sum of static and dynamic power, as expressed by

$$P_{tot} = V_{DD} I_{stat} + \alpha C V_{DD}^2 f$$
(1.1)

where  $V_{DD}$  is the supply voltage,  $I_{stat}$  is the static supply current,  $\alpha$  is the switching activity rate, C is the load capacitance, and f is the clock frequency. According to (1.1), low-power operation can be achieved by reducing the load capacitance or the frequency, as well as drastic reduction in power consumption can be achieved by reducing the  $V_{DD}$ . Aggressive voltage scaling typically leads the transistors to work in subthreshold region with a significant reduction of their static current and hence the standby power consumption.

The operation of MOSFETs in the subthreshold region is typically exploited when designing ULP/ULV circuits. A CMOS circuit is said to operate in subthreshold if all the transistors work with a gate-source voltage ( $V_{GS}$ ) lower than their threshold voltage ( $V_{TH}$ ), as shown in Fig.1.5.



Fig.1.5: MOSFET current-voltage characteristics highlighting the subthreshold and superthreshold operating regions. In the superthreshold region, the current is fairly linear, whereas in the subthreshold regime it is exponentially dependent on  $V_{GS}$ .

The basic four-terminal nMOSFET channel structure is depicted in Fig. 1.6. The substrate (connected to the Body terminal) is composed of p-type silicon, where two n<sup>+</sup>-type wells are formed and connected to the drain and source terminals. Then, the gate consists of heavily doped or silicide polysilicon, and is separated from the substrate by a thin silicon dioxide film, i.e., the gate oxide. The main device parameters are the gate oxide thickness ( $T_{ox}$ ) and its dielectric constant ( $\mathcal{E}_r$ ), the channel length (L), the substrate doping concentration ( $N_{sub}$ ), and the channel width (W).



Fig. 1.6: Structure of an n-channel MOSFET with different current contributions.

When the MOSFET works in subthreshold region ( $V_{GS} < V_{TH}$ ), the drain current is given by:

$$I_{sub} = \frac{W}{L} I_0 \exp\left(\frac{V_{GS} - V_{TH}}{nkT/q}\right) \left(1 - \exp\left(\frac{-V_{DS}}{kT/q}\right)\right)$$
(1.2)

where  $I_0$  is the intrinsic sub-threshold current at zero gate-source voltage, *n* is the subthreshold slope, *k* is the Boltzmann constant, *T* is the absolute temperature, *q* is the electron charge with  $V_T = kT/q$  being the thermal voltage, and  $V_{DS}$  is the drain-source voltage.

By including the body effect (i.e., body-source voltage  $V_{BS} \neq 0$ ), Equation 1.2 becomes as below:

$$I_{sub} = \frac{W}{L} I_0 \exp\left(\frac{V_{GS} - V_{TH0} + \lambda_{BB} V_{BS}}{nkT/q}\right) \left(1 - \exp\left(\frac{-V_{DS}}{kT/q}\right)\right)$$
(1.3)

where  $V_{TH0}$  is the zero-bias threshold voltage and  $\lambda_{BB}$  is the threshold voltage body coefficient, considering that the threshold voltage  $V_{TH}$  is expressed by  $V_{TH0} - \lambda_{BB}V_{BS}$ .

Finally, considering  $V_{DS} > 4V_T$  (with  $V_T \cong 26 \, mV$  at room temperature), the term  $\exp\left(\frac{-V_{DS}}{V_T}\right)$  approaches zero, thus leading to the following simplified expression:

$$I_{sub} = \frac{W}{L} I_0 \exp\left(\frac{V_{GS} - V_{TH0} + \lambda_{BB} V_{BS}}{nkT/q}\right)$$
(1.4)

Analysing the different current contributions in the MOSFET (highlighted in Fig. 1.6) is fundamental to achieve a better insight for the circuit designer, especially in designing subthreshold circuits.

The current I1 consists of the subthreshold current between source and drain depending on the applied gate voltage which create a weak inversion layer in the channel. An additional effect for this current is given by the Drain-Induced Barrier Lowering (DIBL), which occurs when a high drain voltage is applied to a short-channel device.

The current I2 is the channel punch-through. When the channel is short and the drain voltage is high, the depletion regions of drain and source approach each other. As a result, the gate voltage loses control over the channel.

The current I3 represents the Gate-Induced Drain Leakage (GIDL). This is the result of a high electric field on the gate-drain overlap region. As a consequence, the depletion width of the drain to substrate n-p junction is reduced [50]. Carriers are generated in the substrate and drain from the direct band-to-band tunnelling. This phenomenon is more prominent in technology with thin oxide that works with high gate voltage.

The current I4 is the reverse n-p junction leakage and could affect also the source-body n-p junction if the source voltage is higher than the body voltage. It has two components: the minority carrier diffusion/drift near the edge of the depletion region and the electron-hole pair generation in the depletion region of the reverse-biased junction. If both n- and p-regions are heavily doped, Band-To-Band Tunnelling (BTBT) can also be present.

The current I5 refers to oxide leakage tunnelling with two contributions. The first one is due to the high electric field which results in the tunnelling of electrons from the inverted substrate-to-gate and also from the gate-to-substrate through the oxide. The second contribution is the gate current due to hot carrier injection. If a region with a high electric field is located near the oxide-channel interface, some of the electrons or holes can gain sufficient energy from the field to cross the interface potential barrier and enter the oxide layer.

#### 1.2.2 Impact of process variability on subthreshold circuits

Due to the relentless miniaturization of MOSFET devices, process variability has become a critical issue in the design of VLSI circuits. Within a chip, the main source of variability comes from the MOSFETs, but also interconnections, resistors, capacitors and diodes contribute to performance variability. Increasing process variations provides a primary opposition to voltage scaling and limits the achievable power reduction.

CMOS technology includes two major types of process variability: local (intra-die) and global (inter-die). Local variability is related to the differences of identical MOSFETs across a short distance. Global variability refers to changes for identical MOSFETs separated by a longer distance (i.e., on a different die) or fabricated at a different time [51].

A different classification is given by [52], where process variability is categorized to withindie, die-to-die and wafer-to-wafer. Within-die and die-to-die classifications reflect some of the spatial characteristics of the variations. Those that vary rapidly over small distances (< die size) are called within-die, whereas variations that change gradually over the wafer will cause dieto-die variations. Wafer-to-wafer variations reflect both the spatial and temporal characteristics of the process and cause different wafers to have different properties.

From the designer's point of view, the process variability is typically divided into two categories.

The local variability or within-die is considered a mismatch, which leads similar MOSFETs within a circuit to exhibit different characteristics of each other. On the other hand, the global or process variability is typically described as a same threshold voltage change for all the MOSFETs within a circuit.

The local variation or mismatch ( $\sigma$ ) of MOSFET parameters such as channel dopant concentration, mobility, and gate oxide thickness is typically modelled considering an area dependency according to the Pelgrom's law [53], as given by

$$\sigma \propto \frac{1}{\sqrt{WL}} \tag{1.5}$$

Qualitatively, local variations decrease as the device size increases since the parameters mediate over a greater distance or area. According to (1.5), the solution to reduce the impact of mismatch is the increase of the MOSFETs' size, but at the cost of greater area occupation.

Conversely, the global variability is modelled by different worst-case process corners. Such corners are generated considering slow (S), i.e., less conductive, and fast (F), i.e., more conductive, nMOSFETs and pMOSFETs. On one hand, when a MOSFET is defined fast, this means that its threshold voltage is reduced by the variability. On the other hand, a slow MOSFET has higher threshold voltage with respect to the nominal one. As a result, we have four different corners with respect to the nominal one given by typical (T) nMOSFETs and pMOSFETs (i.e., TT corner): (*i*) SS corner given by slow nMOSFETs and pMOSFETs to model the worst-case condition for the speed, (*ii*) FF corner given by fast nMOSFETs and pMOSFETs to model the worst-case condition for the power consumption, (*iii*) the SF corner given by slow nMOSFETs and fast pMOSFETs to model the worst-case condition for the logic "zero" in digital circuits, and (*iv*) the FS corner given by fast nMOSFETs and slow pMOSFETs to model the worst-case condition for the logic "one" in digital circuits. Note that SF and FS corner are also considered in the design and analysis of analog circuits employing both MOSFET typologies.



Fig. 1.7: Measurement results for an nMOSFET (regular threshold voltage RVT device,  $L = 1 \mu m$ ,  $W = 2 \mu m$ , T = 21°C,  $V_{BS} = 0$  V and  $V_{DS} = 0.5$  V) from corner wafer chips.

As an example, Fig. 1.7 shows the experimental current-voltage characteristics of an RVT (regular threshold voltage) nMOSFET with  $L = 1 \mu m$  and  $W = 2 \mu m$  extracted from corner wafer chips implemented in a commercial 180-nm CMOS technology.

According to (1.2)-(1.4), in the subthreshold regime, the drain current depends exponentially on the  $V_{TH}$ . Therefore, a  $V_{TH}$  change owing to the process variations translates into a significant change in the MOSFET conductivity. This requires specific precautions and circuital solutions when designing subthreshold circuits for ULP/ULV applications. In general, the impact of process variations can be mitigated by using MOSFETs of the same type [54]. Another solution concerns the use of post-silicon trimming to adjust the circuit performance after fabrication. In addition, automatic trimming solutions can be also implemented by exploiting a process sensor (as discussed in Chapter 3) or implementing feedback/feedforward control in the circuit design (as discussed in Chapter 2).

# Chapter 2: Design of CMOS Voltage and Current References for Highly-Uncertain Harvesting

Voltage and current references are key building blocks in IoT sensor nodes, often expected to be always-on to ensure proper circuit biasing and robust system operation at all times, regardless of process, voltage and temperature (PVT) variations. To align with the requirements of millimetre-scale harvested systems, always-on reference circuits have to exhibit a power consumption below the nW range, to avoid eating up a significant fraction of the requested minimum power. Similarly, their minimum supply voltage has to be kept as low as possible. This chapter firstly presents the design of a compact NMOS-only voltage reference that is able to operate down to a 0.25-V supply voltage and 5.4-pW power consumption at room temperature. The presented voltage reference is based on a body biasing scheme assisted by replica well biasing to compensate voltage and temperature fluctuations. Detailed circuit analysis is provided, along with trimming-less measurement results on a 180-nm test chip to demonstrate the effectiveness of the proposed approach. Then, the design of a current reference circuit exploiting the structure used for the voltage reference is also presented and validated by means of silicon measurements on a 180-nm prototype.

## 2.1 CMOS Voltage Reference down to 0.25-V, 5.4-pW Operation

In prior art, voltage references are categorized into CMOS and bandgap (BGR) topologies [54]-[67]. Under subthreshold operation, CMOS voltage references [54]-[61] typically allow sub-1 V operation, lower power and smaller area. As a drawback, their accuracy is limited by the susceptibility of the transistor threshold voltage to PVT variations. Conversely, BGR references [62]-[66] exhibit better accuracy at the cost of higher minimum operating voltage, area occupation, process requirements (e.g., availability of BJT transistors), and power in the tens of nWs. Hybrid designs [67]-[68] were also proposed with the aim of covering the intermediate gap between CMOS and BGR solutions, but with  $V_{DD,min}$  (~1 V) higher than CMOS references.

Here, an NMOS-only voltage reference to enable both ultra-low  $V_{min}$  and  $P_{min}$  for adoption in resource-constrained sensor nodes, along with compact area and trimming suppression for low cost, while achieving a competitive absolute accuracy, is presented [C1]. The proposed circuit

employs body biasing for the compensation of environmental (i.e., temperature and voltage) fluctuations, as assisted by replica deep n-well biasing for the temperature-dependent leakage suppression of p-well/deep n-well parasitic diodes. Experimental results on 180-nm test chips demonstrate operation from 1.8 V down to 0.25 V, 5.4-pW power consumption, 2,200- $\mu$ m<sup>2</sup> area, while achieving a 2.8-mV absolute accuracy lower than prior art.

#### 2.1.1 Proposed architecture and operating principle

The proposed voltage reference is based on an 8-transistor circuit comprising a body bias generation block, a deep n-well replica bias, and a core reference generation block, as shown in Fig. 2.1. As generated by the latter block, the output reference voltage  $V_{REF}$  is set by the strength ratio of transistors M4 and the diode-connected stacked transistors M1-M3. Being  $V_{REF}$  positive and the gate of M4 connected to ground, transistor M4 is reverse gate-biased (i.e., it has negative gate-source voltage) and hence conducts a current that is well below the regular transistor leakage. This leads to sub-leakage power consumption, as targeted for highly-uncertain harvesting systems.



Fig. 2.1: Schematic of the proposed voltage reference.

Concerning the body bias generation block in Fig. 2.1, the p-well body enclosing transistors

M1-M3 is biased through a voltage  $V_B$  that is generated by the body bias generation transistors M5-M6. These two-transistor block mimics the structure of the core reference generation block, to track its die-to-die and environmental variations. Indeed, transistor M6 is reverse gate-biased, and hence draws a sub-leakage power. As main difference between the two blocks, M1-M3 are stacked to reduce the strength of their equivalent transistors M1-3 in Fig. 2.1 and hence pushing  $V_{REF}$  closer to the mid-supply point, compared to  $V_B$ . In other words, the number of stacked transistors can be used as a knob to control  $V_{REF}$  amplitude.

With regards to the well replica bias block in Fig. 2.1, transistors M7-M8 replicate the voltage  $V_B$  and drive the deep n-well of transistors M1-M3. As a consequence, the p-well/deep n-well parasitic diode D<sub>PW-DNW</sub> at the body terminal of M1-M3 is subject to a zero voltage, thus eliminating its leakage current. This makes its loading effect on  $V_{REF}$  negligible and suppresses its exponential temperature dependence, which is different from the sub-threshold leakage temperature dependence in M1-M4.

Overall, the body biasing of M1-M3 through the generation of  $V_B$  improves the robustness of  $V_{REF}$  against voltage and temperature fluctuations. To gain a deeper insight, the core reference generation block can be simplified by lumping the stacked transistors M1, M2 and M3 by an equivalent transistor M1-3, as shown in Fig. 2.1. From the above considerations, all transistors work in the subthreshold region with a drain-source voltage  $V_{DS} > 4V_T$ . Therefore, according to Equation (1.4) (see Chapter 1), the resulting drain current of M1-3 including the body effect is given by:

$$I_{M1-3} = \frac{W_{M1-3}}{L_{M1-3}} I_{0,M1-3} exp\left(\frac{V_{GS} - V_{TH}}{nkT/q}\right) = \frac{W_{M1-3}}{L_{M1-3}} I_{0,M1-3} exp\left(\frac{V_{REF} - V_{TH0,M1-3} + \lambda_{BB,M1-3}V_B}{n_{M1-3}kT/q}\right) (2.1)$$

Similarly, the current delivered by M4 can be expressed by:

$$I_{M4} = \frac{W_{M4}}{L_{M4}} I_{0,M4} exp\left(\frac{-V_{REF} - V_{TH0,M4} + \lambda_{BB,M4} V_{REF}}{n_{M4} V_T}\right).$$
(2.2)

By equating (2.1) and (2.2) due to the series connection of M1-3 and M4, we obtain the following  $V_{REF}$  expression:

$$V_{REF} = \frac{V_{TH0,M1-3} - \frac{n_{M1-3}}{n_{M4}} V_{TH0,M4} + n_{M1-3} \frac{kT}{q} ln(\frac{l_{0,M4}W_{M4}L_{M1-3}}{l_{0,M1-3}W_{M1-3}L_{M4}}) - \lambda_{BB,M1-3}V_B}{1 + \frac{n_{M1-3}}{n_{M4}}(1 + \lambda_{BB,M4})}$$
(2.3)

where the term  $\lambda_{BB,M1-3}V_B$  quantifies the body biasing compensation on M1-3 with  $V_B$  being expressed by:

$$V_B = \frac{V_{TH0,M5} - \frac{n_{M5}}{n_{M6}} V_{TH0,M6} + n_{M5} \frac{kT}{q} ln \left(\frac{l_{0,M6} W_{M6} L_{M5}}{l_{0,M5} W_{M5} L_{M6}}\right)}{1 + \frac{n_{M5}}{n_{M6}} (1 + \lambda_{BB,M6})}$$
(2.4)

From (2.3)-(2.4), the circuit architecture shown in Fig. 2.1 introduces the feedforward control path to compensate the effect of voltage and temperature fluctuations through body biasing, as discussed below.

The feedforward control path in Fig. 2.1 mitigates the effect of  $V_{DD}$  fluctuations by deriving a fraction of the fluctuation in  $V_{DD}$  via transistors M5-M6, and then modulating the body voltage  $V_B$  of M1-M3 to oppose against the change in  $V_{REF}$  due to the supply voltage fluctuation. Indeed, if  $V_{DD}$  increases,  $V_B$  ( $V_{REF}$ ) also increases due to the DIBL effect in M5-M6 (M4-M1-3), which effectively behaves like a voltage divider under small-signal analysis. Note that this second-order effect is not explicitly taken into account in (2.3) and (2.4), which capture the dominant effects on  $V_{REF}$  to preserve its simplicity. The secondary dependence of  $V_B$  and  $V_{REF}$  on  $V_{DD}$  via DIBL could be straightforwardly derived, although this would not be particularly useful for their understanding. The increase in  $V_B$  thus makes transistors M1-M3 stronger due to forward body biasing, as expressed by the  $\lambda_{BB,M1-3}V_B$  term in (2.3). Overall, this induces a decrease in  $V_{REF}$ , which counteracts its initial increase due to  $V_{DD}$ .

Similarly, the feedforward path in Fig. 2.1 also enables the compensation of the effect of temperature fluctuations on  $V_{REF}$  via body biasing of M1-M3. This can be understood by observing that the terms that are independent of  $V_B$  have the same temperature dependence as  $V_B$  from the comparison of (2.3) and (2.4), and their change is counteracted by the negative sign of the term  $-\lambda_{BB,M1-3}V_B$  that depends on  $V_B$ . Under the adopted transistor sizing strategy, the sum of the terms independent of  $V_B$  turns out to be proportional to absolute temperature (PTAT), whereas  $-\lambda_{BB,M1-3}V_B$  is complementary to absolute temperature (CTAT). In addition, the

suppression of the p-well/deep n-well junction leakage current via replica well biasing eliminates the related temperature dependence.

In prior art, body biasing has been previously used in a voltage reference as a knob to compensate the effect of temperature fluctuations [54], although based on a different circuit. As fundamental differences, in the proposed reference, body biasing is applied to diode-connected transistors, as opposed to zero- $V_{GS}$  transistors as in [54]. Also, the proposed reference adopts a reverse gate-biased active load M4 to reduce the bias current below leakage to achieve sub-leakage power consumption, as opposed to [54]. Furthermore, the reference in Fig. 2.1 generates the  $V_B$  voltage through a replica circuit M5-M6 of the core reference generation block M1-M4 for transistor process tracking, unlike [54] that also uses a different topology adopting inverted transistor connections. As further difference over [54], the proposed circuit improves line sensitivity thanks to the suppression of the voltage-dependent p-well/deep n-well leakage current through replica well biasing, as well as the reduced small-signal impedance towards ground of the diode-connected transistors M1-M3 (lower than that of zero- $V_{GS}$  transistors in [54]). Finally, the circuit in [54] requires a minimum body bias voltage of 0.4 V for PTAT-CTAT compensation, thus setting a hard lower bound that prohibits further  $V_{min}$  reductions.

#### 2.1.2 Measurement results in 180-nm process

A 180-nm test chip based on the proposed circuit was designed using transistors sizing and flavors as reported in Table 2.1. To achieve the targeted reference voltages in the targeted 100-mV range, the upper transistors M4, M6 and M8 need to be made significantly stronger than the lower transistors M1-M3, M5 and M7. To avoid the area increase and the effect of layout-dependent variations that would come with skewed strength ratios, a dual-threshold design approach was adopted. Low- $V_{TH}$  (LVT) devices were then used for the upper transistors, whereas regular- $V_{TH}$  (RVT) devices were adopted for the other transistors. Overall, the sizing in Table 2.1 makes M4 130× stronger than the stacked transistors M1-M3. Also, the channel length of M4 is set to the maximum value allowed by the design rules, in order to minimize its DIBL coefficient and hence improve the line sensitivity of the reference generation block. Finally, a 1.8-pF MIM (metal-insulator-metal) capacitance with  $30\mu$ m×30 $\mu$ m area was connected at the reference output node to improve the power supply rejection ratio (PSRR) of the circuit.

| Transistor | Туре | W/L          |
|------------|------|--------------|
| M1-M3      | RVT  | 3.4µm/1.28µm |
| M4         | LVT  | 2×17µm/10µm  |
| M5, M7     | RVT  | 1.5µm/7µm    |
| M6, M8     | LVT  | 5µm/5µm      |

Table 2.1: Adopted transistor flavor and sizing for the proposed voltage reference.

The fabricated test chip occupies a silicon area of 2,200  $\mu$ m<sup>2</sup> (48 $\mu$ m×46 $\mu$ m), as shown by the micrograph and the layout in Fig. 2.2. Wafer-level characterization across 30 dice was carried out by using a Cascade SUMMIT 11861B probe equipped with a Temptronic chuck temperature controller. Static and dynamic measurements were carried out with a Keithley 4200-SCS parameter analyzer and a Tektronix TDS74004B digital oscilloscope.



Fig. 2.2: Die photo and layout of the proposed voltage reference.

The reference voltage measured across voltages and temperatures is plotted in Figs. 2.3(a)-(b) for a typical sample. In the test chip characterization, the supply voltage is swept from 0 V to 1.8 V, and the temperature from 0 °C to 120 °C. Fig. 2.3(a) shows the reference voltage  $V_{REF}$  versus the supply voltage  $V_{DD}$  at several temperatures. From Fig. 2.3(a), we can observe the proposed voltage reference starts working properly at supply voltages as low as 250 mV, regardless of the operating temperature. The resulting output voltage is  $V_{REF} \approx 91$  mV at room temperature, and has an absolute temperature coefficient of 25  $\mu$ V/°C at 0.25 V and across the above temperature range. This corresponds to a relative temperature coefficient of 274 ppm/°C, when normalized to the mean value. Furthermore, the effect of voltage fluctuations in the wide 0.25-1.8 V range is minor, as shown by the vertical patterns in the color map across voltages and temperatures in Fig. 2.3(b).



Fig. 2.3: (a) Measured reference voltage vs. supply voltage at several temperatures, (b) color map of reference voltage across supply voltages and temperatures.

Fig. 2.4 shows the line sensitivity dependence on the minimum supply voltage  $V_{min}$ , as evaluated across the voltage range from  $V_{min}$  up to the nominal voltage of 1.8 V at 25 °C and 120 °C. This plot shows that the line sensitivity below 0.25 V increases significantly, whereas it is consistently below 140  $\mu$ V/V at any voltage above. This sets  $V_{min}$  to 0.25 V and results into a competitive relative line sensitivity of 0.15%/V, regardless of the operating temperature.



Fig. 2.4: Measured line sensitivity vs. minimum supply voltage  $V_{min}$  (evaluated in the voltage range from  $V_{min}$  up to the 1.8 V nominal voltage) at 25°C and 120°C.

The measured voltage and temperature dependence of the power consumption is reported in Figs. 2.5(a)-(b). In particular, the power consumption versus  $V_{DD}$  at room temperature is shown

Fig. 2.5(a). The power consumption at  $V_{min}$  is 5.1 pW, and it increases nearly linearly over the entire voltage range at a 23.6 pW/V rate. This is expected from the nearly voltage-independent current in (2.2), and leads to a power increase consumption of less than one order of magnitude (8.2×) even at 1.8 V. Conversely, the current drawn from the supply has a stronger dependence on the temperature, given the transistor operation in the sub-threshold region. In detail, from Fig. 2.5(b) the power consumption exponentially increases with temperature at a rate of 1.049X/°C, which is expectedly close to the 1.051X/°C increase rate of the transistor leakage.



Fig. 2.5: Measured power consumption versus (a) the supply voltage at 25 °C, (b) temperatures at  $V_{DD}$ = 0.25 V for a typical sample.

The dynamic characterization of the voltage reference is presented in Figs. 2.6-2.7. The PSRR at room temperature is plotted versus frequency in Fig. 2.6, which shows that the low-frequency (up to 100 Hz) PSRR is approximately -70 dB. The PSRR further improves at higher frequencies, reaching -83.5 dB at 10 kHz. Fig. 2.7 thus illustrates the measured start-up waveform of  $V_{REF}$  for  $V_{DD} = 0.25$  V and different temperatures. The resulting settling time of the output reference voltage quantifies how quickly the steady state is reached, after a harvester power outage or system wake-up take place [67]. The settling time to reach 95% of the final value of  $V_{DD}$  in the inset of Fig. 2.7 shows that such settling time decreases at higher temperatures, owing to the larger current from transistors M1-M4 in Fig. 2.1 that charges the output capacitance. Again, the temperature dependence of the settling time is expectedly exponential, in view of the subthreshold operation of M1-M4.



Fig. 2.6: Measured power supply rejection ratio (PSRR) vs. frequency at 25°C for a typical sample.



Fig. 2.7: Measured start-up waveform of  $V_{REF}$  for different temperatures at  $V_{DD} = 0.25$  V. The inset shows the 5% settling time vs. temperature.

The robustness of the proposed voltage reference against process variations was also investigated by characterizing 30 die samples from the same lot. The reference voltage at  $V_{DD}$ =0.25 V is plotted versus temperature in Fig. 2.8, which shows a consistent trend across dice with a standard deviation from 0.44 mV to 0.63 mV within the entire temperature range. At room temperature, the mean value  $\mu$  of  $V_{REF}$  and its standard deviation  $\sigma$  are respectively 91.4 mV and 0.51 mV, leading to a process sensitivity  $\sigma/\mu$  of 0.56% as reported in the histogram in Fig. 2.9(a).



Fig. 2.8: Measured  $V_{REF}$  vs. temperature across 30 die samples at  $V_{DD} = 0.25$  V.



Fig. 2.9: Measurements histogram across 30 die samples: (a)  $V_{REF}$  at 25 °C and  $V_{DD}$ = 0.25 V, (b) temperature coefficient at  $V_{DD}$ =0.25 V, (c) power consumption at 25 °C and  $V_{DD}$ =0.25 V, (d) line sensitivity at 25°C.

The impact of process variations on the temperature coefficient *TC* across the 0-120 °C temperature range is reported in Fig. 2.9(b) for  $V_{DD}$ =0.25 V. The histogram shows a mean value of 24.2  $\mu$ V/°C corresponding to 265 ppm/°C. The standard deviation of *TC* is 4.1  $\mu$ V/°C, which corresponds to the rather limited variability of 17%.

The effect of process variations on the power consumption at room temperature is illustrated in Fig. 2.9(c). The power shows a mean value of 5.4 pW and a standard deviation of 0.2 pW,

which lead to a very limited variability of 3.7%. Hence, the low power consumption of the proposed reference is highly consistent across dice.

The line sensitivity dependence on process variations across the entire 0.25-1.8 V range is quantified in Fig. 2.9(d) at 25 °C. The resulting mean and standard deviation are respectively 144.5  $\mu$ V/V and 49.2  $\mu$ V/V, which correspond to 0.16 %/V and 0.05 %/V in relative terms, and a fairly pronounced 34% variability. However, the resulting 3- $\sigma$  worst-case line sensitivity of 292.1  $\mu$ V/V is still very competitive across references with comparable power, as will be shown in the following.

Finally, the effectiveness of the proposed body biasing compensation on line sensitivity, *TC* and process sensitivity was assessed by characterizing the test chips by selectively activating or suppressing the feedforward control path in the circuit of Fig. 2.1

The effect of body biasing on the line sensitivity is shown in Figs. 2.10(a)-(b), where measurements across voltages were carried out by driving the body bias voltage  $V_B$  of M1-M3 in Fig. 2.1 by either the output of the body bias generator M5-M6 or the constant voltage corresponding to  $V_B$  that is generated by M5-M6 at  $V_{DD}$ =0.25 V. In the former case, the enablement of body biasing adaptation at every supply voltage leads to an increase in  $V_B$  by up to 0.3% from Fig. 2.10(a). The increase in  $V_B$  is linear as a function of the  $V_{DD}$ , thus confirming that M5-M6 essentially act like a linear voltage divider. Due to the operation in deep subthreshold regime, such small  $V_B$  increase with  $V_{DD}$  is sufficient to improve the mean value of the line sensitivity by 1.7×, and the worst case by 2.4×, as shown in the cumulative distribution function in Fig. 2.10(b).



Fig. 2.10: Benefit of body biasing compensation on line sensitivity at 25 °C: (a)  $V_B$  normalized to the value at  $V_{DD}$ =0.25 V vs.  $V_{DD}$  for a typical sample, b) cumulative distribution function (CDF) of line sensitivity across die samples.

Concerning the effect of the body biasing feedforward compensation, Fig. 2.11(a) shows that a temperature increase leads to an increase in  $V_B$  by up to 6% across the considered temperature range. Above room temperature,  $V_B$  has a relatively linear trend with a rate of 44  $\mu$ V/°C. Such temperature adaptation of  $V_B$  reduces the temperature coefficient by 1.5× compared to the case with constant  $V_B$ , as shown in Fig. 2.11(b).



Fig. 2.11: Benefit of body biasing compensation on temperature coefficient at  $v_{DD}$ =0.25 V: (a)  $v_B$  normalized to the value at 0°C vs. temperature, (b)  $v_{REF}$  normalized to the value at 0 °C vs. temperature for a typical sample.

Finally, the feedforward body biasing compensation has also a beneficial effect on the process sensitivity. As shown in Fig. 2.12, the cumulative distribution is more concentrated around the center and the mean value, thanks to body biasing. The resulting process sensitivity is reduced by  $1.2 \times$  with respect to the case without body biasing compensation.



Fig. 2.12: Benefit of body biasing compensation on process sensitivity, as shown by the cumulative distribution push towards the mean (25 °C,  $V_{DD}$ =0.25 V).

#### **2.1.3** Comparison with the state of the art

The comparison of the proposed voltage reference over the state-of-the-art voltage references is summarized in Table 2.2 and Figs. 2.13(a)-(b). The latter includes only measured data within the same lot (i.e., no measurements across corner wafers) and without trimming for a fair comparison.

The absolute accuracy of the reference voltage is evaluated in Table 2.2, which is more appropriate than the relative accuracy for the targeted harvesting applications. Indeed, voltage references traditionally have comparable  $V_{REF} \sim 1$  V, in which case both the relative and the absolute accuracy are equivalent. However, the relative accuracy is no longer fair when dealing with ULV applications, as the baseline voltage varies substantially across references. The absolute accuracy is generally more relevant and representative of the reference requirements in non-ratiometric sensing, signal thresholding/comparison, off-chip sensor readout, sub-threshold biasing (due to the exponential I-V transistor characteristics) where the relative error in the transistor current is actually set by the absolute bias voltage error, energy source

monitoring (e.g., harvesting, battery), and data converters (as full-scale LSB accuracy is set by the absolute reference voltage) [38]. Accordingly, Table 2.2 reports both absolute and relative metrics, where the accuracy of  $V_{REF}$  is evaluated by considering 3- $\sigma$  process variations, 0.3-V harvested voltage fluctuation (e.g., a solar cell from low to intense light), and 50-°C temperature deviation.

For fair comparison, the tradeoff between power,  $V_{min}$  and the reference voltage accuracy in prior art in the same 180-nm technology generation is summarized in Figs. 2.13(a)-(b). From Figs. 2.13(a), the proposed reference exhibits the lowest power consumption of 5.4 pW, which is nearly the same as [55], 5.8-6,852× lower than other prior art in the same technology, and 12.9× higher than [60] in a different technology from Table 2.2. At the same time, the proposed voltage reference shows the lowest  $V_{min}$  of 250 mV, which is lower than prior art by 1.6-5.6×. The unique combination of pW-range consumption and low  $V_{min}$  allows reliable operation in energy-harvested systems, even under highly-uncertain environmental conditions. As further consideration related to the supply, the PSRR is competitive and the second best from Table 2.2 with an advantage ranging from 11 dB to 29 dB over [54]-[64], [67] and [68], improving the resilience against supply noise by one to three orders of magnitude. Overall, the ability to operate across the entire 0.25-1.8 V voltage range with power in the pW range (or few tens of pWs at 1 V or above) also simplifies power management, eliminating any voltage regulation at the system level.

Table 2.2 also shows that the proposed reference exhibits a lower silicon area than other 180nm demonstrations excepting [55] and [58]. The low silicon area, the voltage regulator suppression and pW operation thus make the proposed reference well suited for low-cost energy-harvested and directly-harvested systems able to run under a wide range of environmental conditions.

Table 2.2 also reports that the proposed reference has the lowest absolute process sensitivity of 0.51 mV, which corresponds to a 2.6-18.8× improvement over prior art. From the same table, the absolute temperature coefficient *TC* of 24.2  $\mu$ V/°C is equivalent or better than most CMOS references excepting the slightly better CMOS references [55] (which did not report it for the 180-nm test chip) and [59]. The achieved *TC* is expectedly worse than achieved by some bandgap references in the 10-15  $\mu$ V/°C range [64], [66]. Regarding the inaccuracy contribution due to supply voltage fluctuations, the absolute line sensitivity of 144.5  $\mu$ V/V is also lower than

most prior demonstrations including some bandgap and hybrid references [54], [56]-[58], [60]-[64], with an improvement of 2.3-68.9×. The line sensitivity is equivalent to [55] and [68] at iso-technology, and  $11.8\times$  worse than the best-in-class bandgap reference [66], which in turn exhibits a power consumption of four orders of magnitude higher than the proposed reference.

Finally, an overall  $V_{REF}$  absolute accuracy of 2.8 mV is achieved across PVT variations from Table 2.2, which outperforms prior art in 180 nm by 1.7-11.6× from Fig. 2.13(b). Accordingly, the proposed reference achieves a favorable tradeoff between power, minimum voltage and accuracy, preserving accuracy in low-cost and pW-power directly-harvested systems.

|                                                       | r                   |                   | 1000                        | 1000                       | Taga                       | mayar                         | maiar                         | FRAGTER           | 1000            | Tagaa                             | 100.00          | 100.00                   |                                    |                                | 1000                              |
|-------------------------------------------------------|---------------------|-------------------|-----------------------------|----------------------------|----------------------------|-------------------------------|-------------------------------|-------------------|-----------------|-----------------------------------|-----------------|--------------------------|------------------------------------|--------------------------------|-----------------------------------|
|                                                       | This                | VLSI<br>2016      | JSSC<br>2012                | JSSC<br>2011               | JSSC<br>2017               | TCAS-I<br>2017                | TCAS-I<br>2019                | ESSCIRC<br>2017   | JSSC<br>2019    | ISSCC<br>2015                     | ISSCC<br>2017   | ISSCC<br>2015            | VLSI<br>2017                       | VLSI<br>2019                   | JSSC<br>2019                      |
|                                                       | work                | [54]              | [55]                        | [56]                       | [57]                       | [58]                          | [59]                          | [60]              | [61]            | [62]                              | [63]            | [64]                     | [66]                               | [67]                           | [68]                              |
| tech.<br>[nm]                                         | 180                 | 180               | 130/180                     | 180                        | 180                        | 180                           | 180                           | 65                | 65              | 130                               | 180             | 350                      | 180                                | 180                            | 180                               |
| type                                                  | CMOS                | CMOS              | CMOS                        | CMOS                       | CMOS                       | CMOS                          | CMOS                          | CMOS              | CMOS            | BGR                               | BGR             | BGR                      | sub-BGR                            | hybrid                         | hybrid                            |
| transistor<br>flavor(s)                               | N (RVT),<br>N (LVT) | P (RVT)           | N (native),<br>N* (HVT)     | N, P<br>(RVT),<br>N* (HVT) | N<br>(native),<br>P* (HVT) | N*<br>(HVT),<br>N, P<br>(RVT) | N*<br>(HVT),<br>N, P<br>(RVT) | N, P<br>(RVT)     | N, P<br>(RVT)   | N, P<br>(RVT),<br>N (LVT),<br>BJT | N, P<br>(RVT)   | N, P<br>(RVT),<br>BJT    | N* (HVT),<br>N, P<br>(RVT),<br>BJT | N (RVT),<br>N*(native),<br>BJT | N<br>(native),<br>P (RVT),<br>BJT |
| active area<br>[µm <sup>2</sup> ]                     | 2,200               | 4,880             | 1,350/1,425                 | 43,000                     | 2,500                      | 2,000                         | 55,000                        | 104               | 3,256           | 26,400                            | 55,000          | 480,000                  | 34,000                             | 7,600                          | 4,500                             |
| $V_{min}$ [V]                                         | 0.25                | 1.2               | 0.5/0.5                     | 0.45                       | 1.4                        | 0.45                          | 0.7                           | 0.4               | 0.5             | 0.5                               | 1.3             | 1.25                     | 0.8                                | 0.9                            | 1.0                               |
| power<br>[pW]                                         | 5.4                 | 114               | 2.2/5.5                     | 2,600                      | 33.6                       | 54.8                          | 28,000                        | 0.42              | 12.2            | 32,000                            | 9,300           | 28,700                   | 37,000                             | 31.4                           | 192                               |
| $V_{REF}$ [mV]                                        | 91.4                | 986.2             | 176.8/328.4                 | 257.5                      | 1,250                      | 225.3                         | 368.4                         | 342.8             | 210.1           | 498                               | 1,238           | 1,176                    | 240                                | 736                            | 692.6                             |
| PSRR<br>[dB]                                          | -70<br>@100Hz       | -42<br>@100Hz     | -53/-49<br>@100Hz           | -45<br>@100Hz              | -41<br>@100Hz              | -44<br>@100Hz                 | -59<br>@10Hz                  | NA                | NA              | -52<br>in DC                      | -46<br>@100Hz   | NA                       | -81<br>@50Hz                       | NA                             | -55<br>@100Hz                     |
| process<br>sensitivity <sup>**</sup><br>[mV]<br>([%]) | <b>0.51</b> (0.56)  | 2.56<br>(0.26)    | 1.27/N.A.<br>(0.72/N.A.)    | 10<br>(3.88)               | 10<br>(0.8)                | 1.42<br>(0.63)                | 1.31<br>(0.36)                | 16.8<br>(4.9)     | 9.6<br>(4.6)    | 3.3<br>(0.66)                     | 5.32<br>(0.43)  | 2.35<br>( <b>0.2</b> )   | 2.04<br>(0.85)                     | 4.17<br>(0.57)                 | 1.45<br>(0.21)                    |
| <i>TC</i><br>[μV/°C]<br>([ppm/°C])                    | 24.2<br>(265)       | 84.8<br>(86)      | 10.9/N.A<br>(62/N.A)        | 42.5<br>(165)              | 28.8<br>(23)               | 23.4<br>(104)                 | 15.9<br>(43.1)                | 86.5<br>(252.2)   | 26.0<br>(123.7) | 37.4<br>(75)                      | 32.2<br>(26)    | 15.0<br>( <b>12.75</b> ) | <b>10.1</b> (42.1)                 | 19.9<br>(27)                   | 22.9<br>(33)                      |
| temperature<br>range<br>(°C)                          | 0-120               | -40 - 85          | -20 - 80                    | 0 - 125                    | 0 - 100                    | 0-120                         | -40 -125                      | -40 - 60          | 0 - 100         | 0-80                              | 0-110           | -10 - 110                | -20 - 120                          | 0-170                          | -20 -<br>100                      |
| line<br>sensitivity<br>[µV/V]<br>([%/V])              | 144.5<br>(0.16)     | 3,747.6<br>(0.38) | 58.2/144.5<br>(0.033/0.044) | 1,143.3<br>(0.444)         | 3,875<br>(0.31)            | 338<br>(0.15)                 | 99.5<br>(0.027)               | 1,611.2<br>(0.47) | 630.3<br>(0.3)  | 9,960<br>(2)                      | 990.4<br>(0.08) | 2,328.5<br>(0.198)       | 12.2<br>(0.0051)                   | 1,987.2<br>(0.27)              | 138.5<br>(0.02)                   |
| overall<br>accuracy***<br>[mV]<br>([%])               | <b>2.8</b> (3.0)    | 13.1<br>(1.3)     | 4.4/NA<br>(2.5/NA)          | 32.5<br>(12.6)             | 32.6<br>(2.6)              | 5.5<br>(2.5)                  | 4.8<br>(1.3)                  | 55.2<br>(16.1)    | 30.3<br>(14.4)  | 14.7<br>(3.0)                     | 17.9<br>(1.4)   | 8.5<br>( <b>0.7</b> )    | 6.6<br>(2.8)                       | 14.1<br>(1.9)                  | 5.5<br>(0.8)                      |
| * thick-oxide tra                                     | ansistor            |                   |                             |                            |                            |                               |                               |                   |                 |                                   |                 |                          |                                    |                                | ļ                                 |

\*\* within-lot measurements for fair comparison

accuracy (mV) = process sensitivity (mV)·3 + line sensitivity (mV/V)·0.3 V + TC (mV/°C)·50 °C (i.e., 3 $\sigma$  within-lot variations, 0.3-V voltage change, and 50 °C temperature deviation)

Table 2.2: Performance summary and comparison with the state-of-art voltage references (best

performance in bold).



Fig. 2.13: (a) Power and (b) absolute  $V_{REF}$  accuracy vs.  $V_{min}$  in state-of-the-art voltage references fabricated in the same 180-nm technology generation.

## 2.2 From a Voltage reference to a Current Reference

Current references are key building blocks of analog and mixed-signal circuits to be used in IoT sensor nodes. For instance, one of their main tasks is to fix the bias point of the amplifier stages. Therefore, the aforementioned constraints for energy-harvested IoT systems (i.e., low standby power consumption, low voltage operation, small area) are clearly transferred to the design specifications of current references. Here, an NMOS-only current reference exploiting the structure used for the above-described voltage reference is presented and validated by means of silicon measurements on a 180-nm test chip. Experimental results demonstrate operation from 1.8 V down to 0.6 V and 72-pW static power consumption, while ensuring 4,000- $\mu$ m<sup>2</sup> area occupancy.

### 2.2.1 Proposed architecture and operating principle

A feasible and quite intuitive solution to design a current reference circuit is to exploit a voltage reference architecture as voltage generator to bias an output load transistor, while compensating to first order the temperature dependence of its drain current ( $I_D$ ) [39]. As shown in Fig. 2.14(a), the latter exhibits two opposite trends. On one hand, at low  $V_{GS}$ ,  $I_D$  increases with the temperature, mainly ascribed to the decrease of the  $V_{TH}$  and the increase of the thermal voltage ( $V_T$ ). Conversely, at high  $V_{GS}$ ,  $I_D$  decreases with the temperature owing to the decrease of the charge carrier mobility. The transition point between these two operating regions is usually referred as zero-temperature-coefficient (ZTC) point [39], where the temperature dependence of the current is ideally zero due to the compensation of the above opposite effects. In practice, as shown in Fig. 2.14(b), the *TC* reaches a minimum that is not properly zero for a specific  $V_{GS}$ . Accordingly, this point can be also referred as minimum-TC (MTC) voltage point ( $V_{MTC}$ ), instead of ZTC point.



Fig. 2.14: (a)  $I_D$ - $V_{GS}$  characteristics of a typical nMOSFET at different temperatures, (b) temperature coefficient *TC* versus  $V_{GS}$ , and (c) required  $\Delta V_{GS}$  at different  $V_{GS}$  for temperature compensation [39].

A standard approach consists of designing a current reference circuit able to reach and maintain the MTC point as operating point for the output load transistor to ensure a *TC* as low as possible. Typically, this approach to implement the temperature compensation relies on the use of a voltage reference with an output voltage equal to the  $V_{MTC}$ , while the load transistor converts this  $V_{MTC}$  at its gate terminal into the reference current (*I*<sub>REF</sub>) at its drain terminal. This design approach requires a voltage reference circuit to generate the precise bias point for the output MOSFET close to the  $V_{MTC}$ . An alternative approach consists of replacing the voltage reference at  $V_{MTC}$  with a circuit that generates an output voltage (*V*<sub>X</sub>) whose temperature dependence compensates to first order the one of the drain current of the load transistor [39]. To this aim, for  $V_{GS} < V_{MTC}$  a complementary-to-absolute-temperature (CTAT) voltage has to be used, as shown in Fig. 2.14(c). On the contrary, for  $V_{GS} > V_{MTC}$ , a proportional-to-absolute-temperature (PTAT) voltage is required. By adopting this alternative approach, the load transistor can be ideally biased at any  $V_{GS}$ , but in practice, it has to be biased in the proximity of  $V_{MTC}$  to reach low *TC* values.

By following the above-described design approach, Fig. 2.15 shows the conceptual diagram and the schematic of the proposed NMOS-only current reference circuit. It consists of a voltage

generator block, whose output voltage (Vx) is used as gate bias for a load transistor M<sub>LOAD</sub> (see Fig. 2.15(a)). As shown in Fig. 2.15(b), the voltage reference presented in Section 2.1 is exploited for the voltage generator, i.e., M1-M7 transistors, with the only difference of using two stacked transistors (M1-M2) instead of three in the core reference generation block. Indeed, the pW power consumption of the voltage reference of Section 2.1 along with the small area and excellent absolute accuracy make it a good candidate for a precise bias block in a current reference design. More specifically, since the proposed current reference is based on the voltage reference of Section 2.1, it inherits its intrinsic advantages, such as reduced area occupation, low voltage and low power consumption. From Fig. 2.15(b), it is also worth noting that M<sub>LOAD</sub> is implemented by a stack of 9 series-connected transistors, i.e., M8(1)-(9). This choice aims at overcoming the limitation in the maximum value of the channel length allowed by the adopted technology in order to minimize the *I<sub>REF</sub>* sensitivity to the drain voltage of M<sub>LOAD</sub> (i.e., *VouT*), representing the load sensitivity of the current reference.



Fig. 2.15: (a) Conceptual diagram and (b) schematic of the proposed current reference.

According to the circuit of Fig. 2.15(b),  $V_X$  and  $V_B$  are analytically expressed by Equations (2.3) and (2.4) of Section 2.1, respectively, where for  $V_X$  (corresponding to  $V_{REF}$  of the voltage reference circuit) the two stacked transistors M1 and M2 are lumped by an equivalent transistor M1-2. With a proper choice of transistor sizing and flavors, it is then possible to achieve a quite precise  $V_X$  value (by exploiting the body biasing control through  $V_B$ ) close to the  $V_{MTC}$  of the load transistor and with a temperature dependence (CTAT or PTAT) which compensates to first order the one of the drain current of M<sub>LOAD</sub>.

The static power consumption ( $P_{static}$ ) of the current reference of Fig. 2.15 is given by the sum of two contributions, as expressed by [39]

$$P_{static} = V_{DD}I_{DD} + V_{OUT}I_{REF}$$
(2.5)

where  $I_{DD}$  is the supply current referring to the voltage generator block in Fig. 2.1. From (2.5), it is possible to define as a FoM the ratio  $I_{TOT}/I_{REF}$  [69]-[70] where  $I_{TOT} = I_{DD} + I_{REF}$ , which represents a measure of the power efficiency of a current reference. Indeed, the closer  $I_{TOT}/I_{REF}$  to unit, the lower the current absorbed from the supply needed to generate the reference current, thus ensuring an optimum power consumption.

#### 2.2.2 Measurement results in 180-nm process

A 180-nm prototype based on the current reference circuit of Fig. 2.15 was designed using transistors sizing and flavors as reported in Table 2.3. In order to enable the above design approach (i.e.,  $V_X$  close to the  $V_{MTC}$  of the load transistor), a dual-threshold design technique was adopted [39]. More specifically, a native- $V_{TH}$  (NVT) device with low  $V_{MTC}$  (about 300 mV) was used for the load transistor to achieve low-voltage operation. At the same time, to obtain  $V_X \approx V_{MTC}$ , in the voltage generator block the upper transistors M3, M5 and M7 need to be significantly stronger than the lower transistors, whereas RVT devices were adopted for the other transistors. Note that, under the adopted transistor sizing strategy, the voltage generator block generates  $V_X$  slightly larger than the  $V_{MTC}$  of  $M_{LOAD}$ , with a PTAT behavior for first-order compensation of the temperature dependence of the *I\_{REF*} at such bias (according to Fig. 2.14). In addition, the channel length of M8(1)-(9) is set to the maximum value allowed by the design rules, in order to improve the load sensitivity. Finally, two 1.8-pF MIM capacitances with

 $30\mu$ m× $30\mu$ m area was connected at the  $V_X$  and  $V_B$  nodes to improve the dynamic response of the circuit.

| Transistor | Туре | W/L       |
|------------|------|-----------|
| M1-M2      | RVT  | 1µm/20µm  |
| M3         | NVT  | 35µm/10µm |
| M4, M6     | RVT  | 30µm/2µm  |
| M5, M7     | NVT  | 5µm/10µm  |
| M8(1)-(9)  | NVT  | 1µm/20µm  |

Table 2.3: Adopted transistor flavor and sizing for the proposed current reference.

The fabricated test chip occupies a silicon area of about 4,000  $\mu$ m<sup>2</sup> (41 $\mu$ m×97 $\mu$ m), as shown by the die micrograph and the layout in Fig. 2.16. Wafer-level characterization across 15 dice from the same lot was carried out by using a Cascade SUMMIT 11861B probe (equipped with a Temptronic chuck temperature controller) and a Keithley 4200-SCS parameter analyzer.



Fig. 2.16: Die photo and layout of the proposed current reference.

The reference current measured across voltages and temperatures is plotted in Figs. 2.17(a)-(b) for a typical sample. In the test chip characterization, the supply voltage is swept from 0 V up to 1.8 V, and the temperature from 0 °C up to 100 °C, while setting  $V_{OUT} = 0.6$  V (corresponding to the minimum operating voltage  $V_{min}$ ). In particular, Fig. 2.17(a) shows the reference current  $I_{REF}$  versus the supply voltage  $V_{DD}$  at several temperatures. From Fig. 2.17(a), we can note the

proposed circuit starts working properly at supply voltages as low as 0.6 V, regardless of the operating temperature. The resulting output current is  $I_{REF} \approx 190$  nA at room temperature, and has a temperature coefficient of 1473 ppm/°C at 0.6 V and across the above temperature range. In addition, we can observe that the effect of voltage fluctuations in the wide 0.6-1.8 V range is minor (line sensitivity of 0.153 %/V at room temperature), as shown by the vertical patterns in the color map across voltages and temperatures of Fig. 2.17(b).



Fig. 2.17: (a) Measured reference current vs. supply voltage at several temperatures, (b) color map of reference current across supply voltages and temperatures ( $V_{OUT}$  = 0.6 V).

Figs. 2.18(a)-(b) show the measured voltage and temperature dependence of the supply current  $I_{DD}$  for a typical sample (at  $V_{OUT}$ = 0.6 V). More specifically,  $I_{DD}$  versus  $V_{DD}$  at room temperature is shown Fig. 2.18(a). The  $I_{DD}$  at  $V_{min}$  is about 123 pA, and it is nearly voltage-independent over the entire voltage range with an increase by only 2.4% at 1.8 V. On the contrary, the current drawn from the supply has a stronger dependence on the temperature owing to transistor sub-threshold operation. In particular, from Fig. 2.18(b) the power consumption exponentially increases with temperature, leading to an increase of about 154× from 0 °C to 100 °C.



Fig. 2.18: Measured supply current versus (a) the supply voltage at 25 °C, (b) temperatures at  $V_{DD} = 0.6$  V for a typical sample ( $V_{OUT} = 0.6$  V in both cases) for a typical sample.

The dependence of the reference current on the *Vour* ranging from 0.6 V up to 1.8 V at  $V_{DD}$  = 0.6 V and room temperature, i.e., the load sensitivity of the reference circuit is shown in Figs. 2.19(a) and (b) for a typical sample and across 15 dice, respectively. Thanks to the 9 series-connected devices composing the load transistor (see Fig. 2.15), the load sensitivity for all tested samples is quite good with an average value of 0.11 %/V and a standard deviation of 0.01 %/V, thus corresponding to a variability of 9%.



Fig. 2.19: Measured reference current vs.  $V_{OUT}$ , i.e., load sensitivity for (a) a typical sample and (b) across 15 die samples at  $V_{DD} = 0.6$  V and 25 °C.

The robustness of the proposed current reference against process variations was also investigated by characterizing the main FoMs across 15 dice, as reported in the histograms of Figs. 2.20(a)-(d). From Fig. 2.20(a), the mean value  $\mu$  of  $I_{REF}$  and its standard deviation  $\sigma$  are respectively 190.8 nA and 2.7 nA, leading to a process sensitivity  $\sigma/\mu$  of 1.4%. The impact of process variations on the temperature coefficient *TC* across the 0-100 °C temperature range is reported in Fig. 2.20(b). The histogram shows a mean value of 1480 ppm/°C and a standard deviation of 102 ppm/°C, which correspond to the rather limited variability of 6.9%. The effect of process variations on the supply current at room temperature is illustrated in Fig. 2.20(c). The measured  $I_{DD}$  shows a mean value of 120 pA and a standard deviation of 22 pA, which leads to a fairly pronounced variability of about 18%. The line sensitivity dependence on process variations across the 0.6-1.8 V range is quantified in Fig. 2.20(d) at 25 °C. The resulting mean and standard deviation are respectively 0.15 %/V and 0.02 %/V, which correspond to a 13% variability.



Fig. 2.20: Measurements histogram across 15 die samples: (a)  $I_{REF}$  at 25 °C and  $V_{DD} = V_{OUT} = 0.6$  V, (b) temperature coefficient at  $V_{DD} = V_{OUT} = 0.6$  V, (c) supply current at 25 °C and  $V_{DD} = V_{OUT} = 0.6$  V, (d) line sensitivity at 25°C and  $V_{OUT} = 0.6$  V.

Finally, similar to the voltage reference of Section 2.1, the influence of the body biasing control implemented in the voltage generator block of Fig. 2.15 (i.e., M4-M5 generating the voltage

 $V_B$ ) on the line sensitivity and *TC* was also investigated by characterizing the test samples while selectively activating or suppressing the feedforward control path through  $V_B$ .

The effect of body biasing on the line sensitivity at  $V_{OUT} = 0.6$  V is shown in Figs. 2.21(a)-(b), where measurements across voltages were carried out by driving the body bias voltage  $V_B$  of M1-M2 in Fig. 2.15 by either the output of the body bias generator M4-M5 or the constant voltage corresponding to  $V_B$  that is generated by M4-M5 at  $V_{DD} = 0.6$  V. In the former case, the enablement of body biasing adaptation at every supply voltage leads to a linear increase in  $V_B$  by up to 0.6% from Fig. 2.21(a). Such  $V_B$  increase with  $V_{DD}$  allows improving the mean value of the line sensitivity by  $1.4\times$ , and the worst case by  $1.3\times$ , as shown in the cumulative distribution function in Fig. 2.21(b).



Fig. 2.21: Benefit of body biasing on line sensitivity at 25 °C and  $V_{OUT} = 0.6$  V: (a)  $V_B$  normalized to the value at  $V_{DD} = 0.6$  V vs.  $V_{DD}$  for a typical sample, b) cumulative distribution function (CDF) of line sensitivity across die samples.

The effect of body biasing on the temperature coefficient at  $V_{DD} = V_{OUT} = 0.6$  V is shown in Figs. 2.22(a)-(b). In particular, Fig. 2.22(a) shows a CTAT behavior for the  $V_B$  as generated by M4-M5 with a linear decrease up to 6.5% and by a rate of 193  $\mu$ V/°C. According to Equation (2.3) of Section 2.1, this favors a PTAT behavior for the  $V_X$ , as required for temperature compensation when  $V_X$  is larger than the  $V_{MTC}$  of MLOAD (see Fig. 2.14). Such temperature adaptation of  $V_B$  reduces the temperature coefficient by about 9% (for both mean value and worst case) as compared to the case with constant  $V_B$ , as shown in Fig. 2.22(b).



Fig. 2.22: Benefit of body biasing on temperature coefficient at  $V_{DD} = V_{OUT} = 0.6$  V: (a)  $V_B$  normalized to the value at 0°C vs. temperature, (b) cumulative distribution function (CDF) of temperature coefficient across die samples.

### 2.2.3 Comparison with the state of the art

The comparison of the proposed current reference against recent state-of-the-art designs is summarized in Table 2.4 and Figs. 2.23(a)-(b) considering only measured data. Unlike Table 2.4, Figs. 2.23(a)-(b) only refer to solutions implemented in 180-nm CMOS technology for a fair comparison. Note that Table 2.4 reports power consumption data both excluding and including (i.e., total power) the contribution of the generated reference current, while also evaluating the ratio  $I_{TOT}/I_{REF}$ . From Fig. 2.23(a), the proposed current reference shows the best trade-off between power consumption (excluding the contribution of the generated reference circuit exhibits the lowest  $I_{TOT}/I_{REF}$  ratio (very close to the unit). At the same time, the proposed current reference shows a low  $V_{min}$  of 0.6 V, which is only higher than [39] by 1.3× considering designs in the same technology. Low-voltage operation combined with the best line sensitivity in Table 2.4 (i.e., 0.15% corresponding to a 3.8-50× improvement over prior art) make the proposed reference circuit a good solution for SoC powered by rechargeable batteries (e.g., 2 Ni-Cd series batteries vary from 2.6 V at full charge down to 1.8 V when discharged) without the need of voltage regulation at the system level.

Table 2.4 also shows that the proposed reference exhibits a low load sensitivity of only 0.11%, which corresponds to a  $2.3 \times$  improvement as compared to [74].

Table 2.2 also reports that the proposed circuit has a low process sensitivity of 1.4%, which is only  $1.1 \times$  higher than [77] and  $1.2-12.5 \times$  better than prior art references. From the same table, the measured *TC* of 1460 ppm/°C is only better than [72], with a 2.5-9.7 × degradation with respect to the other designs.

Overall, the proposed reference thus provides a favorable tradeoff among power, minimum operating voltage, power efficiency and area occupation, at the only cost of an increased temperature coefficient.

|                              | This   | [39]    | [69]   | [71]   | [72]   | [73]    | [74]    | [75]    | [76]   | [77]   |
|------------------------------|--------|---------|--------|--------|--------|---------|---------|---------|--------|--------|
|                              | work   | IJCTA   | ISCAS  | TCASII | TCASII | TCASII  | ESSCIRC | TCASII  | IET    | JSSC   |
|                              | WOLK   | 2018    | 2019   | 2016   | 2005   | 2020    | 2014    | 2020    | 2018   | 2020   |
| Technology [nm]              | 180    | 180     | 180    | 180    | 1,500  | 180     | 180     | 180     | 65     | 180    |
| V <sub>min</sub> [V]         | 0.6    | 0.45    | 1.2    | 1.25   | 1.1    | 0.8     | 1.2     | 0.7     | 0.4    | 1.5    |
| Power excluding              | 0.072  | 10.2    | 650    | 500    | 1.55   | 39.3    | 0.023   | 28      | 0.0032 | 3      |
| $I_{REF} @(V_{DD}, T)$       | (0.6 V | (0.6V   | (1.2 V | (1.8 V | (1.1 V | (0.8 V  | (1.2 V  | (0.7 V  | (0.4 V | (1.5 V |
| [nW]                         | 25°C)  | 20°C)   | 25°C)  | 25°C)  | 25°C)  | 25°C)   | 25°C)   | 25°C)   | 20°C)  | 25°C)  |
| Total power                  | 114.5  | 213     | 820    | 670    | 2      | 48.6    | 0.047   | 35      | 0.0037 | 4.5    |
| -                            | (0.6 V | (0.6 V  | (1.2 V | (1.8 V | (1.1 V | (0.8 V  | (1.2 V  | (0.7 V  | (0.4 V | (1.5 V |
| $@(V_{DD}, T) [nW]$          | 25°C)  | 20°C)   | 25°C)  | 25°C)  | 25°C)  | 25°C)   | 25°C)   | 25°C)   | 20°C)  | 25°C)  |
|                              | 190.7  | 338     | 142.5  | 92.3   | 0.41   | 11.6    | 0.020   | 9.97    | 0.0012 | 1      |
| $I_{REF} @ (V_{DD}, T)$ [nA] | (0.6 V | (0.6 V  | (1.2 V | (1.5 V | (1.1 V | (0.8 V  | (1.2 V  | (0.7 V  | (0.4 V | (1.5 V |
|                              | 25°C)  | 20°C)   | 25°C)  | 25°C)  | 25°C)  | 25°C)   | 25°C)   | 25°C)   | 20°C)  | 25°C)  |
| Process sensitivity [%]      | 1.4    | 2.7     | 9.4    | 6.1    | N.A.   | N.A.    | 1.9     | 1.6     | 17.5   | 1.26   |
| <i>TC</i> [ppm/°C]           | 1460   | 578     | N.A.   | 177    | 2500   | 169     | 780     | 150     | 469    | 289    |
| <i>T</i> range [°C]          | 0:100  | 0:80    | -40:85 | -40:85 | -20:70 | -40:120 | 0:80    | -40:125 | -20:60 | -20:80 |
| Line sensitivity<br>[%/V]    | 0.15   | 4.4     | 1.45   | 7.5    | 6      | 1.08    | 0.58    | 0.6     | 2.5    | 1.4    |
| Load sensitivity<br>[%/V]    | 0.11   | N.A.    | N.A.   | N.A.   | N.A.   | N.A.    | 0.25    | N.A.    | N.A.   | N.A.   |
| $I_{TOT}/I_{REF}$ (×)        | 1.01   | 1.05    | 4.80   | 1.35   | 1.29   | 5.26    | 1.96    | 5.01    | 7.69   | 3.00   |
| Area [mm <sup>2</sup> ]      | 0.004  | 0.00075 | 0.02   | 0.0013 | 0.046  | 0.054   | 0.0382  | 0.055   | 0.008  | 0.332  |

Table 2.4. Performance summary and comparison with the state-of-art current references (only

measured data and best performance in bold).



Fig. 2.23: (a) Power excluding  $I_{REF}$  contribution vs. area and (b)  $I_{TOT}/I_{REF}$  vs.  $V_{min}$  in state-of-the-art current references fabricated in the same 180-nm technology generation.

# Chapter 3: Design of a Corner-Aware CMOS Voltage Reference for Purely-Harvested Systems

Purely-harvested sensor nodes require low-cost building blocks with small form factor and able to operate down to low operating voltages and power by solely relying on harvesters as energy source. Unfortunately, operation at very low voltages and power enhances the sensitivity to process variations and the inherently conflicting design goals across process corners. These issues are typically mitigated by using post-fabrication trimming techniques. In this regard, this chapter introduces the design of a global variation-aware voltage reference with ULP/ULV operation, competitive sensitivity to process variations, and overall accuracy against PVT variations. The circuit is based on design replicas optimized at different corners and an on-chip process sensor, which allows selecting the best combination (i.e., selection or merge) of replicas. Experimental results in 180-nm CMOS technology across corner wafers are provided to demonstrate the effectiveness of the proposed solution. Compared to conventional single replica, replica selection/combination leads to  $4 \times$  lower process sensitivity across corner wafers.

## **3.1 Circuit Architecture and Operating Principle**

ULP/ULV circuit design often involves the use of trimming techniques to deal with the exacerbated process variations and the conflictual nature of design targets across process corners. Conventional trimming methods require post-fabrication testing and circuit adjustment to calibrate circuit parameters, thus impacting the chip cost. As an example, a common solution exploits parallel-connected transistors with different strengths, which can be activated or deactivated through a switching network controlled by the so-called trimming bits [55] and [78]. Alternative solutions (e.g., used in ADCs) are based on automatic trimming algorithms [79]. Typically, built-in self-calibration procedures are carried out using additional circuitry at the cost of larger area occupation. Furthermore, automatic self-calibration algorithms are usually application-specific [80].

Here, the design of an NMOS-only corner-aware ULP/ULV voltage reference (VR) is presented for purely-harvested operation. Fig. 3.1 illustrates the scheme and the operating principle of the proposed architecture. Instead of adopting a fixed design as a compromise across corners and

conflicting design targets (e.g., process sensitivity, temperature coefficient, etc.), the circuit selects/combines VR design replicas optimized at three different process corners (i.e., VR1 for SS corner, VR2 for TT corner, and VR3 for FF corner) to relax design conflicts and improve performance across global variations. To this aim, an on-chip process sensor is used to select the best combination (i.e., selection or merge) of replicas at boot time, avoiding any post-fabrication testing effort and trimming for low-cost applications. As shown in Fig. 3.1, short-circuiting the outputs of VR1 and VR2 (thus obtaining their average) allows covering the intermediate sub-corner between SS and TT corners. Similarly, the intermediate sub-corner between TT and FF corners is covered by short-circuiting the outputs of VR3. Note also that the proposed approach is general and applicable to different basic reference circuits and process sensors.



Fig. 3.1: Scheme and operating principle of the proposed NMOS-only corner-aware architecture.

## 3.2 Oscillator-Based On-Chip Process Sensor

Fig. 3.2 shows the architecture of the proposed process sensor. It comprises an NMOS slow oscillator counting an NMOS fast oscillator. Transistor size in the two oscillators is purposely differentiated (size A and B for the fast and slow oscillators, respectively) to induce a threshold voltage difference across the whole range of global variations (from SS to FF corner). In particular, such *V*<sub>TH</sub> difference monotonically depends on global variations, as illustrated in Fig. 3.3 (different sizing raises the threshold voltage above the fast oscillator by 12.5 mV at SS, 33

mV at TT and 53.5 mV at FF). The threshold shift is thus translated into global variationdependent frequency ratio (i.e., count) between the fast small-sized and the slow larger-sized oscillator, such as  $f_H/f_L$  where  $f_H$  and  $f_L$  are the frequency of the fast and slow oscillators, respectively, as evaluated through a counter (see Fig. 3.2) that ultimately quantifies the global variation bin that the chip lies in.



Fig. 3.2: Process sensor architecture.



Fig. 3.3: Operating principle of the process sensor: process corners can be detected with a pair of circuits sized as A and B by reading out a circuit parameter that depends on  $V_{TH}$  (i.e., the oscillator frequency).

Both NMOS-only oscillators were designed using ratioed logic and a ring oscillator structure, as shown in Fig. 3.4. More specifically, the fast oscillator consists of a 7-stage circuit, where each stage is implemented with a stack of 3 NMOS for both pull-down and pull-up networks.

Conversely, the slow oscillator adopts a 41-stage architecture, where each stage is implemented with one NMOS and a stack of 2 NMOS for pull-down and pull-up networks, respectively. The use of different stage topologies is aimed at achieving a similar temperature dependence for the two oscillators, while not implying any influence on the dependence of the frequency on process corners. The analogous temperature dependence of the two oscillators makes the count robust against environmental changes and allows correct replica selection/combination during in-field operation (e.g., boot time), suppressing any testing or trimming effort, and the need for any accurate time basis for corner identification.



Fig. 3.4: Schematic of implemented fast and slow oscillators.

In general, the replica count N is set to cover 2N-1 variation bins as a tradeoff between complexity and adaptability to variations. The proposed approach allows mitigating the impact of process variations (particularly beneficial at low power and  $V_{min}$ ), improving performance by breaking the conventionally rigid and conflicting tradeoffs across corners, and can be progressively adapted to different levels of process maturity via simple count threshold reprogramming.

## 3.2.1 Measurement results in 180-nm process

Both fast and slow oscillators were implemented in a 1.8-V 180-nm CMOS technology using only RVT NMOS devices to validate the proposed process sensor design. Table 3.1 reports the adopted transistor sizing, while Fig. 3.5 shows the circuit layout and the chip micrograph. The resulting fast and slow oscillators occupy a silicon area of about 960  $\mu$ m<sup>2</sup> (26 $\mu$ m×37 $\mu$ m) and 40,000  $\mu$ m<sup>2</sup> (500 $\mu$ m×80 $\mu$ m), respectively. Measurements were performed on 45 samples coming from three different wafers of process corners (SS, TT, and FF), i.e., 15 samples per corner wafer. Measurements on corner wafers allow demonstrating the effectiveness of the proposed solution.

| Transistor | <b>W/L</b> [μm] | Transistor | <b>W/L</b> [μm] |
|------------|-----------------|------------|-----------------|
| M1         | 3x3.6/6.3       | M4         | 0.48/0.18       |
| M2         | 12.6/2.8        | Capacitor  |                 |
| M3         | 0.48/1.2        | C1         | 10/10 (MIM cap) |

| Table 3.1: | Transistor | sizing f | for the | implemented | oscillators. |
|------------|------------|----------|---------|-------------|--------------|
|------------|------------|----------|---------|-------------|--------------|



Fig. 3.5: Layout of the implemented oscillators and chip micrograph.

Regarding the design of the fast oscillator, Fig. 3.6 shows its frequency  $f_H$  for  $V_{DD} = 1.8$  V and 25°C at different sizes from corner simulation analysis. For all corners, the frequency decreases

as the active area increases due to increased parasitic capacitance. In addition, with increasing size the observed frequencies at different corners are closer to each other, thus making it more difficult to detect the process corner. Accordingly, the sizing corresponding to the minimum value of the active area reported in Fig. 3.6 (i.e., 6.5  $\mu$ m<sup>2</sup>) was adopted to enable easier corner detection. This makes the statistical distributions at different corners disoverlapped with >97.7% confidence level (more than 2 standard deviations).



Fig. 3.6: Frequency of the fast oscillator as function of the active area for  $V_{DD} = 1.8$  V and 25°C from corner simulation analysis.

Fig. 3.7 shows the measured frequency ratio  $f_H/f_L$  between the two oscillators as a function of  $V_{DD}$  (ranging from 1.6 V up to the nominal value of 1.8 V) at 25 °C across corner chips, while Fig. 3.8 reports the measured  $f_H/f_L$  across dice, temperature and voltage variations. From Fig. 3.7, we can observe that the resulting ratio  $f_H/f_L$  allows differentiating the three process corners across voltages around 1.8 V at room temperature. Then, data reported in Fig. 3.8 proves that the measured ratio  $f_H/f_L$  across corners is well discriminated across the 45 dice and the considered 0-70°C temperature range at 1.8 V, whereas the corner detection is expectedly less robust at lower voltages. More specifically, mismatch induces a 0.7-2% variability in the frequency ratio across voltages and temperatures, which correctly discriminates corners again with >97.7% confidence level at any given voltage or temperature. Regarding the power consumption, the fast oscillator exhibits an average current of 85  $\mu$ A for the TT corner, 105  $\mu$ A for the FF corner, and 70  $\mu$ A for the SS corner. Higher current consumption is expectedly observed in the 41-stage slow oscillator structure, i.e., 4 mA for the TT corner, 7.3 mA for the

FF corner, and 5.8 mA for the SS corner. Nevertheless, it is worth pointing out that the process sensor acts only one time (i.e., at boot time), thus not influencing the power dissipation of the whole system during normal running.



Fig. 3.7: Measured frequency ratio between the two oscillators as a function of  $V_{DD}$  at 25 °C across corner dice.



Fig. 3.8: Measured statistical distribution of the frequency ratio between the two oscillators across dice, temperature and voltage variations.

#### 3.3 Basic ULP/ULV Voltage Reference Circuit

The implemented voltage reference has an NMOS-only structure similar to the one presented in Chapter 2. Indeed, as shown in Fig. 3.9, it is based on an 8-transistor circuit comprising a body bias generation block, a deep n-well replica bias, and a core reference generation block. Therefore, all the considerations about the operating principle of the circuit of Chapter 2 are also valid for the circuit of Fig. 3.9. The only difference concerns the gate connection of M4, M6, and M8 (i.e., the upper transistors in the three blocks). In the design presented in Chapter 2, the gate of these devices is grounded (see Fig. 2.1). Here, the gate of M4, M6, and M8 is connected to the source, thus leading to a zero  $V_{GS}$ . This topological variant was introduced to enable a single-threshold design approach, i.e., using only RVT devices, which is essential when dealing with a corner-aware design (transistors with different flavors typically exhibit different  $V_{TH}$  shifts across corners). In particular, using RVT devices (i.e., with higher  $V_{TH}$  as compared to LVT ones) for M4, M6, and M8 in a reverse gate-biased configuration as in Chapter 2 would require a notable increase of their size to provide the current needed for proper circuit operation. This would translate into a significant increase of the area occupation, along with an increased effect of layout-dependent variations coming from skewed strength ratios between upper and bottom transistors in the circuit blocks.

For a detailed circuit analysis, we can refer to the simplified schematic shown in Fig. 3.10, where M1-M3 are lumped into a single transistor M13. Considering all transistors working in the subthreshold region with a drain-source voltage  $V_{DS} > 4V_T$  and following the same analytical approach used in Chapter 2, the following  $V_{REF}$  expression can be derived from Fig. 3.10:

$$V_{REF} = \frac{V_{TH_{0,M_{13}}} - \frac{n_{M_{13}}}{n_{M_4}} V_{TH_{0,M_4}} + n_{M_{13}} \frac{kT}{q} ln \left(\frac{l_{0,M_4} W_{M_4} L_{M_{13}}}{l_{0,M_{13}} W_{M_{13}} L_{M_4}}\right) - \lambda_{BB,M_{13}} V_B}{1 + \frac{n_{M_{13}}}{n_{M_4}} \cdot \lambda_{BB,M_4}}$$
(3.1)

where the term  $\lambda_{BB,M13}V_B$  quantifies the body biasing compensation on M13 with  $V_B$  given by:

$$V_B = \frac{V_{TH0,M5} - \frac{n_{M5}}{n_{M6}} V_{TH0,M6} + n_{M5} \frac{kT}{q} ln \left( \frac{I_{0,M6} W_{M6} L_{M5}}{I_{0,M5} W_{M5} L_{M6}} \right)}{1 + \frac{n_{M5}}{n_{M6}} \lambda_{BB,M6}}.$$
(3.2)



Fig. 3.9: Schematic of the basic voltage reference circuit used for each corner replica.



Fig. 3.10: Simplified circuit analysis of the basic reference circuit in Fig. 3.9 with M1-M3 lumped into a single transistor M13.

## **3.3.1** Measurement results in 180-nm process with and without corner-aware replica combination

The basic voltage reference circuit of Fig. 3.9 was designed in a 180-nm CMOS technology using RVT devices. To exemplify the proposed approach, three replicas of the circuit were sized differently and optimized around the SS, TT and FF corners, as reported in Table 3.2. This allows covering five global variation bins, as described in Fig. 3.1. Indeed, global variation bins between SS and TT require intermediate sizing and design tradeoff compared to SS and TT replica, which is simply obtained by enabling and short-circuiting the two replicas, suppressing the need for two additional intermediate replicas (similar considerations hold for intermediate variations between TT and FF). All three designed replicas occupy a similar silicon area of about 6,000  $\mu$ m<sup>2</sup> (124 $\mu$ m×48 $\mu$ m), as shown in Fig. 3.11 illustrating the circuit layout and the die photo. Again, measurements were performed on 45 samples coming from three different wafers of process corners (SS, TT, and FF), i.e., 15 samples per corner wafer, to prove the effectiveness of the proposed solution.

| Transistor | <b>W/L</b> [µm] <b>SS</b> | <b>W/L</b> [μm] TT | <b>W/L</b> [µm] <b>FF</b> |
|------------|---------------------------|--------------------|---------------------------|
| M1-M3      | 6.7/1.2                   | 6.6/1.2            | 4.9/1.1                   |
| M4         | 7×14/20                   | 7×14/20            | 7×14/20                   |
| M5, M7     | <b>1.6</b> /7             | 2/7                | 1/7                       |
| M6, M8     | 2×10/8                    | 2×10/8             | 2×10/8                    |

Table 3.2: Adopted transistor sizing for the three replicas of the voltage reference circuit, each optimized at different corners for minimum process sensitivity and temperature coefficient (in bold sizes varying across different optimizations).



Fig. 3.11: Layout of the voltage reference replicas and die micrograph.

Figs. 3.12-3.14 report measurement results referred only to circuit replica optimized for the TT corner on 15 dice coming from TT corner wafer to provide circuit performance in the case of typical behavior. More specifically, Fig. 3.12 shows the measured  $V_{REF}$  versus the supply voltage  $V_{DD}$  at 25 °C for one sample. Figs. 3.13(a) and (b) shows the measured  $V_{REF}$  and supply current  $I_{V_{DD}}$ , respectively, versus the temperature at  $V_{DD} = 0.2$  V for one sample. Figs. 3.14(a) and (b) show the measured statistical distribution of the  $V_{REF}$  (at 25°C) and temperature coefficient TC, both at  $V_{DD}$  = 0.2 V, across 15 TT dice. In the test chip characterization, the supply voltage is swept from 0 V to 1.8 V, and the temperature from 0 °C up to 70 °C. From Fig. 3.12, we can observe the voltage reference shows reliable operation at supply voltages down to 200 mV. At room temperature, the resulting output voltage is  $V_{REF} \approx 43$  mV, while the absolute line sensitivity is  $143 \mu V/V$ . From Fig. 3.13(a), the absolute temperature coefficient is 43  $\mu$ V/°C at the minimum operating voltage. The power consumption is only 3.2 pW at room temperature and  $V_{DD} = 0.2$  V, with a notable increasing trend with the temperature owing to subthreshold operation, as shown in Fig. 3.13(b). The  $V_{REF}$  process sensitivity of the TT replica across 15 TT corner dice is 1.4%, owing to a mean value of 42.7 mV and a standard deviation of 0.6 mV (see Fig. 3.14(a)). The measured average TC is 33.1  $\mu$ V/°C with a standard deviation of 4.1  $\mu$ V/°C, thus corresponding to a variability of about 12.4%.



Fig. 3.12: (a) Measured reference voltage vs. supply voltage at room temperature in the case of typical behavior.



Fig. 3.13: Measured (a) reference voltage and (b) supply current as a function of temperature in the case of typical behavior.



Fig. 3.14: Measured statistical distribution of (a) reference voltage (at 0.2 V and 25 °C) and (b) temperature coefficient (at 0.2 V) for the TT circuit replica across 15 TT test chips.

Then, Figs. 3.15-3.17 compare measurement results obtained across all 45 test chips in the cases when the proposed global variation-aware replica selection is not considered (i.e., TT replica measured across corners) and when it is employed (i.e., TT, SS, and FF replicas measured on the corresponding 15 corner test chips). More specifically, Fig. 3.15, 3.16, and 3.17 shows the measured statistical distributions without and with replica selection for the  $V_{REF}$  (at 0.2 V and 25 °C), *TC* (at 0.2 V), and line sensitivity *LS* (at 25 °C), respectively. From Fig. 3.15, we can observe that the process sensitivity of the TT replica across 45 dice is 4%. The proper selection of the replicas across corners significantly narrows the overall distribution of  $V_{REF}$  with a 2.5× improvement compared to the single replica. As shown in Figs. 3.16 and 3.17, at the same time both the average *TC* and *LS* across corners benefit from the replica selection with 1.2× and 1.4× improvements, respectively, enabling more consistent performance across dice. Interestingly, the proposed approach improves performance and variability to no detriment of other measured parameters, as it fundamentally relaxes the underlying design tradeoffs.



Fig. 3.15: Measured statistical distribution of reference voltage (at 0.2 V and 25 °C) w/o and w/ replica selection across corners.



Fig. 3.16: Measured statistical distribution of temperature coefficient (at 0.2 V) w/o and w/ replica selection across corners.



Fig. 3.17: Measured statistical distribution of line sensitivity (at 25 °C) w/o and w/ replica selection across corners.

#### 3.4 Comparison with the state of the art

The comparison of the proposed circuit over state of the art voltage references is summarized in Table 3.3, whereas Fig. 3.18 shows the tradeoff between  $V_{min}$  and the  $V_{REF}$  absolute accuracy considering only prior art designs in the same 180-nm technology for fair comparison. Data referred to corner wafer measurements are highlighted both in Table 3.3 (refer to column within the orange square) and Fig. 3.18 (refer to star symbols). From Table 3.3, the power consumption of the proposed reference is 3.2 pW, which is close to the lowest power reported in prior art (i.e., 2.2 pW in 130-nm technology [55]) and 1.7× better than the voltage reference presented

in Chapter 2. The power consumption is 9.8-11,562× lower than other prior art from Table 3.3. The proposed reference also exhibits the lowest  $V_{min}$  of 200 mV, which is 2.25-6.5× lower than [63]-[57], [55].

The absolute accuracy of the reference voltage is evaluated in Table 3.3, which is more appropriate than the relative accuracy for the targeted harvesting applications, as explained in Chapter 2. Accordingly, Table 3.3 reports both absolute and relative metrics, where the accuracy of  $V_{REF}$  is evaluated by considering 3- $\sigma$  process variations, 0.3-V harvested voltage fluctuation (e.g., a solar cell from low to intense light), and 20-°C temperature deviation. Note that for the case of corner wafer measurements 3- $\sigma$  process variations were not considered. As a fair metric to compare references with different  $V_{REF}$  down to sub-100mV, absolute and relative metrics are also reported for the temperature coefficient, process sensitivity and line sensitivity.

The measured process sensitivity across corner chips of 0.68 mV is  $5.4-27.5 \times$  lower than other works reporting corner wafer measurements [68], [54]. The 34.9  $\mu$ V/°C absolute temperature coefficient is 2.4× lower than [54] (1.5× higher than [68]) and comparable to [63], [62], [68] and the voltage reference presented in Chapter 2. The achieved *TC* is expectedly worse than the one achieved by some bandgap references. The absolute line sensitivity is 2.3-65× better than [57], [63]-[64],[67]-[68] and the voltage reference presented in Chapter 2.

Table 2.2 also shows that the proposed reference exhibits a higher silicon area (i.e., 78,500  $\mu$ m<sup>2</sup>) than CMOS and hybrid designs due to the presence of the process sensor. This can be mainly ascribed to the area occupied by the slow oscillator (about 60,000  $\mu$ m<sup>2</sup>), whereas each VR replica and the fast oscillator occupies an area of only 6000  $\mu$ m<sup>2</sup> and 500  $\mu$ m<sup>2</sup>, respectively. It is worth pointing out that by using an external real-time clock (RTC), i.e., using only the fast oscillator, the total occupied area can be reduced to 18,500  $\mu$ m<sup>2</sup>. However, the extra occupied area due to the process sensor can be widely justified when exploiting this block for trimming all the other blocks in a whole SoC.

Finally, an overall  $V_{REF}$  absolute accuracy of 1.4 mV is achieved across PVT variations from Table 3.3, which outperforms prior art by 2-22.6×. Accordingly, the proposed reference achieves a favorable tradeoff among power, minimum operating voltage and accuracy, preserving accuracy in low-cost and pW-power directly-harvested systems, even better than the voltage reference of Chapter 2.

|                                  | This work<br>(w/ replica<br>selection) | This work<br>(TT<br>replica<br>only) | VLSI'16<br>[54] | JSSCC'19<br>[68]        | VR<br>Chapter 2      | JSSC'12<br>[55]    | JSSC'17<br>[57]    | ISSCC'15<br>[62] | ISSCC'17<br>[63] | ISSCC'15<br>[64] | VLSI'17<br>[66]           | VLSI'19<br>[67]            |
|----------------------------------|----------------------------------------|--------------------------------------|-----------------|-------------------------|----------------------|--------------------|--------------------|------------------|------------------|------------------|---------------------------|----------------------------|
| technology [nm]                  | 180                                    | 180                                  | 180             | 180                     | 180                  | 130                | 180                | 130              | 180              | 350              | 180                       | 180                        |
| type                             | CMOS                                   | CMOS                                 | CMOS            | hybrid                  | CMOS                 | CMOS               | CMOS               | BGR              | BGR              | BGR              | sub-BGR                   | hybrid                     |
| transistor flavors<br>required   | N-RVT,                                 | N-RVT,                               | P-RVT           | P-RVT,<br>N-native, BJT | N -RVT,<br>N-LVT     | N-HVT,<br>N-native | P-HVT,<br>N-native | N/P-RVT,<br>BJT  | N/P-RVT,<br>BJT  | N/P-RVT,<br>BJT  | N-HVT,<br>N/P-RVT,<br>BJT | N-RVT,<br>N-native,<br>BJT |
| active area [µm <sup>2</sup> ]   | 18,500 (w/<br>RTC)-<br>78,500          | 6000                                 | 4,880           | 4,500                   | 2,200                | 1,350              | 2,500              | 26,400           | 55,000           | 480,000          | 34,000                    | 7,600                      |
| V <sub>DD</sub> range [V]        | <b>0.2</b> -1.8                        | 0.2-1.8                              | 1.2-2.2         | 1.0-1.8                 | 0.25                 | 0.5-3.3            | 1.4-3.6            | 0.5-1.5          | 1.3-1.8          | 1.25-3.3         | 0.8-1.8                   | 0.9-3.3                    |
| min. power@25°C<br>[pW]          | 3.2                                    | 3.2                                  | 114             | 192                     | 5.4                  | 2.2                | 33.6               | 32,000           | 9,300            | 28,700           | 37,000                    | 31.4                       |
| V <sub>REF</sub> [mV]            | 42.7                                   | 42.6                                 | 986.2           | 692.6                   | 91.4                 | 176.4              | 1,250              | 498              | 1238             | 1176             | 240                       | 740                        |
| PS [mV]                          | 0.68                                   | 1.7                                  | 18.7            | 3.67                    | 0.51                 | 1.27               | 10                 | 3.3              | 5.32             | 2.35             | 2.04                      | 4.22                       |
| (%)                              | (1.6)                                  | (4.0)                                | (1.9)           | (0.53)                  | (0.56)               | (0.72)             | (0.8)              | (0.66)           | (0.43)           | (0.2)            | (0.85)                    | (0.57)                     |
| PS evaluated w/<br>corner wafers | YES                                    | YES                                  | YES             | YES                     | NO                   | NO                 | NO                 | NO               | NO               | NO               | NO                        | NO                         |
| TC [ $\mu V/^{\circ}C$ ]         | 34.9                                   | 41.7                                 | 84.81           | 22.86                   | 24.2                 | 10.94              | 28.75              | 37.35            | 32.19            | 14.99            | 10.10                     | 19.98                      |
| ([ppm/ºC])                       | (832)                                  | (978)                                | (86)            | (33)                    | (265)                | (62)               | (23)               | (75)             | (26)             | (12.75)          | (42.1)                    | (27)                       |
| temperature range<br>tested (°C) | 0-70                                   | 0-70                                 | -40 - 85        | -20 - 100               | 0 - 120              | -20 - 80           | 0 - 100            | 0-80             | 0-110            | -10 - 110        | -20 - 120                 | 0 - 170                    |
| LS [µV/V]                        | 60.7                                   | 84.3                                 | 3,750           | 140                     | 144.5                | 60                 | 3,880              | 9,960            | 990              | 2,330            | 10                        | 2,000                      |
| ([%/V])                          | (0.14)                                 | (0.19)                               | (0.38)          | (0.02)                  | (0.16)               | (0.033)            | (0.31)             | (2)              | (0.08)           | (0.198)          | (0.0051)                  | (0.27)                     |
| accuracy (b)                     | 1.4                                    | 2.6                                  | 21.6            | 4.17                    | 2.55                 | 4.05               | 31.74              | 13.6             | 16.91            | 8.05             | 6.33                      | 13.65                      |
| [mV] ([%])                       | (3.3)                                  | (6.2)                                | (2.2)           | (0.6)                   | (3.0)                | (2.3)              | (2.5)              | (2.7)            | (1.4)            | (0.7)            | (2.6)                     | (1.8)                      |
| <sup>(b)</sup> accuracy (mV) =   | process sens                           | sitivity (mV)                        | )·3 + line sen  | sitivity (mV/V)∙        | 0.3 V + <i>TC</i> (m | V/°C)·20 °C        | i.e., 3σ withi     | n-lot variations | , 0.3-V voltage  | change, and 50   | °C temperat               | ure deviation)             |

Table 3.3: Performance summary and comparison with state-of-the-art voltage references (best

performance in bold).



Fig. 3.18: Absolute  $V_{REF}$  accuracy vs.  $V_{min}$  in state-of-the-art voltage references fabricated in the same 180-nm technology generation. Star symbols denote results from corner wafer measurements.

## Chapter 4: Design of an Ultra-Low Voltage Level Shifter

State-of-the-art SoCs consist of several heterogeneous intellectual property (IP) blocks, each operating at a different supply voltage level depending on timing requirements. Time-critical blocks run at higher supply voltage ( $V_{DDH}$ ) to reach high performance, whereas noncritical blocks operate at lower supply voltage ( $V_{DDL}$ ), even in subthreshold regime, to save energy. Reliable level shifter circuits are hence required in such multiple- $V_{DD}$  systems for a proper interfacing between different voltage domains, while maintaining the overall robustness of the design. This chapter firstly provides an overview of prior art on level shifter designs. Then, a robust level shifter design able to convert input voltages from the deep subthreshold regime (about 100 mV) up to the nominal supply voltage (1.8 V) is presented. The proposed circuit is based on a self-biased low-voltage cascode current mirror (CM) topology that features diode-connected PMOS and NMOS transistors to drive the split-input inverting buffer used as output stage with high energy efficiency. Experimental results in 180-nm CMOS process across corner wafers are provided, demonstrating the effectiveness of the proposed level shifter as compared to prior art.

#### **4.1 Prior Art on Level Shifter Designs**

In prior art, level shifter designs can be categorized to cross-coupled (CC) and current mirror (CM) based topologies, whose basic schematics are illustrated in Fig. 4.1(a) and (b), respectively. Due to the presence of complementary pull-up networks (PUNs) and pull-down networks (PDNs), CC-based level shifters typically exhibit very low standby power consumption. As a main drawback, they suffer from the current contention between the PUNs and PDNs during the switching, which affects both speed and energy. Such a current contention is exacerbated when subthreshold voltages need to be up-converted, requiring an impractical increase in size of the PDNs to make it more conductive. Conversely, the CM-based architectures benefit from relaxed contention between PUNs and PDNs to improve speed and energy when a wide range up-conversion (i.e. from deep subthreshold regime to a significantly higher voltage level) is requested. Nevertheless, they typically suffer from large static power consumption due to the current provided by the current mirror structure.



Fig. 4.1: Schematic of conventional (a) cross-coupled (CC) and (b) current mirror (CM) based level shifter topologies.

Different solutions were recently proposed in literature to overcome the above limitations of CC-based [81]-[86] and CM-based [87]-[92] topologies. For instance, in [83]-[84] adaptive/regulated PUNs are proposed to reduce current contention in CC-based designs, thus improving the switching speed and energy for up-conversions from extremely low-voltage domains. In addition, a split-input inverting buffer is used as output stage to further improve energy efficiency. In order to address the voltage drop and non-optimal feedback limitations of conventional CM-based level shifters, a revised Wilson CM exploiting mixed-threshold voltage devices is presented in [87]. In [88] a reduced-swing output buffer design allows lowering standby power, while a pass transistor-based circuitry improves the switching speed. Instead, in [89] a self-controlled current limiter scheme is employed to achieve voltage shifting from deep subthreshold to above-threshold domains while improving speed, energy, and static power consumption. The split-input inverting buffer is also adopted in CM-based topologies, as shown in [90] and [91], with the aim of reducing the static current in the output stage.

# 4.2 Ultra-Low Voltage Level Shifter with High-Speed and Energy-Efficient Operation

Here, an ULV level shifter design based on a self-biased low-voltage cascode CM scheme and a split-input inverting output buffer is presented. Moreover, the proposed solution exploits an additional diode-connected NMOS device along with a PDN boosting device in the driving scheme of the split-input inverting output buffer to achieve high energy efficiency, while ensuring fast switching. The proposed circuit was designed in 180-nm CMOS technology and validated through measurements on fabricated samples coming from five different wafers of process corners (SS, SF, TT, FS, and FF). Obtained experimental results demonstrate the robustness of the proposed design for extremely low-voltage inputs (from 100 mV up to the nominal voltage of 1.8 V). For a 0.4-V 100-kHz input pulse, an average delay of 7.6 ns was measured across 10 test chips along with a mean switching energy per transition of about 69 fJ and an average standby power consumption of 1.33 nW.

#### 4.2.1 Proposed solution and operating principle

Fig. 4.2 shows the schematic of the proposed level shifter, which is based on a PMOS-based self-biased low-voltage cascode CM ( $M_{p1}$ - $M_{p4}$ ), where  $M_{p1}$  is biased through the diode-connected  $M_{p3}$  [93]. Similar to other solutions proposed in literature [83]-[84], [90]-[91], the circuit also exploits a split-input inverting buffer ( $M_{n5}$ - $M_{p5}$ ) as output stage to lower static power consumption. However, the driving scheme of the output stage differs from previous designs for the use of an additional diode-connected NMOS device ( $M_{n4}$ ) along with a PDN boosting device ( $M_{n2}$ ).



Fig. 4.2. Schematic of the proposed level shifter.

Indeed, the output buffer of the proposed design is driven by NDP and NDN nodes, whose voltage values differ from the voltage drop ( $V_D$ ) of M<sub>p3</sub> and M<sub>n4</sub> ( $V_D = /V_{GSp3} / + V_{GSn4}$ ). Such voltage difference is larger than in previous designs that adopt a split-input inverting output buffer. As a consequence, in the proposed solution the short-circuit power consumption of the output stage during both low-to-high (L $\rightarrow$ H) and high-to-low (H $\rightarrow$ L) transitions is lowered to achieve improved energy-efficiency. This can be observed in Figs. 4.3(a)-(f), which illustrates the simulated transient behavior of the proposed level shifter with reference to the voltage upconversion of an input pulse amplitude from 200 mV to 1.8 V for both output transitions. From Fig. 4.3(c) and (d) we can note that during the L $\rightarrow$ H (H $\rightarrow$ L) transition the NDN (NDP) node turns off the NMOS (PMOS) of the output stage well before its complementary device weakly turns on. In this way, the contention at the output node is alleviated and the short-circuit power is reduced.

As shown in Fig. 4.3, when the input signal A (AN) is low (high),  $M_{n1}$  and  $M_{n2}$  are in OFF state, while  $M_{n3}$  is maintained ON. Thus, the voltage at node NP is low (0 V), whereas the voltage at the node NDP is high ( $V_{DDH}$ ), being  $M_{p1}$  in the ON state. At the same time, the voltage at NN and NDN nodes are  $V_{DDH} - /V_{GSp3}/$  and  $V_{DDH} - /V_{GSp3}/ - V_{GSn4}$ , respectively. L $\rightarrow$ H transition of the input signal A switches  $M_{n1}$  and  $M_{n2}$  ON, while  $M_{n3}$  is turning OFF, thus leading NN and NDN nodes to be discharged. As the voltage of the node NN turns  $M_{p4}$  ON, the current  $I_R$ flowing in the right branch of the circuit starts charging the node NP. Consequently,  $M_{p1}$  is weakened and the current contention at the node NDP is strongly alleviated. This allows the node NDP to be discharged to switch  $M_{p5}$  ON, while NN and NDN nodes are fully discharged down to 0 V to completely cut-off  $M_{n5}$ . Fig. 4.3 also shows the signals of the proposed level shifter for the H $\rightarrow$ L input transition. As A (AN) falls (rises),  $M_{n1}$  and  $M_{n2}$  are turned OFF, while  $M_{n3}$  is switched ON to pull down the node NP. The discharging current  $I_R$  is hence mirrored as IL on the left branch of the circuit to charge the nodes NDP and NDN to  $V_{DDH}$  and  $V_{DDH} - /V_{GSp3}/$  -  $V_{GSn4}$  voltage levels, respectively. As a consequence,  $M_{P5}$  is switched completely OFF, while  $M_{N5}$  turns ON to discharge the output node.



Fig. 4.3. Simulated transient behavior of the proposed level shifter for a voltage up-conversion from 200 mV up to 1.8 V. (a) Input (A) and output (OUT) voltages. (b) Voltages at internal nodes (NN, NP, NDP, and NDN). (c)-(d) Details of NN, NP, NDP, and NDN signals during L $\rightarrow$ H and H $\rightarrow$ L transitions. (e) Left- and right-branch currents ( $I_L$  and  $I_R$ ). (f) Current of the output stage ( $I_{OUT}$ ).

#### 4.2.2 Measurement results in 180-nm process

The level shifter of Fig. 4.2 was designed in a commercial 1.8-V 180-nm CMOS technology with transistor sizing as reported in Table 4.1. Note that a dual- $V_{TH}$  design approach using low  $V_{TH}$  (LVT) devices for M<sub>n1</sub>-M<sub>n2</sub>-M<sub>n3</sub> and regular  $V_{TH}$  (RVT) devices for the rest of transistors was adopted to improve speed, while ensuring small area and minimum V<sub>DDL</sub> robustness thanks to proper transistor sizing. It is also worth pointing out that the voltage conversion range of the

| proposed level shifter can be further up extended by using thicker-oxide (e.g., 3.3-V) I/O PMOS |  |
|-------------------------------------------------------------------------------------------------|--|
| devices, as proposed in [94].                                                                   |  |

| Transistor       | Туре | W/L (µm)  | Transistor             | Туре | W/L (µm)  |
|------------------|------|-----------|------------------------|------|-----------|
| M <sub>n1</sub>  | LVT  | 0.4/0.3   | M <sub>p1</sub>        | RVT  | 0.3/0.22  |
| M <sub>n2</sub>  | LVT  | 0.22/0.3  | M <sub>p2</sub>        | RVT  | 0.3/0.22  |
| M <sub>n</sub> 3 | LVT  | 0.4/0.3   | <b>M</b> <sub>p3</sub> | RVT  | 0.22/0.22 |
| M <sub>n4</sub>  | RVT  | 0.22/0.18 | M <sub>p4</sub>        | RVT  | 0.22/0.22 |
| M <sub>n5</sub>  | RVT  | 0.22/0.18 | M <sub>p5</sub>        | RVT  | 0.22/0.18 |
|                  | . –  |           | 1                      |      |           |

Table 4.1 Transistors type and sizing for the proposed level shifter.



Fig. 4.4: Micrograph of the fabricated test chip and layout of the proposed level shifter.

Fig. 4.4 shows the micrograph of the fabricated test chip along with the layout of the proposed level shifter. The physical design was carried out following the double-cell-height strategy and exploiting only metal-1 and metal-2 wires. The resulting level shifter occupies a silicon area of about 82  $\mu$ m<sup>2</sup> (8.8 $\mu$ m×9.3 $\mu$ m). Static and dynamic measurements were performed on 10 samples coming from five different wafers of process corners (SS, SF, TT, FS, and FF), i.e., two samples for each process corner. Measurements on corner wafers allow demonstrating the robustness of the proposed solution against process variations.



Fig. 4.5: (a) Statistical distribution of measured minimum V<sub>DDL</sub> for successful up-conversion to the nominal supply voltage (1.8 V) at different temperatures (-25 °C, 25 °C, and 80 °C), and (b) measured input (A) and output (OUT) waveforms for 70 mV→1.8 V conversion at 25 °C.

Fig. 4.5(a) shows the distribution of the measured minimum  $V_{DDL}$  ( $V_{DDL,min}$ ) for successful upconversion to 1.8 V at different temperatures (i.e., -25 °C, 25 °C, and 80 °C) and considering an input signal frequency of 100 Hz. At 25 °C, the best sample can up-convert an input voltage pulse as low as 70 mV (in this regard, see the measured input and output waveforms provided in Fig. 4.5(b)), while the worst case is 100 mV among the 10 characterized chips. The mean ( $\mu$ ) and the standard deviation ( $\sigma$ ) of  $V_{DDL,min}$  at 25° C are 85 mV and 9.7 mV, respectively, thus corresponding to a variability  $\sigma/\mu \approx 11\%$ .  $V_{DDL,min}$  increasing (decreasing) was observed at lower (higher) temperature due to the lower (higher) leakage current flowing in the CM-based architecture. In particular, a 70% (60%) increase in the mean (worst-case)  $V_{DDL,min}$  was obtained at -25 °C as compared to room temperature. Note also that, by increasing the frequency of the input signal, the  $V_{DDL,min}$  tends to slightly increase. This phenomenon is due to the low discharge/charge rate of the internal nodes. Therefore, to enable an increase of the input frequency, a higher value of  $V_{DDL,min}$  is needed to warranty the working principle and reduce the output delay.



Fig. 4.6: Measured static power as a function of  $V_{DDL}$  for  $V_{DDH} = 1.8$  V at 25 °C. 1- $\sigma$  error bars and mean ( $\mu$ ) plot are also shown.

Measured static power considering  $V_{DDL}$  ranging from 0.1 to 0.5 V and  $V_{DDH} = 1.8$  V is shown in Fig. 4.6. In this plot, 1- $\sigma$  error bars are also reported as a measure of the uncertainty in standby power consumption. In the considered  $V_{DDL}$  range, both the  $\mu$  and  $\sigma/\mu$  of static power consumption are maintained quite constant to about 1.3 nW and 28%, respectively.

In order to correctly measure the delay of the proposed level shifter, it was buffered with buffer placed close to the LS to drive the output PAD and the test equipment. The output load of the level shifter is equivalent to two minimum-sized inverter gates. The output buffering was also replicated in another "test" path of the fabricated chip with the aim of separately evaluating its own contribution to the whole delay. Experimental data in terms of delay and provided in Fig. 4.7(a) thus refers only to the level shifter delay. Again,  $1-\sigma$  error bars are provided in this figure to give an insight on the variability. The mean measured delay is 2.87 µs when the input pulse voltage is 100 mV, while it decreases down to about 6.2 ns for  $V_{DDL} = 0.5$  V. For the target 0.4 V $\rightarrow$  1.8 V voltage up-conversion, the mean (maximum) delay results to be only 7.6 ns (8.5 ns) with a  $\sigma/\mu$  of about 6%. Measured energy per transition as a function of  $V_{DDL}$  considering an input signal frequency of 100 kHz is reported in Fig. 4.7(b). As  $V_{DDL}$  increases, the energy decreases mainly due to the decrease of the short-circuit energy in the output buffer. For  $V_{DDL} = 0.4$  V, the mean (maximum) energy consumption is only 68.9 fJ (77 fJ) with a  $\sigma/\mu$  of about 7%. It is worth also highlighting that the proposed level shifter scales well with increasing

operating frequency. More specifically, as running frequency increases, total energy per operation tends to decrease owing to the reduced contribution of leakage.



Fig. 4.7: Measured (a) delay and (b) energy per transition (100-kHz input signal) as a function of  $V_{DDL}$ for  $V_{DDH} = 1.8$  V at 25 °C. 1- $\sigma$  error bars and mean ( $\mu$ ) plot are also shown.

Fig. 4.8 shows the measured delay versus  $V_{DDL}$  for a typical sample considering  $V_{DDH} = 1.8$  V and three different temperatures (i.e. -25 °C, 25 °C, and 80 °C). Since the minimum upconvertible input pulse at -25 °C can be as low as 120 mV (from Fig. 4.5), reported data in this figure are confined to the 200-500 mV range for the  $V_{DDL}$ . In the deep subthreshold region (i.e., lower  $V_{DDL}$ ) delay measurements at 25 °C and 80 °C are very close, while a significantly lower speed was observed at -25 °C owing to the reduced current provided by the CM structure. As  $V_{DDL}$  increases, the performance spread due to the temperature variations shrinks (i.e., robustness against temperature improves).



Fig. 4.8: Measured delay as a function of  $V_{DDL}$  for a typical sample at  $V_{DDH} = 1.8$  V and three different temperatures (-25 °C, 25 °C, and 80 °C).

Finally, Fig. 4.9(a) and (b) show the measured delay and energy per transition (with a 100-kHz input signal), respectively, versus  $V_{DDH}$  ranging from 1.2 to 1.8 V for  $V_{DDL} = 0.3$  V at 25 °C. By increasing the  $V_{DDH}$ , the current flowing in the CM structure increases, thus reducing the delay while increasing the energy per transition. The increase in  $V_{DDH}$  also reduces the output buffer delay.



Fig. 4.9: Measured (a) delay and (b) energy per transition (100-kHz input signal) as a function of  $V_{DDH}$  for  $V_{DDL} = 0.3$  V at 25 °C. 1- $\sigma$  error bars and mean ( $\mu$ ) plot are also shown.

#### 4.2.3 Comparison with the state of the art

Overall experimental results obtained in terms of delay and energy per transition suggest that the proposed level shifter is well suitable for applications with moderate to high switching activities. A comparison with state-of-the-art measured level shifters is reported in Table 4.2, where the single asterisk (\*) denotes mean value, the double-asterisk (\*\*) denotes minimum value, "#" denotes post-layout simulation results, while "+" denotes estimated area without considering a Standard Cell layout.

| Design      | Туре | Tech.<br>[nm] | Conversion<br>range              | Vddl,min<br>[σ/μ] | Delay [ns]           | Energy per<br>transition [fJ] | Static<br>power<br>[nW] | Area [µm²]        |  |
|-------------|------|---------------|----------------------------------|-------------------|----------------------|-------------------------------|-------------------------|-------------------|--|
| <u>This</u> | СМ   | 180           | 85* mV <b>→</b> 1.8V             | 0.114             | 7.6                  | 68.9 (0.4 V→1.8 V             | 1.33                    | 81.8              |  |
| <u>work</u> |      |               |                                  |                   | (0.4V→1.8V)          | – 100 kHz)                    | ( <b>0.4V</b> )         |                   |  |
| [87]        | СМ   | 180           | 210**                            | N/A               | 167                  | 39 (0.3V <b>→</b> 1.8V        | 0.16                    | 153.0             |  |
|             |      |               | mV <b>→</b> 1.8V                 |                   | (0.3V→1.8V)          | 100KHz)                       | (0.3V)                  | 155.0             |  |
| [91] (1)    | CM   | 180           | 180**                            | NT/A              | 180                  | 46 (0.4 V→1.8 V               | 1.50                    | 125.0             |  |
|             | СМ   | 180           | mV <b>→</b> 1.8V                 | N/A               | (0.4V <b>→</b> 1.8V) | 5 MHz)                        | (0.1V)                  | 135.0             |  |
| [91] (2)    | СМ   | 180           | 80** mV <b>→</b> 1.8V            | N/A               | 95                   | 118 (0.4 V→1.8 V              | 1.80                    | 160.0             |  |
|             | CM   | 180           | 80 IIIV <b>→</b> 1.8V            | N/A               | (0.4V <b>→</b> 1.8V) | 5 MHz)                        | (0.1V)                  | 160.0             |  |
| [92]        | СМ   | 180           | 330**                            | N/A               | 29                   | 61.5 (0.4 V→1.8 V             | 0.33                    | 229.5             |  |
|             | CIVI | 180           | mV <b>→</b> 1.8V                 | IN/A              | (0.4V <b>→</b> 1.8V) | 500 kHz)                      | (0.4V)                  | 229.3             |  |
| [88]        | СМ   | 65            | 100**                            | N/A               | 7.5                  | 123.8 (0.3 V→1.2              | 2.64                    | 7.45+             |  |
|             | CIVI | 05            | mV <b>→</b> 1.2V                 | IN/A              | (0.3V→1.2V)          | V 1 MHz)                      | (0.3V)                  | 7.43              |  |
| [89]        | СМ   | 65            | 100**                            | N/A               | 13.7                 | 90.9 (0.2 V→1.2 V             | 1.24                    | 31.3+             |  |
|             | CIVI | 05            | mV <b>→</b> 1.2V                 | IN/A              | (0.2V→1.2V)          | 1 MHz)                        | (0.2V)                  | 51.5              |  |
| [83]        | CC   | 180           | 96* mV <b>→</b> 1.8V             | 0.375             | 31.7                 | <sup>#</sup> 173 (0.4 V→1.8 V | 0.06                    | 108.8             |  |
|             | cc   | 180           | 90 m v 71.8 v                    | 0.375             | (0.4V <b>→</b> 1.8V) | 100 kHz)                      | (0.4V)                  | 108.8             |  |
| [86]        | CC   | 180           | 68 <sup>*</sup> mV <b>→</b> 1.8V | 0.128             | ~230                 | 350 (0.4 V <b>→</b> 1.8 V     | 0.12                    | 95.6 <sup>+</sup> |  |
|             | cc   | 100           | 08 111 71.81                     | 0.128             | (0.4V <b>→</b> 1.8V) | 10 kHz)                       | (0.4V)                  | 95.0              |  |
| [85]        | CC   | 130           | 31 <sup>*</sup> mV <b>→</b> 1.2V | ~0.462            | 22                   | 25.9 (0.3 V→1.2 V             | 9.87                    | 80.7              |  |
|             | u    | 150           | JI IIIV <b>7</b> 1.2V            | ~0.402            | (0.3V <b>→</b> 1.2V) | 1 MHz)                        | (0.3V)                  | 00.7              |  |
| [82]        | CC   | 65            | 101*                             | 0.197             | 25                   | 30.7 (0.3 V→1.2 V             | 2.50                    | 17.6+             |  |
|             | u    | 05            | mV <b>→</b> 1.2V                 | 0.177             | (0.3V→1.2V)          | 1 MHz)                        | (0.3V)                  | 17.0              |  |

Table 4.2: Comparison of the proposed level shifter with the state of the art.

The CC-based circuit presented in [83] exhibits the lowest standby-power consumption of only 60 pW. Nevertheless, it results to be energy hungry (173 fj). On the other hand, the solution proposed in [85] and designed in the 130-nm CMOS technology shows the lowest energy consumed per transition (~26 fJ) and the minimum  $V_{DDL}$  of only 30 mV at the expense of the larger static power consumption (~9.9 nW). Among the CM-based solutions, that proposed in [87] exhibits low static power consumption (0.16 nW) but poor switching performance and reduced conversion range capability.

From the comparison among level shifter designs fabricated in 180-nm CMOS (see Fig. 4.10 illustrating such comparison in the energy-delay plan), the CM-based circuit proposed in [92] turns out to be very competitive (also see Fig. 4.10 illustrating the comparison in the energy-delay plan) having a switching delay of 29 ns and an average energy consumed per transition of 61.5 fJ. However, when compared to [92], the proposed level shifter improves speed (by about 74%), voltage conversion range, and area occupation (by about 64%), while consuming similar switching energy. This is achieved at the only cost of increased static power consumption. From the comparison in Table 4.2, it is also worth pointing out that the proposed circuit exhibits the highest robustness against process variations in terms of up-convertible  $V_{DDL,min}$  (with a variability  $\sigma/\mu$  of about 11%). Such result is even more notable considering that the reported measurements were performed across corners wafers, unlike the other referenced works that do not report corner wafer analysis.



Fig. 4.10: Energy-delay comparison against state-of-the-art level shifters fabricated in 180-nm CMOS.

## Chapter 5: Design of Ultra-Low Power Dynamic Voltage Comparators

The world around us is "composed" by analog signals. Indeed, the temperature, our voice, the quantity of solar light, air pressure and humidity and other physical quantities are represented by analog signals. As above mentioned, in an SoC we have integrated sensors that use analog signals to provide information. These signals need to be digitalized by using analog-to-digital converters (ADCs), which can be one of the most energy starving block in our SoC. Comparator is one of the most fundamental building blocks in ADCs. The need of realizing systems that can work just by using energy harvesting is pushing towards the use of ULV/ULP CMOS comparators to maximize the power efficiency of the whole system. To achieve ULP design, dynamic comparators are a better choice due to the almost zero standby current consumption. This chapter provides an overview of dynamic voltage comparators, while introducing the dynamic leakage suppression (DLS) logic family that makes it possible to further reduce the power consumption of the comparators. A ULP single-stage DLS-based dynamic voltage comparator is firstly presented reporting measurement results in a 180-nm technology. A second structure exploiting a dual-stage architecture is then proposed and validated by measurement results.

#### 5.1 Dynamic Voltage Comparator (DVC)

An analog comparator is a device that has two analog inputs (i.e., inverting and non-inverting) and one output assumed to have only two states. The output is set to high (positive) state when the non-inverting input is more positive than the inverting one, otherwise the output is low (negative). Operational amplifiers (op-amp) without feedback are often used as analog comparators.

Comparators are typically categorized into continuous-time and dynamic topologies. The former has the output that depends only on the analog inputs (e.g., when using an op-amp). Conversely, the latter has the output also depending on a clock signal. Conventionally, dynamic comparators have two outputs, i.e., *Out* and  $\overline{Out}$ . The clock signal defines two different phases in the operation of dynamic comparators. The first one is the pre-charge (or pre-discharge) phase in which the capacitance at the output nodes are charged (or discharged). The second one

is the evaluation phase in which the output nodes are discharged (or charged) depending on the inputs.

Dynamic voltage comparators (DVCs) typically achieve lower power consumption as compared to continuous-time designs thanks to duty-cycle operation mode and simpler architecture which does not require current mirror or reference circuits. The latter also allows reducing the supply voltage. In addition, DVCs absorb current only when the output logic state changes [95]. Taking all this into account, DVC topology represents the best candidate for ULP/ULV applications.

The schematic of a conventional DVC widely used in ADCs is shown in Fig. 5.1 [96]-[97]. The comparator employs two cross-coupled NOT gates at the top.



Fig. 5.1: Schematic of a conventional dynamic voltage comparator.

The conventional DVC operates as follows. When the clock signal  $C_{LK} = 0$ , the output nodes *Out* and  $\overline{Out}$  are pre-charged by the pre-charge transistors, while the discharge path is disabled by turning off the bottom discharge transistor. When  $C_{LK} = 1$ , the NOT gate 1 (the NOT gate 2) discharges the node  $\overline{Out}$  (*Out*) through M<sub>P</sub> (M<sub>M</sub>). Assuming that Vin + > Vin -,  $\overline{Out}$  discharges faster than *Out*. As a consequence, when  $\overline{Out}$  drops down to  $V_{DD} - /V_{TH,M_P}/$ , the corresponding pMOSFET belonging to the NOT gate 2 turns on, enabling the latch regeneration caused by cross-coupled differential pairs structure. Thus, *Out* pulls up to  $V_{DD}$  and  $\overline{Out}$  completely discharges to ground. Conversely, when Vin+ < Vin-, the circuits work vice versa; *Out* discharges faster than  $\overline{Out}$ . As a consequence, when *Out* drops down to  $V_{DD} - /V_{TH,M_P}/$ , the corresponding pMOSFET belonging to the NOT gate 1 turns on, enabling the latch regeneration caused by cross-coupled differential pairs structure. Thus,  $\overline{Out}$  pulls up to  $V_{DD}$  and *Out* completely discharges to ground

One of the main figures of merit (FoMs) of a comparator is the offset. This parameter represents the minimum voltage difference between the two inputs Vin+ and Vin- needed to enable the output toggling. In conventional DVCs the offset mainly depends on the mismatch between the two branches (right and left) and in particular on the mismatch between M<sub>M</sub> and M<sub>P</sub>. In fact, a difference in the  $V_{TH}$  of these two MOSFETs results in an offset contribution of the same value. Others main FoMs are the propagation delay, the power consumption and the common mode range (CMR). The propagation delay is defined as the delay at 50% of the swing between the clock signal and the output signal. Power consumption consists of both static and dynamic contributions. The CMR represents the range of input voltages for which the comparator properly works. The ideal case is a "rail-to-rail" comparator that is able to compare all the voltages between 0 and  $V_{DD}$ .

#### 5.2 Dynamic Leakage Suppression (DLS) Logic

DVCs are usually implemented using logic gates. In this regard, the dynamic leakage suppression (DLS) logic represents a feasible solution for designing a ULP DVC. This logic family was firstly introduced in [98] with the name of "ULP CMOS logic" and then resumed in [99] as "DLS logic". Its main goal is to drastically reduce the standby power of digital standard cells, but at the cost of a substantially degraded speed. In particular, the standby current in DLS logic gates is typically two or three orders of magnitude lower than transistor leakage (i.e., with  $V_{GS} = 0$ ), while the typical delay is in the millisecond range.

The basic structure of DLS logic gates is illustrated in Fig. 5.2. It is based on a standard CMOS gate with two additional switches (i.e., header and footer transistors) which are fed by the output node. The header transistor is implemented with an nMOSFET, while the footer is a pMOSFET.



Fig. 5.2: Basic structure of DLS logic gates.

In [98] the body terminal of all nMOSFET are connected to ground and the body of all pMOSFET are connected to  $V_{DD}$ , as in standard CMOS logic. On the contrary, in [99] the body of the footer pMOSFET is connected to ground to enhance its conductivity. The connection of the output node to the gate of the header and footer transistors acts as a feedback loop, thus further cutting off the pull-up or the pull-down, as shown in Figs. 5.3(a)-(b).



Fig. 5.3: DLS NOT gate with (a) logic 'zero' and (b) logic 'one' at the input, highlighting the voltage across the transistors.

A low input turns off M3, turns on M2 and sets the output high, which in turn switches off the PMOS footer M4. Since the drain currents of M3 and M4 are equal, the voltage V2 of their common node settles to a value close to  $V_{DD}/2$  (considering M3 and M4 with same conductivity). This translates into a negative  $V_{GS}$  for M3 around  $-V_{DD}/2$ , thus enabling super-cutoff operation (see Fig. 5.2(3)). Similarly, a high input leads to super-cutoff operation for M1 and M2 (see Fig. 5.3(b)). Super-cutoff [100] region allows reaching static current in the order of fA.

As the input changes from 0 V to *V*<sub>DD</sub>, M3 switches from super-cutoff region to weak inversion and starts to equalize the voltage *V*2 with the output. In turn M4 switches from super-cutoff region to a traditional cutoff bias point. M3 pulls *V*2 up, discharging the output node. Node *V*1 also discharges. This causes M1 and M2 towards the super-cutoff region, thus reducing the leakage from *V*<sub>DD</sub> to the output node. At the same time, the leakage through M4 pulls the output low, further suppressing the current of M1 and M2 and accelerating the overall discharge of the output. Due to this super-cutoff feedback effect, DLS logic has intrinsically different low-tohigh and high-to-low switching points, resulting in hysteresis and increased noise margins, as shown in Fig. 5.4. Overall, DLS logic is well suited for low-cost ULP applications that require low operating frequency.



Fig. 5.4: Static transfer characteristics of a DLS NOT gate.

#### 5.3 Ultra-Low Power Single-Stage DLS-Based DVC

A single-stage DVC design based on DLS logic is here presented to enable ULP operation. Fig. 5.5 shows the schematic of the proposed circuit. Unlike the conventional design of Fig. 5.1, it employs a pre-discharge structure. As a result, the pre-discharge phase occurs when the clock signal is high, whereas the evaluation phase requires a low clock signal. Note also that it uses DLS NOT logic gates in the cross-coupled latch and back-to-back MOSFETs (i.e., DLS switch) for pre-discharge and charge transistors. More specifically, M5-M6 and M7-M8 allow discharging the output nodes  $\overline{Out}$  and Out, respectively, during pre-discharge phase, while the evaluation current path is controlled by M1-M2. Replacing single-MOSFET pre-discharge and charge and charge allows reducing leakage current when they are turned off.



Fig. 5.5: Schematic of the single-stage DLS-based DVC with pre-discharge phase.

#### 5.3.1 Measurement results in 180-nm process

The single-stage DVC of Fig. 5.5 was designed in a commercial 1.8-V 180-nm CMOS technology using a dual- $V_{TH}$  design approach with transistor sizing as reported in Table 5.1. The resulting test chip occupies a silicon area of about 1,020  $\mu$ m<sup>2</sup>. Fig. 5.6 shows the chip micrograph with the layout of the proposed comparator.

| Transistor | <b>Type &amp; W/L</b> [μm] | Transistor | <b>Type &amp; W/L</b> [μm] |
|------------|----------------------------|------------|----------------------------|
| M1         | LVT 0.9/1.25               | M9-M13     | LVT 0.9/(4x2.34)           |
| M2         | RVT 0.9/1.25               | M10-M14    | RVT 0.9/(4x2.43)           |
| M3-M4      | RVT 3.96/1.8               | M11-M15    | RVT 0.9/1.44               |
| M5-M7      | RVT 1.44/1.8               | M12-M16    | RVT 0.9/1.26               |
| M6-M8      | RVT 1.44/1.8               |            |                            |

Table 5.1: Transistor flavor and sizing of the single-stage ULP DVC.



Fig.5.6: Layout and chip micrograph of the proposed ULP single-stage DLS-based DVC.

Fig. 5.7 shows the measured delay and offset voltage as a function of common-mode input voltage ( $V_{CM}$ ) at  $V_{DD} = 0.4$  V (corresponding to the observed minimum operating voltage), 25 °C and a clock frequency of 50 Hz for a typical sample. In this figure,  $V_{CM \ ranges}$  from 25 mV up to 250 mV. From Fig. 5.6, we can observe a delay in the order of a few milliseconds with a notable increasing trend with increasing  $V_{CM}$ . Such an increase of the delay can be ascribed to the reduction of the conductivity of M3-M4, i.e., the evaluation MOSFETs. An increase in  $V_{CM}$  reduces the  $|V_{GS}|$  of M3-M4 since their sources are connected to a common voltage. Reducing the conductivity of the evaluation MOSFETs leads to a larger time to charge the output nodes and trigger the latch. Fig. 5.7 also shows an offset voltage in the order of tens of millivolts, reaching a minimum value ( $\approx 26$  mV) around  $V_{CM} = 200$  mV. The high offset voltage is likely due to the mismatch effect and the not-modelled leakage current of the 2 branches that become significant working with pA supply current.

The limited input voltage range ( $V_{CM}$ ) is strongly related to the increasing of the delay. In fact, frequency of 50 Hz corresponds to a period of 20 ms and hence an evaluation phase of 10 ms. Increasing  $V_{CM}$  up to 250 mV results in increasing the delay which would reach a value higher than the evaluation phase time interval, thus making impossible a comparison. By slowing down the clock frequency, it is possible to extend the range of  $V_{CM}$ . However, given the lower conductivity of the evaluation MOSFETs, the offset becomes high while extending  $V_{CM}$  range. This phenomenon is due to the greater influence of the mismatch on circuits with sub-leakage (i.e., pA) currents. Indeed, the measured power consumption of the proposed circuit is about 80 pW.



Fig. 5.7: Measured delay and offset voltage vs.  $V_{CM}$  for  $V_{DD} = 0.4$  V, 25 °C, and 50-Hz clock frequency for one sample.

#### 5.4 Ultra-Low Power Dual-Stage DLS-Based DVC

With the aim of increasing the stability of the delay as *V*<sub>CM</sub> changes and improving the offset, while maintaining ultra-low power consumption, here a dual-stage DLS-based DVC design is presented. The proposed solution is based on the combination of two DVCs with opposite structures, properly coupled. In this regard, Fig. 5.8 shows the schematic of a single-stage DLS-based DVC implemented with two different architectures: (*i*) pre-discharge based structure with

PMOS as evaluation transistors (corresponding to the circuit of Fig. 5.5) and (*ii*) pre-charge based structure with NMOS as evaluation transistors (as in the circuit of Fig. 5.1). The former implements the pre-discharge (evaluation) phase when the clock signal is high (low). Conversely, the latter implements the pre-charge (evaluation) phase when the clock signal is low (high). As shown in Fig. 5.8, both architectures use DLS NOT logic gates and back-to-back MOSFETs (i.e., DLS switch) for pre-discharge (or pre-charge) and charge (or discharge) transistors. More specifically, the pre-discharge based structure employs M5-M6 and M7-M8 to discharge the output nodes  $\overline{OutP}$  and OutP, respectively, during pre-discharge phase, while the evaluation current path is controlled by M1-M2. In turn, the pre-charge based structure uses M17-M18 and M19-M20 to charge the output nodes OutN and  $\overline{OutN}$ , respectively, during pre-charge phase, while the evaluation current path is controlled by M26-M27. Note that the pre-charge based structure well operates with common-mode input voltage  $V_{CM}$  ranging from 0 to  $V_{DD}/2$ , whereas the pre-charge based architecture requires  $V_{CM}$  ranging from  $V_{DD}/2$  to  $V_{DD}$ . According to the schematic of Fig. 5.8, when Vin+> Vin- during the evaluation phase, both OutP and  $\overline{OutN}$  are high, while in turn  $\overline{OutP}$  and OutN are low.



Fig. 5.8: Schematic of single-stage DVCs: (left) pre-discharge based and (right) pre-charge based structures.

The architecture of the proposed dual-stage DVC is based on the combination of the two structures of Fig. 5.8. This requires a proper coupling between the two circuits. Short-circuiting the output nodes of the two structures that exhibit the same behavior (i.e.,  $\overline{OutP}$  with OutN and OutP with  $\overline{OutN}$ ) is not a feasible solution considering that, before the evaluation phase, those

nodes should reach different voltage levels according to pre-charge and pre-discharge operations. The implemented solution is illustrated in Fig. 5.8. This consists of interconnecting the output nodes of the two structures with opposite behavior by interposing a DLS-based cross-coupled latch. When the pre-discharge based structure does not work properly due to an input voltage between  $V_{DD}/2$  and  $V_{DD}$ , it is necessary to restore the value of OutP ( $\overline{OutP}$ ) by reading the value on the opposite side, i.e.,  $\overline{OutN}$  (OutN). When the input voltage is between 0 V and  $V_{DD}/2$ , the pre-charge based structure does not work properly due to the lower conduction of the evaluation MOSFETs. As a consequence, the outputs of this structure need to be brought to the proper level by the outputs of the opposite structure. When Vin +> Vin-, during the evaluation phase OutP ( $\overline{OutP}$ ) goes high (low), while  $\overline{OutN}$  (OutN) goes low (high). Therefore, in order to properly combine different outputs, 4 logic gates are needed: the first one reads OutP and restores the value of  $\overline{OutP}$  and restores OutN, while the fourth one does the opposite of the third one. Accordingly, these 4 gates introduce 2 latch, as shown in Fig. 5.9.



Fig. 5.9: Interconnection circuit between the output nodes of the two single-stage structures.

The two additional latches introduced by the interconnection block of Fig. 5.9 are supplied between  $V_{DD}$  and ground with the aim of making their delay independent from  $V_{CM}$ , thus speeding up the circuit. The 4 capacitors added at the output nodes (see Fig. 5.9) are made with MOS capacitors, i.e., using PMOS with the gate connected to the comparator output and the other terminals (drain, source and body) connected to ground. From simulation analysis, it can

be observed that increasing the size of such MOS capacitors allows further reducing the offset value, obviously at the cost of larger area occupancy, lower speed and higher power consumption. Such trade-off has to be thus taken into account when sizing the MOS capacitors.

#### 5.4.1 Measurement results in 180-nm process

The above described dual-stage DLS-based DVC was designed in a commercial 1.8-V 180-nm CMOS technology using a dual- $V_{TH}$  design approach with transistor sizing as reported in Table 5.2. The layout of the proposed DVC and die micrograph are shown in Fig. 5.10. The resulting test chip occupies a silicon area of about 3600  $\mu$ m<sup>2</sup> (42 $\mu$ m×86 $\mu$ m). Static and dynamic measurements were performed on 3 samples coming from different wafers of process corners (SS, TT, and FF), i.e., one sample for each process corner.

| Transistor | Type & W/L [µm]  | Transistor      | Type & W/L [μm]  |
|------------|------------------|-----------------|------------------|
| M1         | RVT 1.8/2.52     | M9-M13-M21-M28  | LVT 1.8/2.43     |
| M2         | RVT 1.8/2.52     | M10-M14-M22-M29 | RVT 1.8/2.43     |
| M3-M4      | RVT 1.98/(2x3.6) | M11-M15-M23-M30 | RVT 1.8/(2x1.44) |
| M5-M7      | LVT 1.44/3.6     | M12-M16-M24-M31 | RVT 1.8/(2x2.52) |
| M6-M8      | LVT 1.8/1.26     | M17-M19         | LVT 1.44/3.6     |
| M18-M20    | LVT 1.8/1.26     | M25-M32         | RVT 1.98/3.6     |
| M26        | LVT 1.8/(2x2.52) | M27             | LVT 1.8/2.52     |
| M32        | LVT 5/(3x10)     |                 |                  |

Table 5.2: Transistor flavor and sizing for the dual-stage DLS-based DVC.



Fig. 5.10: Layout of the proposed DVC and die micrograph with area occupancy.

Fig. 5.11 shows the delay as a function of  $V_{CM}$  for  $V_{DD} = 0.6$  V, 25 °C and clock frequency of 10 Hz. In this figure,  $V_{CM ranges}$  from few mV up to 200 mV. Again, the limited input range is due to the loss of control on the output nodes by the input analog voltages. When Vin+ > 200 mV and Vin- > Vin+, the output (referring to the *OutN* node) reflects a not valid value, while it toggles to high level. Considering that this problem appears when  $V_{CM}$  increases, the structure that could create the problem is the pre-charge based structure. As expected, the measured delay is higher than the single-stage structure due to the presence of MOS capacitors at the output nodes along with the delay contribution of the interconnection circuit. However, from Fig. 5.11 we can appreciate a quite constant delay in the considered  $V_{CM}$  range, as targeted by the dual-stage design approach.



Fig. 5.11: Measured delay vs.  $V_{CM}$  for  $V_{DD} = 0.6$  V, 25 °C, and 10-Hz clock frequency in the three test chips (TT, SS and FF corner wafers).

Fig. 5.12 (a) and (b) shows the measured current consumption and offset voltage, respectively, as a function of  $V_{CM}$  for  $V_{DD} = 0.6$  V, 25 °C and clock frequency of 10 Hz. From Fig. 5.12(a), the current consumption exhibits a quasi-linear increase with increasing  $V_{CM}$ . We can also observe that the measured supply current is in the order of pA, thus resulting in a picowatt power consumption, as targeted for ULP applications. From Fig. 5.12(b), we can see that in the typical case the offset is very low (lower than 1 mV) and quite constant in the considered  $V_{CM}$  range. Such an offset is considerably degraded in the SS and FF corner chips. However, we can note that also in the worst-case (i.e., SS corner) the dual-stage design exhibits a lower average offset than that achieved by the single-stage circuit for a typical sample (see Fig. 5.6).



Fig. 5.12: Measured (a) current consumption (a) and (b) offset voltage (b) as function of  $V_{CM}$  for  $V_{DD} = 0.6$  V, 25 °C, and 10-Hz clock frequency in the three test chips (TT, SS and FF corner wafers).

Fig. 5.13 then shows the measured input (i.e., Vin+, Vin- and clock signal) and output (referring to *OutN* node) waveforms of the proposed dual-stage DVC for Vin+ = 100 mV, Vin- sweeping from 99.5 mV up to 100.75 mV in 90 ms,  $V_{DD} = 0.6$  V, 25 °C and clock frequency of 10 Hz in the TT test chip. From Fig. 5.13, we can observe a delay (calculated as the time interval between the 50% swing of the falling edge of the clock signal and the 50% swing of the rising edge of the output voltage) in the order of a few milliseconds (see Fig. 5.11) and an offset voltage (calculated as the difference between Vin+ and Vin- when the output signal reach the 50% swing of its rising edge) of only 200  $\mu$ V (see Fig. 5.12 (b)).



Fig. 5.13: Measured input and output signals of the proposed dual-stage DVC for  $V_{DD} = 0.6$  V, 25 °C, and 10-Hz clock frequency in the three test chips (TT corner chip).

## 5.5 Comparison with the state of the art

The comparison of the two proposed DVCs against state-of-the-art designs is summarized in Table 5.3. As mentioned before, DVCs are mainly used within ADCs and for this reason, it is not easy to find measurement data on DVCs in prior art. In Table 5.3., data related to both proposed solutions refers to measurement results (in particular, double-stage structure across process corner test chips), while [101] reports only post-layout simulation results and [102]-[103] show pre-layout results.

The proposed single-stage design occupies an area of only 1,020  $\mu$ m<sup>2</sup>, which is 3 times lower than the dual-stage and not comparable with [101] that uses standard cell in a different technology. The single-stage design shows a minimum supply voltage  $V_{min}$  of 0.4 V in line with [103], while the  $V_{min}$  of the double-stage structure is slightly higher (i.e., 0.5 V). Both proposed solutions show a higher  $V_{min}$  than [101], but in a different technology. [101]-[102]-[103] show full range CMR, while both proposed solutions have a limited input range between 0 V and 200-250 mV.

As expected, the delay of the single-stage and the double-stage DVCs is not comparable with others works due to the use of the DLS logic. The first structure shows a  $2.8 \times$  better delay as compared to the TT test chip of the double-stage DVC.

The offset of the single-stage DVC is in line with [101] and  $3-16.5 \times$  higher than other designs, along with a power consumption of 80 pW, i.e.,  $56 \times$  lower than [103], which is the one with the lowest power consumption in prior art. Unfortunately, the narrow CMR combined with the high offset makes the single-stage DVC not a good solution to be used in SoC. Conversely, the strength of the dual-stage DVC is the quite low offset value (lower than 1 mV at the TT corner and 11.8 mV considering the mean across corners) combined with the  $4 \times$  lower power consumption as compared to the single-stage solution. More specifically, the dual-stage DVC can be useful to convert an input signal from a sensor with a variation between 0 V and 200 mV with an error of 0.5% (*VoFFSET*/CMR[%]) in ULP applications.

|                                               | [101] (sim) |       | [102]<br>(sim) | [103]<br>(sim) | Single-stage<br>DVC (TT) | Dual-stage DVC<br>(TT-FF-SS) |
|-----------------------------------------------|-------------|-------|----------------|----------------|--------------------------|------------------------------|
| Technology [nm]                               | 40          |       | 180            | 180            | 180                      | 180                          |
| Area [μm <sup>2</sup> ]                       | 62          |       | N/A            | N/A            | 1020                     | 3100                         |
| Min operating<br>voltage V <sub>min</sub> [V] | 0.3         |       | N/A            | 0.4            | 0.4                      | 0.5                          |
| Supply voltage<br>for FoMs [V]                | 0.6         | 0.3   | 1.8            | 0.4            | 0.4                      | 0.6                          |
| CMR [V]<br>(min-max)                          | 0.1-0.6     | 0-0.3 | 0-1.8          | 0-0.5          | 0-0.25                   | 0-0.2                        |
| Delay [µs]                                    | 0.003       | 0.1   | 0.00027        | 0.59           | 3270                     | 9300-7600-13300              |
| Offset [mV]                                   | 40          | 28    | 2.5            | 13.7           | 41.4                     | <b>0.95</b> -6.8-27.6        |
| Power [W]                                     | 1.5µ        | 15n   | 230µ           | 4.48n          | 80p                      | <b>20p</b> -22p-15p          |

Table 5.3: Comparison of two proposed DLS-based DVCs against state-of-the-art designs.

## Conclusions

IoT sensors nodes will enter our life more and more invasively in the next year. It is not difficult to imagine a future where we can monitor our biomedical values using a subcutaneous chip and an app on the mobile phone, or imagine that soon the fruit and vegetables we eat have been irrigated with quantities of water controlled by SoC making with benefit for the environment. The applications are innumerable and the integration of the entire system within a chip, also including the energy harvester, can be useful in various aspects.

First of all, SoCs that also integrate the part necessary for power supply reduce the costs of setting up and realization of the single sensor node. Not requiring additional parts, these SoCs are smaller, therefore less invasive and easy to be integrated within the environment. However, the most interesting aspect is the possibility of making the SoCs run almost perpetually. By improving the integrability and performance of energy harvesting and at the same time reducing the power of both analog and digital circuits, there is the possibility of creating IoT sensor nodes that only need to be placed once and will provide data whenever there is enough energy.

In this thesis, energy harvesting systems and energy storage solutions (e.g., super capacitors) are initially explored to demonstrate how it is possible to integrate power and storage systems into a chip. However, it is well known that the typical small amount of energy available from harvesting (i.e., reduced power/voltage budget, especially as a result of fluctuations in environmental conditions) can compromise the functionality and performance of both analog and digital circuits within a SoC. Therefore, to prolong the operation of the SoCs under unfavorable environmental conditions, circuit blocks with ULP/ULV operation are particularly sought after.

In this regard, this thesis has first presented the design of a voltage reference circuit operating down to 250 mV supply voltage, while consuming 5.4 pW at room temperature with 2,200  $\mu$ m<sup>2</sup> area in 180-nm technology [C1]. The proposed circuit exploits a body biasing scheme to deal with the effect of voltage/temperature fluctuations and hence to ensure good overall accuracy. A current reference circuit based on a voltage generator exploiting the structure of the above voltage reference was also presented and validated by means of measurements on a 180-nm test chip. The proposed current reference properly works down to 0.6 V to generate a current in the nA range with only 4,000- $\mu$ m<sup>2</sup> area occupancy, while reaching high power efficiency as guaranteed by the pW-power consumption of the voltage generator sub-block. Then, the design of an innovative voltage reference architecture based on an on-chip process sensor was proposed to reduce sensitivity to process variations and hence to achieve overall accuracy

against PVT variations. The proposed architecture also enables ULP/ULV operation with a minimum supply voltage of only 200 mV and power consumption of only 3.2 pW at room temperature, as demonstrated by measurements on a 180-nm test chip across corner wafers. A level shifter design able to convert input voltages from the subthreshold regime (around 100 mV) up to the nominal supply voltage (1.8 V) was also presented. The proposed level shifter relies on a self-biased low-voltage cascode current mirror topology with additional diode-connected PMOS and NMOS transistors to drive the split-input inverting buffer used as output stage with high energy efficiency. Obtained measurement results in 180-nm CMOS technology and across corner wafers demonstrate good robustness and performance of the proposed circuit when compared to prior art designs. Finally, the design of an ULP/ULV comparator was proposed by using the DLS logic family. Two different topologies, i.e., a single-stage structure and a dual-stage architecture based on the combination of two single-stage comparator were implemented and validated through measurements on 180-nm prototypes, demonstrating a power consumption of only few tens of pW.

## **Bibliography**

- Arbet, Daniel & Nagy, Lukas & Stopjakova, V. (2020). Ultra-Low-Voltage IC Design Methods. 10.5772/intechopen.91958.
- M. Alioto, "Ultra-Low Power VLSI Circuit Design Demystified and Explained: A Tutorial," in *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 59, no. 1, pp. 3-29, Jan. 2012, doi: 10.1109/TCSI.2011.2177004.
- National University of Singapore "Engineers invent smart microchip that can self-start and operate when battery runs out: Game-changing technology maximises lifetime and enables smaller, cheaper IoT devices." ScienceDaily. ScienceDaily, 3 May 2018. www.sciencedaily.com/releases/2018/05/180503101739.htm
- Statista Research Department. Internet of Things (IoT) Connected Devices Installed Base Worldwide from 2015 to 2025 (in Billions); Statista Research Department: Ss. Cyril and Methodius University in Skopje, Faculty of Computer Science and Engineering, Skopje, Republic of North Macedonia, 2020.
- C. Yu and X. Chen, "Carbon nanotubes based on-chip supercapacitors with improved areal energy density," 2020 IEEE 8th Electronics System-Integration Technology Conference (ESTC), Tønsberg, Vestfold, Norway, 2020, pp. 1-4, doi: 10.1109/ESTC48849.2020.9229835.
- Z. Chen, M. Law, P. Mak and R. P. Martins, "A Single-Chip Solar Energy Harvesting IC Using Integrated Photodiodes for Biomedical Implant Applications," in *IEEE Transactions on Biomedical Circuits and Systems*, vol. 11, no. 1, pp. 44-53, Feb. 2017, doi: 10.1109/TBCAS.2016.2553152.
- H. Le, N. Fong and H. C. Luong, "RF energy harvesting circuit with on-chip antenna for biomedical applications," *International Conference on Communications and Electronics* 2010, Nha Trang, 2010, pp. 115-117, doi: 10.1109/ICCE.2010.5670693.
- Éprima era 6]L. Lin, S. Jain and M. Alioto, "A 595pW 14pJ/Cycle microcontroller with dual-mode standard cells and self-startup for battery-indifferent distributed sensing," 2018 IEEE International Solid - State Circuits Conference - (ISSCC), San Francisco, CA, 2018, pp. 44-46, doi: 10.1109/ISSCC.2018.8310175.
- M. Mohammadifar, J. Zhang, I. Yazgan, V. Kariuki, O. Sadik and S. Choi, "High performance paper-based microbial fuel cells using nanostructured polymers," 2016 IEEE SENSORS, Orlando, FL, 2016, pp. 1-3, doi: 10.1109/ICSENS.2016.7808982.

- Vuran, Mehmet & Salam, Abdul & Wong, Rigoberto & Irmak, Suat. (2018). Internet of Underground Things in Precision Agriculture: Architecture and Technology Aspects. Ad Hoc Networks. 81. 10.1016/j.adhoc.2018.07.017.
- 11. W. Yang, H. Jiang, Z. Wang and W. Jia, "An Ultra-Low Power Temperature Sensor Based on Relaxation Oscillator in Standard CMOS," 2018 IEEE International Conference on Electron Devices and Solid State Circuits (EDSSC), Shenzhen, 2018, pp. 1-2, doi: 10.1109/EDSSC.2018.8487078.
- O. Aiello, P. Crovetti, L. Lin and M. Alioto, "A pW-Power Hz-Range Oscillator Operating With a 0.3–1.8-V Unregulated Supply," in *IEEE Journal of Solid-State Circuits*, vol. 54, no. 5, pp. 1487-1496, May 2019, doi: 10.1109/JSSC.2018.2886336.
- S. Chen, V. P. J. Chung, D. Yao and W. Fang, "Vertically integrated CMOS-MEMS capacitive humidity sensor and a resistive temperature detector for environment application," 2017 19th International Conference on Solid-State Sensors, Actuators and Microsystems (TRANSDUCERS), Kaohsiung, 2017, pp. 1453-1454, doi: 10.1109/TRANSDUCERS.2017.7994333.
- D. Yamane, T. Matsushima, T. Konishi, H. Toshiyoshi, K. Machida and K. Masu, "A dualaxis MEMS inertial sensor using multi-layered high-density metal for an arrayed CMOS-MEMS accelerometer," 2014 Symposium on Design, Test, Integration and Packaging of MEMS/MOEMS (DTIP), Cannes, 2014, pp. 1-4, doi: 10.1109/DTIP.2014.7056643.
- 15. Y. Luo, K. Teng, Y. Li, W. Mao, C. Heng and Y. Lian, "A 93µW 11Mbps wireless vital signs monitoring SoC with 3-lead ECG, bio-impedance, and body temperature," 2017 *IEEE Asian Solid-State Circuits Conference (A-SSCC)*, Seoul, 2017, pp. 29-32, doi: 10.1109/ASSCC.2017.8240208.
- 16. D. Lee, G. Dulai, and Vassili Karanassios "Survey of energy harvesting and energy scavenging approaches for on-site powering of wireless sensor- and microinstrumentnetworks", Proc. SPIE 8728, Energy Harvesting and Storage: Materials, Devices, and Applications IV, 87280S (28 May 2013); https://doi.org/10.1117/12.2016238.
- M. Rauzek, J. Konecny, M. Borova, K. Janosova, J. Hlavica, P. Musilek. "Energy Harvesting Sources, Storage Devices and System Topologies for Environmental Wireless Sensor Networks: A Review". *Sensors* 2018, *18*, 2446.
- 18. E. Ferro, P. López, V. M. Brea and D. Cabello, "On-Chip Solar Cell and PMU on the Same Substrate with Cold Start-Up from nW and 80 dB of Input Power Range for Biomedical Applications," 2019 IEEE International Symposium on Circuits and Systems (ISCAS), Sapporo, Japan, 2019, pp. 1-5, doi: 10.1109/ISCAS.2019.8702742.

- 19. Nguyen, N.Q.; Pochiraju, K.V. Behavior of thermoelectric generators exposed to transient heat sources. Appl. Therm. Eng. 2013, 51, 1–9.
- Pandya, S., Velarde, G., Zhang, L. *et al.* New approach to waste-heat energy harvesting: pyroelectric energy conversion. *NPG Asia Mater* 11, 26 (2019). https://doi.org/10.1038/s41427-019-0125-y.
- 21. X. Yang and X. He, "A Piezoelectric Wind Energy Harvester with Interaction Between Vortex-Induced Vibration and Galloping," 2019 IEEE SENSORS, Montreal, QC, Canada, 2019, pp. 1-4, doi: 10.1109/SENSORS43011.2019.8956809.
- 22. Hyun Jun Jung, Yooseob Song, Seong Kwang Hong, Chan Ho Yang, Sung Joo Hwang, Se Yeong Jeong, Tae Hyun Sung, Design and optimization of piezoelectric impact-based micro wind energy harvester for wireless sensor network, Sensors and Actuators Physical, Volume 222, 2015, Pages 314-321, ISSN 0924-4247, https://doi.org/10.1016/j.sna.2014.12.010.
- Zeng, Z., Ziegler, A.D., Searchinger, T. *et al.* A reversal in global terrestrial stilling and its implications for wind energy production. *Nat. Clim. Chang.* 9, 979–985 (2019). https://doi.org/10.1038/s41558-019-0622-6
- 24. Wang, P., Tanaka, K., Sugiyama, S. *et al.* A micro electromagnetic low level vibration energy harvester based on MEMS technology. *Microsyst Technol* 15, 941–951 (2009). https://doi.org/10.1007/s00542-009-0827-0
- 25. W.A. Serdijn, A.L.R. Mansano, M. Stoopman, Chapter 4.2 Introduction to RF Energy Harvesting, Editor(s): Edward Sazonov, Michael R. Neuman, Wearable Sensors, Academic Press, 2014, Pages 299-322, ISBN 9780124186620, https://doi.org/10.1016/B978-0-12-418662-0.00019-2.
- 26. K. Ishibashi, J. Ida, L. Nguyen, R. Ishikawa, Y. Satoh and D. Luong, "RF Characteristics of Rectifier Devices for Ambient RF Energy Harvesting," 2019 International Symposium on Electronics and Smart Devices (ISESD), Badung-Bali, Indonesia, 2019, pp. 1-4, doi: 10.1109/ISESD.2019.8909660.
- 27. Visconti P., Primiceri P., Ferri R., Pucciarelli M. & Venere E. (2017). An Overview on State-of-art Energy Harvesting Techniques and Choice Criteria: a WSN Node for Goods Transport and Storage Powered by a Smart Solar-based EH System. International Journal of Renewable Energy Research. 7. 1281-1295.
- Wang, Jinhui & Li, Fei & Zhu, Feng & Schmidt, Oliver. (2018). Recent Progress in Micro-Supercapacitor Design, Integration, and Functionalization. Small Methods. 1800367. 10.1002/smtd.201800367.

- 29. Zhang, H.; Cao, Y.; Chee, M.O.L.; Dong, P.; Ye, M.; Shen, J. Recent advances in microsupercapacitors. *Nanoscale* 2019, *11*, 5807–5821.
- 30. Bu, F., Zhou, W., Xu, Y. et al. Recent developments of advanced micro-supercapacitors: design, fabrication and applications. npj Flex Electron 4, 31 (2020). https://doi.org/10.1038/s41528-020-00093-6
- Maximilian Kaus, Julia Kowal, Dirk Uwe Sauer, Modelling the effects of charge redistribution during self-discharge of supercapacitors, Electrochimica Acta, Volume 55, Issue 25, 2010, Pages 7516-7523, ISSN 0013-4686.
- 32. Waqas Ali Haider, Muhammad Tahir, Liang He, Wei Yang, Aamir Minhas-khan, Kwadwo Asare Owusu, Yiming Chen, Xufeng Hong, Liqiang Mai, Integration of VS2 nanosheets into carbon for high energy density micro-supercapacitor, Journal of Alloys and Compounds, Volume 823, 2020, 151769, ISSN 0925-8388.
- 33. Lang Liu, Dong Ye, Yao Yu, Lin Liu, Yue Wu, Carbon-based flexible microsupercapacitor fabrication via mask-free ambient micro-plasma-jet etching, Carbon, Volume 111, 2017, Pages 121-127, ISSN 0008-6223, https://doi.org/10.1016/j.carbon.2016.09.037.
- Ankur Soam, Nitin Arya, Aniruddh Singh, Rajiv Dusane, Fabrication of silicon nanowires based on-chip micro-supercapacitor, Chemical Physics Letters, Volume 678, 2017, Pages 46-50, ISSN 0009-2614, https://doi.org/10.1016/j.cplett.2017.04.019.
- 35. K. Brousse, S. Nguyen, A. Gillet, S. Pinaud, R. Tan, A. Meffre, K. Soulantica, B. Chaudret, P.L. Taberna, M. Respaud, P. Simon, Laser-scribed Ru organometallic complex for the preparation of RuO2 micro-supercapacitor electrodes on flexible substrate, Electrochimica Acta, Volume 281, 2018, Pages 816-821, ISSN 0013-4686, https://doi.org/10.1016/j.electacta.2018.05.198.
- 36. Chien-Wei Wu, Binesh Unnikrishnan, I-Wen Peter Chen, Scott G. Harroun, Huan-Tsung Chang, Chih-Ching Huang, Excellent oxidation resistive MXene aqueous ink for microsupercapacitor application, Energy Storage Materials, Volume 25, 2020, Pages 563-571, ISSN 2405-8297, https://doi.org/10.1016/j.ensm.2019.09.026.
- Ankur Soam, Kaushik Parida, Rahul Kumar, Pravin kavle, Rajiv O. Dusane, Silicon-MnO2 core-shell nanowires as electrodes for micro-supercapacitor application, Ceramics International, Volume 45, Issue 15, 2019, Pages 18914-18923, ISSN 0272-8842, https://doi.org/10.1016/j.ceramint.2019.06.127.

- M. Alioto, Enabling the Internet of Things—From Integrated Circuits to Integrated System. Springer, 2017.
- 39. F. Crupi, R. De Rose, M. Paliy, M. Lanuzza, M. Perna, and G. Iannaccone, "A portable class of 3-transistor current references with low-power sub-0.5 V operation," *International Journal of Circuit Theory and Applications*, vol. 46, no. 4, pp. 779–795, Apr. 2018.
- 40. L. Lin, S. Jain, and M. Alioto, "Sub-nW Microcontroller With Dual-Mode Logic and Self-Startup for Battery-Indifferent Sensor Nodes," accepted to *IEEE Journal of Solid-State Circuits*.
- 41. M. Cho, et al., "A 142 nW Voice and Acoustic Activity Detection Chip for mm-Scale Sensor Nodes Using Time-Interleaved Mixer-Based Frequency Scanning," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 352–353.
- 42. L. Lin, S. Jain, and M. Alioto, "Multi-Sensor Platform with Five-Order-of-Magnitude System Power Adaptation down to 3.1nW and Sustained Operation under Moonlight Harvesting," in 2020 IEEE Symposium on VLSI Circuits, Honolulu, HI, USA, June 2020, pp. 1-2.
- 43. P. Nadeau, M. Mimee, S. Carim, T. K. Lu, and A. P. Chandrakasan, "Nanowatt Circuit Interface to Whole-Cell Bacterial Sensors," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 352–353.
- 44. M. Alioto, "Energy Harvesters for IoT: Applications and Key Aspects," *Short course at VLSI Symposium 2015*, Kyoto, June 15, 2015.
- 45. J. Goeppert and Y. Manoli, "Fully Integrated Startup at 70 mV of Boost Converters for Thermoelectric Energy Harvesting," *IEEE J. Solid-State Circuits*, vol. 51, no. 7, 2016.
- 46. S. Bandyopadhyay and A. P. Chandrakasan, "Platform Architecture for Solar, Thermal, and Vibration Energy Combining with MPPT and Single Inductor," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 2199–2215, 2012.
- 47. J. Li et al., "Body-Area Powering with Human Body-Coupled Power Transmission and Energy Harvesting ICs," *IEEE Transactions on Biomedical Circuits and Systems*, early access.
- 48. P. Toledo, P. Crovetti, O. Aiello, M. Alioto, "Fully Digital Rail-to-Rail OTA with Sub-1,000 μm2 Area, 250-mV Minimum Supply and nW Power at 150-pF Load in 180nm," *IEEE Solid-State Circuits Letters*, vol. 3, pp. 474-477, Sept. 2020.
- 49. L. Lin, S. Jain, and M. Alioto, "Integrated Power Management for Battery-Indifferent Systems With Ultra-Wide Adaptation Down to nW," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 4, pp. 967-976, April 2020.

- 50. S. K. Saha, "Modeling Process Variability in Scaled CMOS Technology," in *IEEE Design & Test of Computers*, vol. 27, no. 2, pp. 8-16, March-April 2010, doi: 10.1109/MDT.2010.50.
- 51. L. Pang and B. Nikolic, "Measurements and Analysis of Process Variability in 90 nm CMOS," in *IEEE Journal of Solid-State Circuits*, vol. 44, no. 5, pp. 1655-1663, May 2009, doi: 10.1109/JSSC.2009.2015789.
- 52. P. G. Drennan and C. C. McAndrew, "Understanding MOSFET mismatch for analog design," in *IEEE Journal of Solid-State Circuits*, vol. 38, no. 3, pp. 450-456, March 2003, doi: 10.1109/JSSC.2002.808305.
- 53. M. J. M. Pelgrom, A. C. J. Duinmaijer and A. P. G. Welbers, "Matching properties of MOS transistors," in *IEEE Journal of Solid-State Circuits*, vol. 24, no. 5, pp. 1433-1439, Oct. 1989, doi: 10.1109/JSSC.1989.572629.
- 54. Qing Dong, Kaiyuan Yang, D. Blaauw and D. Sylvester, "A 114-pW PMOS-only, trimfree voltage reference with 0.26% within-wafer inaccuracy for nW systems," 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), Honolulu, HI, 2016, pp. 1-2, doi: 10.1109/VLSIC.2016.7573494.
- 55. M. Seok, G. Kim, D. Blaauw, and D. Sylvester, "A Portable 2-Transistor Picowatt Temperature-Compensated Voltage Reference Operating at 0.5 V," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 10, pp. 2534–2545, Oct. 2012.
- 56. L. Magnelli, F. Crupi, P. Corsonello, C. Pace, and G. Iannaccone, "A 2.6 nW, 0.45 V Temperature-Compensated Subthreshold CMOS Voltage Reference," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 12, pp. 465–474, Feb. 2011.
- 57. Lee, D. Sylvester, and D. Blaauw, "A Subthreshold Voltage Reference With Scalable Output Voltage for Low-Power IoT Systems," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 5, pp. 1443–1449, May 2017.
- 58. C. de Oliveira, D. Cordova, H. Klimach, and S. Bampi, "Picowatt, 0.45–0.6 V Self-Biased Subthreshold CMOS Voltage Reference," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 12, pp. 3036–3046, Dec. 2017.
- 59. L. Wang and C. Zhan, "A 0.7-V 28-nW CMOS Subthreshold Voltage and Current Reference in One Simple Circuit," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 66, no. 9, pp. 3457–3466, Sep. 2019.

- 60. H. Wang, P.P. Mercier, "A 420fW Self-Regulated 3T Voltage Reference Generator Achieving 0.47%/V Line Regulation from 0.4-to-1.2V," in *Proc. IEEE European Solid-State Circuits Conference (ESSCIRC)*, Sep. 2017.
- H. Wang, P.P. Mercier, "A 763 pW 230 pJ/Conversion Fully-Integrated CMOS Temperature-to-Digital Converter with +0.81/-0.75°C Inaccuracy," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 8, pp. 2281-2290, Aug. 2019.
- 62. Shrivastava, K. Craig, N. E. Roberts, D. D. Wentzloff, and B. H. Calhoun, "A 32nW Bandgap Reference Voltage Operational from 0.5V Supply for Ultra-Low Power Systems," in 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, Feb. 2015, pp. 94–95.
- 63. Y. Ji, C. Jeon, H. Son, B. Kim, H.-J. Park, and J.-Y. Sim "A 9.3nW All-in-One Bandgap Voltage and Current Reference Circuit," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), Feb. 2017, pp. 100–101.
- 64. J. M. Lee, Y. Ji, S. Choi, Y.-C. Cho, S.-J. Jang, J. S. Choi, B. Kim, H.-J. Park, and J.-Y. Sim, "A 29nW Bandgap Reference Circuit," in 2015 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, Feb. 2015, pp. 100–101.
- 65. Y. Osaki, T. Hirose, N. Kuroki, and M. Numa, "1.2-V Supply, 100-nW, 1.09-V Bandgap and 0.7-V Supply, 52.5-nW, 0.55-V Sub-bandgap Reference Circuits for Nanowatt CMOS LSIs," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 6, pp. 1530–1538, June 2013.
- 66. M. Kim and S. Cho, "0.8V, 37nW, 42ppm/°C Sub-Bandgap Voltage Reference with PSRR of -81dB and Line Sensitivity of 51ppm/V in 0.18um CMOS," in 2017 Symposium on VLSI Circuits, June 2017, pp. 144–145.
- 67. Lee and D. Blaauw, "A 31 pW-to-113 nW Hybrid BJT and CMOS Voltage Reference with 3.6% ±3σ-Inaccuracy from 0 °C to 170 °C for Low-Power High-Temperature IoT Systems," in 2017 Symposium on VLSI Circuits, June 2019, pp. 142–143.
- 68. Y. Ji, J. Lee, B. Kim, H.-J. Park, and J.-Y. Sim, "A 192-pW Voltage Reference Generating Bandgap–Vth With Process and Temperature Dependence Compensation," *IEEE Journal* of Solid-State Circuits, vol. 54, no. 12, pp. 3281–3291, Dec. 2019.
- 69. R. Mohamed, M. Chen and G. Wang, "Untrimmed CMOS Nano-Ampere Current Reference with Curvature-Compensation Scheme," 2019 IEEE International Symposium on Circuits and Systems (ISCAS), Sapporo, Japan, 2019, pp. 1-4, doi: 10.1109/ISCAS.2019.8702293.
- 70. M. S. Eslampanah Sendi, S. Kananian, M. Sharifkhani and A. M. Sodagar, "Temperature Compensation in CMOS Peaking Current References," in *IEEE Transactions on Circuits*

and Systems II: Express Briefs, vol. 65, no. 9, pp. 1139-1143, Sept. 2018, doi: 10.1109/TCSII.2018.2805832.

- 71. S. S. Chouhan and K. Halonen, "A 0.67-µW 177-ppm/°C All-MOS Current Reference Circuit in a 0.18µm CMOS Technology," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 63, no. 8, pp. 723-727, Aug. 2016, doi: 10.1109/TCSII.2016.2531158.
- 72. E. M. Camacho-Galeano, C. Galup-Montoro and M. C. Schneider, "A 2-nW 1.1-V selfbiased current reference in CMOS technology," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 52, no. 2, pp. 61-65, Feb. 2005, doi: 10.1109/TCSII.2004.842059.
- 73. Q. Huang, C. Zhan, L. Wang, Z. Li and Q. Pan, "A -40 °C to 120 °C, 169 ppm/°C Nano-Ampere CMOS Current Reference," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 67, no. 9, pp. 1494-1498, Sept. 2020, doi: 10.1109/TCSII.2020.3009838.
- 74. M. Choi, I. Lee, T. -K. Jang, D. Blaauw and D. Sylvester, "A 23pW, 780ppm/°C resistorless current reference using subthreshold MOSFETs," *ESSCIRC 2014 - 40th European Solid State Circuits Conference (ESSCIRC)*, Venice Lido, 2014, pp. 119-122, doi: 10.1109/ESSCIRC.2014.6942036.
- 75. L. Wang and C. Zhan, "A 0.7-V 28-nW CMOS Subthreshold Voltage and Current Reference in One Simple Circuit," in *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 66, no. 9, pp. 3457-3466, Sept. 2019, doi: 10.1109/TCSI.2019.2927240.
- 76. H. Wang and P. P. Mercier, "A 3.4-pW 0.4-V 469.3 ppm/°C Five-Transistor Current Reference Generator," in *IEEE Solid-State Circuits Letters*, vol. 1, no. 5, pp. 122-125, May 2018, doi: 10.1109/LSSC.2018.2875825.
- 77. S. Lee, S. Heinrich-Barna, K. Noh, K. Kunz and E. Sánchez-Sinencio, "A 1-nA 4.5-nW 289-ppm/°C Current Reference Using Automatic Calibration," in *IEEE Journal of Solid-State Circuits*, vol. 55, no. 9, pp. 2498-2512, Sept. 2020, doi: 10.1109/JSSC.2020.2995038.
- 78. Rajarshi Paul, Amit Patra, Trimming process and temperature variation in second-order bandgap voltage reference circuits, Microelectronics Journal, Volume 42, Issue 2, 2011, Pages 271-276, ISSN 0026-2692, https://doi.org/10.1016/j.mejo.2010.10.006.
- 79. S. Thirunakkarasu and B. Bakkaloglu, "Built-in Self-Calibration and Digital-Trim Technique for 14-Bit SAR ADCs Achieving ±1 LSB INL," in *IEEE Transactions on Very*

Large Scale Integration (VLSI) Systems, vol. 23, no. 5, pp. 916-925, May 2015, doi: 10.1109/TVLSI.2014.2321761.

- 80. M. Yoshioka et al., " A 10-b 50-MS/s 820-µW SAR ADC with on-chip digital calibration ", *Proc. IEEE Int. Solid-State Circuits Conf.*, pp. 384-385, Feb. 2010.
- 81. M. Lanuzza, P. Corsonello, and S. Perri, "Fast and wide range voltage conversion in multisupply voltage designs," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 23, no. 2, pp. 388–391, Feb. 2015.
- W. Zhao, A.B. Alvarez, and Y. Ha, "A 65-nm 25.1-ns 30.7-fJ Robust Subthreshold Level Shifter with Wide Conversion Range," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 62, no. 7, pp. 671–675, Jul. 2015.
- 83. M. Lanuzza, F. Crupi, S. Rao, R. De Rose, S. Strangio, and G. Iannaccone, "An Ultralow-Voltage Energy-Efficient Level Shifter," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 64, no. 1, pp. 61–65, Jan. 2017.
- 84. S. Kabirpour and M. Jalali, "A Low-Power and High-Speed Voltage Level Shifter Based on a Regulated Cross-Coupled Pull-Up Network," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 66, no. 6, pp. 909–913, Jun. 2019.
- 85. E. Låte, T. Ytterdal, and S. Aunet, "An Energy Efficient Level Shifter Capable of Logic Conversion From Sub-15 mV to 1.2 V," *IEEE Transactions on Circuits and Systems II: Express Briefs*, in press.
- 86. R. Matsuzuka, T. Hirose, Y. Shizuku, K. Shinonaga, N. Kuroki, and M. Numa, "An 80mV-to-1.8-V Conversion-Range Low-Energy Level Shifter for Extremely Low-Voltage VLSIs," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 8, pp. 2026–2035, Aug. 2017.
- 87. J. Zhou, C. Wang, X. Liu, X. Zhang, and M. Je, "An Ultra-Low Voltage Level Shifter Using Revised Wilson Current Mirror for Fast and Energy-Efficient Wide-Range Voltage Conversion from Sub-Threshold to I/O Voltage," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 62, no. 3, pp. 697–706, Mar. 2015.
- 88. V. L. Le and T.T.-H. Kim, "An Area and Energy Efficient Ultra-Low Voltage Level Shifter With Pass Transistor and Reduced-Swing Output Buffer in 65-nm CMOS," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 65, no. 5, pp. 607–611, May 2018.
- 89. L. Wen, X. Cheng, S. Tian, H. Wen, and X. Zeng, "Subthreshold Level Shifter with Self-Controlled Current Limiter by Detecting Output Error," *IEEE Transactions on Circuits* and Systems II: Express Briefs, vol. 63, no. 4, pp. 346–350, Apr. 2016.

- 90. S. Kabirpour and M. Jalali, "A Power-Delay and Area Efficient Voltage Level Shifter Based on a Reflected-Output Wilson Current Mirror Level Shifter," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 67, no. 2, pp. 250–254, Feb. 2020.
- 91. R. Lotfi, M. Saberi, S.R. Hosseini, A.R. Ahmadi-Mehr, and R.B. Staszewski, "Energy-Efficient Wide-Range Voltage Level Shifters Reaching 4.2 fJ/Transition," *IEEE Solid-State Circuits Letters*, vol. 1, no. 2, pp. 34–37, Feb. 2018.
- 92. E. Maghsoudloo, M. Rezaei, M. Sawan, and B. Gosselin, "A High-Speed and Ultra Low-Power Subthreshold Signal Level Shifter," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 5, pp. 1164–1172, May 2017.
- 93. R. Luzzi, M. Bucci, and A. Trifiletti, "Self-biased cascode current mirror," U.S. Patent 0160557 A1, Jun. 25, 2009.
- 94. F. Ishihara, F. Sheikh, and B. Nikolic, "Level conversion for dual-supply systems," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 12, no. 2, pp. 185–195, Feb. 2004.
- 95. Jendernalik, W. An Ultra-Low-Energy Analog Comparator for A/D Converters in CMOS Image Sensors. *Circuits Syst Signal Process* 36, 4829–4843 (2017). https://doi.org/10.1007/s00034-017-0630-6
- 96. S. Hussain, R. Kumar and G. Trivedi, "Comparison and Design of Dynamic Comparator in 180nm SCL Technology for Low Power and High Speed Flash ADC," 2017 IEEE International Symposium on Nanoelectronic and Information Systems (iNIS), Bhopal, 2017, pp. 139-144, doi: 10.1109/iNIS.2017.37.
- 97. S. Babayan-Mashhadi and R. Lotfi, "Analysis and Design of a Low-Voltage Low-Power Double-Tail Comparator," in *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 22, no. 2, pp. 343-352, Feb. 2014, doi: 10.1109/TVLSI.2013.2241799.
- 98. D. Bol, R. Ambroise, D. Flandre and J. Legat, "Building Ultra-Low-Power Low-Frequency Digital Circuits with High-Speed Devices," 2007 14th IEEE International Conference on Electronics, Circuits and Systems, Marrakech, 2007, pp. 1404-1407, doi: 10.1109/ICECS.2007.4511262.
- 99. W. Lim, I. Lee, D. Sylvester and D. Blaauw, "8.2 Batteryless Sub-nW Cortex-M0+ processor with dynamic leakage-suppression logic," 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, San Francisco, CA, 2015, pp. 1-3, doi: 10.1109/ISSCC.2015.7062968.

- H. Kawaguchi, K. Nose, T. Sakurai, "A super cut-off CMOS (SCCMOS) scheme for 0.5-V supply voltage with picoampere stand-by current", IEEE Journal of Solid-State Circuits, vol. 35, no. 10, pp. 1498-1501, Oct. 2000.
- 101. O. Aiello, P. Crovetti and M. Alioto, "Fully Synthesizable, Rail-to-Rail Dynamic Voltage Comparator for Operation down to 0.3 V," 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, 2018, pp. 1-5, doi: 10.1109/ISCAS.2018.8351106.
- 102. Khorami, M. B. Dastjerdi and A. F. Ahmadi, "A low-power high-speed comparator for analog to digital converters," 2016 IEEE International Symposium on Circuits and Systems (ISCAS), Montreal, QC, 2016, pp. 2010-2013, doi: 10.1109/ISCAS.2016.7538971.
- 103. Y. Hwang and D. Jeong, "Ultra-low-voltage low-power dynamic comparator with forward body bias scheme for SAR ADC," in *Electronics Letters*, vol. 54, no. 24, pp. 1370-1372, 29 11 2018, doi: 10.1049/el.2018.6340.

## **List of Publications**

- [C1] L. Fassio, L. Lin, R. De Rose, M. Lanuzza, F. Crupi, and M. Alioto, "A 0.25-V, 5.3-pW Voltage Reference with 25-μV/°C Temperature Coefficient, 140-μV/V Line Sensitivity and 2,200-μm<sup>2</sup> Area in 180nm," in 2020 IEEE Symposium on VLSI Circuits, Honolulu, HI, USA, June 2020, pp. 1–2.
- [J1] L. Fassio et al., "A Robust, High-Speed and Energy-Efficient Ultralow-Voltage Level Shifter," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, doi: 10.1109/TCSII.2020.3033253.