XII Front-End Electronics Workshop 12<sup>th</sup> – 16<sup>th</sup> June 2023

# The CMS Outer-Tracker ASICs



Alessandro Caratelli

on behalf of the CMS OT ASICs working-group

### CMS Outer Tracker ASICs



### CIC ASIC

DESIGNED AND TESTED BY:
L. Caponetto • G. Galbit,
B. Nodari • S. Viret • S. Scarfi (IP2I Lyon university)

With contributions from A. Caratelli, D. Ceresa



## SSA ASIC

#### DESIGNED AND TESTED BY:

- A. Caratelli G. Bergamin D. Ceresa,
- J. Kaplon K. Kloukinas S. Scarfi



MPA ASIC

#### DESIGNED AND TESTED BY:

- D. Ceresa G. Bergamin A. Caratelli J. Kaplon,
- K. Kloukinas A. Nookala S. Scarfi



Total of: ~ 185 000 chips

### Few years back.. Requirements for the HighLumi tracker upgrade [1]

#### Phase-2 upgrade tracker requirements:

- Higher luminosity
- Increased pileup events per BX
- Increase radiation tolerance
- Reduced material budget
- Participate in the L1 trigger
- Improve trigger performance

#### Requirements for the tracker electronics:

- Increase granularity
- Introduction of a pixelated sensor in OT
- Radiation tolerance up to 100 Mrad or more
- Quick and on-chip particle discrimination
- Higher trigger rate (1MHz) and longer latency (12.5 μs)
- Power density < 100 mW/cm<sup>2</sup>
- Add tracking information to the Level-1 trigger decision

[1] CMS collaboration. "The phase-2 upgrade of the CMS tracker." CMS-TDR-014 (2017).

Few years back.. Introduced a novel particle detector electronic system

The outer tracker detector can provide for every event additional information for the trigger decision leading to a significant improvement of the particle recognition efficiency

### The complete real time tracker readout is not feasible

HOW? The readout electronics in the detector can send pre-selected information for the Level-1 event reconstruction

**Intelligent pixel particle detector** capable to locally self-select interesting signatures of particles interesting for the physics, without relying on an external trigger system

This approach allows for a significant data reduction efficiency improvement

Detector capable of providing particle transverse momentum information in addition to simple geometrical positioning and energy measurements An intelligent particle tracking system based on  $p_T$  discrimination



### The CMS Outer Tracker modules [1, 2]



[1] CMS collaboration. "The phase-2 upgrade of the CMS tracker." CMS-TDR-014 (2017).

[2] Abbaneo, Duccio. "Upgrade of the CMS Tracker with tracking trigger." Journal of Instrumentation 6.12 : C12065.

### The Pixel-Strip module



[3] Caratelli, Alessandro, et al. Characterization of the first prototype of the Silicon-Strip readout ASIC (SSA). No. CMS-CR-2018-286. 2018.

[4] Ceresa, Davide, et al. Characterization of the MPA prototype, a 65 nm pixel readout ASIC with on-chip quick transverse momentum discrimination capabilities. No. CMS-CR-2018-279. 2018.

[5] Moreira, Paulo. "The LpGBT project status and overview." ACES. 2016.

[6] Nodari, Benedetta, et al. A 65 nm data concentration ASIC for the CMS outer tracker detector upgrade at HL-LHC. No. CMS-CR-2018-278. 2018.

### MPA and SSA ASICs: system level architecture choices

Initially several system design choices needed to be taken:

- Define which functionality are implemented in the SSA and which in the MPA
- How to minimise system power requirements
- How to minimize bandwidth requirements
- Maximize the particle recognition efficiency
- Optimize bandwidth among ASICs
- Data encoding
- Transmission FIFOs depth
- Data compression
- Particle hit clustering at SSA level
- Several others

Analytic approach but the functionality and the efficiency depends on physics statistics, particle rates and hit occupancy (no simple inputs)

Becomes necessary a Simulation framework capable of providing:

System Studies and performances evaluation

**Design Verification** 

- Study and compare different system implementation
- Evaluate tradeoff between performances and power optimization
- Report efficiency parameters by comparison with a system reference model
- Evaluate the efficiency of the particle recognition and of the data readout
- **Realistic stimuli generation** from Monte-Carlo simulations of complex interactions in high-energy particle collisions

### MPA and SSA ASICs: system level architecture choices

Initially several system design choices needed to be taken:

- Define which functionality are implemented in the SSA and which in the MPA
- How to minimise system power requirements
- How to minimize bandwidth requirements
- Maximize the particle recognition efficiency
- Optimize bandwidth among ASICs
- Data encoding
- Transmission FIFOs depth
- Data compression
- Particle hit clustering at SSA level
- Several others

Analytic approach but the functionality and the efficiency depends on physics statistics, particle rates and hit occupancy (no simple inputs)

Becomes necessary a Simulation framework capable of providing:

System Studies and performances evaluation

**Design Verification** 

- Verify the RTL implementation and the chip-set functionalities
- Generation of **realistic activity information** for precise power analysis
- Verify post-layout netlist
- Verify at clock-cycle level precision the subsystems integration and the communication among modules the ASICs

### System level simulation framework [7]

#### Implemented in: SystemVerilog HDL / UVM + Python



[7] Caratelli, Alessandro, et al. "System Level simulation framework for the ASICs development of a novel particle physics detector." 2018 14th Conference on Ph. D. Research in Microelectronics and Electronics (PRIME). IEEE, 2018.

System architecture definition [8]



[8] A. Caratelli, D. Ceresa, S. Kloukinas, S. Scarfi et al. Readout architecture for the Pixel-Strip module of the CMS Outer Tracker Phase-2 upgrade. No. CMS-CR-2016-405.

ALESSANDRO CARATELLI | CERN EP-ESE | 11





![](_page_12_Figure_3.jpeg)

- Fast combinatorial clustering at event rate to limit cross-talk effect
- Wide clusters represents not interesting events: are **filtered** to optimize bandwidth and processing power.
- Correct the parallax error of approximating the cylindrical geometry with planar pixel-strip sensors.
- Up to 8 Cluster Centroids coordinates are transmitted per every event to the MPA coincidence logic for the **Stub generation** and the **transverse momentum discrimination**

![](_page_13_Figure_2.jpeg)

![](_page_14_Figure_2.jpeg)

[9] D. Ceresa, A. Caratelli, G. Bergamin, J. Kaplon, K. Kloukinas, S. Scarfi "MPA-SSA, design and test of a 65nm ASIC-based system for particle tracking at HL-LHC featuring on-chip particle discrimination." 2019 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC). IEEE, 2019.

System architecture definition

![](_page_15_Figure_2.jpeg)

[9] D. Ceresa, A. Caratelli, G. Bergamin, J. Kaplon, K. Kloukinas, S. Scarfi "MPA-SSA, design and test of a 65nm ASIC-based system for particle tracking at HL-LHC featuring on-chip particle discrimination." 2019 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC). IEEE, 2019.

Coo N

![](_page_16_Figure_2.jpeg)

#### Coordinate encoding

#### MPA

Mephisto Encoder Up to 2 coordinate per cycle Significantly less power consumption

![](_page_16_Figure_6.jpeg)

#### SSA

**Priority Encoder** Encodes up to 8 centroids over the 132 bits vector in the periphery logic

Less power efficient, but minimizing latency in SSA reduce much more the consumption in the MPA

#### What counts is the overall power!

![](_page_16_Figure_11.jpeg)

Pixel data Pixel data Pixel data Pixel data **Stub Finding Logic** Row 0 Row 1 Row 2 Row 15 Hit L1 = 1920 bits 4 pixel centroid/ row = 512 bits Zero compression ... + Coincidence logic Digital Trigger 🛛 📥 Digital LI Data Z Priority Encoder path path Pixel Encoder Strip Encoder 40 MHz CIk Capacitance for Pad connection Coincidence Matrix Finding logic pulse injection With sensor Input: 76 Gb/s pixel data L1 Data Interface Stub Sorter 2.56 Gb/s strip data 120 Stub FIFO BX averaging **Trigger Interface** ready 5 stubs / 2 BX = 80 bits Output: 1.6 Gb/s stub data **Binary** DSP 320 MHz CIk Readout SSA-to-SSA Trigger data de-Trigger data serializer L1 data serializer L1 data serializer serializer 2 SLVS @ 320 MHz LI Strip ... 8 SLVS @ 320 Mbps 5 SLVS @ 320 Mbps 1 SLVS 1 SLVS DSP Memory @ 320 Mbps @ 320 Mbps I SLVS @ 3 Serializer Serializer 20 cycle/ LI event TX 8 SLVS @ 320 MHz Stub FIFO Finding 8 strip centroid/BX **Serialiazer Serialiazer SSA MPA** 5 SLVS I SLVS @ 320 MHz @ 320 MHz

![](_page_18_Figure_1.jpeg)

![](_page_19_Figure_2.jpeg)

[9] D. Ceresa, A. Caratelli, G. Bergamin, J. Kaplon, K. Kloukinas, S. Scarfi "MPA-SSA, design and test of a 65nm ASIC-based system for particle tracking at HL-LHC featuring on-chip particle discrimination." 2019 IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC). IEEE, 2019.

### Design for Testability

#### Memory Built-In-Self-Test

- Test full memory functionality in <10 ms</li>
- Results saved in internal registers accessible via slow-control
- Few additional hardware self contained hierarchical block
- Clock gating during normal operation (only leakage power)

#### Periphery Scan Chain

- FSM Easy to access standardized approach with TRL
- 92% of fault coverage in SSA (300ms)
- ~95% of fault coverage and 25k ff in MPA (750ms)

![](_page_20_Figure_11.jpeg)

### Logic Built-In-Self-Test for pixel array

- FSM embedded in Pixel Array logic and vectors from configuration
- Requires compression / decompression logic
- ~90% coverage

### **Power Reduction Methodology**

#### Power optimization

- Clock gating in all configuration registers and logic
- Architecture studies to minimize power consumption
- Use Multi-VT standard cells
- Use gated SRAM blocks
- Multi-supply voltage (1.0V 1.2V)
- Find power hungry and low activity blocks and optimize their implementation

![](_page_21_Picture_9.jpeg)

### Power study:

- Static and Dynamic power analysis
- Voltage drop analysis on different scenario
- Power-Grid-View (PGV) for Macro

![](_page_21_Figure_14.jpeg)

### Total Ionizing Dose effects hardening

Digital domain:

- 9-tracks library selected as compromise between power consumption and radiation tolerance considering the operating range -40°C / 0°C
- Characterization of the digital cells parameters (prop. delay, transition time, setup/hold, etc. ) for radiation corner
- Increased margins for TID degradation (setup uncertainty jitter + additional 8% of clock period reduced max transition derate factors)
- Due to narrow channel effects → Removed minimum-width cells (D0, D1) and delay elements
- Only thin-oxide devices  $\rightarrow$  1.2V max (CMOS IO and SLVS)
- Custom ESD structures latch-up resistant

#### Memories:

- A custom memory compiler allowed to generate a SRAM with cell transistor featuring nMOS W > 200 nm pMOS W > 500 nm
- Protection against latch-up is reached by placing p<sup>+</sup> guard bands between n<sup>-</sup> regions.

Usage of ELT devices in input stage:

- To prevent the radiation induced drain-to-source leakage current increase due to the charge trapped in the shallow trench isolations (STI).
- To mitigate the 1/f noise increase on irradiated devices due to side-effects of the STI region in nMOS operated at low drain current.

### Digital library choice and delay corner comparison

- Supply voltage scaling
- 9 tracks library chosen as compromise between power consumption and radiation tolerance
- Temperature inversion effect prevent the SSA from using a high-Vt library cells at 0.9V.
- Mix of standard-Vt and low-Vt digital cells at 1.0V+10% as compromise of power consumption, memory operation and propagation delay at -40°C

![](_page_23_Figure_6.jpeg)

### Single Event Effects tolerance

#### — State machines

- Triple module redundancy (FULL)
- Triplicated Clock-trees
- Triplicated Reset distribution
- FF minimum distance 15um

#### – Latch FIFOs

- Control and header fields triplicated
- Data latches not protected

#### Data pipeline

 No SEU protection applied due to limited power budget

#### Clock tree

- Clock tree triplicated
- The non-triplicated logic uses the voted clock in critical areas
- The non-triplicated logic uses one of the branches in non-critical areas:
  - Simplify scan-chain insertion
  - Helps in reducing buffering for hold fix (power)
  - Allow for CPPR on the 3 branches

#### Triplicated pads for

- Clock
   Control
- Reset Scan-Chain IOs

#### Configuration registers

- Triple module redundancy with error detection and self-correction
- Clock enabled only during
  - o asynchronous readout operation,
  - configuration operations
  - self-correction

#### Glitch filters

- Reset inputs
- TEST-MODE signal
- Scan-chain TEST POINTS control (on the control of the system clock / test clock selection multiplexers)

### Single Event Effects tolerance

#### Physical implementation

- Use of instance space groups among triplicated registers
- Avoid logic simplification by synthesis and P&R flow
- Spacing for clock and reset buffers in all periphery logic

![](_page_25_Figure_6.jpeg)

#### — Functional simulation

- System Verilog UVC for randomize the injection (constrained from the specific test case)
- The randomization is constrained accordingly: Error probability, average SEE rate, minimum time split, etc...
- Injection of single event effects in multiple ASICs at the same time to evaluate the consequences that SEE in an ASIC have on the other ASICs part of the chipset
- Possibility to focus the SEU injection on particular module or subsystem and evaluate the effect at system level
- Possibility to inject SEU in hundred of cells per clock cycle (register grouped in non-interacting categories)

#### – Additional checks

- Script to verify that no triplicated instance is optimized out
- Script to verify placement constraints after chip assembly

### Physical Implementation flow

![](_page_26_Figure_2.jpeg)

- Digital-on-top design flow
- Hierarchical implementation
- Multi supply voltage 1.0 V ± 10% 1.2 V ± 10%
- 3 independent power and ground domains to reduce noise coupling with guard-ring isolation
- Multi-Vt design (Low-Vt used only in critical timing arcs)
- C4 bump floorplan + wirebond for wafer probing
- Complex CTS and timing closure due to triple clock tree balancing and SEU hardening
- Constraints for TMR and digital cells placement
- Skew balancing among triplicated and voted clock trees
- Strip cell sampling clock guarantees <200ps skew in all corners
- Non-default CTS rules to mitigate cross-coupling
- QRC extracted information already at the optimization stage due to design size

### The ASICs

![](_page_27_Picture_2.jpeg)

![](_page_27_Picture_3.jpeg)

| CIC2 ASIC                                                         |   |
|-------------------------------------------------------------------|---|
| Data concentrator ASIC<br>for the CMS OT                          |   |
| DESIGN AND TESTING TEAM:                                          | N |
| • L. Caponetto • G. Galbit • S. Scarfi<br>• B. Nodari • S. Viret, |   |
| With contributions from:<br>• A. Caratelli, • D. Ceresa           |   |

### MPA-SSA-CIC Timeline

![](_page_28_Figure_2.jpeg)

### ASICs testing

- The SSA, the MPA and the CIC were produced in a full mask-set engineering run
- The first 6 wafers have been tested at CERN by the ASICs designers
- Test routine includes:
  - Scan-chain test for production defects
  - Functional test of digital circuits

- Analog bias parameter caracterization
- Front-end caracterization

- Noise analysis
- $\circ$   $\,$  Serial ID and trimming in e-fuses  $\,$
- The wafer have been diced and the chip bonded on carrier boards for radiation tests and detailed cractarization

![](_page_29_Picture_12.jpeg)

### SSA test results

![](_page_30_Figure_2.jpeg)

#### SSA Threshold trimming procedure

![](_page_31_Figure_3.jpeg)

Trimming performed at 2.0 fC. Threshold spread evaluate for 2.0 fC and 1.25fC

#### SSA Threshold distribution after trimming

![](_page_31_Figure_6.jpeg)

#### SSA Analog Front-End Noise vs Temperature

Plot based on 3 tested chips in climatic chamber

#### SSA Front-end noise measurements

![](_page_32_Figure_5.jpeg)

Distribution based on 10'000 tested chips

Channel input noise evaluated as the standard deviation of the error function fitting the S-Curves (1.25 fC and 2.0 fC)

### Temperature Characterization summary

 -40
 -30
 -20
 -10
 0
 10
 20
 30
 40
 °C

 1.2V
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I
 I

No errors or timing issues observed on digital logic

- No errors or issues observed in analog FE
- Bias structures variation within compensating range
- FE noise change within expectation

- Full set of digital functionalities tests
- Tests of memories (with BIST) and configuration
- Characterization of all bias parameters
- S-Curve for FE Gain, Noise and Trimming
- ADC, E-Fuses, Voltage swipes and several others

![](_page_33_Figure_12.jpeg)

#### Memory Built-In-Self-Test

- Test full memory functionality in <1 ms
- Results saved in internal registers accessible via slow-control
- Clock gating during normal operation (only leakage power)

![](_page_34_Figure_6.jpeg)

#### 10<sup>5</sup> write/read operations per point

#### Scan Chain

- 92% of fault coverage in SSA ASIC
- Custom approach for triplicated design
- SHIFT, RESET and CAPTURE tests
- A total of ~950 test vectors required
- Full test duration < 300 ms
- Scan-chain in SSA operates correctly up to 20MHz

![](_page_34_Figure_15.jpeg)

#### SSA $\rightarrow$ MPA Communication

- No phase aligner at MPA input due to power restrictions
- The communication rely on precise design of the timing
- SSA-MPA communication timing was verified in static timing analysis and simulated post-layout in all cross-corner combinations (UVM verification environment)

![](_page_35_Picture_6.jpeg)

![](_page_35_Figure_7.jpeg)

### Total Ionizing Dose characterization

### - X-ray TID Characterization summary

- 8 chips have been irradiated up to 200 Mrad
- No errors or timing issues observed on digital logic

![](_page_36_Figure_5.jpeg)

- Bias structures variation within compensating range
- FE noise change within expectation
- ADC reference voltage variation larger then expected:
  - Needed changing the target reference voltage to keep stable the DAC output up to 200Mrad.

#### **TID Test routine:**

- Full set of digital functionalities tests
- Tests of memories (with BIST) and configuration
- Characterization of all bias parameters
- S-Curve for FE Gain, Noise and Trimming
- ADC, E-Fuses, Voltage swipes and several others

![](_page_36_Figure_16.jpeg)

SSA Front-End equivalent noise evolution with TID and temperature

SSA 2.1 average FE noise\* vs TID at -10°C

SSA 2.1 average FE noise\* vs TID at +20°C

![](_page_37_Figure_4.jpeg)

\* FE noise evaluated on the S-Curves – 2 fC internal charge injection – Sensor inputs floating

### ADC reference voltage variation with TID (5-bit DAC for corner compensation)

SSA2.1

![](_page_38_Figure_3.jpeg)

SSA2 MPW

prototype

![](_page_38_Figure_4.jpeg)

ADC REF voltage (DAC out)

![](_page_38_Figure_5.jpeg)

#### Calibration value to compensate

**Ring oscillators** 

![](_page_38_Figure_8.jpeg)

SSA Single-Event Effect tests with heavy ions

SEE testing carried out in UCL at Louvain-la-Neuve, Belgium

- No hard errors observed
  - No loss of control observed
  - No loss of synchronisation observed
  - No chip looks or control errors in general

### - Configuration system error-free

- Verified by readout and comparison of full chip configuration at each test iteration (30 seconds)
- SEU correction counter monitoring

![](_page_39_Picture_10.jpeg)

![](_page_39_Picture_11.jpeg)

### SSA Single-Event Effect tests with heavy ions

#### Stub and L1 data SEE cross-section:

![](_page_40_Figure_3.jpeg)

#### Bit error rate estimation

#### (based on OT fluxes from FLUKA simulation)

![](_page_40_Figure_6.jpeg)

### SSA Wafer Probing - process and analog performance

#### **Ring oscillators Frequency**

![](_page_41_Figure_3.jpeg)

 The SSA includes different types of ring oscillator to monitor variations in: Process – Temperature – Total Ionizing Dose

#### FE Noise Performance Tests

#### FE Threshold Trimming

![](_page_41_Figure_7.jpeg)

- Map of the average FE noise
- Cut criteria noise < 1.7 LSB

![](_page_41_Figure_10.jpeg)

- Map of the threshold spread after the trimming procedure
- Cut criteria std(Th) < 0.5 LSB

### SSA Wafer Probing - yield

Digital Tests summary map

![](_page_42_Figure_3.jpeg)

- Stub data [0.9V, 1.0V, 1.1 V]
- L1 data [0.9V, 1.0V, 1.1 V]
- Memory BIST [0.8V, 1.0V, 1.2 V]
- Configuration and all other digital functionalities

![](_page_42_Figure_8.jpeg)

Total yield map

![](_page_42_Figure_10.jpeg)

- Analog bias calibration
- FE functionality
- FE Threshold trimming
- Noise analysis

#### Overall yield (all tests) > 97%

### MPA Wafer Probing - yield

![](_page_43_Figure_2.jpeg)

#### Summary

- System-level studies allowed to define the architecture of the PS-Module ASICs
- After testing the prototypes, the final version of the ASICs (MPA2, SSA2 and CIC2) have been submit to
  production in a full mask-set engineering run.
- The tests on the final version of the chips show results in agreement with the expectation
  - Front-end performances fulfil specifications
  - X-Ray TID test confirms radiation harness up to 200 Mrad
  - Heavy lon test confirms the functionality of the chosen hardening strategy
  - Climatic chamber tests shows a parameter variation within the calibration range
- Wafer-level testing show a high yield, which allowed us to move to the next steps of ordering the production wafer and define the automated production test procedure.

### **Production plans**

- Testing for preproduction (1st lot 25 MPA wafers + 25 SSA-CIC wafers):
  - First 11 MPA tested by the ASICs designers for MAPSA preproduction
  - First 8 SSA-CIC wafers tested by the ASICs designers for module preproduction
- Testing for production
  - ~900 MPA wafers (300 already delivered) + ~200 SSA-CIC wafers (100 already delivered):
  - Wafer Testing for production will be performed at Rood Microtec:
  - Test procedure already defined by ASICs designers team
  - Probe-card currently in design by Rood Microtec (production 4-6 weeks estimated TAT)
  - Bumping will be performed at Winstek
  - Aiming for test system debugging in July 2023 and first 25 wafers tested by August 2023

![](_page_46_Picture_0.jpeg)

### **CIC ASIC**

DESIGNED AND TESTED BY: L. Caponetto, G. Galbit, B. Nodari, S. Viret, (IP2I Lyon university)

With contributions from A. Caratelli, D. Ceresa S. Scarfì

### SSA ASIC

#### DESIGNED AND TESTED BY:

A. Caratelli, G. Bergamin, D. Ceresa, J. Kaplon, K. Kloukinas, S. Scarfì

![](_page_46_Picture_7.jpeg)

### **MPA ASIC**

#### **DESIGNED AND TESTED BY:**

D. Ceresa, G. Bergamin, A. Caratelli, J. Kaplon, K. Kloukinas, A. Nookala, S. Scarfì (CERN EP-ESE)

![](_page_46_Picture_11.jpeg)

![](_page_46_Picture_12.jpeg)