#### Part II

## Intelligent Trackers for Triggers

INFN Pisa

III Seminario Nazionale Rivelatori Innovativi Florence, 4-8 June 2012

### **Correlation electronics**

### **CBC2** Architecture



#### blocks associated with Pt stub generation

channel mask: block noisy channels (but not from pipeline) cluster width discrimination: exclude wide clusters offset correction and correlation: correct for phi offset across module and correlate between layers stub shift register: test feature - shift out result of correlation operation at 40 MHz fast OR at comp. O/P and correlation O/P: - can select either to transmit off-chip for normal operation choose correlation O/P

### neighbour chip signals - CWD O/Ps



### CBC2 Flip Chip ASIC



Thursday, June 7, 2012

## Macro pixel ASIC floorplan



#### > MPixel ASIC size:

- ⊙ ~ 12 x 24 mm2
- Pixel size: ~ 100 μm x
  1500 μm
- # pixels: 128 x 16 = 2048
- Readout on one edge only

Periphery (all functionalities described below have to be packed here)

### Coincidence and data handling in the pixel ASIC



### 2S-pT Module: Hybrid Topology



## Rigid substrates

Build-up substrates are commonly used for chip packaging.

Core layer provides:

- Power/Ground planes
- Rigid core material.
- Mid density routing and through hole vias.

Build-up layers are laminated on top and bottom of core:

- Very high density interconnections on constrained areas.
- Microvias to connect build up layers to core external layers.
- No through hole vias..



6 layers







### Build-up substrates applied to CMS Tracker modules





Rigid, organic build up substrates offer a standard baseline construction for the 2S and PS modules.

The routability has been confirmed using a 1-4-1 build up structure.

Non negligible mass, but power distribution is adequate to feed the ASICs.

Mechanical integration to be studied: glueing on cooling structure, interconnection with the service board, flatness for wirebonding and bump bonding, wirebonding through groove for bottom sensor.

### Flexible substrates

#### • Flexible polyimide is a quickly emerging technology.

Thin film flex technology made of spinned liquid polyimide on square panels.

- Very high density layouts: Tracks w/s = 20  $\mu$ m, microvias = 30  $\mu$ m.
- Silicon matching CTE = 3 ppm/K.
- Very low mass: Cu thickness < 7  $\mu$ m, film thickness ≈ 10  $\mu$ m.
- However: 4 layers maximum, no copper on base layer, limited power delivery capabilities.



Fabrication of Multilayer Structure on Rigid Carrier Substrate



Assembling, Bonding, Protection , Test

Separation of Multilayer from Rigid Substrate Reuse of Carrier





•

### Flexible substrates

#### • Packaging industry is adopting this technology for large volume and integration.

Several suppliers are today available for panelized flex films.

- They all provide very high density, small microvias, thin foils on limited number of layers.
- Flip chip compatible, wirebonding compatibility to be evaluated.

Trend is now to use this technique for:

- Roll to roll lamination of flex circuits for very large volume productions.
- Embedding of dies into multilayer system in package overmolded structures.



IMAPS MINAPAD Forum Grenoble, April 2012.





Body Thickness = 240µm (4 metal layers)



### Flexible substrates for the CMS tracker modules



# Flexible substrates for the CMS tracker modules



### Multi Chip Module-Deposited (MCM-D) to build FE directly on the silicon sensor

- Traditional silicon module build (electrical parts)
  - Sensors,flex circuit,substrate,pitch adaptor, wire bonds, FE-chips, passive components



MCM-D

Deposit dielectric and metal
 layers directly on thesilicon
 sensor Layout concepts
 similar to PCBs All-in-one:
 Sensor, hybrid, pitch-adaptor
 and strip connections

### Module read out architectures

### Option 1

### Option 2



Separate inputs for "trigger" and "readout"

### Some simulation results

## CMS - traditional geom.

The "Barrel-EndCap" design comprises 6 barrel layers and 7 endcap disks composed of rings.



The inner part (1) is populated by Pixel-Strip stacked (PS) modules . The outer part (2) is populated by Strip stacked (2S) modules. The number of endcap disk is optimized for tracking performance.

Different spacings between the two sensors of the Pt modules: 0.8mm in the outer barrel (2S) 1.6 and 2.6mm in the inner barrel (PS) 4.0, 2.6 and 1.2mm in the outer end-cap (2S) 4.0mm in the inner end-cap (PS) L1 tracking precision potential pT resolution 4% @ 10 GeV in forward Tracking precision pT resolution 1.4% @ 10 GeV pT resolution 3% @ 100 GeV

### Data reduction



CW<3 stri

### Stub pt Measurement



R 51 cm - mean  $\sqrt{(1/p_T)}$ : ~0.076 R 82 cm - mean  $\sqrt{(1/p_T)}$ : ~0.073 R 102 cm - mean  $\sqrt{(1/p_T)}$ : ~0.069

## Long Barrel layout (CMS)



### Geometry Comparison



### Example: match ECAL+stub info

Matching between stub and projected electron trajectory:



### ATLAS

#### **Pixel + Strip Sensor Layers**

Long Strips (Δz=10cm)

Short Strips ( $\Delta z=2.5cm$ )

Pixel (not used)



Layer combinations studied for track trigger:

- #0, #1, #2 (only short strips)
- #3, #4 (only long strips)
- #2, #3, #4 (mixed, outer layers)

### ATLAS layout



## Fast clustering in ATLAS

#### The communicating between the two sides



#### **Rejection as Function of p<sub>T</sub>Threshold**



- most tracks (at low  $p_{\tau}$ ) are rejected already with a low  $p_{\tau}$  threshold
- rejection power higher if cluster size and offset cut are combined
- rejection power <u>affected by high pileup</u>

#### e, $\mu$ , $\pi^{\pm}$ Rejection (single particle)



### Performance in ATLAS



#### ATLAS Simulation ATLAS Simulation ATLAS Simulation ATLAS Simulation alone alone 1 10 pt [GeV]

#### offset cut only

| SS 1:<br>SS 2:<br>SS 3: | # hits (layer)<br>6.4%<br>5.5%<br>5.1% | 27-153 degrees<br># hits (SS 3 accept.)<br>4.3%<br>4.7%<br>5.1% | 40-140 de<br># hits (LS 2 acc<br>2.8%<br>2.9%<br>3.4% | <u>Har</u><br>nu | dware Impler<br>mber of patterns<br>→ talk S.Sch | mentation:<br>S O(10 <sup>10</sup> )<br>mitt (WIT2010 |
|-------------------------|----------------------------------------|-----------------------------------------------------------------|-------------------------------------------------------|------------------|--------------------------------------------------|-------------------------------------------------------|
| LS 1:<br>LS 2:          | 8.0%<br>6.5%                           | 8.0%<br>6.5%                                                    | 6.2%<br>6.5%                                          |                  |                                                  |                                                       |

Reduction factors of: 15-30 on short strip layers

#### ~15 on long strip layers

### Pattern recognition

#### Local FPGA vs AM



### Long Barrel option in CMS

#### The "long-barrel" double-stack layout







Self-contained  $\phi$  sectors. Each sector needs to be combined with the two neighbouring sectors (left and right) to "contain" ~2.5 GeV tracks.



## Off detector processing

#### **Off-Detector Processing**

- The local design minimizes data transfer and interconnection complexity.
- Input FPGA finds tracklets within rod N by comparing stub phis within a window defined by the beam spot
- Check that DZ is consistent with IP
- Project to other layers using a table look-up
- Move tracklet N information to destination rod N
- Compare tracklets N projection to tracklets and stubs in rod M to form track candidates

Thursday, May 24, 12





16

| p <sub>T</sub> (GeV) | <b>б(р</b> т)/рт | <b>σ(z)</b> |  |
|----------------------|------------------|-------------|--|
| 3                    | ~1%              | 1.5 mm      |  |
| 10                   | ~1.5%            | 0.6 mm      |  |
| 30                   | ~2.5%            | 0.5 mm      |  |

### First latency estimate ~2µs

## Pattern matching with AM

The pattern bank is a set of pre-calculated patterns

Second accommodate for alignment

- Section of the sector conditions of the sector
- Seam displacements

Solution Associative Memory holds different patterns banks and compares them with the current event pattern





### Too large AM? 2 step approach

 Find low resolution track candidates called "roads".
 Solve most of the pattern recognition



Super Bin (SB)

### Too large AM? 2 step approach

 Find low resolution track candidates called "roads".
 Solve most of the pattern recognition



Super Bin (SB)

Then fit tracks inside roads.
 Thanks to 1<sup>st</sup> step it is much easier



### Too large AM? 2 step approach

 Find low resolution track candidates called "roads".
 Solve most of the pattern recognition



Super Bin (SB)

Then fit tracks inside roads.
 Thanks to 1<sup>st</sup> step it is much easier



IFF smaller resolution wanted (probably not for Trigger) OTHER functions are needed: Hit Buffer + Track fitter + Hit Finder

### Associative Memory for pattern matching



### Anatomy of a PRAM (Pattern Recognition Associative Memory)



Trace Length -> Capacitance -> Power Consumption or Reduced Speed More detector layers, or more bits involved, design more spread out in 2D → less pattern density, higher power consumption ...

### Generating the pattern bank



### Increasing the pattern density

### AMCHIP04: VARIABLE RESOLUTION



### Associative memories evolution

- Long history
  - 1990: Full custom VLSI chip 0.7 μm (INFN-Pisa), 128 patterns/chip: high pattern density, not easy design
  - FPGA approach 1998: easier design but fewer density
  - A good compromise is the standard cell approach used for the SVT CDF upgrade: J. Adelman et al., Nuclear Science Symposium, 2005 IEEE, vol. 1, 2005, p. 603.
    - 0.18µm (INFN-Pisa), 5000 patterns/chip, 6 buses input lines, 50 MHz/bus, 18 bits/bus
      - produced by UMC (Taiwan) design time ~8 months + 2 months production
  - All in all: allow to reach ~30K patterns/chip with 200 MHz/bus speed

## AM03 chip details



Fig. 2. Micrograph of the AMchip03 device. Four manually optimized columns of 1280 patterns each are visible. One on the left, one on the right and two in the middle. The two columns of lower density logic correspond to the interconnection and readout logic that was automatically placed. (Color version available online at http://ieeexplore.ieee.org.)

P.8 mm



## AM04 chip

|                | AMchip03             | AMchip04             | effect                                               |      |
|----------------|----------------------|----------------------|------------------------------------------------------|------|
| Technology     | 180nm                | 65nm LP              | X8 pattern density                                   |      |
| Clock freq.    | 50MHz                | 100MHz               | Faster, higher power cons.                           |      |
| Die size       | 10x10mm <sup>2</sup> | 12x12mm <sup>2</sup> | X1.5 patterns (prototype 3.5x4mm <sup>2</sup> )      |      |
| Core voltage   | 1.8V                 | 1.2V                 | Lower power cons.                                    |      |
| Core power     | 1.3W                 | 2W                   | At 40MHz and 100MHz respec.                          | rost |
| Full custom    | No                   | Yes                  | X2 pattern density                                   | 10-  |
| Layers         | 6                    | 8                    | <sup>3</sup> ⁄ <sub>4</sub> pattern density          |      |
| Patterns/chip  | 5k                   | 80k                  | 8k in prototype                                      |      |
| Ternary layers | N/A                  | 3 to 6               | Better S/N with variable resolution                  |      |
| Bits/layer     | 18                   | 15 on S.             |                                                      |      |
| Input hit b/w  | 4.3 Lind             | 12                   | Gbit/s                                               |      |
| R              | redking              | 2 event buf.         | readout 1 <sup>st</sup> , load 2 <sup>nd</sup> event |      |
| V              | •                    |                      |                                                      |      |

### Evolution of the AM

**65 nm** technology provides a factor  $8 \rightarrow 20000$  patterns/chip Full custom cell provides at least a factor 2 → 40000 patterns/chip 8 layers instead of 12 provides a factor 1,5 → 60000 patterns/chip 1,2 x 1,2 cm<sup>2</sup> 2D chip → 80000 patterns/chip With a **2 D chip** we gain a factor **30**!

1 AMboard: 128 chips  $\rightarrow$  ~10 Mpatterns per board 1 Crate: 16 AMboard  $\rightarrow$  ~160 Mpatterns per crate

Current prototype under design: 65nm TSMC, 12mm<sup>2</sup> MPW run, 100 MHz running clock 8000 patterns/chip 8 layers each Layer words of 12 bits + 3 ternary bits → variable resolution patterns

## Usage in ATLAS @L1.5



### ATLAS FTK



- To deal with data flow designed as highly parallel system
  - 8 'core crate' with own pattern recognition and track fitter
  - Detector subdivided in 64 trigger tower
- PIX (3 layers) & SCT (4 double layers)
- Fit posses combinatorics problem, executed in two sequential steps:
  - Use 8 layer for patter recognition and 8 layer fit
  - Refit track found using all 11 layer





### FTK working principle 8 Layer Tracking Step



- Find low resolution track candidates (roads)
- PIX 3 + SCT 4 axial + (SCT 1 stereo || IBL) .
  layer
- Use parallelism in Associative Memory chips

- Use full resolution hits in 8 layers
  - Obtain high resolution helix parameter from road

## System sketch



## FTK system needs

- Predictions for 40-pileup events (~2015)
  - 1000-2000 clusters per core crate per layer
  - 20k roads per core crate using don'tcare
  - 100k fit combinations per core crate

### FTK performance Performance



-0.02

0

RECO z - TRUTH z (cm)

0.02

0.04

0.08

0.06

-0.1

-0.08

-0.06

-0.04

 ~80-90% efficiency with almost offline resolution

## CMS L1 Track Trigger

#### Stub/ladder data rate (layers R>50 cm)

- Outer layers with 2S pT-modules: (stubs p<sub>T</sub>>2 GeV) ~100-200 kHz/cm<sup>2</sup>
- Module area: 86.64 cm<sup>2</sup>: ~8.664 MHz/40 MHz = 0.22-0.45 stubs/module/bx on average
- 12 modules per ladder: layer data rate~100-200 MHz (2.5-5 stubs/ladder/bx)
- stub info used for patter recognition:14 bits (6 bits rφ (1 mm resolution), 1 bit r-z, 4 module #, 3 time stamp). Eventually full granularity sent to the AM board, but use it smartly. (see next transparencies)
  - If 20 bits sent for full resolution, expected data rate per layer~up to 4 Gbps
  - Given that a AM chip inputs at 16 bits x 100 MHz/ layer and that has intrinsic latency, a single AMboard cannot process in time a single event. Hence need a switch that distributes events to several AM-boards in parallel.

#### 2S p<sub>T</sub>-modules Stub production rate per cm<sup>2</sup>



## Trigger sectors dimensioning

#### **Full GEANT4** simulation

Solution Soluti Solution Solution Solution Solution Solution Solution S

- $\bigcirc$  39 azimuthal trigger sectors allow ~2 GeV p<sub>T</sub> cutoff
  - +z and -z sides lumped together
  - adjacent (half) sectors sent to the same AM

#### Super-Strip

- contains the information of a stub in a sector
- 15 bits
  - 5 (z module position) +1 (phi module position)+6 (cluster position with a precision of ~1 mm) +1 (strip segment) + 2 (p<sub>T</sub> range (2,5,10,infty) from the stub)

| # segments |  |  |  |  |  |
|------------|--|--|--|--|--|
|            |  |  |  |  |  |
|            |  |  |  |  |  |
|            |  |  |  |  |  |
|            |  |  |  |  |  |





## Encoding

- Pattern
- for a N-layers layout a word of N-Super Strips information
  - SS<sub>k</sub> Super Strip information for layer k. Pattern = <SS<sub>1</sub>,SS<sub>2</sub>,SS<sub>3</sub>>
- Bank Coverage
- No. stored patterns for which the bank reaches 90% efficiency of reconstructed patterns
- Generate single muons with 2<pt< 60 GeV in the barrel acceptance</li>
  - studied with different pitch resolution (250  $\mu$ m, 1 mm and 8 mm)  $\leftrightarrow$ (17, 15, 12 bits SS<sub>k</sub>)
    - Almost linear relationship with pitch size: 1.5 M patterns/sector for 1 mm resolution, 120K patterns for 8 mm pitch.
    - Needs a compromise between the number of stored patterns and fake rate.





### Advanced AM

- Main limitations of AM approach for L1 track trigger
  - pattern bank density
  - latency limitations

### VIPRAM (Vertically Integrated Pattern Recognition Memory)

VIPRAM concept (developed at Fermilab):

http://hep.uchicago.edu/~thliu/projects/VIPRAM/TIPP2011\_VIPRAM\_Paper.V11.preprint.pdf







### Pattern recognition for tracking is naturally a task in 3D





## Further evolution of AM

- ~ 500K patterns/cm \*\*2
- Running with > 100 MHz input rate
- N CAM tiers + Control tier
- integrated with FPGA/RAM

(general purpose pattern recognition)



## Feasible by 2020?



Original SVT system had 384K patterns total Aim to reach ~500K per cm\*\*2 for VIPRAM ...



- Where to find this (and more) material
  - WIT2010 and WIT2012 Workshops
  - ACES Workshops
  - TIPP Conference series
  - TWEPP Conference series