

### Filippo Maria Giorgi - INFN Bologna



**XV SuperB General Meeting** Caltech, CA – USA, Dec. 14-17 2010



- New Matrix Scan Logic (respect to SPX0)
- Triggered Architecture
- Integration achievements
- Optimizations
- Simulations
- 2011 Submissions
- Conclusions

### **Summary**

### Respect to previous submission SuperPX0:

- Dense in-pixel digital logic (Time labeling, arbitrary TS comparator for time ordered readout)
- NO Macro-Pixels → no FREEZING required → much less dead area
- NO Scan Buffer (saving RO area) but still time ordered matrix scans
- Smaller BC periods allowed → better time resolution
- Polyvalent Triggered & Data-push architecture

### **New Matrix Scan Logic**

Data-push architecture for pixels on Layer0 requires a lot of output bandwidth.  $(100 \text{MHz/cm}^2 \times \sim 20 \text{bit} = \sim 2 \text{ Gbps/cm}^2)$ 

Some modifications, involving the sweeper architecture only, make possible to exploit the **matrix itself as a hit buffer** for a triggered architecture.

We are evaluating if this solution is viable taking into account the expected maximum trigger latency ( $\sim 6 \ \mu s$ ) of SuperB.





Only triggered time stamps are requested and swept out.



Only triggered time stamps are requested and swept out.



Only triggered time stamps are requested and swept out.



## After the latency, a TS-dependent reset is asserted. Only corresponding pixels are reset



## After the latency, a TS-dependent reset is asserted. Only corresponding pixels are reset



## After the latency, a TS-dependent reset is asserted. Only corresponding pixels are reset

- All the readout architecture is coded in synthesizable VHDL. Now, also the triggered extraction feature of the sweeper.
   Efficiency evaluations conducted with Monte Carlo hit extraction.
- Barrels and Sweeper code optimization for higher speed & lower synthesis time.
  - Full architecture (with optimized components) entirely reintegrated and re-simulated (final matrix dimensions 192x256)
  - Full architecture synthesized in Synopsys environment. Benefits are presented.

## **Integration achievements**



### **Simulations overview**



MC tuned to obtain 100 MHz/cm<sup>2</sup> hit rate on area

### **Simulation Results**



NOT taken into account:

- sensor efficiency (assumed 100%)
- pixel reset dead time (assumed few ns)

## **Simulation results DATA PUSH**

#### **Expected efficiency** combinatorial evaluations



NOT taken into account:

- sensor efficiency (assumed 100%)

- pixel reset dead time (assumed few ns)

### **Analitic expectation DATA PUSH**



- pixel reset dead time (assumed few ns)

#### Readout de-queuing efficiency 100% (no barrel overflows)

- Hit check results: 100 % match.
- Fast clock **4** x RDclk (output bus frequency)

## SuperPX0 comparison

efficiency results from similar simulations of *SuperPX0* readout

0.8

BC period (us)



• Smooth decrease of efficiency in function of trigger latency.

- Almost no dependency of efficiency on BC period (in this region)
- Linear fit slope: -0.3 %/us.

## **Simulation results, TRIGGERED**



• Smooth decrease of efficiency in function of trigger latency.

- Almost no dependency of efficiency on BC period (in this region)
- Linear fit slope: -0.3 %/us.

## **Simulation results, TRIGGERED**

# Bandwidth usage estimated by simulations data bus: 20 bit @ 200 MHz bus $\rightarrow$ 4 Gbps max throughput.

#### •Data push mode

•BC = 100 ns (10 MHz) •Rate = 100 MHz/cm<sup>2</sup>

#### mean bandwidth usage of 2.6 Gbps

**~22% bandwidth saving** thanks to zone clusterization algorithm and time bundling of hits. (respect to APSEL 4D standard *xyt* hit word encoding)

### Triggered mode

BC = 100 ns (10 MHz)
Rate = 100 MHz/cm<sup>2</sup>
Trigger Rate = 2.5 MHz (largely overestimated, 1 trig. every 4 BC)
mean bandwidth usage of 660 Mbps
(corresponding to ~40 Mbps for a standard 150 kHz trigger rate).

## **Simulation results: BANDWIDTH**

### Barrel and Sweeper were described in high-level VHDL code.

- Synthesis slow
- Generated net-list was not optimized  $\rightarrow$  improvable speed performances
- Thesis on code optimization, to be discussed this week in Bologna.
  - Barrels and Sweeper rewritten almost at hardware level.
  - Evident performance improvements are reported by the Synopsys Design Compiler tool.

### **Barrels speed optimization** Worse reg. to reg. signal propagation time



### **Sweeper speed optimization** Worse reg. to reg. signal propagation time



### **Full chip synthesis time optimization**



### Full chip cells area



### **Total cells area comparison**

### SuperPX1 hybrid 3D

- Matrix 32x128
- 2 sub-matrices 16x128
- 4 sparsifiers
- 8 zones for each sparsifier
- zone width: 4 pixels

### • APSEL-VI MAPS 3D

- Matrix 96x128 (96x96)
- 2 sub-matrices 48x128 (48x96)
- 4 (3) sparsifiers
- 8 zones for each sparsifier
- zone width: 4 pixels

## **Submissions 2011**

- Triggered architecture successfully implemented and simulated.
  - 98.2 % readout efficiency at 6 us trigger latency.
  - BC period down to 60 ns.
  - No BC dependency of efficiency in the foreseen triggered working conditions.
- Optimizations lead to faster readout circuits and faster synthesis time.
- Total area after new features and optimization → only 7% larger.
- Next step: architecture tailoring for SuperPX1 and APSEL\_VI

### Conclusions



- DAQ boards responsible for trigger handling
- Pre-processed trigger sent to Front-end electronics.
  - Simpler on-chip trigger logic
  - Re-configurable logic on DAQ boards
- One-wire trigger to FE chips.
- Trigger latency configured on FE chips at start-up.
- Chip trigger signal synchronous to BC clock.





### **Sweeper speed optimization**



### **Full chip speed optimization**



### **Barrels speed optimization**





#### EXAMPLE

During Time Window 2 :

- Some pixels getting fired and labeled with Time Stamp (TS) =2
- The readout queries the columns containing hits labeled with TS=1 (**Reading Time Window**  $\rightarrow$  FastOr activation)
- The readout moves the Active Column over the columns with an active FastOr.



### Matrix scan Logic example