Time-resolved Studies of Single-Event-Upset effects in Optical Data Receiver for the First LHC Upgrade Phase of the ATLAS Pixel Detector

> K.K. Gan, H. Kagan, R. Kass, S. Smith Ohio State University

> P. Buchholz, A. Wiese, <u>M. Ziolkowski</u> **Universität Siegen**

RD11 Conference • 6-8 July, 2011 • Florence

# <u>Plan</u>

- Introduction: SEU effects
- Optical transmission of clock, control and trigger data
- Expected SEU rates
- SEU test setup
- Experimental results
- Mitigation of SEU effects
- Summary and outlook

### Single Event Upset

• The **Single Event Upset (SEU)** is an effect of radiation induced errors in microelectronic circuits, including semiconductor light detectors, when charged particles lose energy by ionizing the medium through which they pass, leaving behind electron-hole pairs.

• The minimal ionizing particles can not cause an SEU directly. Such particles produce, through collisions with atoms, strong ionizing ions, which in turn produce enough amount of electron-hole pairs to induce an SEU error.

• The most sensitive part of the opto-link to the SEU is the **PiN** light detector, due to its "large" active region size. Also the **trans-impedance amplifier** is expected to be SEU-sensitive due to low-current signal on its input. The SEU induced charge can cause a data bit-flip transition or change the timing of the signal edges.



# **Bi-Phase-Mark encoding scheme for the ATLAS pixel optical receiver**



 $\rightarrow$  only half of clock edges is transmitted  $\rightarrow$  clock recovery needed

# **Clock recovery** and **data decoding** scheme in the DORIC receiver ASIC



### Single Event Upset in ATLAS Pixel Detector

### What is known:

• The **SEU cross section** (number of errors / particle flux) as a function of the PiN photocurrent (optical power), measured up to 500  $\mu$ A

K.E. Arms et al, ATLAS pixel opto-electronics, Nucl. Instrum. Methods, A 554, 458 (2005)

- 1° SEU cross-section of 4 x 10  $^{\text{-10}}$  cm  $^{\text{-2}}$  at the average PiN-diode photocurrent of 300  $\mu\text{A}.$
- 2° The expected particle flux at the optical receiver location:  $2 \times 10^{6}$  cm<sup>-2</sup> s<sup>-1</sup>.
- → The expected Bit Error Rate, induced by SEU, is estimated to be 2 x 10<sup>-11</sup>, which corresponds to 1 bit error in 20 minutes. The worst case is 1 error in 80 s at the end of detector life time.
- Since the BER for the DORIC ASIC is by factor 30 less (< 10<sup>-11</sup>), the opto-link BER is limited by the SEU.
- Much higher particle flux after LHC upgrade → increase of SEU error rate by order of magnitude.

### Our motivation for time-resolved SEU studies:

- To gain insight into the SEU event structure, by means of recording data bit and clock state sequences of SEU occurrence in time for further off-line analyses.
- Useful for future development of optical receivers; implementation of mitigation techn.
- We were inspired by similar studies performed before by CERN and SCT group, f.e. J. Troska et al. "Single-Event Upsets in Photodiodes for Multi-Gb/s data Transmission"

• The time-resolved SEU measurements were performed at CERN PS-T7 24 GeV/c proton irradiation facility in August 2009 and September 2010.

• In 2009: The data was taken independently on two PiN-array-Receiver-Chip channels, with Optowell GaAs PiN array and receiver-decoding ASIC in 250nm technology. The optical power of the input signal was optically attenuated to eight values between 10  $\mu$ A (just above receiver chip input-current-threshold ) and 110  $\mu$ A (limit of commercially available transmitter).

• <u>In 2011:</u> The data was taken on five channels with U-L-M GaAS PiN array connected to a receiver-decoding prototype ASIC in 130nm technology. The optical power was attenuated between 100 and 600  $\mu$ A.

• For each optical power setting the beam exposure time was on average 90 proton-bursts, each 400 ms long and separated from each other by a 40 s beam-cycle period.

• In-between the proton-bursts, the error monitoring was active in order to provide a SEUfree reference-measurement.

# Block diagram of the experimental setup for time-resolved SEU studies in 2010



### **<u>Classification of Single-Event-Upset incidents</u>**

For the purpose of data analysis three categories of SEU events were defined:

- type-D (Data) with only data-bit errors observed but no clock deficiency,
- type-C (Clock) with clock deficiency but no data-bit errors,
- type-B (Both) with both data-bit errors and clock deficiency.

A total of **11065** events were collected in 2010 study, among them

94050 events (84.5%) of type-D,

13135 events (12%) of type-C and

**3870** events (3.5%) of type-B.

# **Typical events of type D, C and B**

Examples from 2009 run.

| 1) <sup>Event #9942, <b>type-D</b>,<br/>optical power 22 μA:</sup>                                                                  | <b>3)</b> Event #5203, <b>type-B</b> , optical power 34 μA: |  |  |  |
|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|--|--|--|
| Data bits: 001101+0 11010001                                                                                                        | Data bits: 11101101 -0001101                                |  |  |  |
| Clock L: 00000000 00000000<br>Clock H: 11111111 11111111                                                                            | Clock L: 00000000 00000000<br>Clock H: 11111110 11111111    |  |  |  |
| <b>2)</b> Event #5689, <b>type-C</b> , <b>4</b> ) Event #9067, <b>type-B</b> , optical power 55 μA: <b>4</b> ) optical power 10 μA: |                                                             |  |  |  |
| Data bits: 01101100 01000101                                                                                                        | Data bits: 1011101+ 1111+101                                |  |  |  |
| Clock L: 00000000 00000000<br>Clock H: 1111110 1111111                                                                              | Clock L: 00000000 11111000<br>Clock H: 1111111 00001111     |  |  |  |

A '+' and '-' indicate  $0 \rightarrow 1$  and  $1 \rightarrow 0$  bit-flip errors respectively.

# SEU frequency of occurrence for various conditions of recovered clock and transmitted data in 2010 run

| Туре D                             | # Bit-flips |       | Bit-flip type |     | Case |
|------------------------------------|-------------|-------|---------------|-----|------|
| 84.5%                              | one         | 97.5% | 0→1           | 95% | 1.   |
| only data affected<br>18810 events |             |       | 1→0           | 5%  | 2.   |
|                                    | two         | 2.5%  | bo            | oth | 3.   |

| Type C                     | Clock deficiency | Case |  |
|----------------------------|------------------|------|--|
| 12%<br>only clock affected | H→L 99.8%        | 4.   |  |
| 2627 events                | L→H 0.2%         | 5.   |  |

| Type B                                              | # Clock states      | Clock deficiency         | # Bit-flips          | Bit-flip type            | Case |
|-----------------------------------------------------|---------------------|--------------------------|----------------------|--------------------------|------|
|                                                     |                     | H→L 100%                 | 079/                 | 1→0 60%                  | 6.   |
|                                                     | one 75%             |                          | one 97% -            | 0→1 40%                  | 7.   |
| both clock and data<br>affected<br>tw<br>774 events |                     |                          | two 3%               | both                     | 8.   |
|                                                     | two and more<br>25% | inverted clock<br>17%    | two 82%<br>three 18% | 0→1<br>for last bit-flip | 9.   |
|                                                     |                     | interrupted clock<br>83% | one 60%<br>two 40%   | both                     | 10.  |

(Event numbers are given per single receiver assembly channel)

### **Example: Typical type-C event**

• Waveforms were recorded on-line with an oscilloscope (2010)



### **Example: Type-B event with inverted clock**



Clock recovery circuit locks to data signal transitions instead of clock transitions

### **Example: Type-B event with interrupted clock**



high states of clock are missing



### SEU rate measured during 2009 run



 $\rightarrow$  Exponential decrease of SEU rate with increasing amplitude of input signal

### SEU rate measured during 2010 run



 $\rightarrow$  rather weak decrease of SEU rate with increasing amplitude of input signal

### SEU rate for type-D events



### SEU rate for type-C events



### SEU rate for type-B events





→ Magnitude of  $10^{-10}$  cm<sup>2</sup> at reference photocurrent of 300 µA (higher than 4x  $10^{-10}$  cm<sup>2</sup> from the past measurement).

# Possible solutions for mitigation of SEU effects

- **Type-D events**: highest occurrence rate, exponentially decreasing with increasing optical power:
- $\rightarrow$  use highest optical power for transmitting the input signal,
- → reduce remaining failure rate by introducing data redundancy e.g. forward error correction (FEC), 8-in-10 bit encoding; effectiveness has been recently demonstrated at CERN by J. Troska and F. Vasey (however not yet confirmed for the BPM scheme; we are going to investigate it in 2011 irradiation).
- **Type-C:** with one high clock state missing only, lower occurrence of ~12%, currently no mitigation technique proposed, needs to be analyzed further.
- Type-B events: involving corruption of clock and data, much lower occurrence of ~3.5%,

with two long term clock corruption effects:

- $\rightarrow$  for "inverted clock" avoid long sequences of bit data set to "1",
- → "interrupted clock" is a new error mode, not seen before, probably side effect of change in architecture of delay-locked-loop (DLL) for clock recovery in the prototype receiver chip, it will be carefully monitored during next 2011 irradiation.

### **Summary**

- SEU effects in the optical receiver are dominated by a single-only data bit-flip with 84.5% of occurrence:
- relative rates for bit-flip type "0-to-1" and "1-to-0" are 95% and 5% respectively.
- Significantly lower is SEU impact on clock recovery, 15.5 % of incidents:
- mostly with one high-state of clock missing,
- Iow rate (0.7%) of bursts with several consecutive high-states of clock missing
  → interrupted clock,
- low rate (0.2%) of inverted clock cycle bursts for data-bits transmitted as '1' in a sequence.
- SEU cross-section: weak dependence on photocurrent above 100 μA, value at 300μA about 2-3x higher than the one measured in the past; likely to be attributed to different optical receiver assembly components.

# <u>Outlook</u>

• Additional SEU data taken with irradiated PIN and ASIC assembly available for analyses .

• We plan to continue SEU time-resolved measurements in September 2011, with the new receiver 130 nm-ASIC of 8 regular + 4 spare channels.

Emphases on:

- Mitigation of errors by Forward Error Correction,
- Further investigation of clock recovery errors under various conditions,
- Cover full range of optical input power (from 25 to 600  $\mu$ A),
- Do measurements with two different best-score PIN array products.