

#### High Performance Computing Activities funded by the Future Emerging Technologies European Unit



Pier Stanislao PAOLUCCI – EURETILE FET Project Coordinator

and INFN APE Parallel Computing Group member



FET Grant# 247846



www.euretile.eu



### 7<sup>th</sup> Framework Programme

- The 7<sup>th</sup> framework Programme is the European Union's main instrument for funding research in Europe
- 2007-2013
- 50 Billion Euro total funding
- Grants are determined on the basis of calls for proposals and highly competitive peer reviews
- 10 thematic areas, among which Information and Communication Technologies (ICT)
- 9 Billion Euro for ICT Research Projects
- ICT calls/projects managed by "Units", among which FET – Future Emerging Technologies Unit

## **Future Emerging Technologies ICT unit mission**



FET: the unit acting as pathfinder for the ICT (Information Communication Technologies) program.

FET unit logo

- FET fosters novel/non-conventional approaches, foundational research
- FET funds long term, visionary, collaborative and multidisciplinary ICT research. It must be of foundational nature and high-risk / high-payoff
- FET aims at a breakthrough, a paradigm shift, or at the proof of a novel scientific principle
- FET typically operates through calls for project proposals

## Summary

- FET Unit Future Emerging Technologies mission
  ICT Pathfinder on foundational long-term research
- FET Teracomp call (2010-2013) objectives and projects
  - EURETILE project, coordinated by INFN Roma, TERAFLUX, SooS, TRAMS
- FET Next calls (2011-2012) and beyond
- Coming FET Flagship projects! (2013...)
  - Leverage on EURETILE project to position the INFN crew onboard FET flagship projects
- The EURETILE project
  - Coordinated by INFN Roma
  - Leverages on APE and SHAPES projects
  - □ Applications:
    - Lattice Quantum Chromo Dynamics (LQCD)
    - Brain Simulations Polychronous Spiking Neural Networks (PSNN)

#### **Paolucci's Recent Experience on European ICT projects**

- Coordinator of the DIAM project: 2002-2005
  - Topic: Advanced Digital Signal Processing based on Custom VLIW
  - 5 MEuro funding
- Coordinator of the SHAPES FET project: 2006-2009



- Scalable SW HW Architecture for Numerical and Embedded Systems
- □ 7 MEuro funding Based on Custom VLIW + Custom Net Proc.
- Coordinator of the EURETILE FET project: 2010-2013



- European Reference Many-Tile Experiment
- 5 MEuro eligible costs and funding 2MEuro for INFN Roma
- Chairman of the CASTNESS'07, '08, '11 European Workshops/Schools on Computing Architectures and Software Tools for Numerical and Embedded Scalable Systems

## FET TERACOMP Call: Concurrent Tera-Dev. Comp.



2010-2013 Target: "Radically new methods and tools for system architecture design and programming of chips and systems beyond 2020"

**Design of Many-core systems:** Radically new concepts, design paradigms and methods to address many-cores (>100 cores): Hardware, software and reconfigurable hardware.

- **Design of dependable systems with faulty components:** Methodologies for the design and construction of dependable systems tolerating critical levels of hardware or software faults and increasing component variability.
- Breakthrough programming paradigms: Radically new programming paradigms for many-core systems, in terms of scalability, portability and dependability. Enable high data throughput applications and the management of massive data sets

## FET TERACOMP – 2010-2013 Projects

#### **European Reference Tiled Experiment**

- Our project see next slides Coordinator: INFN
- Brain-Inspired Hierarchical Many-tiles HW
- Appl: LQCD, Brain Polychronous Spiking Neural Net
- Holistic SW Tool Chain: Dynamic Hierarchical Network of Processes, Automated Mapping, Distributed OS/HdS



EURETILE

#### **Exploiting Dataflow Parallelism**

- Multi-core Systems-on-chip with 3D Stacking
- Pushing the data where will be needed



#### Service oriented Operating Systems

- Distributed Operating Systems for many-cores
- Based on microkernels

## TRAMS TeraDevice RAMs

Innovative Architectures for sub-22nm Memories

### FET last calls, closed on Jan 2011



# Exa-scale computing, software and simulation, 25 M€, Objectives:

- To develop optimised application codes driven by the computational needs of science and engineering and of today's grand challenges (e.g. climate change, energy,...)
- Addressing major challenges of extreme parallelism with millions of cores (programming models, compilers, performance analysis, algorithms, power consumption ...)

## ■ Computing Systems, 45 M€

### Next "standard" FET Calls 2011-2012-2013...



- International cooperation on FET research (FET Open 9.4)
- Call 8 (7/11 1/12)
  - Unconventional Computation
  - Dynamics of Multi-Level Complex Systems.
  - Minimising Energy Consumption of Computing to the Limit
  - new FET topics, coordination of communities ... (9.12)
- Call9 (1/12 4/12)
  - Quantum ICT (including ERA-NET-Plus)
  - Fundamentals of Collective Adaptive Systems
  - Neuro-Bio-Inspired Systems
  - new FET topics, coordination of communities ... (9.12)
- WP 2013 (to be defined)
- FP8 European Framework Program (to be defined)





http://cordis.europa.eu/fp7/ict/programme/fet/flagship/home\_en.html

Facebook





Information Sudety and Mer



# **FET Flagship Initiatives are**

Large • Science-driven • Multi-disciplinary • Long-term Foundational • Collaborative research initiatives • Targeting a <u>visionary S&T goal</u> • Seeded from FET ICT but going beyond ICT

- ✓ Able to generate waves of technological innovation, with significant impacts on economy and society
- ✓ Due to size and level of ambition, possible only via federated effort of the European Framework and National research programs
- ✓ FET Flagships are a response to the fragmentation and the leveraging of investment in basic research seeded from ICT leading towards innovation in technology and societal impact.
- ✓ Partnerships for Scientific Leadership



#### Further info: Fet11 conference and CASTNESS'11 Workshop



#### www.euretile.eu

Lab. Naz. Legnaro, 17 Jan 2011 – Pier Stanislao Paolucci – EURETILE Coordinator - INFN (part-time) Researcher

www.fet11.eu



#### Key Objectives of European Reference Tiled arch. Experiment

- Create Brain-Inspired Many-Tile Experimental Platforms
  - Scalable Many-Tile HW Prototypes
    - Explore platform convergence between Many-Tile Scientific (HPC) and High-End Embedded (Streams)
  - Scalable Many-Tile Simulation Platform
- Brain Inspired Hierarchical SW Environment



- Applications represented by Dynamic Hierarchical Networks of Processes (Distributed Application Layer/Distributed Operation Layer)
- Removing accidental complexity with automatic synthesis of low level software, (semi) automatic mapping
- Application Benchmarks:
  - Lattice Quantum Chromo Dynamics
  - Polychronous Spiking Neural Networks (Izhikevich model)
  - DSP / Linear Algebra kernels (common to Embedded and Scientific)
- Many-Tile HW Improvements
  - Brain Inspired 3-level Hierarchy/Communication/Fault Tolerance
  - DNP (Distributed Network Processor) (e.g. support Polychronous Multicast)
  - □ HW/SW Co-generation of ASIPs (Application Specific Instruction-set Processors)
  - Convergence happening between Scientific and Embedded (see later)



#### 3 levels of brain inspired HW / (and SW) hierarchy/interconnect...



## **EURETILE** Izhikevich Polychronous Spiking Neural Network

#### Intuition

- A- First 60 years of evolution of human designed computers produced monolithic high-clock single processor/scalar programming model (... a stream of R after W dependencies on a shared memory)
- B- Billion of years of natural evolution produced massive distributed arch. with NO shared memory, extremely LOW clock freq., NO synchronization barriers and massive multicast
- A sweet spot could exists somewhere between A and B
- Methodology
  - Identify models of brain activities satisfying a few criteria
    - Computational efficiency, faithful reproduction of neural types/activity patterns, conceptual simplicity (occam razor, if possible)
  - □ ... to look for key architectural features to be implemented for efficient scaling
- Identified Model
  - Best model (personal opinion): Izhikevich model (2003-2006-2009) of Polychronous spiking neural networks, requires only 12 flops/(ms \* neuron). The same model simulates all known types of neurons (...a few parameters). Well understood "geometry of phase portraits" representing the dynamic behaviour of the single neurons.
- First Results
  - Network support for Polychronous multicast could be a key necessary feature for intermediate and upper level of the hierarchy
  - Design first (bottom) level of the hierarchy near the sweet point could be done along different paths
    - SHAPES like, NVIDIA like (multi-tile HW around a multi-layer bus, appropriate mix of shared memories and local memories at bottom level, moderate clock frequency, explicit parallelism). This path for general purpose numerical/DSP applications. mAgic DSP designed using TARGET tools.
    - Add ASIPs using TARGET tools: e.g. specific for LQCD and Neural Nets: BRINSAR: BRain INSpired ARchitectures of specialized ASIPs, for efficient implementation of brain-like applications at fbottom level of the hierarchy (e.g. robotics, DSP recognition tasks, ...)

# **EURETILE** Holistic and Scalable Software Tool-Chain

- DOL/DAL Programming model
  - specification of each process in C
  - specification of (dynamic) applications as sets of (hierarchical) Kahn process networks
- System-level fault-tolerance strategies
- High level (functional) simulation
  - ... for debugging and early profiling
- ... then, mapping SW process onto HW tiles
  - automatic optimization, with analytic and bioinspired methods
  - fine-grain parallelism exploited by optimized intra-tile compilation techniques
- Automatically generated HdS (Hardware dependent Software)
  - □ ... providing highly efficient services
- Finally, many-tile scalable simulator
  - validates the entire design
  - ... and collects profiling data



# EURETILE

## HW Background: SHAPES project (2006-2009)

- Layout of single-tile and multi-tile dies. Each tile includes:
  - mAgicV multi Gigaflops VLIW DSP (for numerical computations)
  - ARM9 system (for control / sequential computations)

DNP
 Distributed
 Network
 Processor (for inter-tile communication)



P.S. Paolucci US Patent 6,766,439 mAgicV VLIW arch

P.S. Paolucci et al. US Pat 7,437,540 - mAgic/Arm combo

P. Vicini et al. – DNP architecture





## 2010 Status: Key HW Players

- Heterogenous Embedded multi-processor
  - multi-ARM +
  - multi-DSP +
  - multi ASIPs (sometimes including GPGPU)
- Single-chip GPGPU performances:
  - TERAFLOPS
- Key HW players:
  - ARM
  - NVIDIA
  - ATI
  - INTEL
  - AMD
- Convergence between Embedded and HPC architectures
  - NVIDIA + ARM netbooks
  - INTEL + NVIDIA supercomputers (integrated ?!)
  - NVIDIA + ARM supercomputers (!! Denver)
- EURETILE HW research focusing on
  - Many-core interconnect (DNP)
  - □ ASIPs Design tools/Appl Spec. HW IPs
  - Brain-inspired hierarchy/system solution



- Example of NVIDIA TEGRA 250 MPSoC:
  - GeForce Video Proc + Audio Proc +
  - □ Image Proc + HW Codecs +
  - dual ARM Cortex-A9 + ARM7
  - Complete Multi-media PC on single MPSoC
- See a more complete review of leading 2010 HW architectures in Piero Vicini's CASTNESS'11 presentation



- ETH Zurich
  - DAL/DOL Many-Tile Dynamic/Hierarchical High-Level Programming Env.
- UJF-TIMA Grenoble
  - Automatic Synthesis of Platform Dependent System SW/Communication
- RWTH-Aachen
  - Scalable Simulation Platform
- INFN Roma / TARGET
  - Many-tile HW Platform Prototype
  - DNP (Distributed Network Processor) HW IP



- Cogeneration of ASIPs (Applic. Specific. Instr-set Proc) HW IP /SW Tools
- Application Benckmarks
  - LQCD, Izhikevich Spiking Neural Net, DSP/Linear Algebra



- FET unit Future Emerging Technologies mission
  ICT Pathfinder on foundational long-term research
- FET Teracomp call (2010-2013) objectives and projects
  - EURETILE project, coordinated by INFN Roma, TERAFLUX, SooS, TRAMS
- FET Next calls (2011-2012) and beyond
- Coming FET Flagship projects! (2013...)
  - Leverage on EURETILE project to position the INFN crew onboard FET flagship projects
- The EURETILE project
  - Coordinated by INFN Roma
  - Leverages on APE and SHAPES projects
  - □ Applications:
    - LQCD
    - Brain Simulations: Polychronous Spiking Neural Networks