HAC SPECIS: High-performance Application and Computers, Studying PErformance and Correctness In Simulation

The goal of the HAC SPECIS (High-performance Application and Computers: Studying PErformance and Correctness In Simulation) project is to answer methodological needs of HPC application and runtime developers and to allow to study real HPC systems both from the correctness and performance point of view. To this end, we gather experts from the HPC, formal verification and performance evaluation community.

Context

In the last decades, both hardware and software of modern computers have become increasingly complex. Multi-core architectures comprising several accelerators (GPUs or the Intel Xeon Phi) and interconnected by high-speed networks have become mainstream in the field of High-Performance. Obtaining the maximum performance of such heterogeneous machines requires to break the traditional uniform programming paradigm. To scale, application developers have to make their code as adaptive as possible and to release synchronizations as much as possible. They also have to resort to sophisticated and dynamic data management, load balancing, and scheduling strategies. This evolution has several consequences:

  • First, this increasing complexity and the release of synchronizations is even more error-prone than before. The resulting bugs may almost never occur at small scale but systematically occur at large scale and in a non deterministic way, which makes them particularly difficult to identify and eliminate.
  • Second, the dozen of software stacks and their interactions have become so complex that predicting the performance (both in term of time, resource usage and energy) of the system as a whole is extremely difficult. Understanding and configuring such systems has therefore become a key challenge.

We believe these two challenges related to correctness and performance can be answered by gathering the skills from experts of formal verification, performance evaluation and high performance computing. The goal of the HAC SPECIS Inria Project Laboratory is to answer the methodological needs raised by the recent evolution of HPC architectures by allowing application and runtime developers to study such systems both from the correctness and performance point of view.

All the resulting research developments will be integrated in the open source SimGrid framework so that they can benefit as quickly as possible to the greatest number.

Members and Inria Teams

Arnaud Legrand (POLARIS) is the leader of the HAC SPECIS project.

Related Links

Meetings

  • Kickoff: 23-24 June 2016 @ Rennes

Created: 2016-05-20 ven. 14:01

Validate