RP1 (Principal Developer: IICT-BAS):

ADVANCED COMPUTING AND BIG DATA: ALGORITHMS, TOOLS, SERVICES

Using extreme parallel computer systems for scientific computing is becoming increasingly important worldwide (for reference, see the annual reports of the TOP500). The overall performance of all the top 500 systems in the world has risen to 300 PFLOPS 35% of productivity due to systems with accelerators / co-processors. In the last 5 years we have seen an increase in the number of researchers using extreme parallel computing to solve large computational workloads and processing large amounts of data. Many popular software packages for scientific computations have options to work on co-processors. But still programming on systems with accelerators is very complicated and the accessible libraries are not optimized and show a number of shortcomings. In this sense, it is necessary to develop new algorithms, which are effective for this type of a heterogeneous environment, perform tests of their scalability and efficiency.

Avalanche increase in the volume of scientific data collected using large networks of sensors or devices working with high resolution, and as a result of large-scale simulations imposes the need for new methods and protocols for storing and indexing, as well as integrated approach to securing their processing. As a standard, the industry uses cloud technologies and tools for distributed processing (Grid / Cloud), but they are primarily oriented towards the method of divide and conquer. The recent trend is to go beyond MapReduce/Hadoop, especially for the purposes of processing in real time. The idea of integration of methods and technologies of the areas of high performance computing (HPC) and distributed processing (Grid/Cloud/BigData) is increasingly distributed and seen first in the supercomputer centers of Europe’s leading research institutions.

This research project involves the development of new approaches in the field of high-performance, grid and cloud computing, and new methods for processing large volumes of data with a focus on improving productivity, scalability and energy efficiency. These studies are unthinkable without the intended project infrastructure designed specifically to provide to researchers a variety of computer and information services. Based on proven experience and achieved preliminary results of the research project team, as well as existing partnerships with leading European research groups, the project aims to develop new methods and algorithms that allow efficient use of current and future equipment and decide important fundamental and applied problems

  1. Development and study of new hybrid and fault tolerant multi-level methods with application in bioinformatics, physics and engineering

We aim to develop new hybrid algorithms (including combinations of stochastic and deterministic approaches) that have multi-level nature. The stochastic part is the multilevel Monte Carlo method which is a highly efficient variance reducing method exploiting hierarchical number generation. The deterministic counterpart integrates advanced multilevel deterministic solvers. These algorithms will be designed to fully exploit multiple levels of concurrency, hierarchical memory structures and heterogeneous processing units available in the extreme-scale computational platforms. The new stochastic and hybrid algorithms developed to tackle physical phenomena are aimed to be highly fault-tolerant and resilient. These algorithms will be applied for solving problems in biomechanical engineering, transport problems, financial mathematics.

2. Algorithms and tools for ensuring high scalability, parallel and energy efficiency on systems with computational accelerators

The top high-performance systems increasingly depend on the use of computational accelerators with technologies like GPGPUs or Intel MIC as the main source of their computational power, creating a challenge for the established methods, algorithms and legacy software, which are tailored to homogeneous CPU-based systems. We aim to develop new algorithms and tools specifically optimized for heterogeneous HPC systems with computational accelerators. Thus the development of applications with high scalability and parallel efficiency, will be facilitated, taking into account the energy use and the overall running costs. They will be usable in distributed environments like HPC Grids or Clouds, providing resilience and fault-tolerance.

3. Methods and services for efficient processing of large amounts of data over high-performance computational clusters

The goal of this task is to enable the researchers to develop novel methods, tools and applications that make use of „Big Data“. 3D digitization methods like tomography and laser scanning are truly developing into a “big data science”, where terabytes of data are routinely recorded. It is realized more and more that analyzing the shape, inner structure and dynamics of both living and manufactured complex systems must involve the intricate relations between processes that occur at different scales, in space and time. In this regard in-situ 3D imaging of dynamic objects is becoming more and more influential.

New methods and scalable algorithms for image reconstruction, denoising and segmentation will be developed and implemented. Special emphasis in this task will be put also on the methods and software tools for processing of dense point clouds and high resolution meshes and textures. Typical areas of applications are materials science, biomedicine, and industry. Particular focus will be put also on facilitating the applicability of digital methods and techniques in sciences and arts. The cultural heritage is chosen as case study for the new developments under this task. In this context, the active interaction with the laboratory for 3D digitalization is forseen. We expect that the advances in 4D digitalization, visualization, data analytics and e-services will greatly impact the progress in this field that is of enormous importance for the study and preservation of the rich cultural heritage of the country. The work under this task will provide to the researchers in this field a specific platform which integrates research domain specific software and services.