Chair for Computer Science 12 – Hardware-Software-Co-Design


Logo of Chair for Hardware-Software Co-DesignOur research centers around the systematic design (CAD) of hardware/software systems, ranging from embedded systems to HPC platforms. One principal research direction is domain-specific computing that tries to tackle the very complex programming and design challenge of parallel heterogeneous computer architectures. Domain-specific computing drastically separates the concerns of algorithm development and target architecture implementation, including parallelization and low-level implementation details.

The key idea is to take advantage of the knowledge being inherent in a particular problem area or field of application, i.e., a particular domain, in a well-directed manner and thus, to master the complexity of heterogeneous systems. Such domain knowledge can be captured by reasonable abstractions, augmentations, and notations, e.g., libraries, Domain-specific programming languages (DSLs), or combinations of both (e.g., embedded DSLs implemented via template metaprogramming). On this basis, patterns can be utilized to transform and optimize the input description in a goal-oriented way during compilation, and, finally, to generate code for a specific target architecture. Thus, DSLs provide high productivity plus typically also high performance.

We develop DSLs and target platform languages to capture both domain and architecture knowledge, which is utilized during the different phases of compilation, parallelization, mapping, as well as code generation for a wide variety of architectures, e.g., multi-core processors, GPUs, MPSoCs, FPGAs. All these steps usually go along with optimizing and exploring the vast space of design options and trading off multiple objectives, such as performance, cost, energy, or reliability.

Our considered application domains include multigrid methods based on stencil computations, iterative algorithms on unstructured grids, image processing and computer vision tasks (e.g., for medical and automotive applications), and high-performance processing of Big Data.


Research topics

ExaStencils — Advanced Stencil-Code Engineering

Project ExaStencils investigates and provides a unique, tool-assisted, domain-specific codesign approach for the important class of stencil codes, which play a central role in high-performance simulation on structured or block-structured grids. Stencils are regular access patterns on (usually multidimensional) data grids. Multigrid methods involve a hierarchy of very fine to successively coarser grids. The challenge of exascale is that, for the coarser grids, less processing power is required, and communication dominates. From the computational algorithm perspective, domain-specific investigations include the extraction and development of suitable stencils, the analysis of performance-relevant algorithmic tradeoffs (e.g., the number of grid levels), and the analysis and reduction of synchronization requirements guided by a template model of the targeted cluster architecture. Based on this analysis, sophisticated programming and software tool support is developed by capturing the relevant data structures and program segments for stencil computations in a domain-specific language and applying a generator-based product-line technology to generate and optimize automatically stencil codes tailored to each application–platform pair. A central distinguishing mark of ExaStencils is that domain knowledge is being pursued in a coordinated manner across all abstraction levels, from the formulation of the application scenario down to the generation of highly-optimized stencil code.

Further information: ExaStencils Project Website
Software: ExaStencils Code Generation Framework

HighPerMeshes — Domain-Specific Programming and Target-Platform-Aware Compiler Infrastructure for Algorithms on Unstructured Grids

The goal of HighPerMeshes is to develop a pragmatically valuable domain-specific framework for the efficient, parallel, and scaling implementation of iterative algorithms on unstructured grids. Simulation software in the time domain that falls into this category (e.g., TD-FEM, TD-DG, network simulations), has increasingly been used in scientific and industrial domains in recent years and complements or supplements comparable methods on structured grids. With the results of this project, developers can, with moderate effort, extend existing source codes in high-level languages by domain-specific library and language elements. The intelligent compiler infrastructure uses domain knowledge to enable performance-optimized, highly parallel execution on all relevant modern hardware architectures (Multicore, Manycore, GPU, FPGA), also in heterogeneous systems. Thus, the project offers to many HPC developers from science and industry a comfortable and sustainable path towards scaling usage of the most efficient current and future target architectures.

Further information: HighPerMeshes Project Page

Hipacc — The Heterogeneous Image Processing Acceleration Framework

Hipacc is a DSL embedded in C++ and a compiler framework for the domain of image processing. It captures domain knowledge in a compact and intuitive language and employs source-to-source translation combined with various optimizations to achieve excellent productivity paired with performance portability. The Hipacc approach has been applied and evaluated for a wide variety of parallel accelerator architectures, including manycore processors, such as NVIDIA and AMD GPUs and Intel Xeon Phi, embedded GPUs, Xilinx and Intel FPGAs, as well as vector units.

Software: Hipacc DSL and Compilation Framework

ReProVide — Query Optimisation and Near-Data Processing on Reconfigurable SoCs for Big Data Analysis

The goal of this project is to provide novel hardware and optimization techniques for scalable, high-performance processing of Big Data. We particularly target huge data sets with flexible schemata (row-oriented, column-oriented, document-oriented, irregular, or non-indexed) as well as data streams as found in click-stream enterprise analytics, software logs, and discussion-forum archives, as well as produced by sensors in IoT and Industrie 4.0. In this realm, the project investigates the potential of hardware-reconfigurable, FPGA-based SoCs for near-data processing where computations are pushed towards such heterogeneous data sources. Based on FPGA technology and in particular on their dynamic reconfiguration, we propose a generic architecture called ReProVide for low-cost processing of database queries.

Further information: ReProVide Project Website


Selected publications

  • , , , , , , , , , , , , , , , , , , , , :
    Trends in Data Locality Abstractions for HPC Systems
    In: IEEE Transactions on Parallel and Distributed Systems ()
    ISSN: 1045-9219
    DOI: 10.1109/TPDS.2017.2703149
  • , , , :
    SYCL Code Generation for Multigrid Methods
    22nd International Workshop on Software and Compilers for Embedded Systems (SCOPES '19) (Sankt Goar, Germany, 27. May 2019 - 29. May 2019)
    In: 22nd International Workshop on Software and Compilers for Embedded Systems (SCOPES '19)
    DOI: 10.1145/3323439.3323984

Further information