Chair of Computer Science 2 – Programming Systems


We develop scientific solutions for software engineers in industry who work on parallel software for multicores and for distributed or embedded systems made thereof. We take a code-centric approach, construct operational prototypes, and evaluate them both quantitatively and qualitatively.

Corner stones of our field of research:

  • We work on programming models for heterogeneous parallelism, from which we then generate portable and efficient code for multicores, GPUs, accelerators, mobile devices, FPGAs, etc.
  • We help parallelize software for multicores. Our tools analyze code repositories and help developers in migrating and refactoring projects.
  • We analyze code. Our code analysis tools are fast, interactive, incremental and sometimes work in parallel themselves. They not only detect race conditions, conflicting accesses to resources, etc. The resulting suggestions on how to improve the code also show up in the IDE where they matter.
  • We test parallel code and diagnose the root causes of problems. Our tools generate test data, track down erratic runtime behavior, and prevent authenticity attacks.

Some of our research projects strive to ramp up the efficiency of HPC software in general and focuses on GPU and FPGA boards that are components of HPC nodes, see the first two research topics listed below. Other topics further down the list use clusters to do all the symbolic computation that is needed for our programming systems research, e.g., to optimize FPGA designs, to detect bugs in compilers, to find race conditions, to harvest information from code archives, to find semantically equivalent codes, etc.


Prof. Dr. Michael Philippsen

Michael Baer

Thorsten Blass

Marius Kamp

Patrick Kreutzer

Florian Mayer

Research topics

ORKA-HPC – OpenMP for reconfigurable heterogeneous architectures

Detailed project description is found on the ORKA-HPC page.

ParCAn – Parallel code analysis on a GPU

In compiler construction there are analyses that propagate information along the edges of a graph and modify it, until a fix point is reached and the information no longer changes. In this project we build the ParCAn framework to accelerate such analyses by exploiting the massive parallelism of graphic cards.

In 2018 we completed our comparative study on the efficiency of graph data structures on GPUs. We analyzed the combination of 11 graph data structures, 10 static graph algorithms, 19 input graphs, and six graph exchange formats. The measurements were performed on four NVIDA GPUs with different architectures. We analyzed 752k data points for the study. We derived a decision tree that allows developer to choose the best possible graph data structure for their  algorithm. We showed that our decision tree can speed up (static) graph algorithms by up to 45%. Results from this study have influenced the ParCAn framework.

Not all instructions are relevant to an analysis. Irrelevant ones do not contribute to the analysis but only increase the runtime as analysis knowledge (stored in predicates) has to be propagated through those instructions, without any modifications. We implemented a parallel algorithm on the GPU that marks and sweeps such irrelevant instructions. Since removing redundant instructions reduces the length of the propagation paths, fewer iterations are needed before a fixpoint is reached.

We also implemented an algorithm that ensures that all threads of a warp have a homogeneous amount of work. Without this algorithm, predicates can accumulate along control flow paths so that some threads have a higher workload than others. In this case threads with fewer work idle while others still process predicates since the execution time is dominated by the longest runtime of a thread. To avoid such imbalances, threads that finish the computation on their predicates now support other threads that still have work to do. We use characteristics of the GPU to avoid the use of slow synchronization mechanisms.

To show the effectiveness of our framework we integrated it into the LLVM compiler framework. We picked four LLVM analyses and parallelized them with ParCAn. Ample measurements show that our framework can accelerate LLVM’s compilation process by up to 40%. A publication was accepted at the “28th International Conference on Compiler Construction” and received the Best-Paper-Award.


AuDeRace – Automatic Detection of Race-Conditions

Large software projects with hundreds of developers are difficult to review and often contain many bugs. Automatic tests are a well established technique to test sequential and deterministic software. They test the whole project (system test) or each module by itself (unit test). However, recent software contains more and more parallelism. This introduces several new bug patterns, like deadlocks and concurrent memory accesses, that are harder or even impossible to be detected reliably using conventional test methods. Whether the faulty behavior arises depends on the concrete scheduling of the threads which is in-deterministic and varies between individual executions depending on the underlying system. Due to this unpredictable behavior such bugs do not necessarily manifest in an arbitrary test run or may never arise in the testing environment at all. As a result, conventional tests are not well suited for modern, concurrent software.
With the project AuDeRace, we develop methods to efficiently and reliably detect concurrent bugs while keeping the additional effort for developers as low as possible. In an initial approach we define a testing framework that allows the specification of a scheduling plan to regain deterministic execution. However, a major problem still remains: The developer has to identify and implement well suited test cases that cover the potential fault in the program and execute them in a special deterministic way in order to trigger the failure. Especially in the context of concurrency, it is difficult to visualize the behavior of a program and identify the problematic parts. To overcome this, the critical parts shall automatically be narrowed down before even writing dedicated test cases. Existing approaches and tools for this purpose generate too many false positives or the analysis is very time consuming, making their application to real world code prohibitive. The goal of this project is to generate less false positives and increase the analysis speed by combining existing static and dynamic analysis. This allows for the efficient use in not only small example codes but also large and complex software projects.

In 2016 existing approaches were studied regarding their usability as a starting basis for our project. The most promising method uses model checking and predefined assertions to construct thread schedules that trigger the faulty behavior. However, the approach is currently infeasible for larger projects because only very small codes could be analyzed in reasonable time. Therefore, we focused on automatically detecting and removing statements that are unrelated to the parallelism respectively to the potentially faulty code parts in order to decrease the execution time of the preliminary static analysis.

In 2017 the work on automatically reducing programs to speed up further analysis was continued. Furthermore, we evaluated whether the concept of mutation testing can be applied to parallel software as well. The results indicate that this extension is indeed possible to rate tests qualitatively. However, to complete the analysis for larger programs in reasonable time, a few heuristics need to be applied during the process.

In 2018 the focus moved to a deterministic execution of test cases. A concept to reproduce results during the execution was developed: In addition to the test case, a schedule specifies the dynamic behavior of the threads. Instrumenting the code at previously marked positions and other relevant byte code instructions allows a separate control thread to enforce the schedule. When modifying the source code, the marked positions in the code need to be updated as well to keep them consistent with the test cases. A merging technique similar to the ones used in version control systems shall be used to automatically update the positions.

This project is a contribution of the Chair of Computer Science 2 (Programming Systems) to the IZ ESI (Embedded Systems Initiative, ).


AutoCompTest – Automatic Testing of Compilers

Compilers for programming languages are very complex applications and their correctness is crucial: If a compiler is erroneous (i.e., if its behavior deviates from that defined by the language specification), it may generate wrong code or crash with an error message. Often, such errors are hard to detect or circumvent. Thus, users typically demand a bug-free compiler implementation.

Unfortunately, research studies and online bug databases suggest that probably no real compiler is bug-free. Several research works therefore aim to improve the quality of compilers. Since the formal verification (i.e., a proof of a compiler’s correctness) is often prohibited in practice, most of the recent works focus on techniques for extensively testing compilers in an automated way. For this purpose, the compiler under test is usually fed with a test program and its behavior (or that of the generated program) is checked: If the actual behavior does not match the expectation (e.g., if the compiler crashes when fed with a valid test program), a compiler bug has been found. If this testing process is to be carried out in a fully automated way, two main challenges arise:

  • Where do the test programs come from that are fed into the compiler?
  • What is the expected behavior of the compiler or its output program? How can one determine if the compiler worked correctly?

While the scientific literature proposes several approaches for dealing with the second challenge (which are also already established in practice), the automatic generation of random test programs still remains a challenge. If all parts of a compiler should be tested, the test programs have to conform to all rules of the respective programming language, i.e., they have to be syntactically and semantically correct (and thus compilable). Due to the large number of rules of “real” programming languages, the generation of such compilable programs is a non-trivial task. This is further complicated by the fact that the program generation has to be as efficient as possible: Research suggests that the efficiency of such an approach significantly impacts its effectivity — in a practical scenario, a tool can only be used for detecting compiler bugs if it can generate many (and large) programs in short time.

The lack of an appropriate test program generator and the high costs associated with the development of such a tool often prevent the automatic testing of compilers in practice. Our research project therefore aims to reduce the effort for users to implement efficient program generators.

In 2018, we started the development of such a tool. As input, it requires a specification of a programming language’s syntactic and semantic rules by means of an abstract attribute grammar. Such a grammar allows for a short notation of the rules on a high level of abstraction. Our newly devised algorithm then generates test programs that conform to all of the specified rules. It uses several novel technical ideas to reduce its expected runtime. This way, it can generate large sets of test programs in acceptable time, even when executed on a standard desktop computer. A first evaluation of our approach did not only show that it is efficient and effective, but also that it is versatile. Our approach detected several bugs in the C compilers gcc and clang (and achieved a bug detection rate which is comparable to that of a state-of-the-art C program generator from the literature) as well as multiple bugs in different SMT solvers. Some of the bugs that we detected were previously unknown to the respective developers.


AnaCoRe – Analysis of Code Repositories

Software developers often modify their projects in a similar or repetitive way. The reasons for these changes include the adoption of a changed interface to a library, the correction of mistakes in functionally similar components, or the parallelization of sequential parts of a program. If developers have to perform the necessary changes on their own, the modifications can easily introduce errors, for example due to a missed change location. Therefore, an automatic technique is desirable that identifies similar changes and uses this knowledge to support developers with further modifications.

Refer to the AnaCoRe CRIS project page for a detailed description.


Selected publications

  • Blaß, T.; Philippsen, M.: GPU-Accelerated Fixpoint Algorithms for Faster Compiler Analyses (Best Paper Award). In: Proceedings of the 28th International ACM Conference on Compiler Construction, CC’19, Washington, D.C. 2019, S. 122-134. – ISBN 978-1-4503-6277-1
  • Mayer, F.; Knaust, M.; Philippsen, M.: OpenMP on FPGAs – A Survey. In: Fan, X.; de Supinski, B.; Sinnen, O.; Giacaman, N. (Hrsg.): OpenMP: Conquering the Full Hardware Spectrum – Proceedings of the 15th International Workshop on OpenMP (IWOMP 2019), Auckland, New Zealand, Springer LNCS Bd. 11718, 2019, S. 94-108. – ISBN 978-3-030-28595-1
  • Kreutzer, P.; Kraus, S.; Philippsen, M.: Language-Agnostic Generation of Compilable Test Programs. Proceedings of the International Conference on Software Testing, Verification and Validation (ICST 2020), Porto, Portugal, 2020
  • Baer, M.; Oster, N.; Philippsen, M.: MutantDistiller: Using Symbolic Execution for Automatic Detection of Equivalent Mutants and Generation of Mutant Killing Tests. Proceedings International Workshop on Mutation Analysis (Mutation 2020) at International Conference on Software Testing, Verification and Validation (ICST 2020), Porto, Portugal, 2020
  • Kamp, M.; Kreutzer, P.; Philippsen, M.: SeSaMe: A Data Set of Semantically Similar Java Methods. In: Storey, M.; Adams, B.; Haiduc, S. (Hrsg.): Proceedings of the 16th International Conference on Mining Software Repositories (MSR 2019), Montreal, QC, 2019, S. 529-533. – ISBN 978-1-7281-3412-3 – ISSN 2574-3864
  • Dotzler, G.; Kamp, M.; Kreutzer, P.; Philippsen, M.: More Accurate Recommendations for Method-Level Changes. In: Proceedings of 2017 11th ACM Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE2017), Paderborn, Germany, 2017, S. 798-808. – ISBN 978-1-4503-5105-8

Further information