SC is the International Conference for
 High Performnance Computing, Networking, Storage and Analysis

SCHEDULE: NOV 13-19, 2010

FlowChecker: Detecting Bugs in MPI Libraries via Message Flow Checking

SESSION: Parallel Analysis Tools

EVENT TYPE: Paper, Best Student Paper (BSP) Finalist

TIME: 4:00PM - 4:30PM

SESSION CHAIR: Martin Schulz

AUTHOR(S):Zhezhe Chen, Qi Gao, Wenbin Zhang, Feng Qin

ROOM:391-392

ABSTRACT:
Many MPI libraries have suffered from software bugs, which severely impact the productivity of a large number of users. This paper presents a new method called FlowChecker for detecting communication-related bugs in MPI libraries. The main idea is to extract program intentions of message passing (MP-intentions), and to check whether these MP-intentions are fulfilled correctly by the underlying MPI libraries, i.e., whether messages are delivered correctly from specified sources to specified destinations. If not, FlowChecker reports the bugs and provides diagnostic information. We have built a FlowChecker prototype on Linux and evaluated it with five real-world bug cases in three widely-used MPI libraries, including Open MPI, MPICH2, and MVAPICH2. Our experimental results show that FlowChecker effectively detects all five evaluated bug cases and provides useful diagnostic information. Additionally, our experiments with HPL and NPB show that FlowChecker incurs low runtime overhead (0.9-9.7% on three MPI libraries).

Chair/Author Details:

Martin Schulz (Chair) - Lawrence Livermore National Laboratory

Zhezhe Chen - Ohio State University

Qi Gao - Ohio State University

Wenbin Zhang - Ohio State University

Feng Qin - Ohio State University

Add to iCal  Click here to download .ics calendar file

Add to Outlook  Click here to download .vcs calendar file

Add to Google Calendarss  Click here to add event to your Google Calendar

The full paper can be found in the ACM Digital Library and IEEE Computer Society

   Sponsors    IEEE    ACM