Motivation (for current research):

The trend of decreasing feature sizes and increasing demand for high performance has had (and will continue to have) a coupled effect on reducing the reliability of supercomputers. Thus, it is almost a requirement that software systems/applications behave reliably even under unreliable conditions. Tools and techniques that empower development of such "resilient" systems are, thus ,the need of the day. The criticality of this problem and the underlying challenge it provides (i.e. guaranteeing correct computation without recomputing) motivate me.

Tools:

  • KULFI: A LLVM-Based instruction level fault injector
    • This tool was co-developed as part of a work with Vishal Chandra Sharma, PhD Student, University of Utah