Automatic runtime error repair and containment via recovery shepherding
Author(s)Long, Fan; Sidiroglou-Douskos, Stelios; Rinard, Martin
MetadataShow full item record
We present a system, RCV, for enabling software applications to survive divide-by-zero and null-dereference errors. RCV operates directly on off-the-shelf, production, stripped x86 binary executables. RCV implements recovery shepherding, which attaches to the application process when an error occurs, repairs the execution, tracks the repair effects as the execution continues, contains the repair effects within the application process, and detaches from the process after all repair effects are flushed from the process state. RCV therefore incurs negligible overhead during the normal execution of the application. We evaluate RCV on all divide-by-zero and null-dereference errors available in the CVE database  from January 2011 to March 2013 that 1) provide publicly-available inputs that trigger the error which 2) we were able to use to trigger the reported error in our experimental environment. We collected a total of 18 errors in seven real world applications, Wireshark, the FreeType library , Claws Mail, LibreOffice, GIMP, the PHP interpreter, and Chromium. For 17 of the 18 errors, RCV enables the application to continue to execute to provide acceptable output and service to its users on the error-triggering inputs. For 13 of the 18 errors, the continued RCV execution eventually flushes all of the repair effects and RCV detaches to restore the application to full clean functionality. We perform a manual analysis of the source code relevant to our benchmark errors, which indicates that for 11 of the 18 errors the RCV and later patched versions produce identical or equivalent results on all inputs.
DepartmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
PLDI '14: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation
Long, Fan, Stelios Sidiroglou-Douskos, and Martin Rinard. "Automatic runtime error repair and containment via recovery shepherding." 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2014, Edinburgh, United Kingdom, Association for Computing Machinery, 2014.
Author's final manuscript