The goal of library and function identification is to find the original library and function to a given machine-code snippet. These snippets commonly arise from penetration tests attacking a remote executable, static malware analysis or from an IP infringement investigation. While there are several tools designed to achieve this task, all of these seem to rely on varied methods of signature-based identification. In this paper, we argue that this approach is not sufficient for many cases and propose a design and implementation for a multitool called KISS. KISS uses lossless compression and highly optimized pattern matching algorithms to create a very compact but substantial database of library versions. In practice, KISS
shows to achieve remarkable compression rates below 30 percent of the original database size while still allowing for extremely fast (sublinear) snippet identification.
We use statistical test to show that its code snippet recognition is tremendously successful while having a close to the lowest theoretical bound of false positives. Finally, we also demonstrate how our approach improves the security of existing techniques as our design relies fully on complete function body verification, which prevents analysis-resilient malware from disguising as external and trusted library code. This has recently rosen to a problem for malware analysts with existing identification solutions.
Maximilian von Tschirschnitz is working as an prototype engineer and researcher for the Intel Corporation in Germany.
In parallel he is currently conducting his studies of Informatics at the TU Munich.
His current research topics cover IT-security and high precision positioning methods.
His further professional interests include theoretical informatics, image feature recognition and computer graphics.