Skip to content

L3LeTrigger-F/VulDeeLocator

 
 

Repository files navigation

VulDeeLocator: A Deep Learning-Based Fine-Grained Vulnerability Detector

We propose Vulnerability Deep learning-based Locator (VulDeeLocator), a deep learning-based fine-grained vulnerability detector, for C/C++ programs with source code. VulDeeLocator advances the state-of-the-art by simultaneously achieving a high detection capability and a high locating precision. The core innovations underlying VulDeeLocator are (i) the leverage of intermediate code to accommodate semantic information that cannot be conveyed by source code-based representation, and (ii) the concept of granularity refinement for precisely pinning down locations of vulnerabilities.

We extract pieces of source code according to some syntax information (i.e., source code- and Syntax-based Vulnerability Candidate or sSyVC for short), involving four kinds of sSyVCs: library/API function call (FC), array definition (AD), pointer definition (PD), and arithmetic expression (AE). Then we extend these pieces of code to accommodate the semantic information from the intermediate code (i.e., intermediate code- and Semantics-based Vulnerability Candidate or iSeVC for short).

We prepare a dataset of Lower Level Virtual Machine (LLVM) intermediate code with accompanying program source code from two data sources: the National Vulnerability Database (NVD) and the Software Assurance Reference Dataset (SARD). The dataset contains 119,782 vulnerability candidates in intermediate code (i.e., iSeVC), among which 30,201 are vulnerable and 89,581 are not vulnerable. For vulnerable iSeVCs, the line numbers of the vulnerabilities are available.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published