Skip to content

ECE 117: Computer Security. "Attacks and Defenses on PDF Malware Detection". Adversarial examples for vision-based malware detection, and robust classification via fine-tuning on limited data.

Notifications You must be signed in to change notification settings

rathull/robust-pdf-malware

Repository files navigation

Robust PDF Malware

Project Goals

PDFs are one of the most popular formats to access documents, it is crucial that the level of security keeps up with their usage. This project introduces two approaches to analyzing the security of PDF documents: first, by creating adversarial examples of malicious PDFs (through noise injection) to spoof standard PDF malware detectors such as VirusTotal, and second, by fine-tuning ResNet18 to identify malicious PDFs based on their Markov plots.

Malicious PDFs that can bypass VirusTotal scanning (32/60 tools did not correctly classify the PDF as malicious)

In order to create a malicious PDF, you can use ./attacks/find_perturbable.ipynb. This notebook provides methods to find the areas in the PDF that can have noise injected, as well as methods to inject JavaScript attacks. The example in the notebook causes the Calculator app to be opened when the user opens the PDF.

ResNet18 Fine-Tuning

Using code in ./data/generate_dataset.py, create a .pkl file with the Markov plot data saved. This is demonstrated in cells 1-53 of ./finetune.ipynb. Once the datasets have been generated, use the code in ./resnet18_classifier.ipynb to train our fine-tuned version of ResNet to classify malicious and safe PDFs based on their Markov plots!

About

ECE 117: Computer Security. "Attacks and Defenses on PDF Malware Detection". Adversarial examples for vision-based malware detection, and robust classification via fine-tuning on limited data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •