By Sarah Scheffler and Hien Nguyen
detect_xss.py
- given a CFG for a single file, callblock_parser
to detect sources and sinks, then callpath_finder
to detect paths between them.scan_all.sh
- callsdetect_xss.py
on many CFG files
This is a git clone of https://github.com/joyrexus/dijkstra with one added file:
path_finder.py
- contains two main functions:cfg_to_graph(cfg_file)
- parses the CFG file into a list of nodes using functionality defined inspoon-master/test/block/block_parser.py
, then parses that list into a graph that can be input todijkstra
get_path_if_exists(graph, start, end)
takes in the output ofcfg_to_graph
, astart
node, and anend
node. It returns a tuple(start, end, pred)
wherestart
andend
are the unaltered inputs (this was useful to have as an output later), andpred
is the output ofdijkstra
. It is a path, albeit one that is difficult to read.pred
is an empty dictionary{}
if there is no path.
All other files in the dijkstra folder are unaltered from the source.
This contains the code for a dummy extension that contains a very basic XSS vulnerability. The basis of this code was the developer tutorial for Chrome extensions, and the extension itself was written by us.
This contains the code for crawling the Chrome Web Store.
selenium_crawler.py
was the code that we eventually used to crawl the Store.how_to_setup.txt
explains the process of setting up the crawler (largely written to make moving our code over to the MOC server easier)unzip_all_expansions.sh
unzips extensions. (.crx
files can be unzipped using the normalunzip
utility)
This contains the text file outputs of potential vulnerabilities, sorted by extension ID and then filename.
This is a git clone of https://github.com/indutny/spoon with several added files:
test/block/block_parser.py
- Main file that finds sources and sinks within a CFG. Uses regexes to check for a list of sinks and sources. It has three main functions:parseCFG(filename)
- parses the CFG from the output ofspoon
to a list of blocks, where each block is a tuple with the relevant information (block number, predecessors, successors, instructions, etc) extracted using a regexget_sink_blocks(filename) - uses
parseCFG` and then uses the sink regexes to detect which blocks are sinksget_source_block(filename) - uses
parseCFG` and then uses the source regexes to detect which blocks are sources
test/block/tester.py
- simple file to test sink and source capturingtest/cfg_collector.py
- script to calculate the CFG for all downloaded extensions usingfile_to_cfg.js
file_to_cfg.js
- call Esprima to construct an AST and then spoon to construct a CFG for a single.js
file
Contains a number of old or abandoned parts of this project.
The following are dependencies that we copied to this repository for ease of use:
The following are dependencies that can be obtained from apt-get or a similar package manager:
- nodejs-legacy
- nodejs
- npm
- xvfb
- chromium-chromedriver
The following are dependencies that must be obtained from npm:
- esprima
- json
- fs
- estraverse
- escodegen
- assert
The following are dependencies that must be obtained from pip3:
- selenium
- pyvirtualdisplay