Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSoC 2024] Integrating KinFin Proteome Cluster analyses into Genome Browsing environments #9

Merged
merged 148 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
148 commits
Select commit Hold shift + click to select a range
7a8d842
chore: Remove build artifacts from version control
rohan-b-84 May 31, 2024
fa15b22
chore: update requirements.txt file
rohan-b-84 May 31, 2024
62e9149
update: switch 'iteritems()' to 'items()' for Python 3 compatibility
rohan-b-84 May 31, 2024
83890e5
refactor: Remove file download logic from kinfin script
rohan-b-84 May 31, 2024
65de92f
refactor: modularize the install script
rohan-b-84 May 31, 2024
3a3d423
feat: replace --outprefix with --output_path for clarity and flexibility
rohan-b-84 May 31, 2024
0d3399f
chore: Add type hint ignore comment for PyQt4 import
rohan-b-84 Jun 3, 2024
1682917
refactor: Remove unused assignments from DataFactory
rohan-b-84 Jun 3, 2024
71f427c
fix: get_major_ticks() is now get_majorticklabels() to access labels
rohan-b-84 Jun 8, 2024
30a0523
feat: create new entrypoint for kinfin
rohan-b-84 Jun 14, 2024
6fe36de
feat: add util to check whether file exists if filepath provided
rohan-b-84 Jun 16, 2024
682b314
feat: added check for whether required data files exists
rohan-b-84 Jun 16, 2024
41146f7
feat: added classes to store input data
rohan-b-84 Jun 16, 2024
16697b5
feat: added constants to config.py file
rohan-b-84 Jun 16, 2024
1e2f404
feat: added argument parsing using argparse
rohan-b-84 Jun 16, 2024
561df01
feat: added function to validate CLI args
rohan-b-84 Jun 16, 2024
2b2138e
feat: map arguments to input class
rohan-b-84 Jun 16, 2024
fd195b1
feat: defined class to store attribute level data
rohan-b-84 Jun 16, 2024
837d700
feat: defined class to store ALO collections
rohan-b-84 Jun 16, 2024
b1573df
feat: added method to compute proteomes grouped by levels for each at…
rohan-b-84 Jun 16, 2024
7cda86f
feat: added method to create ALO objects for each attribute and level
rohan-b-84 Jun 16, 2024
d87a75a
feat: added util function to read file line by line
rohan-b-84 Jun 16, 2024
3e689e5
feat: added logic to parse attributes from config file
rohan-b-84 Jun 16, 2024
366a97b
feat: added logic to get lineage of taxonomic identifier
rohan-b-84 Jun 16, 2024
ff83eeb
feat: added util function to show progress
rohan-b-84 Jun 16, 2024
a073ce5
feat: added logic to parse nodesdb file
rohan-b-84 Jun 16, 2024
d1cf285
feat: added logic to add taxonomic attributes from nodesdb file
rohan-b-84 Jun 16, 2024
1382fec
feat: added logic to parse tree from nwk file
rohan-b-84 Jun 16, 2024
5f275c2
feat: added function to build AloCollection from CLI input
rohan-b-84 Jun 16, 2024
749b653
feat: added logic to parse attributes from json list and taxon-idx ma…
rohan-b-84 Jun 16, 2024
c60daac
feat: added function to build AloCollection from API input
rohan-b-84 Jun 16, 2024
84d4032
feat: initialize AloCollection inside data store
rohan-b-84 Jun 16, 2024
a231462
feat: added class to store individual protein data
rohan-b-84 Jun 16, 2024
8cdb19b
feat: added class to store collection of proteins
rohan-b-84 Jun 16, 2024
4adf670
fix: spelling correction ALoCollection -> AloCollection
rohan-b-84 Jun 16, 2024
086f825
feat: added docstring to 'build_AloCollection_from_json' function
rohan-b-84 Jun 16, 2024
9aebfde
feat: added util function to read fasta file
rohan-b-84 Jun 16, 2024
687f800
feat: added logic to parse fasta directory and get lengths by protein id
rohan-b-84 Jun 16, 2024
a3788e2
feat: added logic to parse pFam mapping file
rohan-b-84 Jun 16, 2024
2a570af
feat: added logic to parse ipr mapping file
rohan-b-84 Jun 16, 2024
f1c5912
feat: added logic to parse go mapping file
rohan-b-84 Jun 16, 2024
15d3c11
feat: added method to add annotations to protein
rohan-b-84 Jun 16, 2024
6a47645
feat: added function to parse domains from functional annotations file
rohan-b-84 Jun 16, 2024
b4f9146
feat: initialize ProteinCollection inside data store
rohan-b-84 Jun 16, 2024
18c2d30
feat: added class to store Cluster data
rohan-b-84 Jun 16, 2024
8da9aec
feat: added class to store Cluster Collection data
rohan-b-84 Jun 16, 2024
f8dcc56
feat: added function to identify singleton clusters
rohan-b-84 Jun 16, 2024
7969684
fix: add functional annotations to protein collections
rohan-b-84 Jun 16, 2024
91629a2
feat: added function to parse cluster file to create cluster objects
rohan-b-84 Jun 16, 2024
b5fbe3e
feat: initialize ClusterCollection inside data store
rohan-b-84 Jun 16, 2024
ebebba5
feat: add method to setup directories to store output
rohan-b-84 Jun 16, 2024
07f8f6b
feat: added util functions to calculate mean, median, sd and perform …
rohan-b-84 Jun 16, 2024
1e4b69f
feat: added logic to compute protein IDs grouped by proteome IDs
rohan-b-84 Jun 16, 2024
469f4c3
feat: added method to calculate stats based on protein lengths
rohan-b-84 Jun 16, 2024
54e2e43
fix: addded return type to mean, median, sd utils
rohan-b-84 Jun 16, 2024
eda88e3
feat: addded logic to determine cluster type: singleton, shared, spec…
rohan-b-84 Jun 16, 2024
e7090b8
feat: addded logic to determine cardinality type
rohan-b-84 Jun 16, 2024
642b915
feat: addded more properties to cluster class
rohan-b-84 Jun 16, 2024
c3feabe
feat: addded 'add_cluster' method to AttributeLevel class
rohan-b-84 Jun 16, 2024
c98dd85
feat: added 'analyse_cluster' method to DataFactory class
rohan-b-84 Jun 16, 2024
163c249
feat: added 'analyse_clusters' method to DataFactory class
rohan-b-84 Jun 16, 2024
b1d140e
feat: added method to generate header for a node
rohan-b-84 Jun 16, 2024
213cd21
feat: added method to generate histogram chart for a node
rohan-b-84 Jun 16, 2024
2e58026
feat: added method to plot and save texual representation of nwk tree
rohan-b-84 Jun 16, 2024
62bc26a
feat: added method to plot and save graphical representation of nwk tree
rohan-b-84 Jun 16, 2024
f5ca4bf
feat: added method to write tree data to files
rohan-b-84 Jun 16, 2024
a318904
feat: added method to compute rarefaction data
rohan-b-84 Jun 16, 2024
d50e2b8
feat: added method to plot cluster sizes
rohan-b-84 Jun 16, 2024
1c4efb0
feat: added method to generate a header line based on the specified f…
rohan-b-84 Jun 16, 2024
0265f3f
feat: added method to conduct pairwise representation test
rohan-b-84 Jun 16, 2024
3a0f3f3
feat: added method to retrieve ALO metrics
rohan-b-84 Jun 16, 2024
6dc829b
feat: added method to compute protein length stats
rohan-b-84 Jun 16, 2024
b0c97c3
feat: added method to compute secreted cluster coverage
rohan-b-84 Jun 16, 2024
a02fba9
feat: added method to compute domain aggregated domain counts by doma…
rohan-b-84 Jun 16, 2024
ee5bdbe
feat: added method to compute entropy of domain groups
rohan-b-84 Jun 17, 2024
2aae805
feat: added method to generate volcano plots
rohan-b-84 Jun 17, 2024
c88e1da
feat: added method to generate cluster metrics and write them into files
rohan-b-84 Jun 17, 2024
57f40bb
fix: remove unused ALO property: self.rarefaction_data = {}
rohan-b-84 Jun 17, 2024
bc332b1
feat: added method to generate output from the input data
rohan-b-84 Jun 17, 2024
ac4b698
feat: added method to count proteins for a cluster type
rohan-b-84 Jun 17, 2024
c8b21a0
feat: added method to count clusters of specific status and type
rohan-b-84 Jun 17, 2024
4b0f434
feat: added method to get span of proteins for particular cluster type
rohan-b-84 Jun 17, 2024
cabe8ac
feat: added method to get count of clusters by cardinality and type
rohan-b-84 Jun 17, 2024
019b8a4
feat: added method to get sorted string of proteomes
rohan-b-84 Jun 17, 2024
ce91ed5
fix: add type to datastore methods
rohan-b-84 Jun 18, 2024
4ad81c9
feat: added option to perform analysis in CLI mode
rohan-b-84 Jun 18, 2024
92efdcd
feat: add a session manager class for api
rohan-b-84 Jun 18, 2024
3b14291
feat: define endpoint to init API session
rohan-b-84 Jun 18, 2024
2cafbbd
feat: added option to perform analysis in API mode
rohan-b-84 Jun 18, 2024
2b24642
fix: remove unnecessary print statement
rohan-b-84 Jun 18, 2024
a085906
feat: basic script to compare outputs
rohan-b-84 Jun 18, 2024
fd5ae17
Update .gitignore
DRL Jun 24, 2024
53436ee
fix: use log base of 2
rohan-b-84 Jun 25, 2024
7508973
mkae domain counts in sorted order for easier testing
rohan-b-84 Jun 25, 2024
108d178
feat: add tests to ensure consistent output
rohan-b-84 Jun 25, 2024
d07061b
Merge branch 'master' into rohan/basic-endpoint
rohan-b-84 Jun 25, 2024
9c80617
chore: update requirements.txt
rohan-b-84 Jun 25, 2024
47ea337
chore: add dependencies for developing features
rohan-b-84 Jun 25, 2024
62a6db6
chore: add fastapi as dependency
rohan-b-84 Jun 25, 2024
d51babd
fix: replace 'id' with '_id' to avoid python keyword conflict
rohan-b-84 Jun 26, 2024
000adf9
fix: update request method from GET to POST
rohan-b-84 Jun 26, 2024
1b93d6b
feat: update API request to return cluster_size_distribution.png file
rohan-b-84 Jun 26, 2024
2ca1c8a
renmae: 'session_path' variable to 'result_dir' variable
rohan-b-84 Jun 26, 2024
b5f2b3e
feat: add environment variables for server
rohan-b-84 Jun 26, 2024
e5c099e
feat: add .env.example file
rohan-b-84 Jun 26, 2024
bf77492
feat: added endpoint to retrieve plots from results
rohan-b-84 Jun 26, 2024
353b8ab
feat: use 'logging' module for logs
rohan-b-84 Jun 26, 2024
f136d4c
fix: add '-r requirements.txt' at start of requirements-dev.txt file
rohan-b-84 Jun 28, 2024
16c3471
chore: Add docstrings to all the functions
rohan-b-84 Jun 28, 2024
94ec730
feat: update testing logic and move them to tests directory
rohan-b-84 Jun 28, 2024
d6d7f0f
feat: made analyse function run asynchronously in background; added d…
rohan-b-84 Jun 28, 2024
a5315da
refactor: initialize output array as header line
rohan-b-84 Jun 28, 2024
1c5f265
refactor: output line to be concatanated instead of appended and join…
rohan-b-84 Jun 28, 2024
2370edf
fix: implement sourcery changes
rohan-b-84 Jun 30, 2024
7ed51ed
fix: more sourcery manual fixes; breaking functions into smaller comp…
rohan-b-84 Jun 30, 2024
6898d51
fix: sourcery error for analyse_cluster method
rohan-b-84 Jul 2, 2024
932fb69
fix: sourcery error for plot_count_comparisons_volcano method
rohan-b-84 Jul 2, 2024
f7a01fe
fix: most of the flake8 errors (except E501-line length)
rohan-b-84 Jul 2, 2024
cd00be9
fix: use logging module
rohan-b-84 Jul 2, 2024
4d9a264
fix: all sourcery errors and warnings
rohan-b-84 Jul 7, 2024
5fe575a
feat: added docstrings :)
rohan-b-84 Jul 7, 2024
7989e1a
feat: tests show missing files if any
rohan-b-84 Jul 7, 2024
db7c80f
feat: more list comprehensions
rohan-b-84 Jul 7, 2024
91b8240
fix: comment 'Offending SequenceID' error message
rohan-b-84 Jul 8, 2024
be80a37
Merge branch 'master' into rohan/basic-endpoint
rohan-b-84 Jul 8, 2024
6250396
fix: avoid escaping quotes
rohan-b-84 Jul 8, 2024
fa4d2af
fix: unterminated string literal warning
rohan-b-84 Jul 8, 2024
1754203
fix: linting errors
rohan-b-84 Jul 8, 2024
bafc0c7
fix: update function return types for API handlers
rohan-b-84 Jul 9, 2024
3974afc
feat: update setup.py
rohan-b-84 Jul 9, 2024
c82bc41
feat: save logs in file
rohan-b-84 Jul 27, 2024
9148e3c
feat: use asyncio and subprocesses instead of fastapi's background tasks
rohan-b-84 Jul 27, 2024
8cf0137
feat: use query sessions instead of user sessions
rohan-b-84 Aug 1, 2024
72ce9f8
fix: linting and sourcery
rohan-b-84 Aug 1, 2024
bcb09f4
refactor: use results_base_dir env directly only for API
rohan-b-84 Aug 6, 2024
d8ef014
feat: added example taxon_idx_mapping file
rohan-b-84 Aug 6, 2024
711fec7
fix: set plot format to png for API
rohan-b-84 Aug 6, 2024
1cf6522
fix: set content-disposition to inline instead of attachment for plots
rohan-b-84 Aug 6, 2024
fa9901e
feat: added endpoint to get run summary in JSON format
rohan-b-84 Aug 6, 2024
1006a37
feat: prefix kinfin endpoints with /kinfin
rohan-b-84 Aug 6, 2024
59667f4
feat: added counts-by-taxon endpoint
rohan-b-84 Aug 6, 2024
0f226f8
feat: make output deterministic; rename attribute as
rohan-b-84 Aug 19, 2024
bd03663
feat: refactor endpoints.py; added wrapper; endpoints to get availab…
rohan-b-84 Aug 19, 2024
54704d3
feat: some curl examples for KaaS
rohan-b-84 Aug 19, 2024
b756dac
fix: sourcery magic
rohan-b-84 Aug 19, 2024
95d97dc
fix: isort
rohan-b-84 Aug 19, 2024
004e209
fix: remove unnecessary print statements
rohan-b-84 Aug 19, 2024
63c444a
chore: remove testing data from version control
rohan-b-84 Aug 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions .eggs/README.txt

This file was deleted.

5 changes: 5 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
CLUSTER_FILE_PATH=/absolute/path/to/Orthogroups.txt
SEQUENCE_IDS_FILE_PATH=/absolute/path/to/SequenceIDs.txt
TAXON_IDX_MAPPING_FILE_PATH=/absolute/path/to/taxon_idx_mapping.json
RESULTS_BASE_DIR=/absolute/path/where/all/results/should/be/stored/
SESSION_INACTIVITY_THRESHOLD=24
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,9 @@ example/test.*
build/
dist/

venv
.test_data
result
.DS_Store
.env
data
2,195 changes: 0 additions & 2,195 deletions build/lib/kinfin/kinfin.py

This file was deleted.

Binary file removed dist/kinfin-0.9-py2.7.egg
Binary file not shown.
70 changes: 70 additions & 0 deletions example/curl_examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
### 1. Initialize the Analysis Process

```bash
curl -X POST "http://127.0.0.1:8000/kinfin/init" \
-H "Content-Type: application/json" \
-d '{"config": [{ "taxon": "BGLAB", "label1": "red" },{ "taxon": "CVIRG", "label1": "red" },{ "taxon": "DPOLY", "label1": "red" },{ "taxon": "GAEGI", "label1": "red" },{ "taxon": "LJAPO", "label1": "red" },{ "taxon": "LSAXA", "label1": "red" },{ "taxon": "MANGU", "label1": "red" },{ "taxon": "MAREN", "label1": "red" },{ "taxon": "MGIGA", "label1": "red" },{ "taxon": "MMERC", "label1": "red" },{ "taxon": "MTROS", "label1": "blue" },{ "taxon": "OBIMA", "label1": "blue" },{ "taxon": "OEDUL", "label1": "blue" },{ "taxon": "OSINE", "label1": "blue" },{ "taxon": "OVULG", "label1": "blue" },{ "taxon": "PCANA", "label1": "blue" },{ "taxon": "PMAXI", "label1": "blue" },{ "taxon": "PVULG", "label1": "blue" },{ "taxon": "TGRAN", "label1": "blue" }]}' | jq
```

### 2. Get Run Status

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/status" \
-H "x-session-id: <session_id>" | jq
```

### 3. Get Run Summary

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/run-summary" \
-H "x-session-id: <session_id>" | jq
```

### 4. Get Available Attributes and Taxon Sets

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/available-attributes-taxonsets" \
-H "x-session-id: <session_id>" | jq
```

### 5. Get Counts by Taxon

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/counts-by-taxon" \
-H "x-session-id: <session_id>" | jq
```

### 6. Get Cluster Summary

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/cluster-summary/label1" \
-H "x-session-id: <session_id>" | jq
```

### 7. Get Attribute Summary

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/attribute-summary/label1" \
-H "x-session-id: <session_id>" | jq
```

### 8. Get Cluster Metrics

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/cluster-metrics/label1/red" \
-H "x-session-id: <session_id>" | jq
```

### 9. Get Pairwise Analysis

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/pairwise-analysis/label1" \
-H "x-session-id: <session_id>" | jq
```

### 10. Get Plot

```bash
curl -X GET "http://127.0.0.1:8000/kinfin/plot/<plot_type>" \
-H "x-session-id: <session_id>" -o "<filename>.png"
```
8 changes: 8 additions & 0 deletions example/taxon_idx_mapping.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"A": "0",
"B": "1",
"C": "2",
"D": "3",
"E": "4",
"F": "5"
}
84 changes: 0 additions & 84 deletions install

This file was deleted.

143 changes: 143 additions & 0 deletions install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
#!/usr/bin/env bash

# logging function
log() {
local GREEN='\033[0;32m'
local YELLOW='\033[0;33m'
local RED='\033[0;31m'
local NO_COLOR='\033[0m'

local level=$1
local message=$2

case $level in
INFO)
echo -e "[${NO_COLOR}INFO${NO_COLOR}] - $message"
;;
SUCCESS)
echo -e "[${GREEN}SUCCESS${NO_COLOR}] - $message"
;;
ERROR)
echo -e "[${RED}ERROR${NO_COLOR}] - $message" >&2
;;
*)
echo "Invalid log level: $level"
;;
esac
}

# Check dependencies exist
check_dependencies() {
log INFO "Checking dependencies..."

local dependencies=("wget" "gunzip")
local missing_dependencies=()

for dependency in "${dependencies[@]}"; do
local item=$(command -v "$dependency")
if [ ! -x "$item" ]; then
missing_dependencies+=("$dependency")
fi
done

if [ ${#missing_dependencies[@]} -gt 0 ]; then
log ERROR "Missing dependencies: ${missing_dependencies[*]}. Please install them."
exit 1
else
for dependency in "${dependencies[@]}"; do
log SUCCESS "$dependency is installed."
done
log SUCCESS "All dependencies are installed."
return 0
fi
}

# Function to download a file
download_file() {
local url=$1
local filename=$2

log INFO "Downloading $filename from $url"
$(which wget) -np -nd -qN --show-progress "$url" -P "$DIR/data/"

if [ $? -eq 0 ]; then
log SUCCESS "Downloaded $filename"
else
log ERROR "Failed to download $filename from $url"
exit 1
fi
}

# Extract .gz files
extract_gzip() {
local gz_file=$1
local dest=$2

log INFO "Extracting $gz_file..."

$(which gunzip) -c "$gz_file" > "$dest"

if [ $? -eq 0 ]; then
log SUCCESS "Extracted $gz_file at $dest"
else
log ERROR "Failed to extract $gz_file. Please download kinfin again."
exit 1
fi
}



main() {
# Set working directory
DIR="$(cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

check_dependencies

log INFO "Checking input data files..."

local pfam_dest="$DIR/data/Pfam-A.clans.tsv.gz"
local ipr_dest="$DIR/data/entry.list"
local go_dest="$DIR/data/interpro2go"
local nodesdbgz="$DIR/data/nodesdb.gz"
local nodesdb="$DIR/data/nodesdb.txt"

if [ ! -f "$nodesdb" ]; then
if [ -f "$nodesdbgz" ]; then
extract_gzip "$nodesdbgz" "$nodesdb"
else
log ERROR "$nodesdbgz not found. Please download kinfin again."
exit 1
fi
else
log SUCCESS "$nodesdb is already present."
fi

if [ ! -f "$pfam_dest" ]; then
download_file "ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.clans.tsv.gz" "Pfam-A.clans.tsv.gz"
else
log SUCCESS "Pfam-A.clans.tsv.gz is already present."
fi

if [ ! -f "$ipr_dest" ]; then
download_file "ftp.ebi.ac.uk/pub/databases/interpro/current_release/entry.list" "entry.list"
else
log SUCCESS "entry.list is already present."
fi

if [ ! -f "$go_dest" ]; then
download_file "ftp.ebi.ac.uk/pub/databases/interpro/current_release/interpro2go" "interpro2go"
else
log SUCCESS "interpro2go is already present."
fi

log SUCCESS "All required files downloaded."

# Create executable
log INFO "Creating executable..."
echo -e '#!/usr/bin/env bash\nDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"\n$DIR/src/kinfin.py "$@"' > $DIR/kinfin && chmod +x $DIR/kinfin

# Done
log SUCCESS "Kinfin was installed. Please run ./kinfin --help"
}

main
3 changes: 3 additions & 0 deletions requirements-dev.txt
rohan-b-84 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
-r requirements.txt
fastapi==0.111.0
pytest==8.2.2
11 changes: 6 additions & 5 deletions requirements.txt
rohan-b-84 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
scipy==1.11.1
matplotlib==2.0.2
scipy==1.13.1
matplotlib==3.9.0
docopt==0.6.2
networkx==1.11
powerlaw==1.4.1
ete3==3.0.0b35
networkx==3.3
powerlaw==1.5
ete3==3.1.3
fastapi==0.111.0
4 changes: 2 additions & 2 deletions scripts/get_protein_ids_from_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,8 @@ def parse_groups(group_f):


def write_output(output, outprefix):
headers_found = set([k for k, v in headers.iteritems() if v])
clusters_found = set([k for k, v in clusters.iteritems() if v])
headers_found = set([k for k, v in headers.items() if v])
clusters_found = set([k for k, v in clusters.items() if v])
if headers:
print("[+] Found %s of headers ..." % "{:.0%}".format(len(headers_found) / len(headers)))
if clusters:
Expand Down
Loading