Skip to content

Commit

Permalink
add new features
Browse files Browse the repository at this point in the history
  • Loading branch information
lamcodeofpwnosec committed Oct 21, 2024
1 parent 5ad9a22 commit 8af9b08
Show file tree
Hide file tree
Showing 6 changed files with 77 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ The script retrieves a list of URLs that have been archived over time, offering
* `curl`, `sed`, and `tee` should be installed on the system (most Unix-based systems come with these tools by default).

## Steps to Install and Run
You can now use `./wayback.sh -help` to display the help menu and description for each feature. Each feature is now modular and easy to maintain or expand.


1. Clone or Download the Script
```
git clone https://github.com/lamcodeofpwnosec/Waybash.git
Expand Down
12 changes: 12 additions & 0 deletions features/download_archived.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash
# Download Archived Pages feature

echo "Enter your Domain Name:"
read domain

mkdir -p "$domain-archived-pages"
cat "$domain.txt" | while read url; do
wget "http://web.archive.org/web/$url" -P "$domain-archived-pages/"
done

echo "Download complete. Archived pages are saved in the $domain-archived-pages/ folder."
19 changes: 19 additions & 0 deletions features/filter_by_date.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash
# Filter by Date Range feature
# Description: Filters Wayback Machine URLs by a specific date range.

echo "Enter the start date (YYYYMMDD):"
read start_date
echo "Enter the end date (YYYYMMDD):"
read end_date

echo "Enter your Domain Name:"
read domain

curl "http://web.archive.org/cdx/search/cdx?url=*.$domain/*&output=json&fl=original&collapse=urlkey&from=$start_date&to=$end_date" -s -k --insecure --path-as-is \
| sed 's/\["//g' \
| sed 's/"\],//g' \
| sort -u \
| tee -a "$domain-$start_date-to-$end_date.txt"

echo "Crawling complete. The results are saved in $domain-$start_date-to-$end_date.txt"
15 changes: 15 additions & 0 deletions features/filter_by_mimetype.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash
# Filter by MIME Type feature

echo "Enter your Domain Name:"
read domain
echo "Enter the MIME type to filter (e.g., text/html, image/jpeg):"
read mime_type

curl "http://web.archive.org/cdx/search/cdx?url=*.$domain/*&output=json&fl=original,mimetype&filter=!mimetype:$mime_type&collapse=urlkey" -s -k --insecure --path-as-is \
| sed 's/\["//g' \
| sed 's/"\],//g' \
| sort -u \
| tee -a "$domain_filtered_by_mime_type.txt"

echo "Crawling complete. The filtered results (MIME type: $mime_type) are saved in $domain_filtered_by_mime_type.txt"
15 changes: 15 additions & 0 deletions features/track_changes.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash
# Track URL Changes Over Time feature

echo "Enter your Domain Name:"
read domain
echo "Enter the specific URL to track changes (e.g., /path/to/page):"
read url_path

curl "http://web.archive.org/cdx/search/cdx?url=$domain$url_path&output=json&fl=timestamp,original" -s -k --insecure --path-as-is \
| sed 's/\["//g' \
| sed 's/"\],//g' \
| sort -u \
| tee -a "$domain$url_path_change_log.txt"

echo "Tracking complete. Changes to $url_path are saved in $domain$url_path_change_log.txt"
13 changes: 13 additions & 0 deletions features/verbose_mode.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/bash
# Verbose Mode feature

echo "Enter your Domain Name:"
read domain

curl "http://web.archive.org/cdx/search/cdx?url=*.$domain/*&output=json&fl=original&collapse=urlkey" -v -k --insecure --path-as-is \
| sed 's/\["//g' \
| sed 's/"\],//g' \
| sort -u \
| tee -a "$domain_verbose.txt"

echo "Verbose crawling complete. The results are saved in $domain_verbose.txt"

0 comments on commit 8af9b08

Please sign in to comment.