Website footprinting
Hackers can map the entire website of the target without being noticed
Gives information about:
Software
Operating system
Subdirectories
Contact information
Scripting platform
Query details
Programs designed to help in website footprinting
Methodically browse a website in search of specific information.
Information collected this way can help attackers perform social engineering attacks.
Reveals what software that is running on the server and its behavior
Possible to identify the scripting platforms.
Examining website headers
By examining the website headers, it is possible to obtain information about:
Content-Type
Accept-Ranges
Connection Status
Last-Modified
Information
X-Powered-By
Information
E.g. ZendServer 8.5.0,ASP.NET
Web Server Information
Server
header can give you e.g. Apache Server on CentOS
You can also analyze what website pulls
In debugging developer tool of most browsers (ctrl+shift+c) network section
For each request you can see remote IP address, and response headers for further analysis.
Comment analysis
Possible to extract information from the comments
In most of browsers you can right click and how source
Walkthrough
In almost any browser: Right click => Show source
Check for HTML <!-- comment -->
or JavaScript // comment
comments
They are skipped by interpreters and compilers, only for human eyes
They can be instructions for other developers, notes for themselves
E.g. this library won't work as this element is not supported
Gives you clues about what technology (frameworks, languages) they use in the background
Observing link and image tags
Html links: href=cloudarchitecture.io
Gain insight into the file system structure
You can find e.g. a caching server and check vulnerabilities for that caching server.
Also called website mirroring
Helps in
browsing the site offline
searching the website for vulnerabilities
discovering valuable information and metadata.
Can be protected with some detections based on e.g. page pull speed, behavior, known scrapers, AI.
💡 Good tool for setting up fake websites.
E.g. manually recreate login pages
If you control the DNS you can do a redirect.
Allows you to save social media pages with this however most are protected, and illegal to clone.
Website monitoring tools can send notifications on detected changes.
💡 Protection against fake websites
Always check domain name for misspelling
Make sure it's HTTPS , if it's not the data can be sniffed easily
Protects against someone taking over DNS
If the other part does not have the certificate, browser does not accept communication
Check SSL certificate authority, if it's changing, it can prompt a question.
Certificates expire usually in a year.
httrack
httrack https://testwebpage.com
to copy
📝 wget
Basic utility that can be used for mirroring website
Or one could manually copy paste source code of HTML + CSS
Extracting metadata
You can extract metadata of files (e.g. images) from a webpage
Metadata can include
Owner of the file
GPS coordinates (images)
File type metadata
🤗 Linux does not work with extensions e.g. .pdf
but checks for the metadata.
Helpful as you will not be fooled by the extension
Tools for extracting metadata
hexdump
Dump file as ASCII and inspect manually
E.g. hexdump -C TEST_DOCUMENT.docx
❗ Not recommended as it's pretty hard to extract information from binary.
ExifTool
Reads + writes metadata of audio, video, PDF, docs etc.
E.g. exiftool TEST_DOCUMENT.docx
would return something like Microsoft Office Word
, Version: 16.0
📝 Metagoofil | Google hacking tool
Search for files that may have metadata for a website using Google and dump their metadata.