Mad Crawler

About

Mad Crawler is a multithreaded robot when parses predefined sites and finds all <a href="" /> links on those ones. Each target site processes recursively with a one hundred links depth and one seconds timeouts. Result links save uniquely to the output result-[date&time].txt file. This file will be saved inside a running application folder.

Build

Project building requires JDK 7+ and an internet connection. Here is the bash command which tests and builds Mad Crawler:

sh gradlew clean build

You can find result jar file here:

project-root/build/libs/Mad-Crawler-x.x.jar

All necessary dependencies are included in it.

Run

To start program you need JRE 7+ and a file with targets for crawling. The run command looks like:

java -jar Mad-Crawler-x.x.jar /path/to/file-with-targets.txt

Where file-with-targets.txt should conform to the following format:

google.com
apple.com
yandex.com

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
gradle/wrapper		gradle/wrapper
src		src
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mad Crawler

About

Build

Run

About

Releases

Packages

Languages

License

anatoly-chichikov/mad-crawler

Folders and files

Latest commit

History

Repository files navigation

Mad Crawler

About

Build

Run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages