-
Notifications
You must be signed in to change notification settings - Fork 1
Crawler
- The crawler crawls "random" elements from the ddb
- Random Elements are chosen by searching for "*" and using a random result offset and limit
- public domain is calculated for these random items and statistics about outcome and errors is created
- all exceptions and results will be logged to the console by the crawler
To run the crawler start the server//application with command line parameter -c or --crawler
java -jar pdc-0.1-SNAPSHOT.jar -c
or java -jar pdc-0.1-SNAPSHOT.jar --crawler
The crawler will start a search for the search term *
to fetch items from the DDB-API.
Additional parameters:
--maxDepth 50
will run the crawler for 50 DDB Items and then stop the crawler. Defaults to 1000 items. Alternative: --depth
--fetchSize 10
will fetch 10 results in one request to the DDB-API and then calculate the public domain status. It will then fetch the next 10 results and so on (until maxDepth is reached). Larger numbers will speed up the crawler but will stress the DDB-API web service. Defaults to 100
--timeout 1000
the timeout in milliseconds in between of the calculation of the different DDB Items