CourtDrive Kroll Web Scraper
Output raw text:
perl kroll_parser.pl --method=GET --url=https://cases.ra.kroll.com/seadrillpartners/Home-ClaimInfo --recursive=100 --debug=3
perl kroll_parser.pl --method=POST --url=https://cases.ra.kroll.com/seadrillpartners/Home-LoadClaimData --recursive=100 --debug=3
Output a JSON file:
perl kroll_parser.pl --url=https://cases.ra.kroll.com/seadrillpartners/Home-LoadClaimData --recursive=100 --format=json > 2023-07-25.json
Optionally format the output:
perl kroll_parser.pl --file=2023-07-25.json --format=pdf > 2023-07-25.pdf
perl kroll_parser.pl --file=2023-07-25.json --format=xlsx > 2023-07-25.xlsx
This script combines:
a web client which performs HTTP requests on behalf of a user
a parser to find all claims in the scraped HTML and relevant meta data and build a tree
- --debug=N
-
- 0 = no debugging output, non-verbose
- 1 = basic operational output
- 2 = technical debugging info
- 3 = in-depth debugging info
- --noverbose
-
equivalent to debug=0
- --verbose
-
equivalent to debug=1
*note* make the debug or verbose option first to get debugging output on the remaining options
- --url=http://domain/path/to/scrape/
-
this URL will be scraped for data
- --recursive=N
-
follow links to nested HTML pages to this depth (defaults to zero/false)
- --format=html
-
format of the output -- options include txt, json, html, pdf, and xlsx
- --file=report.json
-
import a data tree from a json-formatted file (i.e. the output of perl kroll_parser.pl --format=json), enabling multiple calculation and viewing options on the same data
Thomas Anderson
[email protected]
Copyright 2023