Skip to content

orderamidchaos/CourtDrive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAME

CourtDrive Kroll Web Scraper

SYNOPSIS

Output raw text:

perl kroll_parser.pl --method=GET --url=https://cases.ra.kroll.com/seadrillpartners/Home-ClaimInfo --recursive=100 --debug=3
perl kroll_parser.pl --method=POST --url=https://cases.ra.kroll.com/seadrillpartners/Home-LoadClaimData --recursive=100 --debug=3

Output a JSON file:

perl kroll_parser.pl --url=https://cases.ra.kroll.com/seadrillpartners/Home-LoadClaimData --recursive=100 --format=json > 2023-07-25.json

Optionally format the output:

perl kroll_parser.pl --file=2023-07-25.json --format=pdf > 2023-07-25.pdf
perl kroll_parser.pl --file=2023-07-25.json --format=xlsx > 2023-07-25.xlsx

DESCRIPTION

This script combines:

  • a web client which performs HTTP requests on behalf of a user

  • a parser to find all claims in the scraped HTML and relevant meta data and build a tree

OPTIONS

--debug=N
0 = no debugging output, non-verbose
1 = basic operational output
2 = technical debugging info
3 = in-depth debugging info
--noverbose

equivalent to debug=0

--verbose

equivalent to debug=1

*note* make the debug or verbose option first to get debugging output on the remaining options

--url=http://domain/path/to/scrape/

this URL will be scraped for data

--recursive=N

follow links to nested HTML pages to this depth (defaults to zero/false)

--format=html

format of the output -- options include txt, json, html, pdf, and xlsx

--file=report.json

import a data tree from a json-formatted file (i.e. the output of perl kroll_parser.pl --format=json), enabling multiple calculation and viewing options on the same data

AUTHOR

Thomas Anderson
[email protected]

COPYRIGHT

Copyright 2023

About

Web scraper for CourtDrive

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages