Skip to content

Latest commit

 

History

History
33 lines (24 loc) · 714 Bytes

README.md

File metadata and controls

33 lines (24 loc) · 714 Bytes

Website Crawler

A tool to crawl a site and log any resources that return a 404. Results are presented with a searchable todo-style checklist.

Setup

  1. Install Node
  2. Clone repo git clone [email protected]:hudakdidit/site_crawler.git
  3. Install dependencies npm install
  4. Setup config file: run mv config-example.json config.json. Update the site and port properties as necessary.

Tasks

Start webpack and the express web server

npm start

Start webpack the express web server, and the web crawler

npm run dev-crawl

Start the express web server

npm run server

Start the crawler script.

npm run crawl