Skip to content

Latest commit

 

History

History
14 lines (12 loc) · 749 Bytes

README.md

File metadata and controls

14 lines (12 loc) · 749 Bytes

clj-robots-parser Build Status

What

A Clojure(-script) library to parse robots.txt files as specified by The Great Goog themselves. As robots.txt is woefully underspecified in the "official" docs, this library tolerates anything it doesn't understand, extracting the data it does.

It can use the extracted data to query whether a given user-agent is allowed to crawl a given URL.

Why

Why use Google's (much more stringent) documentation for handling robots.txt? In terms of SEO, googlebot is what you ought to care about the most.