clj-robots-parser

What

A Clojure(-script) library to parse robots.txt files as specified by The Great Goog themselves. As robots.txt is woefully underspecified in the "official" docs, this library tolerates anything it doesn't understand, extracting the data it does.

It can use the extracted data to query whether a given user-agent is allowed to crawl a given URL.

Why

Why use Google's (much more stringent) documentation for handling robots.txt? In terms of SEO, googlebot is what you ought to care about the most.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

clj-robots-parser

What

Why

Files

README.md

Latest commit

History

README.md

File metadata and controls

clj-robots-parser

What

Why