Skip to content

Commit

Permalink
fix: resolve conflict
Browse files Browse the repository at this point in the history
  • Loading branch information
cdransf committed Jun 15, 2024
2 parents 3f65a93 + ca730a8 commit 429336d
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 2 deletions.
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
# AI robots.txt
# ai.robots.txt

<img src="/assets/images/noai-logo.png" width="100" />

**[Subscribe to updates via RSS/Atom by clicking on this link.](https://github.com/ai-robots-txt/ai.robots.txt/releases.atom)**

_(Or paste the link into your preferred feed reader.)_

---

This is an open list of web crawlers associated with AI companies and the training of LLMs to block. We encourage you to contribute to and implement this list on your own site.

A number of these crawlers have been sourced from [Dark Visitors](https://darkvisitors.com) and we appreciate the ongoing effort they put in to track these crawlers.
Expand Down
2 changes: 1 addition & 1 deletion table-of-bot-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
| AdsBot-Google | Google | Yes (Exceptions for Dynamic Search Ads) | Analyzes website content for ad relevancy, improves ad serving for Google Ads. Data anonymized according to [Google's Privacy Policy](https://policies.google.com/privacy). Unclear on data retention or use by other products. | Varies depending on campaign activity and website updates. Crawls optimized to minimize impact, specific frequency not public. | Web crawler by Google Ads to analyze websites for ad effectiveness and ensure ad relevancy to webpage content. |
|Amazonbot | Amazon | Yes | Service improvement and enabling answers for Alexa users. | No information provided. | Includes references to crawled website when surfacing answers via Alexa; does not clearly outline other uses. |
|anthropic-ai | [Anthropic](https://www.anthropic.com) | Unclear at this time. | Scrapes data to train Anthropic's AI products. | No information provided. | Scrapes data to train LLMs and AI products offered by Anthropic. |
|Applebot-Extended | | | | | |
|Applebot-Extended | [Apple](https://support.apple.com/en-us/119829#datausage) | Yes | | | Apple has a secondary user agent, Applebot-Extended ... [that is] used to train Apple’s foundation models powering generative AI features across Apple products, including Apple Intelligence, Services, and Developer Tools. |
|AwarioRssBot | | | | | |
|AwarioSmartBot | | | | | |
|Bytespider | ByteDance | No | LLM training. | Unclear at this time. | Downloads data to train LLMS, including ChatGPT competitors. |
Expand Down

0 comments on commit 429336d

Please sign in to comment.