mirror of
https://git.qwik.space/left4code/robots.txt.git
synced 2025-08-14 21:09:31 +05:30
fixed broken README (horrible at markdown)
This commit is contained in:
25
README.md
25
README.md
@@ -4,35 +4,38 @@ A collection of robots.txt files for sites where it's the only option to defend
|
||||
|
||||
## Use Instructions:
|
||||
|
||||
###1: Pick a file you want to use.
|
||||
**1:** Pick a file you want to use.
|
||||
|
||||
- **blacklist-robots.txt** - a large robots.txt file around **24Kb** in size that attempts to block everything bad, this includes search engines, AI crawlers, scrapers, and SEO bots.
|
||||
|
||||
- **whitelist-robots.txt** - a small robots.txt file around **1.3Kb** (you can shrink it). Currently only allows [Wiby](https://wiby.org/) and [Marginalia Search](https://marginalia-search.com/).
|
||||
|
||||
###2: Change your desired file's name to 'robots.txt'
|
||||
**2:** Change your desired file's name to 'robots.txt'
|
||||
|
||||
###2.5: If you have a sitemap, add it inside file where instructed, if you don't, delete the line.
|
||||
**2.5:** If you have a sitemap, add it inside file where instructed, if you don't, delete the line.
|
||||
|
||||
###3: Upload it to your site!
|
||||
**3:** Upload it to your site!
|
||||
|
||||
## Request to Modify Repository
|
||||
## Request to Modify Repository 🛠️
|
||||
|
||||
Would you like a bot added or removed from either the whitelist or blacklist file? My email is [Here](https://left4code.neocities.org/left4code_gpg.txt), Specify the following in your email:
|
||||
|
||||
1: File to modify
|
||||
2: Bot to add or remove
|
||||
3: name of User-agent for crawler and website if possible
|
||||
4: why the bot should be added or removed.
|
||||
**1:** File to modify
|
||||
|
||||
**2:** Bot to add or remove
|
||||
|
||||
**3:** name of User-agent for crawler and website if possible
|
||||
|
||||
**4:** why the bot should be added or removed.
|
||||
|
||||
Currently this gitea instance does not allow for new sign-ups and therefore new PR's, email is currently the only way and will change if this instance ever opens up again. This also allows for anyone who already has an email address to make requests and hopefully should be easier to manage.
|
||||
|
||||
## Where did you get the User-agents from?
|
||||
|
||||
These user agents were manually obtained from [Baccyflap's No AI webring](https://baccyflap.com/noai/), I went through the list, found the disallowed user agents, put them all into a list, and ran:
|
||||
These User-agents were manually obtained from the members of [Baccyflap's No AI webring](https://baccyflap.com/noai/), I went through the list, found the disallowed User-agents, put them all into a list, and ran:
|
||||
|
||||
`sort <File> | uniq -u > final_output.txt`
|
||||
|
||||
if you wanted to create a list of your own.
|
||||
|
||||
## I do not guarantee this will make you impervious to bots, this is what I use. Help would be appreciated in keeping the list updated, managed, and hopefully in the future, documented.
|
||||
## I do not guarantee this will make you impervious to bots, this is what I use. Help would be appreciated in keeping the list updated, managed, and hopefully in the future, documented. ⚠️
|
||||
|
Reference in New Issue
Block a user