- Spider.Plugin
Spider.Plugin
Assembly: Spider.Plugin
Description
Spider plugin has been designed to manage robots that visit a site, it completes this by performing 3 tasks
- Create a list of routes that should not be visited by a bot, based on attributes placed on action methods.
- Monitor bots navigating through a website and return a 403 error if a bot enters a route that it should not visit.
- Serve /robots.txt which is built based on routes denied based on DenySpiderAttribute.
This is achieved by adding DenySpiderAttribute to a controller class or action method, the following code sample demostrates 2 action methods that are denied to bots:
[DenySpider] [Breadcrumb(nameof(Languages.LanguageStrings.Privacy))] public IActionResult Privacy() { return View(new BaseModel(GetBreadcrumbs(), GetCartSummary())); } [DenySpider("*")] [ResponseCache(Duration = 0, Location = ResponseCacheLocation.None, NoStore = true)] public IActionResult Error() { return View(new ErrorViewModel { RequestId = Activity.Current?.Id ?? HttpContext.TraceIdentifier }); }
The DenySpider attribute works by allowing users to specify which user agents are denied, each bot should have a unique user agent, when they navigate through a site the first task is to read robots.txt to see if they are not allowed in any specific route. A well behaved bot will obey the robots.txt file and not enter a route where it has been designed.
The deny spider attribute has 3 constructors, the default denies any user agent *, the other 2 constructors allow you to specifically mention an individual agent that should be denied. Yo can have multiple attributes for multiple agents. The third constructor allows you to add a comment that will also appear in robots.txt file. The deny spider attributes are also used to create the denied routes with user agents in robots.txt.
If the deny spider middleware identifies a bot which is attempting to load a route where it has been denied, then a 403 forbidden response is returned.