Difference between revisions of "Robots exclusion standard (robots.txt)"

From wikieduonline
Jump to navigation Jump to search
Line 14: Line 14:
 
* https://en.wikipedia.org/robots.txt
 
* https://en.wikipedia.org/robots.txt
 
* https://dubai.dubizzle.com/robots.txt
 
* https://dubai.dubizzle.com/robots.txt
 +
* <code>[[wget -e]] robots=off --mirror https://www.mywebsite.org</code>
  
 
== See also ==
 
== See also ==
 
* {{robots.txt}}
 
* {{robots.txt}}

Revision as of 12:51, 18 January 2024

wikipedia:robots.txt

Elastic App Search web crawler

Failed to fetch robots.txt: SSL certificate chain is invalid [unable to find valid certification path to requested target]. Make sure your SSL certificate chain is correct. For self-signed certificates or certificates signed with unknown certificate authorities, you can add your signing certificate to Enterprise Search Crawler configuration. Alternatively, you can disable SSL certificate validation (non-production environments only).


User-agent:

Related

See also

Advertising: