Xenu Link Sleuth is a classic desktop application for checking websites for broken links. Despite its age and simple interface, Xenu remains popular due to its reliability, speed, and freeware status. The tool thoroughly crawls websites to find broken links, redirects, and other issues, presenting results in a simple report. While it lacks the advanced features of modern SEO tools, Xenu's efficiency at its core task of finding broken links keeps it relevant. Many SEO professionals still use Xenu for quick broken link checks.
User Agent String
Xenu Link Sleuth/1.3.9
How to Control Xenu Link Sleuth
Block Completely
To prevent Xenu Link Sleuth from accessing your entire website, add this to your robots.txt file:
# Block Xenu Link Sleuth
User-agent: Xenu Link Sleuth
Disallow: /
Block Specific Directories
To restrict access to certain parts of your site while allowing others:
Multiple ways to detect Xenu Link Sleuth in your application:
Basic Pattern
/Xenu Link Sleuth/i
Strict Pattern
/^Xenu Link Sleuth/1\.3\.9$/
Flexible Pattern
/Xenu Link Sleuth[\s\/]?[\d\.]*?/i
Vendor Match
/.*Xenu.*Xenu/i
Implementation Examples
// PHP Detection for Xenu Link Sleuth
function detect_xenu_link_sleuth() {
$user_agent = $_SERVER['HTTP_USER_AGENT'] ?? '';
$pattern = '/Xenu Link Sleuth/i';
if (preg_match($pattern, $user_agent)) {
// Log the detection
error_log('Xenu Link Sleuth detected from IP: ' . $_SERVER['REMOTE_ADDR']);
// Set cache headers
header('Cache-Control: public, max-age=3600');
header('X-Robots-Tag: noarchive');
// Optional: Serve cached version
if (file_exists('cache/' . md5($_SERVER['REQUEST_URI']) . '.html')) {
readfile('cache/' . md5($_SERVER['REQUEST_URI']) . '.html');
exit;
}
return true;
}
return false;
}
# Python/Flask Detection for Xenu Link Sleuth
import re
from flask import request, make_response
def detect_xenu_link_sleuth():
user_agent = request.headers.get('User-Agent', '')
pattern = r'Xenu Link Sleuth'
if re.search(pattern, user_agent, re.IGNORECASE):
# Create response with caching
response = make_response()
response.headers['Cache-Control'] = 'public, max-age=3600'
response.headers['X-Robots-Tag'] = 'noarchive'
return True
return False
# Django Middleware
class XenuLinkSleuthMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
if self.detect_bot(request):
# Handle bot traffic
pass
return self.get_response(request)
// JavaScript/Node.js Detection for Xenu Link Sleuth
const express = require('express');
const app = express();
// Middleware to detect Xenu Link Sleuth
function detectXenuLinkSleuth(req, res, next) {
const userAgent = req.headers['user-agent'] || '';
const pattern = /Xenu Link Sleuth/i;
if (pattern.test(userAgent)) {
// Log bot detection
console.log('Xenu Link Sleuth detected from IP:', req.ip);
// Set cache headers
res.set({
'Cache-Control': 'public, max-age=3600',
'X-Robots-Tag': 'noarchive'
});
// Mark request as bot
req.isBot = true;
req.botName = 'Xenu Link Sleuth';
}
next();
}
app.use(detectXenuLinkSleuth);
# Apache .htaccess rules for Xenu Link Sleuth# Block completely
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} Xenu Link Sleuth [NC]
RewriteRule .* - [F,L]
# Or redirect to a static version
RewriteCond %{HTTP_USER_AGENT} Xenu Link Sleuth [NC]
RewriteCond %{REQUEST_URI} !^/static/
RewriteRule ^(.*)$ /static/$1 [L]
# Or set environment variable for PHP
SetEnvIfNoCase User-Agent "Xenu Link Sleuth" is_bot=1
# Add cache headers for this bot
<If "%{HTTP_USER_AGENT} =~ /Xenu Link Sleuth/i">
Header set Cache-Control "public, max-age=3600"
Header set X-Robots-Tag "noarchive"
</If>
# Nginx configuration for Xenu Link Sleuth# Map user agent to variable
map $http_user_agent $is_xenu_link_sleuth {
default 0;
~*Xenu Link Sleuth 1;
}
server {
# Block the bot completely
if ($is_xenu_link_sleuth) {
return 403;
}
# Or serve cached content
location / {
if ($is_xenu_link_sleuth) {
root /var/www/cached;
try_files $uri $uri.html $uri/index.html @backend;
}
try_files $uri @backend;
}
# Add headers for bot requests
location @backend {
if ($is_xenu_link_sleuth) {
add_header Cache-Control "public, max-age=3600";
add_header X-Robots-Tag "noarchive";
}
proxy_pass http://backend;
}
}
Should You Block This Bot?
Recommendations based on your website type:
Site Type
Recommendation
Reasoning
E-commerce
Optional
Evaluate based on bandwidth usage vs. benefits
Blog/News
Allow
Increases content reach and discoverability
SaaS Application
Block
No benefit for application interfaces; preserve resources
Documentation
Selective
Allow for public docs, block for internal docs
Corporate Site
Limit
Allow for public pages, block sensitive areas like intranets