Cookie Consent by Free Privacy Policy Generator facebookexternalhit User Agent - Meta Bot Details | CL SEO

facebookexternalhit

Meta Since 2010
Respects robots.txt
#social #facebook #preview #opengraph
Quick Actions
Official Docs

What is facebookexternalhit?

Facebookexternalhit is Meta's crawler that generates link previews when URLs are shared on Facebook, Instagram, and other Meta platforms. The bot reads Open Graph meta tags to create rich previews with titles, descriptions, and images. It's essential for content publishers as these previews significantly impact click-through rates on social media. The crawler also refreshes previews periodically and when explicitly requested through Facebook's Sharing Debugger tool. For websites relying on social media traffic, ensuring facebookexternalhit can properly access and parse their content is crucial for effective social media presence.

User Agent String

facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

How to Control facebookexternalhit

Block Completely

To prevent facebookexternalhit from accessing your entire website, add this to your robots.txt file:

# Block facebookexternalhit User-agent: facebookexternalhit Disallow: /

Block Specific Directories

To restrict access to certain parts of your site while allowing others:

User-agent: facebookexternalhit Disallow: /admin/ Disallow: /private/ Disallow: /wp-admin/ Allow: /public/

Set Crawl Delay

To slow down the crawl rate (note: not all bots respect this directive):

User-agent: facebookexternalhit Crawl-delay: 10

How to Verify facebookexternalhit

Verification Method:
Check for Facebook IP ranges or use Facebook's Sharing Debugger

Learn more in the official documentation.

Detection Patterns

Multiple ways to detect facebookexternalhit in your application:

Basic Pattern

/facebookexternalhit/i

Strict Pattern

/^facebookexternalhit/1\.1 \(\+http\://www\.facebook\.com/externalhit_uatext\.php\)$/

Flexible Pattern

/facebookexternalhit[\s\/]?[\d\.]*?/i

Vendor Match

/.*Meta.*facebookexternalhit/i

Implementation Examples

// PHP Detection for facebookexternalhit function detect_facebookexternalhit() { $user_agent = $_SERVER['HTTP_USER_AGENT'] ?? ''; $pattern = '/facebookexternalhit/i'; if (preg_match($pattern, $user_agent)) { // Log the detection error_log('facebookexternalhit detected from IP: ' . $_SERVER['REMOTE_ADDR']); // Set cache headers header('Cache-Control: public, max-age=3600'); header('X-Robots-Tag: noarchive'); // Optional: Serve cached version if (file_exists('cache/' . md5($_SERVER['REQUEST_URI']) . '.html')) { readfile('cache/' . md5($_SERVER['REQUEST_URI']) . '.html'); exit; } return true; } return false; }
# Python/Flask Detection for facebookexternalhit import re from flask import request, make_response def detect_facebookexternalhit(): user_agent = request.headers.get('User-Agent', '') pattern = r'facebookexternalhit' if re.search(pattern, user_agent, re.IGNORECASE): # Create response with caching response = make_response() response.headers['Cache-Control'] = 'public, max-age=3600' response.headers['X-Robots-Tag'] = 'noarchive' return True return False # Django Middleware class facebookexternalhitMiddleware: def __init__(self, get_response): self.get_response = get_response def __call__(self, request): if self.detect_bot(request): # Handle bot traffic pass return self.get_response(request)
// JavaScript/Node.js Detection for facebookexternalhit const express = require('express'); const app = express(); // Middleware to detect facebookexternalhit function detectfacebookexternalhit(req, res, next) { const userAgent = req.headers['user-agent'] || ''; const pattern = /facebookexternalhit/i; if (pattern.test(userAgent)) { // Log bot detection console.log('facebookexternalhit detected from IP:', req.ip); // Set cache headers res.set({ 'Cache-Control': 'public, max-age=3600', 'X-Robots-Tag': 'noarchive' }); // Mark request as bot req.isBot = true; req.botName = 'facebookexternalhit'; } next(); } app.use(detectfacebookexternalhit);
# Apache .htaccess rules for facebookexternalhit # Block completely RewriteEngine On RewriteCond %{HTTP_USER_AGENT} facebookexternalhit [NC] RewriteRule .* - [F,L] # Or redirect to a static version RewriteCond %{HTTP_USER_AGENT} facebookexternalhit [NC] RewriteCond %{REQUEST_URI} !^/static/ RewriteRule ^(.*)$ /static/$1 [L] # Or set environment variable for PHP SetEnvIfNoCase User-Agent "facebookexternalhit" is_bot=1 # Add cache headers for this bot <If "%{HTTP_USER_AGENT} =~ /facebookexternalhit/i"> Header set Cache-Control "public, max-age=3600" Header set X-Robots-Tag "noarchive" </If>
# Nginx configuration for facebookexternalhit # Map user agent to variable map $http_user_agent $is_facebookexternalhit { default 0; ~*facebookexternalhit 1; } server { # Block the bot completely if ($is_facebookexternalhit) { return 403; } # Or serve cached content location / { if ($is_facebookexternalhit) { root /var/www/cached; try_files $uri $uri.html $uri/index.html @backend; } try_files $uri @backend; } # Add headers for bot requests location @backend { if ($is_facebookexternalhit) { add_header Cache-Control "public, max-age=3600"; add_header X-Robots-Tag "noarchive"; } proxy_pass http://backend; } }

Should You Block This Bot?

Recommendations based on your website type:

Site Type Recommendation Reasoning
E-commerce Optional Evaluate based on bandwidth usage vs. benefits
Blog/News Allow Increases content reach and discoverability
SaaS Application Block No benefit for application interfaces; preserve resources
Documentation Selective Allow for public docs, block for internal docs
Corporate Site Limit Allow for public pages, block sensitive areas like intranets

Advanced robots.txt Configurations

E-commerce Site Configuration

User-agent: facebookexternalhit Crawl-delay: 5 Disallow: /cart/ Disallow: /checkout/ Disallow: /my-account/ Disallow: /api/ Disallow: /*?sort= Disallow: /*?filter= Disallow: /*&page= Allow: /products/ Allow: /categories/ Sitemap: https://example.com/sitemap.xml

Publishing/Blog Configuration

User-agent: facebookexternalhit Crawl-delay: 10 Disallow: /wp-admin/ Disallow: /drafts/ Disallow: /preview/ Disallow: /*?replytocom= Allow: /

SaaS/Application Configuration

User-agent: facebookexternalhit Disallow: /app/ Disallow: /api/ Disallow: /dashboard/ Disallow: /settings/ Allow: / Allow: /pricing/ Allow: /features/ Allow: /docs/

Quick Reference

User Agent Match

facebookexternalhit

Robots.txt Name

facebookexternalhit

Category

social

Respects robots.txt

Yes
Copied to clipboard!