facebookexternalhit

Name: facebookexternalhit
Author: Meta

Meta • Since 2010

Respects robots.txt

#social #facebook #preview #opengraph

Quick Actions

Official Docs

What is facebookexternalhit?

Facebookexternalhit is Meta's crawler that generates link previews when URLs are shared on Facebook, Instagram, and other Meta platforms. The bot reads Open Graph meta tags to create rich previews with titles, descriptions, and images. It's essential for content publishers as these previews significantly impact click-through rates on social media. The crawler also refreshes previews periodically and when explicitly requested through Facebook's Sharing Debugger tool. For websites relying on social media traffic, ensuring facebookexternalhit can properly access and parse their content is crucial for effective social media presence.

User Agent String

facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

How to Control facebookexternalhit

Block Completely

To prevent facebookexternalhit from accessing your entire website, add this to your robots.txt file:

# Block facebookexternalhit
User-agent: facebookexternalhit
Disallow: /

Block Specific Directories

To restrict access to certain parts of your site while allowing others:

User-agent: facebookexternalhit
Disallow: /admin/
Disallow: /private/
Disallow: /wp-admin/
Allow: /public/

Set Crawl Delay

To slow down the crawl rate (note: not all bots respect this directive):

User-agent: facebookexternalhit
Crawl-delay: 10

How to Verify facebookexternalhit

Verification Method:
Check for Facebook IP ranges or use Facebook's Sharing Debugger

Learn more in the official documentation.

Detection Patterns

Multiple ways to detect facebookexternalhit in your application:

Basic Pattern

/facebookexternalhit/i

Strict Pattern

/^facebookexternalhit/1\.1 $\+http\://www\.facebook\.com/externalhit_uatext\.php$$/

Flexible Pattern

/facebookexternalhit[\s\/]?[\d\.]*?/i

Vendor Match

/.*Meta.*facebookexternalhit/i

Implementation Examples

// PHP Detection for facebookexternalhit function detect_facebookexternalhit() { $user_agent = $_SERVER['HTTP_USER_AGENT'] ?? ''; $pattern = '/facebookexternalhit/i'; if (preg_match($pattern, $user_agent)) { // Log the detection error_log('facebookexternalhit detected from IP: ' . $_SERVER['REMOTE_ADDR']); // Set cache headers header('Cache-Control: public, max-age=3600'); header('X-Robots-Tag: noarchive'); // Optional: Serve cached version if (file_exists('cache/' . md5($_SERVER['REQUEST_URI']) . '.html')) { readfile('cache/' . md5($_SERVER['REQUEST_URI']) . '.html'); exit; } return true; } return false; }

# Python/Flask Detection for facebookexternalhit import re from flask import request, make_responsedef detect_facebookexternalhit(): user_agent = request.headers.get('User-Agent', '') pattern = r'facebookexternalhit' if re.search(pattern, user_agent, re.IGNORECASE): # Create response with caching response = make_response() response.headers['Cache-Control'] = 'public, max-age=3600' response.headers['X-Robots-Tag'] = 'noarchive' return True return False# Django Middleware class facebookexternalhitMiddleware: def __init__(self, get_response): self.get_response = get_response def __call__(self, request): if self.detect_bot(request): # Handle bot traffic pass return self.get_response(request)

// JavaScript/Node.js Detection for facebookexternalhit const express = require('express'); const app = express();// Middleware to detect facebookexternalhit function detectfacebookexternalhit(req, res, next) { const userAgent = req.headers['user-agent'] || ''; const pattern = /facebookexternalhit/i; if (pattern.test(userAgent)) { // Log bot detection console.log('facebookexternalhit detected from IP:', req.ip); // Set cache headers res.set({ 'Cache-Control': 'public, max-age=3600', 'X-Robots-Tag': 'noarchive' }); // Mark request as bot req.isBot = true; req.botName = 'facebookexternalhit'; } next(); }app.use(detectfacebookexternalhit);

# Apache .htaccess rules for facebookexternalhit# Block completely RewriteEngine On RewriteCond %{HTTP_USER_AGENT} facebookexternalhit [NC] RewriteRule .* - [F,L]# Or redirect to a static version RewriteCond %{HTTP_USER_AGENT} facebookexternalhit [NC] RewriteCond %{REQUEST_URI} !^/static/ RewriteRule ^(.*)$ /static/$1 [L]# Or set environment variable for PHP SetEnvIfNoCase User-Agent "facebookexternalhit" is_bot=1# Add cache headers for this bot <If "%{HTTP_USER_AGENT} =~ /facebookexternalhit/i"> Header set Cache-Control "public, max-age=3600" Header set X-Robots-Tag "noarchive" </If>

# Nginx configuration for facebookexternalhit# Map user agent to variable map $http_user_agent $is_facebookexternalhit { default 0; ~*facebookexternalhit 1; }server { # Block the bot completely if ($is_facebookexternalhit) { return 403; } # Or serve cached content location / { if ($is_facebookexternalhit) { root /var/www/cached; try_files $uri $uri.html $uri/index.html @backend; } try_files $uri @backend; } # Add headers for bot requests location @backend { if ($is_facebookexternalhit) { add_header Cache-Control "public, max-age=3600"; add_header X-Robots-Tag "noarchive"; } proxy_pass http://backend; } }

Should You Block This Bot?

Recommendations based on your website type:

Site Type	Recommendation	Reasoning
E-commerce	Optional	Evaluate based on bandwidth usage vs. benefits
Blog/News	Allow	Increases content reach and discoverability
SaaS Application	Block	No benefit for application interfaces; preserve resources
Documentation	Selective	Allow for public docs, block for internal docs
Corporate Site	Limit	Allow for public pages, block sensitive areas like intranets

Advanced robots.txt Configurations

E-commerce Site Configuration

User-agent: facebookexternalhit Crawl-delay: 5 Disallow: /cart/ Disallow: /checkout/ Disallow: /my-account/ Disallow: /api/ Disallow: /*?sort= Disallow: /*?filter= Disallow: /*&page= Allow: /products/ Allow: /categories/ Sitemap: https://example.com/sitemap.xml

Publishing/Blog Configuration

User-agent: facebookexternalhit Crawl-delay: 10 Disallow: /wp-admin/ Disallow: /drafts/ Disallow: /preview/ Disallow: /*?replytocom= Allow: /

SaaS/Application Configuration

User-agent: facebookexternalhit Disallow: /app/ Disallow: /api/ Disallow: /dashboard/ Disallow: /settings/ Allow: / Allow: /pricing/ Allow: /features/ Allow: /docs/

Quick Reference

User Agent Match

facebookexternalhit

Robots.txt Name

facebookexternalhit

Respects robots.txt

Yes

← Back to User Agent Directory

Copied to clipboard!

facebookexternalhit

What is facebookexternalhit?

User Agent String

How to Control facebookexternalhit

Block Completely

Block Specific Directories

Set Crawl Delay

How to Verify facebookexternalhit

Detection Patterns

Basic Pattern

Strict Pattern

Flexible Pattern

Vendor Match

Implementation Examples

Should You Block This Bot?

Advanced robots.txt Configurations

E-commerce Site Configuration

Publishing/Blog Configuration

SaaS/Application Configuration

Quick Reference

User Agent Match

Robots.txt Name

Category

Respects robots.txt

Technical Details

Quick Patterns

Tags

Related User Agents