Cloaking — a black-hat technique in SEO and digital marketing where a website presents different content or URLs to search engine crawlers than what it shows to human users. In other words, the page that Google’s bot sees is not the same page a normal visitor sees. The purpose of cloaking is usually to manipulate search rankings.
This deceptive tactic is essentially a way to cheat the system by hiding the true content behind a figurative cloak, showing one thing to search engines and another to people.
Cloaking can take many forms. A classic example is a website that serves a keyword-stuffed HTML page to Googlebot but shows a slick graphical page (with little text) to users. The search engine indexes the first version (helping the site rank for those keywords), whereas real visitors never see that SEO content. By doing this, a site might rank high on Google for certain queries but then deliver completely different information or even spam to the user, leading to a poor or misleading user experience.
Why is Cloaking Not Recommended by Google?
Google explicitly forbids cloaking because it undermines the integrity of search results: Cloaking is considered a serious violation of Google’s Webmaster Guidelines because it provides users with different results than they expected.
Cloaking is a high-risk game. Search engines actively look for it. If they catch it, penalties are usually swift and harsh to discourage the practice.
When a user clicks on a search result, they expect to see the content that was described by the search engine – not a bait-and-switch. By showing one thing to search engines and another to users, cloaking misleads users and search engines alike. This erodes trust in the search engine’s ability to deliver relevant, accurate results.
Because of this, the consequences of cloaking can be severe. Google’s stance is to penalize or even ban sites that engage in cloaking. In fact, Google warns that if they detect cloaking on your site, you may be removed entirely from the Google index.
In practical terms, a manual action (penalty) for cloaking can lead to your pages dropping out of search results or your entire site being deindexed. Other search engines (Bing, etc.) have similar policies disallowing cloaking. In short, the risk far outweighs any temporary benefit – using cloaking can destroy your SEO rather than help it.
How Does Cloaking Work?
Cloaking works by delivering content conditionally based on who (or what) is requesting the page. Technically, most cloaking methods involve two steps
- Identify the visitor: When a page is requested, the server (or sometimes client-side script) determines whether the visitor is a search engine’s crawler or a normal user. This is done by checking factors like the visitor’s IP address or the User-Agent string in the HTTP request, which usually identifies the browser or bot. For example, Google’s crawler identifies itself as Googlebot in the User-Agent, and it comes from specific IP ranges that belong to Google. A cloaking script has a list of known crawler user-agents or IPs to watch for.
- Serve different content accordingly: If the script detects a search engine crawler, it will serve up the SEO version of the content; if it’s a regular user, it serves the normal content. Essentially, the web server maintains two versions of the page and chooses one based on the identification step. In server-side cloaking, this content-switching logic is implemented on the server (for instance, in a PHP script or Apache/.htaccess rules). In some cases, cloaking can also be done on the client side – for example, the server sends the same page to everyone, but includes a piece of JavaScript that hides or replaces content when run in a user’s browser. Since search engine bots often don’t execute JavaScript like a browser would, they might index the original content, while human users running the script see something different.
Server-side cloaking: This is the most common approach. The web server examines the incoming request’s headers (like IP, User-Agent, etc.) before deciding what to output. For instance, a PHP script might do:
if(UserAgent contains 'Googlebot'){ show google_page.html; } else { show normal_page.html; }
Server-side cloaking is powerful because it’s seamless – the user only ever gets one version of the HTML, and they have no easy way to know a different version existed.
Client-side cloaking: Less common, but it exists. This could involve serving content that is visible in the raw HTML (which the crawler sees) and then using front-end code (JavaScript or CSS) to remove or alter that content for users. For example, a page could have a block of text that is only there for SEO. A JavaScript on page load might immediately remove that block for users’ eyes. The crawler, which might not run the script, would still see the block of keyword-rich text and rank the page accordingly. (This borders on techniques like hidden text, which is also against guidelines.) Another client-side trick could be using an immediate redirect for users (via JS or meta refresh) to send them to a different page, while search bots (that don’t follow the script) stay on the original page. However, modern crawlers do execute JavaScript to some extent, so this method is less effective and still very risky.
Types of Cloaking in SEO
There are several types of cloaking, classified by what signal is used to differentiate the user versus the crawler. Below are the main types of cloaking in SEO, each with an explanation and a real-world example.
1. IP-based Cloaking
Definition: IP-based cloaking means the website checks the visitor’s IP address to determine if it belongs to a search engine. Based on that, it serves different content. Search engines have known IP ranges (for example, Googlebot often comes from IPs that reverse-resolve to *.googlebot.com). In IP cloaking, the server maintains a list of these bot IPs. If an incoming request comes from one of those, it assumes it’s a crawler and serves the special content. If the IP is from any other source (an ordinary user’s ISP), it serves the regular site.
Example
Imagine a website has two versions of its homepage. Version A is a plain HTML page stuffed with keywords and paragraphs about buy cheap medications online – this is meant for crawlers. Version B is a glossy page with product images and a sign-up form – this is meant for real visitors. When Googlebot (from a Google IP) visits, the site detects the IP and shows Version A (the text-heavy page). When a user visits from their home internet, the site shows Version B. In effect, Google indexes a page full of text and thinks the site is very relevant to cheap medications, boosting its ranking. But when users click that Google result, they get a page that might be just a sign-up or advertisement with nowhere near the promised content. This is IP cloaking in action.
A real scenario of IP cloaking could be a site identifying all search engine bots by IP and giving them an optimized page. For instance, when an IP address associated with a search engine bot (e.g., Googlebot) visits the site, the server delivers a webpage with more keywords. However, if the visitor’s IP is not a known search engine, the server delivers a completely different version.
2. User-Agent Cloaking
Definition: User-Agent cloaking checks the User-Agent string that browsers and bots send in the HTTP request headers. This string identifies who is visiting (e.g., Chrome browser, or Googlebot, or Bingbot). In this form of cloaking, the server looks for substrings like Googlebot or Bingbot in the User-Agent. If it matches a known crawler’s name, the server serves the special content; if it’s a normal browser user-agent (like Mozilla/5.0 (Windows NT 10.0; Win64; x64)…Chrome/117.0.…), it serves the regular content.
For instance, a news site might cloak by User-Agent. To Googlebot it serves an article page full of crawlable text. But to regular users, it serves a heavy multimedia page or even a paywall that blocks the text. Unless Googlebot identifies itself via User-Agent, it won’t hit the paywall – it sees the full text. This is cloaking (and also violates Google’s policy on paywalls unless you use approved methods). Another scenario: User-agent cloaking can identify the visitor’s user-agent, and if it’s a search engine, serve an optimized page made purely for ranking, while serving a completely different (often lower-quality) page to users.
In other words, the site might have a dummy page for bots and a normal page for users, switching based on the User-Agent string alone.
User-Agent cloaking is very similar to IP cloaking in outcome; often both are used together (checking IP and UA) to be extra sure the visitor is a search bot. But even just UA-based detection is enough to cloak. It’s also sometimes called browser cloaking if used to target specific browsers, which we’ll discuss below.
Example
A website might have a script that reads the User-Agent. Say a request comes in with User-Agent: Googlebot/2.1 (+http://www.google.com/bot.html). The site detects Googlebot in that string and then, instead of the normal page, it serves an alternate page tailored to SEO (perhaps with extra text, or a page that is otherwise hidden). Conversely, if the User-Agent is something like Mozilla/5.0 (Windows NT 10.0; Win64; x64)…Safari/605 (which indicates a human using Safari on Windows), the site serves the usual user-facing page.
3. HTTP Header Cloaking
Definition: This is a broader category where any HTTP header differences can be used to differentiate users from bots. The User-Agent (discussed above) is one HTTP header, but there are others like Accept-Language, Accept or even presence/absence of cookies, etc. In HTTP header cloaking, the server examines the incoming request headers for patterns typical of bots versus humans. Â For example, a normal browser might send a bunch of headers (language preference, accepted content types, cookies, etc.), whereas a simple bot might have a more minimal header set. A cloaking script might exploit these differences.
Geo-location cloaking (GeoIP) can also be considered a subset: using the IP (which comes from request headers via the TCP connection) to infer location and then altering content – we cover that next. The key idea is that HTTP header cloaking uses technical request details beyond just user-agent to decide what content to show. One popular misuse mentioned in industry is cloaking via the Accept-Language header to rank in regions the site doesn’t actually serve: For example, showing French content to Googlebot when it crawls from France to rank in French SERPs, even if users from France actually get redirected or see different content. This can manipulate local search rankings
In summary, HTTP header cloaking is a grab-bag of tricks: any trait in the request that distinguishes a bot from a human can be leveraged (headers, IP, etc.) to show different output. It often overlaps with IP and User-Agent cloaking but extends to other header fields as well.
Example
One form of header cloaking involves the Accept-Language header. A search engine crawler might not always set an Accept-Language (or if it does, it might default to en-US), whereas a real user’s browser usually sends an Accept-Language reflecting their OS/browser settings (like en-US,en;q=0.9 meaning English). A site could detect this: if no language header (or a specific one) is present, it assumes it’s the crawler and serves a certain version of the page. If a normal browser language is present, serve the regular page. As an illustration, some content farms tried to target multiple locales – e.g., show Google’s crawler a page in English full of keywords to rank globally, but show actual users from a non-English region a page in another language or a message. This can be considered cloaking if the content differs in a deceptive way for ranking purposes.
Another example: checking for the presence of cookies or certain HTTP headers like DNT (Do Not Track) or others. Bots typically don’t hold login cookies or session cookies. A cloaked site might not serve content to anyone with a session cookie (assuming those are real users) but serve content to those without (assuming those are fresh crawlers).
4. GeoIP Cloaking
Definition: GeoIP cloaking serves different content based on the visitor’s geographic location (determined by their IP address lookup). It’s similar to IP cloaking, but the decision is based on location rather than simply is this a known bot IP. Typically, this is used to show search engine crawlers content intended for a specific region. If done improperly, it can become cloaking – especially if the crawler is treated differently than a normal user from the same location would be.
It’s important to note: serving different content by geography in itself isn’t automatically cloaking. Websites legitimately do geo-targeting (e.g., showing US visitors prices in USD and UK visitors prices in GBP). That’s fine if every user in the US sees the USD content and Googlebot crawling from the US also sees USD content. It becomes cloaking if you, for instance, detect Googlebot (which usually comes from a US-based IP) and show it content for US that regular US visitors don’t actually see, or if you hide content from Googlebot that other users in that region get. Google’s guidance is that you should treat Googlebot as a typical user from whatever location it’s coming from – not give it special treatment.
While geo-targeting for users (like language or currency changes) is legitimate, you must ensure that search bots get the same regional content a user from that region would get. If you deliver something only to the bot or hide something from the bot that a local user sees, that’s cloaking.
Example
Suppose you operate in two countries, but you really want to rank in Country A’s Google results even though your content is mainly for Country B. You might implement GeoIP cloaking such that any request from a Country A IP (which would include Googlebot if it crawls from a data center in Country A) gets a webpage with tailored content (perhaps translated or stuffed with local keywords), whereas users from Country B (your real audience) get the normal content. This means Google sees a version aiming at Country A’s market, possibly boosting your ranking there, but local users might never see that exact version. This is cloaking.
Another scenario: Some affiliate or scam sites used to show safe content to US visitors (because that’s where a lot of manual reviewers or Google crawlers originate) but show more aggressive content or ads to visitors from other countries. If Googlebot is usually coming from the US, the site might cloak by not showing the spammy content to US IPs (thus Google doesn’t index the spam). However, a user from Europe might access the same URL and get redirected to a completely different, spam-laden page. That’s GeoIP cloaking used as a spam tactic.
5. Browser Cloaking
Definition: Browser cloaking is a variant where the site serves different content to visitors based on the specific web browser (or device type) they are using, including treating crawlers as a special browser. It’s essentially an extension of user-agent cloaking: instead of just a binary bot vs user, it might serve a unique version for Chrome, another for Firefox, another for Safari, and another for Googlebot, etc.
In legitimate development, sites do adapt to different browsers or devices for compatibility (e.g., showing a simpler layout on IE11 versus Chrome). But that usually shouldn’t affect the actual content. Browser cloaking crosses into cloaking territory when the content delivered to Google’s browser (crawler) is enriched or different purely to game rankings.
Some years ago, sites would detect if a visitor was using the text-only Lynx browser (an indicator it might be a bot or a very old system) and then output a plain HTML page. Search bots, not running a real browser, could be lumped into this and thus always got the plain HTML (with all the text for SEO). Regular users on Chrome got a fancy page that might even fetch content dynamically via JS (which the bot wouldn’t get). This means the bot indexed content that human Chrome users might not directly get – a form of cloaking via user-agent detection.
Browser cloaking is less common today but refers to serving different versions for different browser agents, with the intent to also serve a special version to search engine bots. It’s similar to user-agent cloaking; you can think of it as user-agent cloaking where multiple user-agents (browsers) each get tailored content. Unless those differences are purely cosmetic or oriented towards genuine user experience per device (and not favoring bots with extra info), it’s not allowed.
Example
A website could decide to show a lightweight, text-only version of a page to anyone using, say, an older browser or a text-only browser – and they might classify Googlebot under that umbrella (since Googlebot isn’t a typical browser). Meanwhile, users on modern browsers get the full interactive version. If those versions differ in substantive content, it’s cloaking. For instance, the site might serve Browser A (and Googlebot) a page with lots of crawlable text, but serve Browser B (e.g., Chrome) a page that is mostly images and maybe requires login to see details. This way, Googlebot (treated like Browser A) indexes content that Browser B users can’t actually see without logging in or it’s not present.
6. CNAME Cloaking (or DNS Cloaking)
Definition: CNAME cloaking is a more unusual technique that involves the Domain Name System (DNS) rather than the HTTP content directly. A CNAME record is a DNS alias – it can make one domain name point to another. In CNAME cloaking, a third-party domain is masked under a first-party domain via DNS configuration. This is often used not so much to show different content to Google vs users, but to disguise the true source of content or tracking.
One common usage is by analytics or advertising networks: they might use a CNAME record so that a resource from tracker.example.com (a CNAME pointing to the third-party tracker’s servers) appears to come from the first-party site. To the user (and sometimes to basic crawlers), it looks like content or data exchange is happening with the first-party site, while in reality data is being sent to a third-party. In the context of cloaking, this is more related to hiding who is serving the content rather than swapping content per se.
CNAME cloaking is less about dynamic switching on the fly and more about masking identities. Essentially, it’s a privacy-invasive tactic and can violate guidelines or even browser policies (Safari’s ITP specifically cracked down on CNAME cloaking by limiting cookies for such scenarios).
CNAME/DNS cloaking isn’t the typical show Google this, user that scenario. It’s more about cloaking who is delivering the content or gathering data. It’s mentioned in SEO contexts usually when discussing cloaking as a broad concept, but it’s arguably more relevant to analytics/tracking and sneaky redirects. Be aware that using DNS tricks to hide content sources or to pretend content is from your domain (when it’s not) can be seen as a form of cloaking as well.
Example
A marketing company might provide a script that a website owner installs. That script calls analytics.mysite.com/script.js, but analytics.mysite.com is actually a CNAME that resolves to the marketing company’s server. Because of this, any tracking pixels or even content fetched from that subdomain bypass third-party blockers and appear as first-party. For SEO, this could be used to inject content or links that appear to be part of the site but are actually controlled elsewhere.
Another scenario could be using DNS to cloak a redirect or different site: e.g., secret.mysite.com CNAMEs to spamnetwork.com. To a casual observer, it’s a subdomain of mysite (perhaps not obviously tied to the main site), and it might host content that search engines index as if it’s part of mysite.com. But users are subtly redirected or content is loaded from spamnetwork.com under that alias, without their knowledge. This is quite advanced and not commonly encountered by beginners, but it has been used by some to avoid detection.
7. Referrer Cloaking
Definition: Referrer cloaking alters what is shown based on the referring source of the visitor. The referrer (HTTP Referer header) tells a site what link you clicked to get there. In referrer cloaking, the site checks where the visitor came from – e.g., a Google search, a specific website, an ad campaign, etc. – and then serves different content accordingly.
Referrer cloaking thus is about where the traffic comes from. It’s widely used (often unethically) in the marketing world. However, there is a legitimate side-note: link cloaking for affiliate links, where you mask an affiliate URL behind your own domain (like yoursite.com/product-recommend redirects to the affiliate link), is sometimes also called cloaking. But that practice, if it’s a true redirect that the user expects, is generally fine (it’s more for aesthetics and tracking). The key difference is that ethical link cloaking (e.g., for affiliate links) doesn’t show different content; it just hides the ugly URL. The user still ends up on the real target page they expected (just via a nicer-looking link). In contrast, referrer cloaking in a black-hat sense means deceiving either the traffic source or the user about where they’re going.
Example
This technique is often associated with affiliate marketing and ad networks. For instance, an affiliate marketer might not want the affiliate program (or ad platform) to see the final landing page they’re sending users to (especially if it violates some rules). So, if the user’s referrer is an ad network (like Facebook Ads or Google Ads) or a search engine, they show a harmless page (sometimes called a clean or bridge page). But if the user comes with no referrer (like after that initial page, they get redirected with an empty ref) or from a specific controlled page, then the site redirects or shows the real sales page. This is often called cloaking in the affiliate world – showing the ad network one thing while funneling users to something else.
In SEO terms, referrer cloaking could work like this: If a visitor comes from Google’s search results (referrer contains www.google.com), the site assumes it’s a Google user (possibly also that Googlebot would not have a referrer or might simulate one in some cases) and shows content A. If the visitor comes from somewhere else or directly (no referrer), it shows content B. A dishonest use-case: a website might show a normal article to anyone coming from Google Search (to appear legitimate when Google’s reviewing or a user clicks through from SERP), but if someone visits that page directly (say a user bookmarks it or a Google reviewer tries to manually inspect by copy-pasting URL), the site might redirect them to spam or a sign-up form. This way, the referrer = Google condition acts as a trigger to show the good content only when coming from a Google search link.
Another scenario: Some hacking/cloaking kits do this – they deliver spammy pages to search engine bots and users coming from search results, but if you try to access the URL without that search referrer, they might show a normal page or nothing. This makes it harder for site owners or investigators to reproduce what a Google search user saw (unless they simulate the referrer).
8. JavaScript Cloaking
JavaScript cloaking involves using scripts on the page to dynamically alter, add, or hide content after the page is loaded, in order to show a different version to users than to crawlers. It overlaps with what we discussed under client-side cloaking. Typically, the server serves a baseline HTML that is crawler-friendly. Then JavaScript (which crawlers may not fully execute or might handle differently) changes the page for the user’s eyes.
Some malicious cases: a script might check navigator.userAgent in the browser, and if it does not find a certain substring (meaning it assumes it’s a search bot in a non-browser environment), it might generate or keep some content; if it does find it (meaning a real browser), it could strip that content out. This way, the decision is happening on the client side, after the crawler has already received the content.
It’s worth noting that Google has become much better at processing JavaScript. GoogleBot uses a headless Chrome rendering engine nowadays, which means simple JavaScript cloaking can be detected (Google might still see the final state after scripts). However, there might be timing differences or complex scripts that Google doesn’t execute fully, which cloakers try to exploit.
Legitimate use vs. cloaking: Many modern websites are JavaScript-heavy (like single-page applications). To ensure Google can index them, they might use prerendering or server-side rendering to deliver an SEO-friendly version to Googlebot. As long as that version is the same content that a user would get by actually using the app, it’s not cloaking. Google even encourages techniques like dynamic rendering for JS-heavy sites – where you serve Googlebot a pre-rendered static HTML version. The critical point is that the content shouldn’t be misleading or different in intent. If you start altering what content shows up (like adding extra SEO text only in the pre-render that users never see), then it becomes cloaking. For example, if you have a React site, you might generate a static HTML for crawlers. That’s fine if that static HTML isn’t saying something entirely else that your live site doesn’t.
Example
Consider a webpage that contains a large block of keyword-rich text in the HTML. If a user loads this page, a piece of JavaScript might immediately execute to remove that block from the DOM or hide it via CSS (maybe setting display:none or overlaying it with something). The human user never sees the text block, but Google’s crawler saw it in the HTML source and indexed all those keywords. This is JavaScript-based cloaking – using script to present a cleaner or different page to users while the crawler indexed something else. Another example is using JS to fetch content: perhaps the HTML served to everyone contains a placeholder or some default content that’s SEO-friendly. If a real browser is detected (maybe through user-agent in JS or simply the act of running the script), the script might replace that section with user-specific or less SEO-y content.
How Search Engines Detect Cloaking
Given that cloaking is a known spam technique, search engines have multiple methods to catch it. Here are some ways Google and other search engines detect cloaking:
- Comparing fetches with different identities: Search engines can visit your page using their regular crawler (identifying as Googlebot) and then revisit the same page using a different persona (for example, as a generic browser or from a different IP that isn’t on your bot list). If the content differs significantly, that’s a red flag. In fact, it’s known that Google will look at your site with user-agents other than Googlebot and compare the results. They might use a dummy crawler or even manual tools to see if what a user gets is the same as what Googlebot got. If you serve Googlebot an all-text page but any normal user sees a video page with no text, Google can algorithmically flag that discrepancy.
- Automated content comparisons: Google may hash or take snapshots of the content it fetched as Googlebot and periodically check if the cached version matches what a user sees. Tools like Google’s Inspect URL in Search Console show what Googlebot retrieved. If Google’s systems detect that the content in its index or cache is not present when a real browser accesses the page (for instance, users quickly bounce or report content not on page), that could signal cloaking. They can also compare the HTML served to Googlebot vs a headless Chrome (which simulates a user) and see differences in the DOM.
- Multiple crawler locations: For geo-based cloaking, Google has crawlers from different locations. They can verify if your site shows consistent content to, say, a US-based Googlebot and a European-based Googlebot where appropriate. Also, they might use a VPN or proxy to crawl as if they were a normal user from various places and see if those align with what their official bot saw.
- Manual review and spam reports: Google’s webspam team and algorithms work together. If the algorithms suspect cloaking, they might flag it for a human reviewer. Additionally, competitors or users might report a site for cloaking. A human at Google could then check the page both as Googlebot and as a user. Unlike normal users, Google’s engineers can directly retrieve pages masquerading as different agents. If they confirm cloaking, a manual action may be applied.
- Known cloaking signatures: Some cloaking scripts or services have identifiable patterns. Googlebot might detect if a page is doing something like instant meta refresh or serving a vastly different byte count to Googlebot vs others. They have 20+ years of cloaking examples to draw on, so many common cloaking setups are detectable via fingerprints.
Google can play spy. Google has admitted that they sometimes do this kind of cloak-detection by crawling with different credentials. It’s largely algorithmic – they’re not manually checking every site, but they have systems to sniff this out. Also, with the rise of machine learning, search engines can sometimes infer that a site might be cloaking if user behavior is odd (e.g., the snippet in search results contains text the user can’t find on the page).
Bing and other search engines similarly have guidelines against cloaking and likely use comparable techniques. They may not be as advanced as Google in some aspects, but it’s safe to assume any major crawler can detect basic cloaking by comparison methods.
How Can You Detect Cloaking?
If you suspect a site (maybe your own, if you inherited SEO from someone, or a competitor’s site) is using cloaking, there are ways you can detect it. It’s also good for webmasters to verify their site isn’t accidentally appearing to cloak (for instance, if you have A/B tests or dynamic content, you want to ensure Google sees what users see).
Here are some practical methods to detect cloaking on a webpage:
- Compare the Google snippet/cache to the live page: Do a Google search for the site or page in question. Look at the snippet (the description text Google shows). Does that text exist on the page when you visit it as a user? If Google shows a summary or keywords that you cannot find on the page, that’s a strong sign the crawler saw something different. You can also use the cache:URL operator in Google to see the cached version of the page (often in text-only form). Compare that cached text to the current live page. If they differ significantly, cloaking might be happening.
. - Use a Fetch as Google tool or user-agent switcher: There are tools like Google Search Console’s URL Inspection (which shows what Googlebot fetched) or third-party SEO tools that let you see a page as Googlebot. You can also manually switch your browser’s user-agent to Googlebot. In Chrome, for instance, open Developer Tools > Network conditions, uncheck Select automatically and enter a Googlebot user-agent, then refresh the page. See if the content or redirects differ when you masquerade as Googlebot. If, for example, as Googlebot you suddenly see a bunch of text or you don’t get redirected where a normal user would, that indicates cloaking.
- Check for redirects or different content by referrer: Try visiting the page directly versus coming from a Google search result. One way: search for the page on Google, click the result (this sets the referrer as Google). Then try copying the URL and visiting it directly in a new browser or incognito (no referrer). If the page behaves differently – e.g., one path shows content, another path instantly redirects elsewhere – there may be referrer-based cloaking. Using Chrome DevTools with Preserve log can help catch rapid redirects that might be missed by eye.
- Use command-line tools (cURL) for fine control: Using cURL or similar, you can craft requests to a server with specific headers. For example, run a cURL request with a Googlebot user-agent and one with a normal user-agent, and diff the responses. Likewise, you can test with and without certain headers, or from different IPs (if you have a VPN or proxy). If the HTML output differs, you’ve uncovered cloaking. Some example commands (from iPullRank) were:
- curl -A Mozilla/5.0 (compatible; Googlebot/2.1; +https://www.google.com/bot.html) https://www.example.com/ (fetch as Googlebot)
- curl -I https://www.example.com/ (fetch just headers to see if any clues like differing status codes)
- You can also add -e https://www.google.com/search to simulate coming from Google as a referrer.
- Look at the HTML code for hidden elements: View the page source or use dev tools to see if there’s content that is hidden via CSS (display:none, extremely small font, same color as background) or via JavaScript after load. Hidden text or links could indicate an attempt to cloak content for crawlers. However, sometimes this might be just hidden until a user clicks something (not malicious). Use this in combination with other clues.
For your own site, it’s wise to do periodic checks, especially if you run tests or use dynamic content. If you ever hire an SEO agency, and your rankings spike suspiciously, it’s not paranoid to check that they didn’t secretly implement cloaking. Regularly monitoring Google’s cache vs your live site can catch this early.
What is the Difference Between Ghosting and Cloaking?
In an SEO context, ghosting and cloaking are sometimes mentioned together, but they refer to different deceptive tactics:
- Cloaking (as we’ve discussed) is showing different content to different audiences at the same time based on who they are (bot vs user). The cloaked content is continuously served to the search engine until detected.
- Ghosting refers to a tactic where content is made available for a short time and then removed – essentially making it ghost content. The idea is to get search engines to index the content or page, and then later wipe or hide that content, often to avoid users seeing it or to avoid detection after gaining the ranking. The search engine saw a ghost that the users can no longer see.
Ghosting is a bit less talked-about in modern SEO, but it’s good to know the term. It basically describes a form of bait content that disappears – like a ghost haunting the index. With ghosting, everyone (including search engines and any users who happen to visit) might see the content initially, but after a certain time, that content disappears or the page changes. With cloaking, the difference is in who is looking, not when.
Both are deceptive. Google considers both black-hat. Ghosting is considered an illegal (against guidelines) practice and can be penalized by search engines just like cloaking can. If Google revisits a ghosted page and finds the content is gone or changed to something irrelevant, it will likely drop that page from rankings fast. It might also algorithmically learn the site is doing this and trust it less.
In SEO terms, ghosting might not always be intentional – sometimes spammers auto-generate pages that later 404, causing a kind of accidental ghosting – but usually it’s intentional to briefly exploit search engines.
Example
A website might publish an article stuffed with lucrative keywords (or even plagiarized content that ranks well) just long enough to get crawled and indexed by Google. Once Google has indexed that page (which could be within hours or days), the site then alters or removes the content – perhaps replacing it with something else or nothing at all. For a while, Google’s results might still show the page for those keywords (since in Google’s index the content existed), but users clicking the result find an empty page or a different content. Essentially, the site ghosted the content: it was there, then it vanished. This is a bait-and-switch: lure the crawler with content, then later serve the user something else (or blank). If done cleverly, the site might hope to reap the search traffic at least for some time before Google catches on that the content is gone.
Can Split-Tests Be Considered as Cloaking?
A common concern among webmasters is whether running A/B tests or multivariate tests on your website (where different users see different versions of a page) could be interpreted by Google as cloaking. After all, if one user sees version A and another sees version B, isn’t that different content to different users?
The good news: When done properly, A/B or split testing is not considered cloaking by Google. Major testing platforms and Google’s own guidelines make it clear that testing for UX improvement is acceptable, provided you’re not showing one version specifically to search engines and a completely different one to all users. The goal is to treat Googlebot as just another user in the test – not to always show it the winning version or a special version.
Some additional tips:
- Do not target the bot: Don’t create a test variant that is served only to Googlebot. For example, don’t have your testing tool set up an audience that says if user-agent contains Googlebot, show variant X. That would be cloaking. As long as Googlebot has the same chance to see any of the test variants as a normal visitor would, you’re fine. In other words, Google doesn’t care if its bot sees version A or version B, it just cares that the bot’s experience is equivalent to a real user’s experience. If, say, 90% of users see version A and 10% see version B, Googlebot should be treated as one of those users – maybe it’ll see A, maybe B, randomly.
- Use rel=canonical when using separate URLs: If your split test involves different URLs (e.g., you’re testing two different landing page URLs), you should use a canonical tag pointing to the main URL for the page. This tells Google that both versions are essentially the same page, preventing duplicate content issues or the wrong version from ranking. Often A/B tests are done with the same URL by swapping content, in which case this might not apply. But if, for example, you’re bucket-testing two different page URLs in search, canonicalize the test variant to the original.
- Use 302 (temporary) redirects for experiments, not 301: If your test directs a portion of users from the main URL to a variant URL, use a 302 (temporary) redirect. A 302 signals to search engines that this redirect is not permanent – it’s just for testing – and that they should keep the original URL indexed. A 301 could make Google index the variant URL instead, which you don’t want. Most A/B testing tools that redirect traffic will use 302s by default for this reason.
- Keep tests short and conclude them: Don’t run an A/B test for so long that Google might think you have two completely different sites. Google recommends running experiments only as long as necessary to get results, and then either pick a winner or revert changes. Prolonged split content could confuse crawlers. Once your test is done, everybody including Googlebot should see the one final version, which eliminates any potential cloaking concerns.
- Quality of variants: Ensure that both (or all) variants are legitimate content you’re okay with Google seeing. If one variant is a significantly dumber page solely intended to see how users react (but you’d never want Google to index it), you might noindex that variant or better yet, not serve that to Google specifically. But ideally, any variant in an A/B test should be a plausible permanent page. If you wouldn’t want one variant indexed at all, maybe you should avoid showing it to a large segment including Google. That said, if you follow the above rules (random distribution, canonical, etc.), even a less optimized variant being seen by Google shouldn’t hurt in the short term.
In essence, intent matters. A/B testing is intended to improve user experience by finding which version users prefer – not to deceive search engines. Google is fine with it: Google’s crawler might see one version or the other; as long as it’s not intentionally excluded or given a special version, it’s not cloaking.
For example, if you’re testing two different headlines on a page to see which gets more clicks, some users get Headline A, others Headline B. Googlebot comes, it might get Headline A (or B). That’s okay – users also get those. There’s no search engine only content.
One thing to avoid: Don’t show a completely keyword-stuffed headline to Google and a normal headline to users under the guise of A/B test. That would be cloaking. All test variants should be designed for users, not just for SEO.
Most A/B testing platforms (Optimize, Optimizely, VWO, etc.) have documentation on being SEO-friendly. They typically do things like cloaking avoidance automatically. For instance, Google Optimize and Optimize 360 follow these principles so that experiments don’t trigger cloaking flags.