What is hyphenation typosquatting?

Hyphenation typosquatting inserts or removes hyphens in a domain name to produce variants that resemble legitimate subservice or product names. This guide explains the DNS rules that govern hyphens, why hyphenated variants are unusually convincing, and how the technique overlaps with combosquatting.

7 min read

What it is#

Hyphenation typosquatting creates domain variants by inserting hyphens into a brand name or removing hyphens from a domain that legitimately contains them. Common examples include:

Legitimate domainVariantChange
facebook.comface-book.comHyphen inserted at word boundary
youtube.comyou-tube.comHyphen inserted at word boundary
linkedin.comlinked-in.comHyphen inserted at word boundary
t-mobile.comtmobile.comHyphen removed

Both insertion and removal changes are small enough to pass casual inspection, yet each yields a fully distinct domain under DNS resolution.

The technique is effective because hyphens are common in legitimate domain names. Brands such as Coca-Cola (coca-cola.com), Rolls-Royce (rolls-royce.com), and T-Mobile (t-mobile.com) all use hyphens in their primary domains. This normalizes the presence of hyphens for users and makes hyphenated variants of single-word brands appear intentional rather than erroneous.

DNS rules for hyphens#

The hyphen-minus character (-) is one of only three character classes permitted in DNS labels alongside ASCII letters and digits under long-standing hostname rules. However, several placement constraints apply:

  • A label cannot begin or end with a hyphen.
  • Hyphens in the third and fourth positions are reserved for special prefixes. The xn-- prefix signals an internationalized domain name encoded in Punycode. Registries generally reject labels with hyphens in positions three and four unless the label is a valid Punycode string.
  • Consecutive hyphens elsewhere in a label are syntactically valid but uncommon, so face--book.com would be registrable in most zones yet visually suspicious.

These rules narrow the space of plausible hyphenation variants slightly. An attacker generating permutations for a 10-character label has 9 possible insertion points (between each adjacent character pair), but some outputs will violate the leading-hyphen, trailing-hyphen, or positions-three-and-four constraints. For domains that already contain hyphens, removing each hyphen yields one variant per existing hyphen.

Why hyphenation variants are convincing#

Hyphenated domains exploit several aspects of how users perceive URLs, and the result is a permutation class that trades on plausibility rather than deception through error.

Word-boundary plausibility. Many brand names are compound words. Inserting a hyphen at the natural word boundary (face-book, you-tube, linked-in) produces a string that looks like a deliberate naming choice rather than a typo. The variant reads more naturally than character-substitution or transposition errors, where the mistake is often visible on inspection.

Subservice naming conventions. Organizations routinely use hyphens to name subservices and internal products: cloud-storage.example.com, api-gateway.example.com. Years of exposure to this pattern train users to accept hyphens as normal elements of a URL, lowering suspicion when a hyphenated brand variant appears in an email or message.

Visual similarity at small sizes. In many typefaces and at mobile-screen font sizes, the difference between facebook.com and face-book.com is a single short stroke. In email bodies, chat messages, and shortened link previews, the hyphen can blend into surrounding characters or disappear entirely. This effect is compounded in homoglyph attacks where the hyphen is replaced with a visually similar Unicode character.

Reduced "typo" signal. Unlike an omission (facebok.com) or vowel swap (facebouk.com), a hyphenated variant does not look misspelled. Spelling-aware users who would catch a missing letter may not question the presence of a hyphen, since it does not alter the constituent words.

Bidirectional exploitation. Hyphenation works in both directions. Attackers can insert hyphens into single-word brands (facebookface-book) or remove them from hyphenated brands (t-mobiletmobile). The removal case is particularly effective because the resulting domain is shorter and arguably simpler, characteristics often associated with legitimacy.

Overlap with combosquatting#

Hyphenation and combosquatting frequently co-occur. Analysis of global DNS traffic shows that the most common combosquatting keywords are "support", "com", "login", "help", and "secure". Many combosquat domains use hyphens to join the brand and keyword: paypal-login.com, amazon-support-center.com, safebank-security.com. In these cases the hyphen both separates the keyword visually (making the domain more readable) and signals that the domain belongs to a service or subdepartment of the brand.

A single domain can fall into multiple permutation categories simultaneously. brand-login.com is a hyphenation variant (relative to brandlogin.com), a combosquat (appending "login" to the brand), and potentially a case of keyword squatting if "login" is chosen specifically to match search queries. Detection systems that classify permutations should account for this overlap rather than treating each category as mutually exclusive.

Scale of the problem#

Large-scale DNS measurements have found over 2.3 million potential typosquatting names registered and resolving to IP addresses across common permutation techniques, with hyphenation a consistent contributor. Analysis of hundreds of prominent brands shows that defensive coverage remains inconsistent even among well-resourced organizations.

Hyphenation is a routine component of large-scale squatting campaigns, not a niche technique. When combined with keyword squatting or TLD squatting, the variant count for a single brand can grow into the hundreds. Abusive hyphenated domains also tend to stay active: research tracking combosquatting domains over multi-year periods found that close to 60% persist for more than 1,000 days once registered.

Real-world patterns#

Hyphenation variants appear across several recurring attack scenarios:

  • Phishing landing pages. Domains like paypal-verify.com or microsoft-account-update.com host credential-harvesting pages. The hyphenated structure mimics subservice naming, making the domain plausible in a phishing email that claims the recipient needs to "verify" or "update" something. Research on phishing infrastructure shows that attackers frequently obtain free TLS certificates (often from Let's Encrypt) for these domains, adding a padlock icon that reinforces perceived legitimacy.
  • Malware distribution. Hyphenated variants of software-download domains can serve trojanized installers. Because the domain reads as a legitimate product subdomain, users may not hesitate to download the file.
  • Brand impersonation at scale. Automated tools can generate hyphen-insertion variants alongside other permutation classes. An attacker targeting a 10-character brand can register up to 9 hyphen-insertion variants across dozens of TLDs, creating a broad net for inbound traffic.
  • Reverse squatting on hyphenated brands. Companies that use hyphens in their primary domain face the additional risk of attackers registering the unhyphenated form. If my-company.com is the official site, mycompany.com may look equally legitimate to users unfamiliar with the brand's exact naming convention.

Detection and monitoring#

Hyphenation variants are straightforward to enumerate algorithmically: for each adjacent character pair in the label, insert a hyphen and validate the result against DNS label rules. The reverse operation (removing each existing hyphen) is equally simple. This makes hyphenation one of the most deterministic permutation classes to generate and monitor, comparable in predictability to bitsquatting.

When triaging alerts, several heuristics help separate high-risk variants from noise:

  • Hyphenated variants of well-known single-word brands (face-book.com, you-tube.com) are almost certainly not legitimate.
  • Variants that split a brand at a natural word boundary are more deceptive than arbitrary splits (fac-ebook.com is less plausible than face-book.com).
  • Domains combining hyphenation with high-risk keywords (brand-secure-login.com) should be elevated in priority, as the combination of techniques indicates deliberate phishing domain construction.

Monitoring signals such as WHOIS and RDAP registration data, Certificate Transparency logs, and passive DNS records can surface newly registered hyphenation variants before they are used in active campaigns. Defensive registration of high-risk variants is practical given the bounded permutation count, though the overlap with combosquatting means the total variant space can expand quickly when keywords are factored in. For the full defensive playbook, see typosquatting protection.

Have I Been Squatted generates hyphenation permutations alongside omission, transposition, bitsquatting, and other lookalike domain categories for every monitored domain. Variants are checked against registration data and TLS certificate issuance automatically, surfacing newly registered hyphenation squats for investigation through domain monitoring.

More from Typosquatting

View all

Put what you learn into practice

Monitor typosquats, investigate infrastructure, and move from reading to detection with continuous domain coverage built for security teams.