What is hyphenation typosquatting?
Hyphenation typosquatting inserts or removes hyphens in a domain name to produce variants that resemble legitimate subservice or product names. This guide explains the DNS rules that govern hyphens, why hyphenated variants are unusually convincing, and how the technique overlaps with combosquatting.
7 min read
What it is#
Hyphenation typosquatting creates domain variants by inserting hyphens into a brand name or removing hyphens from a domain that legitimately contains them. Common examples include:
| Legitimate domain | Variant | Change |
|---|---|---|
facebook.com | face-book.com | Hyphen inserted at word boundary |
youtube.com | you-tube.com | Hyphen inserted at word boundary |
linkedin.com | linked-in.com | Hyphen inserted at word boundary |
t-mobile.com | tmobile.com | Hyphen removed |
Both insertion and removal changes are small enough to pass casual inspection, yet each yields a fully distinct domain under DNS resolution.
The technique is effective because hyphens are common in legitimate domain names. Brands such as Coca-Cola (coca-cola.com), Rolls-Royce (rolls-royce.com), and T-Mobile (t-mobile.com) all use hyphens in their primary domains. This normalizes the presence of hyphens for users and makes hyphenated variants of single-word brands appear intentional rather than erroneous.
DNS rules for hyphens#
The hyphen-minus character (-) is one of only three character classes permitted in DNS labels alongside ASCII letters and digits under long-standing hostname rules. However, several placement constraints apply:
- A label cannot begin or end with a hyphen.
- Hyphens in the third and fourth positions are reserved for special prefixes. The
xn--prefix signals an internationalized domain name encoded in Punycode. Registries generally reject labels with hyphens in positions three and four unless the label is a valid Punycode string. - Consecutive hyphens elsewhere in a label are syntactically valid but uncommon, so
face--book.comwould be registrable in most zones yet visually suspicious.
These rules narrow the space of plausible hyphenation variants slightly. An attacker generating permutations for a 10-character label has 9 possible insertion points (between each adjacent character pair), but some outputs will violate the leading-hyphen, trailing-hyphen, or positions-three-and-four constraints. For domains that already contain hyphens, removing each hyphen yields one variant per existing hyphen.
Why hyphenation variants are convincing#
Hyphenated domains exploit several aspects of how users perceive URLs, and the result is a permutation class that trades on plausibility rather than deception through error.
Word-boundary plausibility. Many brand names are compound words. Inserting a hyphen at the natural word boundary (face-book, you-tube, linked-in) produces a string that looks like a deliberate naming choice rather than a typo. The variant reads more naturally than character-substitution or transposition errors, where the mistake is often visible on inspection.
Subservice naming conventions. Organizations routinely use hyphens to name subservices and internal products: cloud-storage.example.com, api-gateway.example.com. Years of exposure to this pattern train users to accept hyphens as normal elements of a URL, lowering suspicion when a hyphenated brand variant appears in an email or message.
Visual similarity at small sizes. In many typefaces and at mobile-screen font sizes, the difference between facebook.com and face-book.com is a single short stroke. In email bodies, chat messages, and shortened link previews, the hyphen can blend into surrounding characters or disappear entirely. This effect is compounded in homoglyph attacks where the hyphen is replaced with a visually similar Unicode character.
Reduced "typo" signal. Unlike an omission (facebok.com) or vowel swap (facebouk.com), a hyphenated variant does not look misspelled. Spelling-aware users who would catch a missing letter may not question the presence of a hyphen, since it does not alter the constituent words.
Bidirectional exploitation. Hyphenation works in both directions. Attackers can insert hyphens into single-word brands (facebook → face-book) or remove them from hyphenated brands (t-mobile → tmobile). The removal case is particularly effective because the resulting domain is shorter and arguably simpler, characteristics often associated with legitimacy.
Overlap with combosquatting#
Hyphenation and combosquatting frequently co-occur. Analysis of global DNS traffic shows that the most common combosquatting keywords are "support", "com", "login", "help", and "secure". Many combosquat domains use hyphens to join the brand and keyword: paypal-login.com, amazon-support-center.com, safebank-security.com. In these cases the hyphen both separates the keyword visually (making the domain more readable) and signals that the domain belongs to a service or subdepartment of the brand.
A single domain can fall into multiple permutation categories simultaneously. brand-login.com is a hyphenation variant (relative to brandlogin.com), a combosquat (appending "login" to the brand), and potentially a case of keyword squatting if "login" is chosen specifically to match search queries. Detection systems that classify permutations should account for this overlap rather than treating each category as mutually exclusive.
Scale of the problem#
Large-scale DNS measurements have found over 2.3 million potential typosquatting names registered and resolving to IP addresses across common permutation techniques, with hyphenation a consistent contributor. Analysis of hundreds of prominent brands shows that defensive coverage remains inconsistent even among well-resourced organizations.
Hyphenation is a routine component of large-scale squatting campaigns, not a niche technique. When combined with keyword squatting or TLD squatting, the variant count for a single brand can grow into the hundreds. Abusive hyphenated domains also tend to stay active: research tracking combosquatting domains over multi-year periods found that close to 60% persist for more than 1,000 days once registered.
Real-world patterns#
Hyphenation variants appear across several recurring attack scenarios:
- Phishing landing pages. Domains like
paypal-verify.comormicrosoft-account-update.comhost credential-harvesting pages. The hyphenated structure mimics subservice naming, making the domain plausible in a phishing email that claims the recipient needs to "verify" or "update" something. Research on phishing infrastructure shows that attackers frequently obtain free TLS certificates (often from Let's Encrypt) for these domains, adding a padlock icon that reinforces perceived legitimacy. - Malware distribution. Hyphenated variants of software-download domains can serve trojanized installers. Because the domain reads as a legitimate product subdomain, users may not hesitate to download the file.
- Brand impersonation at scale. Automated tools can generate hyphen-insertion variants alongside other permutation classes. An attacker targeting a 10-character brand can register up to 9 hyphen-insertion variants across dozens of TLDs, creating a broad net for inbound traffic.
- Reverse squatting on hyphenated brands. Companies that use hyphens in their primary domain face the additional risk of attackers registering the unhyphenated form. If
my-company.comis the official site,mycompany.commay look equally legitimate to users unfamiliar with the brand's exact naming convention.
Detection and monitoring#
Hyphenation variants are straightforward to enumerate algorithmically: for each adjacent character pair in the label, insert a hyphen and validate the result against DNS label rules. The reverse operation (removing each existing hyphen) is equally simple. This makes hyphenation one of the most deterministic permutation classes to generate and monitor, comparable in predictability to bitsquatting.
When triaging alerts, several heuristics help separate high-risk variants from noise:
- Hyphenated variants of well-known single-word brands (
face-book.com,you-tube.com) are almost certainly not legitimate. - Variants that split a brand at a natural word boundary are more deceptive than arbitrary splits (
fac-ebook.comis less plausible thanface-book.com). - Domains combining hyphenation with high-risk keywords (
brand-secure-login.com) should be elevated in priority, as the combination of techniques indicates deliberate phishing domain construction.
Monitoring signals such as WHOIS and RDAP registration data, Certificate Transparency logs, and passive DNS records can surface newly registered hyphenation variants before they are used in active campaigns. Defensive registration of high-risk variants is practical given the bounded permutation count, though the overlap with combosquatting means the total variant space can expand quickly when keywords are factored in. For the full defensive playbook, see typosquatting protection.
Have I Been Squatted generates hyphenation permutations alongside omission, transposition, bitsquatting, and other lookalike domain categories for every monitored domain. Variants are checked against registration data and TLS certificate issuance automatically, surfacing newly registered hyphenation squats for investigation through domain monitoring.
Previous
What is addition typosquatting?
Next
What is omission typosquatting?
More from Typosquatting
View allIDN homograph attacks
IDN homograph attacks exploit visual similarity between characters in different Unicode scripts to create domains that appear identical to legitimate ones. This guide covers the technical mechanism, notable demonstrations, browser and registry defenses, and detection approaches.
Typosquatting examples
Documented real-world typosquatting incidents, from Google's typo-domain disputes to Fortune 500 email interception and supply-chain attacks on package managers. Each case illustrates a distinct attack category with dates, outcomes, and lessons.
Typosquatting permutations
Typosquatting permutation generation is the process of algorithmically enumerating all plausible misspellings and variations of a domain name. This guide explains the permutation categories, the tools that generate them, the combinatorial explosion problem, and how security teams prioritize the output.