What is vowel-swap typosquatting?
Vowel-swap typosquatting replaces vowels in a domain name with other vowels, producing variants that exploit spelling uncertainty, phonetic similarity, and the brain's tendency to prioritize consonants over vowels during reading. This guide covers the cognitive science behind the technique, its overlap with sound-squatting, and detection strategies.
7 min read
What it is#
Vowel-swap typosquatting is a permutation technique that replaces one or more vowels in a domain name with different vowels. The five English vowels (a, e, i, o, u) are substituted for each other, producing variants like:
| Domain | Vowel-swap variants |
|---|---|
google.com | goagle.com, gougle.com, giigle.com |
amazon.com | amezon.com, amizon.com, amuzon.com |
facebook.com | facabook.com, fecebook.com |
Unlike adjacent-key substitutions or transpositions, vowel swaps rarely result from mechanical keyboard errors. They exploit a different failure mode, spelling uncertainty. A user who is unsure whether the correct spelling is amazen or amazon may type a plausible but incorrect vowel, landing on an attacker-controlled domain.
Why the brain overlooks vowel changes#
Vowel-swap domains are effective for reasons rooted in how the visual system processes written language.
Consonants dominate word identification. Experimental research on reading has shown that consonants contribute more to the initial stages of visual word recognition than vowels do. Delaying the appearance of a consonant by 30 milliseconds increases gaze duration on a word more than delaying a vowel by the same interval. The brain appears to weight consonant positions more heavily when recognizing words, which means a vowel change is less likely to trigger a recognition failure during rapid reading.
Word shape is preserved. Vowels occupy similar vertical space in most typefaces. Swapping one vowel for another rarely alters ascenders, descenders, or overall word contour. In contrast, consonant substitutions (changing d to k, for instance) can visibly distort the word shape that skilled readers use for fast recognition. A domain like goagle.com retains the silhouette of google.com in ways that goqgle.com does not.
Spelling uncertainty is high for vowels. English vowels are notoriously inconsistent. The letter a alone maps to different sounds in "cat", "cake", "care", and "call". Research on spelling acquisition shows that vowel spellings are harder to learn and more error-prone than consonant spellings, because the mapping between sound and letter is less predictable for vowels. Spellers rely on surrounding consonant context to disambiguate vowel spellings, a strategy unavailable when reading a standalone domain label. This spelling-level ambiguity makes vowel-swapped domains feel plausible even to attentive readers.
Phonetic similarity and sound-squatting#
Vowel-swap domains often sound identical or nearly identical to the original when spoken aloud. The pair amizon.com and amazon.com differ by a single unstressed vowel that many English speakers would reduce to a schwa in casual speech, making the two effectively homophonous.
This property makes vowel swaps particularly dangerous in voice-based social engineering. In vishing calls, a caller who directs a victim to "go to amizon dot com" exploits the fact that the vowel difference is imperceptible in spoken language. The overlap between vowel-swap typosquatting and sound-squatting (registering domains that are homophones of legitimate ones) is substantial. Approximately 15% of identified sound-squatting domains in the wild have associated TLS certificates, a higher rate than other squatting types, suggesting active use in phishing campaigns.
The rise of voice assistants and podcast advertising has expanded the attack surface. When a URL is communicated audibly rather than displayed visually, vowel distinctions collapse entirely, and even careful listeners cannot distinguish between certain vowel-swap variants.
Difference from adjacent-key errors#
The standard model of typosquatting assumes a user's finger slips to a neighboring key. Vowel swaps do not fit this pattern cleanly. On a QWERTY keyboard, a and e are not adjacent, nor are o and i. A domain like amezon.com is unlikely to arise from a slip of the finger; it arises from a slip of the mind.
This distinction matters for threat modeling. Adjacent-key omission and addition errors correlate with typing speed and are more common on mobile keyboards where touch targets are small. Vowel-swap errors, by contrast, correlate with linguistic uncertainty and are more common when a user attempts to spell a brand name from memory rather than clicking a link or using a bookmark. The two error classes overlap in permutation tools but originate from different cognitive mechanisms.
Real-world activity#
Vowel-swap domains appear routinely in domain abuse campaigns. Threat intelligence teams observe tens of thousands of squatting domains registered per month, with vowel substitutions and combosquatting among the most common generation techniques. Many of these domains serve credential-harvesting pages with valid HTTPS certificates and pixel-perfect replicas of login screens, with lifespans measured in hours rather than days.
Vowel swaps targeting brand names with irregular or foreign-origin spellings are especially effective. Brands like anthropic, rakuten, or lyft present ambiguous vowel positions that even literate English speakers second-guess. The closer a brand name sits to a common English word pattern, the more plausible a vowel-swap variant appears. A domain like anthrapic.com exploits the fact that the vowel in the second syllable of "anthropic" is not obvious from pronunciation alone.
Permutation count#
For a domain label with v vowel positions, each position can be replaced by 4 other vowels, producing v × 4 single-swap variants. A domain like facebook has 4 vowel positions (a, e, o, o), generating 16 single-vowel-swap variants. Multi-vowel swaps (changing two or more positions simultaneously) increase the count exponentially but produce increasingly implausible strings unlikely to receive traffic.
Have I Been Squatted's twistrs library includes vowel swap as a standard permutation algorithm alongside homoglyph, hyphenation, and TLD squatting generators. Most engines generate only single-vowel swaps, keeping the candidate list manageable and focused on the most plausible variants.
Vowel swaps in non-Latin scripts#
Vowel-swap techniques extend beyond ASCII. In internationalized domain names (IDNs), languages with larger vowel inventories present additional substitution opportunities. German has umlauted vowels (ä, ö, ü), Spanish distinguishes accented from unaccented vowels, and Arabic script uses diacritics for short vowels that are frequently omitted. Each additional vowel character expands the permutation space in ways that ASCII-only tools may miss. IDN homograph attacks can compound with vowel swaps when visually similar Unicode characters are available for vowel positions.
Detection and monitoring#
Vowel-swap domains can be enumerated systematically by extract the vowel positions from a domain label, substitute each with the four alternative vowels, and filter the results against DNS zone data to find registered variants. The permutation set is modest, typically fewer than 30 candidates for a standard domain, making comprehensive monitoring practical.
Key signals for identifying active threats include:
- WHOIS and RDAP registration data. New registrations matching vowel-swap permutations indicate speculative or malicious intent.
- Certificate Transparency logs. A vowel-swap domain that obtains a TLS certificate is likely preparing to serve content or intercept connections.
- Levenshtein distance filtering. Vowel swaps produce an edit distance of 1 per substitution, placing them in the highest-risk tier for lookalike domain scoring.
- Phonetic matching. Algorithms like Soundex and Metaphone can flag domains that sound like a monitored brand, catching vowel swaps that double as homophones.
Since vowel-swap domains often double as sound-squatting candidates, monitoring programs that address only visual similarity may miss domains that are phonetically identical to the target. Combining visual and phonetic detection provides broader coverage.
Defensive strategies#
Defensive registration of high-risk vowel-swap variants is practical given the small permutation count. For a domain with three vowel positions, only 12 single-swap variants exist, and registering the most plausible candidates costs less than investigating a single brand impersonation incident.
For variants that cannot be preemptively registered, continuous domain monitoring combined with phishing domain detection provides early warning. Automated scanning of Certificate Transparency logs and passive DNS data can surface newly registered vowel-swap domains within hours of activation.
Have I Been Squatted includes vowel-swap permutations in its monitoring set alongside omission, transposition, bitsquatting, and other lookalike domain categories. The platform generates both visual and phonetic variants for monitored domains, checking each against registration data, certificate logs, and DNS resolution to surface threats before they reach end users.
Previous
What is transposition typosquatting?
More from Typosquatting
View allIDN homograph attacks
IDN homograph attacks exploit visual similarity between characters in different Unicode scripts to create domains that appear identical to legitimate ones. This guide covers the technical mechanism, notable demonstrations, browser and registry defenses, and detection approaches.
Typosquatting examples
Documented real-world typosquatting incidents, from Google's typo-domain disputes to Fortune 500 email interception and supply-chain attacks on package managers. Each case illustrates a distinct attack category with dates, outcomes, and lessons.
Typosquatting permutations
Typosquatting permutation generation is the process of algorithmically enumerating all plausible misspellings and variations of a domain name. This guide explains the permutation categories, the tools that generate them, the combinatorial explosion problem, and how security teams prioritize the output.