Unicode domain phishing attacks are ways of fooling people into visiting copycat versions of well known websites in order to obtain sensitive personal information. The attack works by using characters from the Unicode domain in the address bars which appear to be regular ASCII characters (the characters most people In the West will be used to seeing), and which therefore resemble the address of the websites the victim wants to visit, but which actually lead to the copycat site.

For example, although the the “A” of the latin alphabet (U+0061 in ASCII) and the “A” of the Cyrillic alphabet (U+0430) look similar, they are treated differently by browsers because they are not the same character. In the same way, there are many other characters which look very similar and can be substituted for one another. Since there are over 136,000 Unicode characters from 139 different scripts, it is easy to look for characters which look alike.

Although browsers are set up to make it clear when non-latin characters are used in a URL, this feature failed when all of the characters of the address were replaced by non-latin characters. For example, the page at https://www.xn--80ak6aa92e.com/ is a proof-of-concept website demonstrating how someone could set up a website which looks like it leads to apple.com (but actually does not). None of the five letters of the world apple used in the address use latin characters, despite appearing as if they do.

IDN Safe is a browser add-on which looks to see if the URL is using alternative characters and blocks the page. However, many non-latin-character websites may fall victim to this add-on.

Additional Information

  • The attack was first demonstrated by Xudong Zheng in April 2017.


Last updated: 1 June 2018