Opentopia Directory Encyclopedia Tools

IDN homograph attack

Encyclopedia : I : ID : IDN : IDN homograph attack


The internationalized domain name (IDN) homograph attack is a means by which a malicious party may seek to deceive computer users about what remote system they are communicating with, by exploiting the fact that many different characters may have nearly (or wholly) indistinguishable glyphs.

Homographs

In multilingual computer systems, different logical characters may have identical or very similar appearances. For example, Unicode character U+0430, Cyrillic small letter a ("а"), can look identical to Unicode character U+0061, Latin small letter a, ("a") which is the lowercase "a" used in English. Technically, characters that look alike in this way are known as homographs (strictly, homoglyphs). Spoofing attacks based on these similarities are known as homograph spoofing attacks.

The problem arises from the different treatment of the characters in the users mind and the computer's programming. From the viewpoint of the user, a Cyrillic "а" within a Latin string is a Latin "a"; there is literally no difference in the glyphs for these characters in most fonts. However, the computer treats them differently when processing the character string as an identifier. Thus, the user's assumption of a one-to-one correspondence between the visual appearance of a name, and the named entity, breaks down.

In a typical example of a hypothetical attack, someone could register a domain name that appears identical to an existing domain but goes somewhere else. For example, the spoofed domain "pаypal.com" contains a Cyrillic a, not a Latin a. In many ways, this is not a new thing. For example, even staying within the old character set of A-Z, 0-9 and hyphen, G00GLE.COM looks much like GOOGLE.COM in some fonts; or, using a mix of uppercase and lowercase characters, googIe.com (capital I, not small ell) looks much like google.com in some fonts. Or, displaying characters in lowercase alone, rnicrosoft.com ("RNICROSOFT.COM") looks very much like microsoft.com in many fonts. What is new was that the expansion by the internationalized domain name system of the character repertoire from a few dozen characters in a single alphabet to many thousands of characters in many scripts greatly increased the scope for homograph attacks.

Homographs in internationalized domain names

The limitation of domain names to ASCII characters, however, is very unlikely to last for ever. Why should a Russian newspaper's website have to live at gazeta.ru rather than газета.ру? The mechanism known as Internationalizing Domain Names in Applications provides a backward-compatible way for domain names to use the full Unicode character set, and it is already widely supported.

But now look again at that Russian domain name. The Russian letters а,е,р,у are indistinguishable in writing from their English counterparts. Some of the letters (such as a) are close etymologically, while others look similar by sheer coincidence. For instance, Russian letter р is actually pronounced like English r, but the glyphs of the two letters are identical.

This opens a rich vein of opportunities for phishing and other varieties of fraud. An attacker registers a domain name that looks just like that of a major bank, but in which some of the letters have been replaced by homographs in the Russian or Greek alphabet; sends out e-mail messages purporting to come from personnel at the bank, directing people to the bogus site; and steals their account details, while passing traffic through to the real bank's site. The victims will never notice the difference, until all the money disappears from their accounts.

Defending against the attack

The simplest defence is for web browsers not to support IDNA or other similar mechanisms, or for users to turn off whatever support their browsers have. That could mean either blocking access to IDNA sites, or permitting access but displaying URLs in Punycode. Either way, this amounts to abandoning non-ASCII domain names.

The Opera web browser has adopted a compromise: specific sites can be "whitelisted", allowing their domain names to be shown in internationalized form, and Latin-1 characters are allowed unconditionally.

Another possible defence would be for web browsers to display non-ASCII characters in URLs distinctively, perhaps by changing their colour or that of their background. This wouldn't provide protection against spoofing by changing one non-ASCII character to another similar-looking one.

This approach was adopted, as of July 9, 2005, by the plug-in Quero Toolbar for Internet Explorer. Besides IDN highlighting Quero has implemented several other techniques to mitigate IDN spoofing attacks like mixed-script/missing glyph detection, IDN/digit indication and "core domain" highlighting.

There is not yet (as of March 2005) a clear consensus as to the best way to balance the needs of the international community with protection against domain-name spoofing.

See also

External links

 


From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.

Search Titles
0123456789
ABCDEFGHIJ
KLMNOPQRST
UVWXYZ?

E-mail this article to:

Personal Message: