ToASCII and ToUnicode

The conversions between ASCII and non-ASCII forms of a domain name are accomplished by algorithms called ToASCII and ToUnicode. These algorithms are not applied to the domain name as a whole, but rather to individual labels. For example, if the domain name is www.example.com, then the labels are wwwexample, and com. ToASCII or ToUnicode are applied to each of these three separately.

The details of these two algorithms are complex, and are specified in RFC 3490. The following gives an overview of their function.

ToASCII leaves unchanged any ASCII label, but will fail if the label is unsuitable for the Domain Name System. If given a label containing at least one non-ASCII character, ToASCII will apply the Nameprep algorithm, which converts the label to lowercase and performs other normalization, and will then translate the result to ASCII usingPunycode before prepending the four-character string "xn--". This four-character string is called the ASCII Compatible Encoding (ACE) prefix, and is used to distinguish Punycode encoded labels from ordinary ASCII labels. The ToASCII algorithm can fail in several ways; for example, the final string could exceed the 63-character limit of a DNS name. A label for which ToASCII fails cannot be used in an internationalized domain name.

The function ToUnicode reverses the action of ToASCII, stripping off the ACE prefix and applying the Punycode decode algorithm. It does not reverse the Nameprep processing, since that is merely a normalization and is by nature irreversible. Unlike ToASCII, ToUnicode always succeeds, because it simply returns the original string if decoding fails. In particular, this means that ToUnicode has no effect on a string that does not begin with the ACE prefix.


DiggDigg   | RedditReddit   | Add to Mixx!MixxDeldel.icio.usStumble Stumble it!Bookmark and Share Share it

 
Name  
Comment
Verification Code code

Comments submitted from other visitors

More posts, Page # :

More articles ϝ

More articles Internationalized domain name

More articles INDs in Applications

More articles ToASCII and ToUnicode

More articles Example of IDNA encoding

More articles Top-level domain implementation

More articles IDNs Timeline

More articles ccTLD that accepted IDN registration

More articles Registries that support non-ASCII domain names


Icann makes history with first internat...
Published:Thu, 06 May 2010 07:30:06 -0700
The Internet Corporation for Assigned Names and Numbers (Icann) has heralded a new dawn for the internet after making internationalised domain names (IDNs) available for the first......
First Non-Latin Domain Names Go Online...
Published:Thu, 06 May 2010 13:35:00 -0700
Arabic language can now be used in an entire Internet address name......
Icann makes history with first internat...
Published:Thu, 06 May 2010 07:18:24 -0700
Phil Muncaster, V3.co.uk , Thursday 6 May 2010 at 14:59:00 Arabic becomes the first non-Latin language to hit the internet The Internet Corporation for Assigned Names and Numbers ......
ICANN makes history with first internat...
Published:Thu, 06 May 2010 14:03:56 -0700
Arabic becomes the first non-Latin language to hit the internet.......
First internationalised domain names go...
Published:Fri, 07 May 2010 11:21:09 -0700
Egypt, Saudi Arabia, and the United Arab Emirates receive the first non-Latin top level domains......
© 2010 | Privacy Policy | Powered By Noomle.com | SiteMap