7 Internet Draft M. Duerst
8 <draft-duerst-dns-i18n-02.txt> Keio University
9 Expires in six months July 1998
12 Internationalization of Domain Names
17 This document is an Internet-Draft. Internet-Drafts are working doc-
18 uments of the Internet Engineering Task Force (IETF), its areas, and
19 its working groups. Note that other groups may also distribute work-
20 ing documents as Internet-Drafts.
22 Internet-Drafts are draft documents valid for a maximum of six
23 months. Internet-Drafts may be updated, replaced, or obsoleted by
24 other documents at any time. It is not appropriate to use Internet-
25 Drafts as reference material or to cite them other than as a "working
26 draft" or "work in progress".
28 To learn the current status of any Internet-Draft, please check the
29 1id-abstracts.txt listing contained in the Internet-Drafts Shadow
30 Directories on ftp.ietf.org (US East Coast), nic.nordu.net
31 (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific
34 Distribution of this document is unlimited. Please send comments to
35 the author at <mduerst@w3.org>.
40 Internet domain names are currently limited to a very restricted
41 character set. This document proposes the introduction of a new
42 "zero-level" domain (ZLD) to allow the use of arbitrary characters
43 from the Universal Character Set (ISO 10646/Unicode) in domain names.
44 The proposal is fully backwards compatible and does not need any
45 changes to DNS. Version 02 is reissued without changes just to
46 keep this draft available.
50 0. Change History ................................................. 2
51 0.8 Changes Made from Version 01 to Version 02 .................. 2
52 0.9 Changes Made from Version 00 to Version 01 .................. 2
53 1. Introduction ................................................... 3
54 1.1 Motivation .................................................. 3
58 Expires End of January 1998 [Page 1]
60 Internet Draft Internationalization of Domain Names July 1997
63 1.2 Notational Conventions ...................................... 4
64 2. The Hidden Zero Level Domain ................................... 4
65 3. Encoding International Characters .............................. 5
66 3.1 Encoding Requirements ....................................... 5
67 3.2 Encoding Definition ......................................... 5
68 3.3 Encoding Example ............................................ 7
69 3.4 Length Considerations ....................................... 8
70 4. Usage Considerations ........................................... 8
71 4.1 General Usage ............................................... 8
72 4.2 Usage Restrictions .......................................... 9
73 4.3 Domain Name Creation ....................................... 10
74 4.4 Usage in URLs .............................................. 12
75 5. Alternate Proposals ........................................... 13
76 5.1 The Dillon Proposal ........................................ 13
77 5.2 Using a Separate Lookup Service ............................ 13
78 6. Generic Considerations ........................................ 14
79 5.1 Security Considerations .................................... 14
80 5.2 Internationalization Considerations ........................ 14
81 Acknowledgements ................................................. 14
82 Bibliography ..................................................... 15
83 Author's Address .................................................=
93 0.8 Changes Made from Version 01 to Version 02
95 No significant changes; reissued to make it available officially.
96 Changed author's address.
98 Changes deferred to future versions (if ever):
99 - Decide on ZLD name (.i or .i18n.int or something else)
100 - Decide on casing solution
101 - Decide on exact syntax
102 - Proposals for experimental setup
107 0.9 Changes Made from Version 00 to Version 01
115 Expires End of January 1998 [Page 2]
117 Internet Draft Internationalization of Domain Names July 1997
120 - Minor rewrites and clarifications
122 - Added the following references: [RFC1730], [Kle96], [ISO3166],
125 - Slightly expanded discussion about casing
127 - Added some variant proposals for syntax
129 - Added some explanations about different kinds of name parallelism
131 - Added some explanation about independent addition of internation-
132 alized names in subdomains without bothering higher-level domains
134 - Added some explanations about tools needed for support, and the
137 - Change to RFC1123 (numbers allowed at beginning of labels)
148 The lower layers of the Internet do not discriminate any language or
149 script. On the application level, however, the historical dominance
150 of the US and the ASCII character set [ASCII] as a lowest common
151 denominator have led to limitations. The process of removing these
152 limitations is called internationalization (abbreviated i18n). One
153 example of the abovementioned limitations are domain names [RFC1034,
154 RFC1035], where only the letters of the basic Latin alphabet (case-
155 insensitive), the decimal digits, and the hyphen are allowed.
157 While such restrictions are convenient if a domain name is intended
158 to be used by arbitrary people around the globe, there may be very
159 good reasons for using aliases that are more easy to remember or type
160 in a local context. This is similar to traditional mail addresses,
161 where both local scripts and conventions and the Latin script can be
164 There are many good reasons for domain name i18n, and some arguments
165 that are brought forward against such an extension. This document,
166 however, does not discuss the pros and cons of domain name i18n. It
167 proposes and discusses a solution and therefore eliminates one of the
171 Expires End of January 1998 [Page 3]
173 Internet Draft Internationalization of Domain Names July 1997
176 most often heard arguments agains, namely "it cannot be done".
178 The solution proposed in this document consists of the introduction
179 of a new "zero-level" domain building the root of a new domain
180 branch, and an encoding of the Universal Character Set (UCS)
181 [ISO10646] into the limited character set of domain names.
185 1.2 Notational Conventions
187 In the domain name examples in this document, characters of the basic
188 Latin alphabet (expressible in ASCII) are denoted with lower case
189 letters. Upper case letters are used to represent characters outside
190 ASCII, such as accented characters of the Latin alphabet, characters
191 of other alphabets and syllabaries, ideographic characters, and vari-
195 2. The Hidden Zero Level Domain
197 The domain name system uses the domain "in-addr.arpa" to convert
198 internet addresses back to domain names. One way to view this is to
199 say that in-addr.arpa forms the root of a separate hierarchy. This
200 hierarchy has been made part of the main domain name hierarchy just
201 for implementation convenience. While syntactically, in-addr.arpa is
202 a second level domain (SLD), functionally it is a zero level domain
203 (ZLD) in the same way as "." is a ZLD. A similar example of a ZLD is
204 the domain tpc.int, which provides a hierarchy of the global phone
205 numbering system [RFC1530] for services such as paging and printing
208 For domain name i18n to work inside the tight restrictions of domain
209 name syntax, one has to define an encoding that maps strings of UCS
210 characters to strings of characters allowable in domain names, and a
211 means to distinguish domain names that are the result of such an
212 encoding from ordinary domain names.
214 This document proposes to create a new ZLD to distinguish encoded
215 i18n domain names from traditional domain names. This domain would
216 be hidden from the user in the same way as a user does not see in-
217 addr.arpa. This domain could be called "i18n.arpa" (although the use
218 of arpa in this context is definitely not appropriate), simply
219 "i18n", or even just "i". Below, we are using "i" for shortness,
220 while we leave the decision on the actual name to further=
228 Expires End of January 1998 [Page 4]
230 Internet Draft Internationalization of Domain Names July=
234 3. Encoding International Characters
239 3.1 Encoding Requirements
242 Until quite recently, the thought of going beyond ASCII for something
243 such as domain names failed because of the lack of a single encom-
244 passing character set for the scripts and languages of the world.
245 Tagging techniques such as those used in MIME headers [RFC1522] would
246 be much too clumsy for domain names.
248 The definition of ISO 10646 [ISO10646], codepoint by codepoint iden-
249 tical with Unicode [Unicode], provides a single Universal Character
250 Set (UCS). A recent report [RFCIAB] clearly recommends to base the
251 i18n of the Internet on these standards.
253 An encoding for i18n domain names therefore has to take the charac-
254 ters of ISO 10646/Unicode as a starting point. The full four-byte
255 (31 bit) form of UCS, called UCS4, should be used. A limitation to
256 the two-byte form (UCS2), which allows only for the encoding of the
257 Base Multilingual Plane, is too restricting.
259 For the mapping between UCS4 and the strongly limited character set
260 of domain names, the following constraints have to be considered:
262 - The structure of domain names, and therefore the "dot", have to be
263 conserved. Encoding is done for individual labels.
265 - Individual labels in domain names allow the basic Latin alphabet
266 (monocase, 26 letters), decimal digits, and the "-" inside the
267 label. The capacity per octet is therefore limited to somewhat
270 - There is no need nor possibility to preserve any characters.
272 - Frequent characters (i.e. ASCII, alphabetic, UCS2, in that order)
273 should be encoded relatively compactly. A variable-length encoding
274 (similar to UTF-8) seems desirable.
278 3.2 Encoding Definition
281 Several encodings for UCS, so called UCS Transform Formats, exist
285 Expires End of January 1998 [Page 5]
287 Internet Draft Internationalization of Domain Names July 1997
290 already, namely UTF-8 [RFC2044], UTF-7 [RFC1642], and UTF-16 [Uni-
291 code]. Unfortunately, none of them is suitable for our purposes. We
292 therefore use the following encoding:
294 - To accommodate the slanted probability distribution of characters
295 in UCS4, a variable-length encoding is used.
297 - Each target letter encodes 5 bits of information. Four bits of
298 information encode character data, the fifth bit is used to indi-
299 cate continuation of the variable-length encoding.
301 - Continuation is indicated by distinguishing the initial letter
302 from the subsequent letter.
304 - Leading four-bit groups of binary value 0000 of UCS4 characters
305 are discarded, except for the last TWO groups (i.e. the last
306 octet). This means that ASCII and Latin-1 characters need two
307 target letters, the main alphabets up to and including Tibetan
308 need three target letters, the rest of the characters in the BMP
309 need four target letters, all except the last (private) plane in
310 the UTF-16/Surrogates area [Unicode] need five target letters, and
313 - The letters representing the various bit groups in the various
314 positions are chosen according to the following table:
317 Nibble Value Initial Subsequent
337 [Should we try to eliminate "I" and "O" from initial? "I" might be
341 Expires End of January 1998 [Page 6]
343 Internet Draft Internationalization of Domain Names July 1997
346 eliminated because then an algorithm can more easily detect ".i". "O"
347 could lead to some confusion with "0". What other protocols are
348 there that might be able to use a similar solution, but that might
349 have other restrictions for the initial letters? Proposal to run ini-
350 tial range from H to X. Extracting the initial bits then becomes ^
351 'H'. Proposal to have a special convention for all-ASCII labels
352 (start label with one of the letters not used above).]
354 Please note that this solution has the following interesting proper-
357 - For subsequent positions, there is an equivalence between the hex-
358 adecimal value of the character code and the target letter used.
359 This assures easy conversion and checking.
361 - The absence of digits from the "initial" column, and the fact that
362 the hyphen is not used, assures that the resulting string conforms
363 to domain name syntax.
365 - Raw sorting of encoded and unencoded domain names is equivalent.
367 - The boundaries of characters can always be detected easily.
368 (While this is important for representations that are used inter-
369 nally for text editing, it is actually not very important here,
370 because tools for editing can be assumed to use a more straight-
371 forward representation internally.)
373 - Unless control characters are allowed, the target string will
374 never actually contain a G.
381 As an example, the current domain
385 with the components standing for information science, science, the
386 University of Tokyo, academic, and Japan, might in future be repre-
389 JOUHOU.RI.TOUDAI.GAKU.NIHON
391 (a transliteration of the kanji that might probably be chosen to rep-
392 resent the same domain). Writing each character in U+HHHH notation as
393 in [Unicode], this results in the following (given for reference
397 Expires End of January 1998 [Page 7]
399 Internet Draft Internationalization of Domain Names July 1997
402 only, not the actual encoding or something being typed in by the
405 U+60c5U+5831.U+7406.U+6771U+5927.U+5b66.U+65e5U+672c
407 The software handling internationalized domain names will translate
408 this, according to the above specifications, before submitting it to
409 the DNS resolver, to:
411 M0C5L831.N406.M771L927.LB66.M5E5M72C.i
415 3.4 Length Considerations
418 DNS allows for a maximum of 63 positions in each part, and for 255
419 positions for the overall domain name including dots. This allows up
420 to 15 ideographs, or up to 21 letters e.g. from the Hebrew or Arabic
421 alphabet, in a label. While this does not allow for the same margin
422 as in the case of ASCII domain names, it should still be quite suffi-
423 cient. [Problems could only surface for languages that use very long
424 words or terms and don't know any kind of abbreviations or similar
425 shortening devices. Do these exist? Islandic expert asserted
426 Islandic is not a problem.] DNS contains a compression scheme that
427 avoids sending the same trailing portion of a domain name twice in
428 the same transmission. Long domain names are therefore not that much
432 4. Usage Considerations
439 To implement this proposal, neither DNS servers nor resolvers need
440 changes. These programs will only deal with the encoded form of the
441 domain name with the .i suffix. Software that wants to offer an
442 internationalized user interface (for example a web browser) is
443 responsible for the necessary conversions. It will analyze the domain
444 name, call the resolver directly if the domain name conforms to the
445 domain name syntax restrictions, and otherwise encode the name
446 according to the specifications of Section 3.2 and append the .i suf-
447 fix before calling the resolver. New implementations of resolvers
448 will of course offer a companion function to gethostbyname accepting
449 a ISO10646/Unicode string as input.
453 Expires End of January 1998 [Page 8]
455 Internet Draft Internationalization of Domain Names July 1997
458 For domain name administrators, them main tool that will be needed is
459 a program to compile files configuring zones from an UTF-8 notation
460 (or any other suitable encoding) to the encoding described in Section
461 3.3. Utility tools will include a corresponding decompiler, checkers
462 for various kinds of internationalization-related errors, and tools
463 for managing syntactic parallelism (see Section 4.3).
466 4.2 Usage Restrictions
469 While this proposal in theory allows to have control characters such
470 as BEL or NUL or symbols such as arrows and smilies in domain names,
471 such characters should clearly be excluded from domain names. Whether
472 this has to be explicitly specified or whether the difficulty to type
473 these characters on any keyboard of the world will limit their use
474 has to be discussed. One approach is to start with a very restricted
475 subset and gradually relax it; the other is to allow almost anything
476 and to rely on common sense. Anyway, such specifications should go
477 into a separate document to allow easy updates.
479 A related point is the question of equivalence. For historical rea-
480 sons, ISO 10646/Unicode contain considerable number of compatibility
481 characters and allow more than one representation for characters with
482 diacritics. To guarantee smooth interoperability in these and related
483 cases, additional restrictions or the definition of some form of nor-
484 malization seem necessary. However, this is a general problem
485 affecting all areas where ISO 10646/Unicode is used in identifiers,
486 and should therefore be addressed in a generic way. See [iNORM] for
489 Equally related is the problem of case equivalence. Users can very
490 well distinguish between upper case and lower case. Also, casing in
491 an i18n context is not as straightforward as for ASCII, so that case
492 equivalence is best avoided. Problems therefore result not from the
493 fact that case is distinguished for i18n domain names, but from the
494 fact that existing domain names do not distinguish case. Where it is
495 impossible to distinguish between next.com and NeXT.com, the same two
496 subdomains would easily be distinguishable if subordinate to a i18n
497 domain. There are several possible solutions. One is to try to grad-
498 ually migrate from a case-insensitive solution to a case-sensitive
499 solution even for ASCII. Another is to allow case-sensitivity only
500 beyond ASCII. Another is to restrict anything beyond ASCII to lower-
501 case only (lowercase distinguishes better than uppercase, and is also
502 generally used for ASCII domain names).
504 A problem that also has to be discussed and solved is bidirectional-
505 ity. Arabic and Hebrew characters are written right-to-left, and the
509 Expires End of January 1998 [Page 9]
511 Internet Draft Internationalization of Domain Names July 1997
514 mixture with other characters results in a divergence between logical
515 and graphical sequence. See [HTML-I18N] for more explanations. The
516 proposal of [Yer96] for dealing with bidirectionality in URLs could
517 probably be applied to domain names. Anyway, there should be a gen-
518 eral solution for identifiers, not a DNS-specific solution.
521 4.3 Domain Name Creation
524 The ".i" ZLD should be created as such to allow the internationaliza-
525 tion of domain names. Rules for creating subdomains inside ".i"
526 should follow the established rules for the creation of functionally
527 equivalent domains in the existing domain hierarchy, and should
530 For the actual domain hierarchy, the amount of parallelism between
531 the current ASCII-oriented hierarchy and some internationalized hier-
532 archy depends on various factors. In some cases, two fully parallel
533 hierarchies may emerge. In other cases, if more than one script or
534 language is used locally, more than two parallel hierarchies may
535 emerge. Some nodes, e.g. in intranets, may only appear in an i18n
536 hierarchy, whereas others may only appear in the current hierarchy.
537 In some cases, the pecularities of scripts, languages, cultures, and
538 the local marketplace may lead to completely different hierarchies.
540 Also, one has to be aware that there may be several kinds of paral-
541 lelisms. The first one is called syntactic parallelism. If there is
542 a domain XXXX.yy.zz and a domain vvvv.yy.zz, then the domain yy.zz
543 will have to exist both in the traditional DNS hierarchy as well as
544 within the hierarchy starting at the .i ZLD, with appropriate encod-
547 The second type of parallelism is called transcription parallelism.
548 It results by transcribing or transliterating relations between ASCII
549 domain names and domain names in other scripts.
551 The third type of parallelism is called semantic parallelism. It
552 results from translating elements of a domain name from one language
553 to another, possibly also changing the script or set of used charac-
556 On the host level, parallelism means that there are two names for the
557 same host. Conventions should exist to decide whether the parallel
558 names should have separate IP addresses or not (A record or CNAME
559 record). With separate IP addresses, address to name lookup is easy,
560 otherwise it needs special precautions to be able to find all names
561 corresponding to a given host address. Another detail entering this
565 Expires End of January 1998 [Page 10]
567 Internet Draft Internationalization of Domain Names July 1997
570 consideration is that MX records only work for hostnames/domains,
571 not for CNAME aliases. This at least has the consequence that alias
572 resolution for internationalized mail addresses has to occur before
575 When discussing and applying the rules for creating domain names,
576 some peculiarities of i18n domain names should be carefully consid-
579 - Depending on the script, reasonable lengths for domain name parts
580 may differ greatly. For ideographic scripts, a part may often be
581 only a one-letter code. Established rules for lengths may need
582 adaptation. For example, a rule for country TLDs could read: one
583 ideographic character or two other characters.
585 - If the number of generic TLDs (.com, .edu, .org, .net) is kept
586 low, then it may be feasible to restrict i18n TLDs to country
589 - There are no ISO 3166 [ISO3166] two-letter codes in scripts other
590 than Latin. I18n domain names for countries will have to be
591 designed from scratch.
593 - The names of some countries or regions may pose greater political
594 problems when expressed in the native script than when expressed
595 in 2-letter ISO 3166 codes.
597 - I18n country domain names should in principle only be created in
598 those scripts that are used locally. There is probably little use
599 in creating an Arabic domain name for China, for example.
601 - In those cases where domain names are open to a wide range of
602 applicants, a special procedure for accepting applications should
603 be used so that a reasonable-quality fit between ASCII domain
604 names and i18n domain names results where desired. This would
605 probably be done by establishing a period of about a month for
606 applications inside a i18n domain newly created as a parallel for
607 an existing domain, and resolving the detected conflicts. For
608 syntactically parallel domain names, the owners should always be
609 the same. Administration may be split in some cases to account for
610 the necessary linguistic knowledge. For domain names with tran-
611 scription parallelism and semantic parallelism, the question of
612 owner identity should depend on the real-life situation (trade-
615 - It will be desirable to have internationalized subdomains in non-
616 internationalized TLDs. As an example, many companies in France
617 may want to register an accented version of their company name,
621 Expires End of January 1998 [Page 11]
623 Internet Draft Internationalization of Domain Names July 1997
626 while remaining under the .fr TLD. For this, .fr would have to be
627 reregistered as .M6N2.i. Accented and other internationalized sub-
628 domains would go below .M6N2.i, whereas unaccented ones would go
629 below .fr in its plain form.
631 - To generalize the above case, one may need to create a requirement
632 that any domain name registry would have to register and manage
633 syntactically parallel domain names below the .i ZLD upon request
634 to allow registration of i18n domain names in arbitrary subdo-
635 mains. An alternative to this is to organize domain name search
636 so that e.g. in a search for XXXXXX.fr, if M6N2.i is not found in
637 .i, the name server for .fr is queried for XXXXXX.M6N2.i (with
638 XXXXXX appropriately encoded). This convention would allow lower-
639 level domains to introduce internationalized subdomains without
640 depending on higher-level domains.
646 According to current definitions, URLs encode sequences of octets
647 into a sequence of characters from a character set that is almost as
648 limited as the character set of domain names [RFC1738]. This is
649 clearly not satisfying for i18n.
651 Internationalizing URLs, i.e. assigning character semantics to the
652 encoded octets, can either be done separately for each part and/or
653 scheme, or in an uniform way. Doing it separately has the serious
654 disadvantage that software providing user interfaces for URLs in gen-
655 eral would have to know about all the different i18n solutions of the
656 different parts and schemes. Many of these solutions may not even be
659 It is therefore definitely more advantageous to decide on a single
660 and consistent solution for URL internationalization. The most valu-
661 able candidate [Yer96], for many reasons, is UTF-8 [RFC2044], an
662 ASCII-compatible encoding of UCS4.
664 Therefore, an URL containing the domain name of the example of Sec-
665 tion 3.3 should not be written as:
667 ftp://M0C5L831.N406.M771L927.LB66.M5E5M72C.i
669 (although this will also work) but rather
671 ftp://%e6%83%85%e5%a0%b1.%e7%90%86.%e6%9d%b1%e5%a4%a7.
672 %e5%ad%a6.%e6%97%a5%e6%9c%ac
677 Expires End of January 1998 [Page 12]
679 Internet Draft Internationalization of Domain Names July 1997
682 In this canonical form, the trailing .i is absent, and the octets can
683 be reconstructed from the %HH-encoding and interpreted as UTF-8 by
684 generic URL software. The software part dealing with domain names
685 will carry out the conversion to the .i form.
688 5. Alternate Proposals
692 5.1 The Dillon Proposal
694 The proposal of Michael Dillon [Dillon96] is also based on encoding
695 Unicode into the limited character set of domain names. Distinction
696 is done for each part, using the hyphen in initial position. Because
697 this does not fully conform to the syntax of existing domain names,
698 it is questionable whether it is backwards-compatible. On the other
699 hand, this has the advantage that local i18n domain names can be
700 installed easily without cooperation by the manager of the superdo-
703 A variable-length scheme with base 36 is used that can encode up to
704 1610 characters, absolutely insufficient for Chinese or Japanese.
705 Characters assumed not to be used in i18n domain names are excluded,
706 i.e. only one case is allowed for basic Latin characters. This means
707 that large tables have to be worked out carefully to convert between
708 ISO 10646/Unicode and the actual number that is encoded with base=
712 5.2 Using a Separate Lookup Service
714 Instead of using a special encoding and burdening DNS with i18n, one
715 could build and use a separate lookup service for i18n domain names.
716 Instead of converting to UCS4 and encoding according to Section 3.2,
717 and then calling the DNS resolver, a program would contact this new
718 service when seeing a domain name with characters outside the allowed
721 Such solutions have various problems. There are many directory ser-
722 vices and proposals for how to use them in a way similar to DNS. For
723 an overview and a specific proposal, see [Kle96]. However, while
724 there are many proposals, a real service containing the necessary
725 data and providing the wide installed base and distributed updating
726 is in DNS does not exist.
728 Most directory service proposals also do not offer uniqueness.
729 Defining unique names again for a separate service will duplicate
730 much of the work done for DNS. If uniqueness is not guaranteed, the
734 Expires End of January 1998 [Page 13]
736 Internet Draft Internationalization of Domain Names July 1997
739 user is bundened with additional selection steps.
741 Using a separate lookup service for the internationalization of
742 domain names also results in more complex implementations than the
743 proposal made in this draft. Contrary to what some people might
744 expect, the use of a separate lookup service also does not solve a
745 capacity problem with DNS, because there is no such problem, nor will
746 one be created with the introduction of i18n domain names.
749 6. Generic Considerations
753 6.1 Security Considerations
755 This proposal is believed not to raise any other security considera-
756 tions than the current use of the domain name system.
759 6.2 Internationalization Considerations
761 This proposal addresses internationalization as such. The main addi-
762 tional consideration with respect to internationalization may be the
763 indication of language. However, for concise identifiers such as
764 domain names, language tagging would be too much of a burden and
765 would create complex dependencies with semantics.
768 NOTE -- This section is introduced based on a recommenda-
769 tion in [RFCIAB]. A similar section addressing internation-
770 alization should be included in all application level
771 internet drafts and RFCs.
779 I am grateful in particular to the following persons for their advice
780 or criticism: Bert Bos, Lori Brownell, Michael Dillon, Donald E.
781 Eastlake 3rd, David Goldsmith, Larry Masinter, Ryan Moats, Keith
782 Moore, Thorvardur Kari Olafson, Erik van der Poel, Jurgen Schwertl,
783 Paul A. Vixie, Francois Yergeau, and others.
790 Expires End of January 1998 [Page 14]
792 Internet Draft Internationalization of Domain Names July=
798 [ASCII] Coded Character Set -- 7-Bit American Standard Code
799 for Information Interchange, ANSI X3.4-1986.
801 [Dillon96] M. Dillon, "Multilingual Domain Names", Memra Software
802 Inc., November 1996 (circulated Dec. 6, 1996 on iahc-
805 [HTML-I18N] F. Yergeau, G. Nicol, G. Adams, and M. Duerst, "Inter-
806 nationalization of the Hypertext Markup Language",
807 Work in progress (draft-ietf-html-i18n-05.txt), August
810 [iNORM] M. Duerst, "Normalization of Internationalized Identi-
811 fiers", draft-duerst-i18n-norm-00.txt, July 1997.
813 [ISO3166] ISO 3166, "Code for the representation of names of
814 countries", ISO 3166:1993.
816 [ISO10646] ISO/IEC 10646-1:1993. International standard -- Infor-
817 mation technology -- Universal multiple-octet coded
818 character Set (UCS) -- Part 1: Architecture and basic
821 [Kle96] J. Klensin and T. Wolf, Jr., "Domain Names and Company
822 Name Retrieval", Work in progress (draft-klensin-tld-
823 whois-01.txt), November 1996.
825 [RFC1034] P. Mockapetris, "Domain Names - Concepts and Facili-
826 ties", ISI, Nov. 1987.
828 [RFC1035] P. Mockapetris, "Domain Names - Implementation and
829 Specification", ISI, Nov. 1987.
831 [RFC1522] K. Moore, "MIME (Multipurpose Internet Mail Exten-
832 sions) Part Two: Message Header Extensions for Non-
833 ASCII Text", University of Tennessee, September 1993.
835 [RFC1642] D. Goldsmith, M. Davis, "UTF-7: A Mail-safe Transfor-
836 mation Format of Unicode", Taligent Inc., July 1994.
838 [RFC1730] C. Malamud and M. Rose, "Principles of Operation for
839 the TPC.INT Subdomain: General Principles and Policy",
840 Internet Multicasting Service, October 1993.
842 [RFC1738] T. Berners-Lee, L. Masinter, and M. McCahill,
843 "Uniform Resource Locators (URL)", CERN, Dec. 1994.
847 Expires End of January 1998 [Page 15]
849 Internet Draft Internationalization of Domain Names July 1997
852 [RFC2044] F. Yergeau, "UTF-8, A Transformation Format of Unicode
853 and ISO 10646", Alis Technologies, October 1996.
855 [RFCIAB] C. Weider, C. Preston, K. Simonsen, H. Alvestrand, R.
856 Atkinson, M. Crispin, P. Svanberg, "Report from the
857 IAB Character Set Workshop", October 1996 (currently
858 available as draft-weider-iab-char-wrkshop-00.txt).
860 [Unicode] The Unicode Consortium, "The Unicode Standard, Version
861 2.0", Addison-Wesley, Reading, MA, 1996.
863 [Yer96] F. Yergeau, "Internationalization of URLs", Alis Tech-
866 <http://www.alis.com:8085/~yergeau/url-00.html>.
873 World Wide Web Consortium
874 Keio Research Institute at SFC
880 Tel: +81 466 49 11 70
881 E-mail: mduerst@w3.org
884 NOTE -- Please write the author's name with u-Umlaut wherever
885 possible, e.g. in HTML as Dürst.
904 Expires End of January 1998 [Page 16]