Character set policy BOF, 39th IETF Tuesday 0900-1000 Written by Roland Hedberg Summary: Harald presented the to be IETF policy for the handling of character sets, which briefly consists of the following; all information, in the format of strings for human consumption, transported using IETF protocols must have the character set and language declared. The default character set should be ISO 10646(Unicode) with UTF-8 as transport encoding. Language tags according to RFC 1766 should be used. A short discussion followed which made it quite obvious that there was rough consensus among the people in the room that this was a go thing. Minutes: chair: Harald Tveit Alvestrand, Harald.T.Alvestrand@uninett.no After a run through of the reasoning behind IETF adapting a character set policy some of the points was discussed. It was concluded that the policy should not deal with glyphs since that is a application client business, neither should the IETF deal with problems inside ISO 10646. Undefined issues that should be dealt with within the IETF are things like character set registration and how to define comparison between strings. It was concluded that normalization is a very hard ting to do, it is really a research topic. As are ordering since it is language dependent. Therefore we should initially only deal with comparison between strings. Further on, a proposal was made that protocol element names should be in ASCII as long as we don't have rules for name comparisons. Regarding language tags we do not know what language tags we will need but we do need one tag with the meaning "the language is Unknown". It was also discussed whether we should look to either ISO or the unicode consortium for maintenance of the language tags. A straw poll among the people present in the room showed that there where rough consensus about the four bullets in Haralds proposal. ------- =_aaaaaaaaaa0--