l10n.lang.value.rfc.compliant — Make value of lang attribute RFC compliant?


<xsl:param name="l10n.lang.value.rfc.compliant" select="1"></xsl:param>


If non-zero, ensure that the values for all lang attributes in HTML output are RFC compliant[2]. by taking any underscore characters in any lang values found in source documents, and replacing them with hyphen characters in output HTML files. For example, zh_CN in a source document becomes zh-CN in the HTML output form that source.


This parameter does not cause any case change in lang values, because RFC 1766 explicitly states that all "language tags" (as it calls them) "are to be treated as case insensitive".

[2] Section 8.1.1, Language Codes, in the HTML 4.0 Recommendation states that:

[RFC1766] defines and explains the language codes that must be used in HTML documents.

Briefly, language codes consist of a primary code and a possibly empty series of subcodes:

language-code = primary-code ( "-" subcode )*

And in RFC 1766, Tags for the Identification of Languages, the EBNF for "language tag" is given as:

Language-Tag = Primary-tag *( "-" Subtag )
Primary-tag = 1*8ALPHA
Subtag = 1*8ALPHA