263 lines
9.1 KiB
Groff
263 lines
9.1 KiB
Groff
.\" $NetBSD: nls.7,v 1.3 2003/04/14 06:47:12 gmcgarry Exp $
|
|
.\"
|
|
.\" Copyright (c) 2003 The NetBSD Foundation, Inc.
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" This code is derived from software contributed to The NetBSD Foundation
|
|
.\" by Gregory McGarry.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. All advertising materials mentioning features or use of this software
|
|
.\" must display the following acknowledgement:
|
|
.\" This product includes software developed by the NetBSD
|
|
.\" Foundation, Inc. and its contributors.
|
|
.\" 4. Neither the name of The NetBSD Foundation nor the names of its
|
|
.\" contributors may be used to endorse or promote products derived
|
|
.\" from this software without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
|
|
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
|
|
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
|
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
|
|
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
|
|
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
|
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
|
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
|
|
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
|
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
|
.\" POSSIBILITY OF SUCH DAMAGE.
|
|
.\"
|
|
.Dd February 12, 2003
|
|
.Dt NLS 7
|
|
.Os
|
|
.Sh NAME
|
|
.Nm NLS
|
|
.Nd Natural Language Support Overview
|
|
.Sh DESCRIPTION
|
|
National Language Support (NLS) provides commands for a single
|
|
worldwide operating system base.
|
|
An internationalized system has no built-in assumptions or dependencies
|
|
on language-specific or cultural-specific conventions such as:
|
|
.Pp
|
|
.Bl -bullet -indent -compact
|
|
.It
|
|
Character classifications
|
|
.It
|
|
Character comparison rules
|
|
.It
|
|
Character collation order
|
|
.It
|
|
Numeric and monetary formatting
|
|
.It
|
|
Date and time formatting
|
|
.It
|
|
Message-text language
|
|
.It
|
|
Code sets
|
|
.El
|
|
.Pp
|
|
All information pertaining to cultural conventions and language is
|
|
obtained at program run time.
|
|
.Pp
|
|
.Dq Internationalization
|
|
(often abbreviated
|
|
.Dq i18n )
|
|
refers to the operation by which system software is developed to support
|
|
multiple cultural-specific and language-specific conventions.
|
|
This is a generalization process by which the system is untied from
|
|
calling only English strings or other English-specific conventions.
|
|
.Dq Localization
|
|
(often abbreviated
|
|
.Dq l10n )
|
|
refers to the operations by which the user environment is customized to
|
|
handle its input and output appropriate for specific language and cultural
|
|
conventions.
|
|
This is a specialization process, by which generic methods already
|
|
implemented in an internationalized system are used in specific ways.
|
|
The formal description of cultural conventions for some country, together
|
|
with all associated translations targeted to the native language, is
|
|
called the
|
|
.Dq locale .
|
|
.Pp
|
|
.Nx
|
|
provides extensive support to programmers and system developers to
|
|
enable internationalized software to be developed.
|
|
.Nx
|
|
also supplies a large variety of locales for system localization.
|
|
.Ss Localization of Information
|
|
All locale information is accessible to programs at run time so that
|
|
data is processed and displayed correctly for specific cultural
|
|
conventions and language.
|
|
.Pp
|
|
A locale is divided into categories.
|
|
A category is a group of language-specific and culture-specific conventions
|
|
as outlined in the list above.
|
|
ISO C specifies the following six standard categories supported by
|
|
.Nx :
|
|
.Pp
|
|
.Bl -tag -compact -width LC_MONETARYXX
|
|
.It LC_COLLATE
|
|
string-collation order information
|
|
.It LC_CTYPE
|
|
character classification, case conversion, and other character attributes
|
|
.It LC_MESSAGES
|
|
the format for affirmative and negative responses
|
|
.It LC_MONETARY
|
|
rules and symbols for formatting monetary numeric information
|
|
.It LC_NUMERIC
|
|
rules and symbols for formatting nonmonetary numeric information
|
|
.It LC_TIME
|
|
rules and symbols for formatting time and date information
|
|
.El
|
|
.Pp
|
|
Localization of the system is achieved by setting appropriate values
|
|
in environment variables to identify which locale should be used.
|
|
The environment variables have the same names as their respective
|
|
locale categories. Additionally, the
|
|
.Ev LANG ,
|
|
.Ev LC_ALL ,
|
|
and
|
|
.Ev NLSPATH
|
|
environment variables are used.
|
|
The
|
|
.Ev NLSPATH
|
|
environment variable specifies a colon-separated list of directory names
|
|
where the message catalog files of the NLS database are located.
|
|
The
|
|
.Ev LC_ALL
|
|
and
|
|
.Ev LANG
|
|
environment variables also determine the current locale.
|
|
.Pp
|
|
The values of these environment variables contains a string format as:
|
|
.Pp
|
|
.Bd -literal
|
|
language[_territory][.codeset][@modifier]
|
|
.Ed
|
|
.Pp
|
|
For example, the locale for the Danish language spoken in Denmark
|
|
using the ISO8859-1 code set is da_DK.ISO8859-1.
|
|
The da stands for the Danish language and the DK stands for Denmark.
|
|
The short form of da_DK is sufficient to indicate this locale.
|
|
.Pp
|
|
The environment variable settings are queried by their priority level
|
|
in the following manner:
|
|
.Pp
|
|
.Bl -bullet
|
|
.It
|
|
If the
|
|
.Ev LC_ALL
|
|
environment variable is set, all six categories use the locale it
|
|
specifies.
|
|
.It
|
|
If the
|
|
.Ev LC_ALL
|
|
environment variable is not set, each individual category uses the
|
|
locale specified by its corresponding environment variable.
|
|
.It
|
|
If the
|
|
.Ev LC_ALL
|
|
environment variable is not set, and a value for a particular
|
|
.Ev LC_*
|
|
environment variable is not set, the value of the
|
|
.Ev LANG
|
|
environment variable specifies the default locale for all categories.
|
|
Only the
|
|
.Ev LANG
|
|
environment variable should be set in /etc/profile, since it makes it
|
|
most easy for the user to override the system default using the individual
|
|
.Ev LC_*
|
|
variables.
|
|
.It
|
|
If the
|
|
.Ev LC_ALL
|
|
environment variable is not set, a value for a particular
|
|
.Ev LC_*
|
|
environment variable is not set, and the value of the
|
|
.Ev LANG
|
|
environment variable is not set, the locale for that specific
|
|
category defaults to the C locale.
|
|
The C or POSIX locale assumes the 7-bit ASCII character set and defines
|
|
information for the six categories.
|
|
.El
|
|
.Ss Code Sets
|
|
A character is any symbol used for the organization, control, or
|
|
representation of data.
|
|
A group of such symbols used to describe a
|
|
particular language make up a character set.
|
|
A code set contains the encoding values (conversion from bits to
|
|
displayed characters) for a character set.
|
|
It is the encoding values in a code set that provide
|
|
the interface between the system and its input and output devices.
|
|
.Pp
|
|
The following code sets are supported in
|
|
.Nx
|
|
.Bl -tag -width ISO8859_family
|
|
.It ISO8859 family
|
|
Industry-standard code sets are provided by means of the ISO8859
|
|
family of code sets, which provide a range of single-byte code set
|
|
support that includes Latin-1, Latin-2, Arabic, Cyrillic, Hebrew,
|
|
Greek, and Turkish.
|
|
The eucJP code set is the industry-standard code set used to support
|
|
the Japanese locale.
|
|
.It Unicode
|
|
A Unicode environment based on the UTF-8 codeset is supported for all
|
|
supported language/territories.
|
|
UTF-8 provides character support for most of the major languages of the
|
|
world and can be used in environments where multiple languages must be
|
|
processed simultaneously.
|
|
.El
|
|
.Ss Internationalization for Programmers
|
|
To facilitate translations of messages into various languages and to
|
|
make the translated messages available to the program based on a
|
|
user's locale, it is necessary to keep messages separate from the
|
|
programs and provide them in the form of message catalogs that a
|
|
program can access at run time.
|
|
.Pp
|
|
Access to locale information is provided through the
|
|
.Xr setlocale 3
|
|
and
|
|
.Xr nl_langinfo 3
|
|
interfaces.
|
|
See their respective man pages for further information.
|
|
.Pp
|
|
Message source files containing application messages are created by
|
|
the programmer and converted to message catalogs.
|
|
These catalogs are used by the application to retrieve and display
|
|
messages, as needed.
|
|
.Pp
|
|
.Nx
|
|
supports two message catalog interfaces: the X/Open
|
|
.Xr catgets 3
|
|
interface and the Uniforum
|
|
.Xr gettext 3
|
|
interface.
|
|
The
|
|
.Xr catgets 3
|
|
interface has the advantage that it belongs to a standard which is
|
|
well supported.
|
|
Unfortunately the interface is complicated to use and
|
|
maintenance of the catalogs is difficult.
|
|
The implementation also doesn't support different codesets.
|
|
The
|
|
.Xr gettext 3
|
|
interface has not been standardized yet, however it is being supported
|
|
by an increasing number of systems.
|
|
It also provides many additional tools which make programming and
|
|
catalog maintenance much easier.
|
|
.Sh SEE ALSO
|
|
.Xr gencat 1 ,
|
|
.Xr catgets 3 ,
|
|
.Xr gettext 3 ,
|
|
.Xr nl_langinfo 3 ,
|
|
.Xr setlocale 3
|
|
.Sh BUGS
|
|
This man page is incomplete.
|