994 lines
47 KiB
Plaintext
994 lines
47 KiB
Plaintext
This is Info file gettext.info, produced by Makeinfo version 1.68 from
|
||
the input file gettext.texi.
|
||
|
||
INFO-DIR-SECTION GNU Gettext Utilities
|
||
START-INFO-DIR-ENTRY
|
||
* Gettext: (gettext). GNU gettext utilities.
|
||
* gettextize: (gettext)gettextize Invocation. Prepare a package for gettext.
|
||
* msgfmt: (gettext)msgfmt Invocation. Make MO files out of PO files.
|
||
* msgmerge: (gettext)msgmerge Invocation. Update two PO files into one.
|
||
* xgettext: (gettext)xgettext Invocation. Extract strings into a PO file.
|
||
END-INFO-DIR-ENTRY
|
||
|
||
This file provides documentation for GNU `gettext' utilities. It
|
||
also serves as a reference for the free Translation Project.
|
||
|
||
Copyright (C) 1995, 1996, 1997 Free Software Foundation, Inc.
|
||
|
||
Permission is granted to make and distribute verbatim copies of this
|
||
manual provided the copyright notice and this permission notice are
|
||
preserved on all copies.
|
||
|
||
Permission is granted to copy and distribute modified versions of
|
||
this manual under the conditions for verbatim copying, provided that
|
||
the entire resulting derived work is distributed under the terms of a
|
||
permission notice identical to this one.
|
||
|
||
Permission is granted to copy and distribute translations of this
|
||
manual into another language, under the above conditions for modified
|
||
versions, except that this permission notice may be stated in a
|
||
translation approved by the Foundation.
|
||
|
||
|
||
File: gettext.info, Node: Top, Next: Introduction, Prev: (dir), Up: (dir)
|
||
|
||
GNU `gettext' utilities
|
||
***********************
|
||
|
||
* Menu:
|
||
|
||
* Introduction:: Introduction
|
||
* Basics:: PO Files and PO Mode Basics
|
||
* Sources:: Preparing Program Sources
|
||
* Initial:: Making the Initial PO File
|
||
* Updating:: Updating Existing PO Files
|
||
* Binaries:: Producing Binary MO Files
|
||
* Users:: The User's View
|
||
* Programmers:: The Programmer's View
|
||
* Translators:: The Translator's View
|
||
* Maintainers:: The Maintainer's View
|
||
* Conclusion:: Concluding Remarks
|
||
|
||
* Country Codes:: ISO 639 country codes
|
||
|
||
-- The Detailed Node Listing --
|
||
|
||
Introduction
|
||
|
||
* Why:: The Purpose of GNU `gettext'
|
||
* Concepts:: I18n, L10n, and Such
|
||
* Aspects:: Aspects in Native Language Support
|
||
* Files:: Files Conveying Translations
|
||
* Overview:: Overview of GNU `gettext'
|
||
|
||
PO Files and PO Mode Basics
|
||
|
||
* Installation:: Completing GNU `gettext' Installation
|
||
* PO Files:: The Format of PO Files
|
||
* Main PO Commands:: Main Commands
|
||
* Entry Positioning:: Entry Positioning
|
||
* Normalizing:: Normalizing Strings in Entries
|
||
|
||
Preparing Program Sources
|
||
|
||
* Triggering:: Triggering `gettext' Operations
|
||
* Mark Keywords:: How Marks Appears in Sources
|
||
* Marking:: Marking Translatable Strings
|
||
* c-format:: Telling something about the following string
|
||
* Special cases:: Special Cases of Translatable Strings
|
||
|
||
Making the Initial PO File
|
||
|
||
* xgettext Invocation:: Invoking the `xgettext' Program
|
||
* C Sources Context:: C Sources Context
|
||
* Compendium:: Using Translation Compendiums
|
||
|
||
Updating Existing PO Files
|
||
|
||
* msgmerge Invocation:: Invoking the `msgmerge' Program
|
||
* Translated Entries::
|
||
* Fuzzy Entries:: Fuzzy translated Entries
|
||
* Untranslated Entries:: Untranslated Entries
|
||
* Obsolete Entries:: Obsolete Entries
|
||
* Modifying Translations:: Modifying Translations
|
||
* Modifying Comments:: Modifying Comments
|
||
* Auxiliary:: Consulting Auxiliary PO Files
|
||
|
||
Producing Binary MO Files
|
||
|
||
* msgfmt Invocation:: Invoking the `msgfmt' Program
|
||
* MO Files:: The Format of GNU MO Files
|
||
|
||
The User's View
|
||
|
||
* Matrix:: The Current `ABOUT-NLS' Matrix
|
||
* Installers:: Magic for Installers
|
||
* End Users:: Magic for End Users
|
||
|
||
The Programmer's View
|
||
|
||
* catgets:: About `catgets'
|
||
* gettext:: About `gettext'
|
||
* Comparison:: Comparing the two interfaces
|
||
* Using libintl.a:: Using libintl.a in own programs
|
||
* gettext grok:: Being a `gettext' grok
|
||
* Temp Programmers:: Temporary Notes for the Programmers Chapter
|
||
|
||
About `catgets'
|
||
|
||
* Interface to catgets:: The interface
|
||
* Problems with catgets:: Problems with the `catgets' interface?!
|
||
|
||
About `gettext'
|
||
|
||
* Interface to gettext:: The interface
|
||
* Ambiguities:: Solving ambiguities
|
||
* Locating Catalogs:: Locating message catalog files
|
||
* Optimized gettext:: Optimization of the *gettext functions
|
||
|
||
Temporary Notes for the Programmers Chapter
|
||
|
||
* Temp Implementations:: Temporary - Two Possible Implementations
|
||
* Temp catgets:: Temporary - About `catgets'
|
||
* Temp WSI:: Temporary - Why a single implementation
|
||
* Temp Notes:: Temporary - Notes
|
||
|
||
The Translator's View
|
||
|
||
* Trans Intro 0:: Introduction 0
|
||
* Trans Intro 1:: Introduction 1
|
||
* Discussions:: Discussions
|
||
* Organization:: Organization
|
||
* Information Flow:: Information Flow
|
||
|
||
Organization
|
||
|
||
* Central Coordination:: Central Coordination
|
||
* National Teams:: National Teams
|
||
* Mailing Lists:: Mailing Lists
|
||
|
||
National Teams
|
||
|
||
* Sub-Cultures:: Sub-Cultures
|
||
* Organizational Ideas:: Organizational Ideas
|
||
|
||
The Maintainer's View
|
||
|
||
* Flat and Non-Flat:: Flat or Non-Flat Directory Structures
|
||
* Prerequisites:: Prerequisite Works
|
||
* gettextize Invocation:: Invoking the `gettextize' Program
|
||
* Adjusting Files:: Files You Must Create or Alter
|
||
|
||
Files You Must Create or Alter
|
||
|
||
* po/POTFILES.in:: `POTFILES.in' in `po/'
|
||
* configure.in:: `configure.in' at top level
|
||
* aclocal:: `aclocal.m4' at top level
|
||
* acconfig:: `acconfig.h' at top level
|
||
* Makefile:: `Makefile.in' at top level
|
||
* src/Makefile:: `Makefile.in' in `src/'
|
||
|
||
Concluding Remarks
|
||
|
||
* History:: History of GNU `gettext'
|
||
* References:: Related Readings
|
||
|
||
|
||
File: gettext.info, Node: Introduction, Next: Basics, Prev: Top, Up: Top
|
||
|
||
Introduction
|
||
************
|
||
|
||
This manual is still in *DRAFT* state. Some sections are still
|
||
empty, or almost. We keep merging material from other sources
|
||
(essentially e-mail folders) while the proper integration of this
|
||
material is delayed.
|
||
|
||
In this manual, we use *he* when speaking of the programmer or
|
||
maintainer, *she* when speaking of the translator, and *they* when
|
||
speaking of the installers or end users of the translated program.
|
||
This is only a convenience for clarifying the documentation. It is
|
||
*absolutely* not meant to imply that some roles are more appropriate to
|
||
males or females. Besides, as you might guess, GNU `gettext' is meant
|
||
to be useful for people using computers, whatever their sex, race,
|
||
religion or nationality!
|
||
|
||
This chapter explains the goals sought in the creation of GNU
|
||
`gettext' and the free Translation Project. Then, it explains a few
|
||
broad concepts around Native Language Support, and positions message
|
||
translation with regard to other aspects of national and cultural
|
||
variance, as they apply to to programs. It also surveys those files
|
||
used to convey the translations. It explains how the various tools
|
||
interact in the initial generation of these files, and later, how the
|
||
maintenance cycle should usually operate.
|
||
|
||
Please send suggestions and corrections to:
|
||
|
||
Internet address:
|
||
bug-gnu-utils@prep.ai.mit.edu
|
||
|
||
Please include the manual's edition number and update date in your
|
||
messages.
|
||
|
||
* Menu:
|
||
|
||
* Why:: The Purpose of GNU `gettext'
|
||
* Concepts:: I18n, L10n, and Such
|
||
* Aspects:: Aspects in Native Language Support
|
||
* Files:: Files Conveying Translations
|
||
* Overview:: Overview of GNU `gettext'
|
||
|
||
|
||
File: gettext.info, Node: Why, Next: Concepts, Prev: Introduction, Up: Introduction
|
||
|
||
The Purpose of GNU `gettext'
|
||
============================
|
||
|
||
Usually, programs are written and documented in English, and use
|
||
English at execution time to interact with users. This is true not
|
||
only of GNU software, but also of a great deal of commercial and free
|
||
software. Using a common language is quite handy for communication
|
||
between developers, maintainers and users from all countries. On the
|
||
other hand, most people are less comfortable with English than with
|
||
their own native language, and would prefer to use their mother tongue
|
||
for day to day's work, as far as possible. Many would simply *love* to
|
||
see their computer screen showing a lot less of English, and far more
|
||
of their own language.
|
||
|
||
However, to many people, this dream might appear so far fetched that
|
||
they may believe it is not even worth spending time thinking about it.
|
||
They have no confidence at all that the dream might ever become true.
|
||
Yet some have not lost hope, and have organized themselves. The
|
||
Translation Project is a formalization of this hope into a workable
|
||
structure, which has a good chance to get all of us nearer the
|
||
achievement of a truly multi-lingual set of programs.
|
||
|
||
GNU `gettext' is an important step for the Translation Project, as
|
||
it is an asset on which we may build many other steps. This package
|
||
offers to programmers, translators and even users, a well integrated
|
||
set of tools and documentation. Specifically, the GNU `gettext'
|
||
utilities are a set of tools that provides a framework within which
|
||
other free packages may produce multi-lingual messages. These tools
|
||
include a set of conventions about how programs should be written to
|
||
support message catalogs, a directory and file naming organization for
|
||
the message catalogs themselves, a runtime library supporting the
|
||
retrieval of translated messages, and a few stand-alone programs to
|
||
massage in various ways the sets of translatable strings, or already
|
||
translated strings. A special mode for GNU Emacs also helps ease
|
||
interested parties into preparing these sets, or bringing them up to
|
||
date.
|
||
|
||
GNU `gettext' is designed to minimize the impact of
|
||
internationalization on program sources, keeping this impact as small
|
||
and hardly noticeable as possible. Internationalization has better
|
||
chances of succeeding if it is very light weighted, or at least, appear
|
||
to be so, when looking at program sources.
|
||
|
||
The Translation Project also uses the GNU `gettext' distribution as
|
||
a vehicle for documenting its structure and methods. This goes beyond
|
||
the strict technicalities of documenting the GNU `gettext' proper. By
|
||
so doing, translators will find in a single place, as far as possible,
|
||
all they need to know for properly doing their translating work. Also,
|
||
this supplemental documentation might also help programmers, and even
|
||
curious users, in understanding how GNU `gettext' is related to the
|
||
remainder of the Translation Project, and consequently, have a glimpse
|
||
at the *big picture*.
|
||
|
||
|
||
File: gettext.info, Node: Concepts, Next: Aspects, Prev: Why, Up: Introduction
|
||
|
||
I18n, L10n, and Such
|
||
====================
|
||
|
||
Two long words appear all the time when we discuss support of native
|
||
language in programs, and these words have a precise meaning, worth
|
||
being explained here, once and for all in this document. The words are
|
||
*internationalization* and *localization*. Many people, tired of
|
||
writing these long words over and over again, took the habit of writing
|
||
"i18n" and "l10n" instead, quoting the first and last letter of each
|
||
word, and replacing the run of intermediate letters by a number merely
|
||
telling how many such letters there are. But in this manual, in the
|
||
sake of clarity, we will patiently write the names in full, each time...
|
||
|
||
By "internationalization", one refers to the operation by which a
|
||
program, or a set of programs turned into a package, is made aware of
|
||
and able to support multiple languages. This is a generalization
|
||
process, by which the programs are untied from calling only English
|
||
strings or other English specific habits, and connected to generic ways
|
||
of doing the same, instead. Program developers may use various
|
||
techniques to internationalize their programs. Some of these have been
|
||
standardized. GNU `gettext' offers one of these standards. *Note
|
||
Programmers::.
|
||
|
||
By "localization", one means the operation by which, in a set of
|
||
programs already internationalized, one gives the program all needed
|
||
information so that it can adapt itself to handle its input and output
|
||
in a fashion which is correct for some native language and cultural
|
||
habits. This is a particularisation process, by which generic methods
|
||
already implemented in an internationalized program are used in
|
||
specific ways. The programming environment puts several functions to
|
||
the programmers disposal which allow this runtime configuration. The
|
||
formal description of specific set of cultural habits for some country,
|
||
together with all associated translations targeted to the same native
|
||
language, is called the "locale" for this language or country. Users
|
||
achieve localization of programs by setting proper values to special
|
||
environment variables, prior to executing those programs, identifying
|
||
which locale should be used.
|
||
|
||
In fact, locale message support is only one component of the cultural
|
||
data that makes up a particular locale. There are a whole host of
|
||
routines and functions provided to aid programmers in developing
|
||
internationalized software and which allow them to access the data
|
||
stored in a particular locale. When someone presently refers to a
|
||
particular locale, they are obviously referring to the data stored
|
||
within that particular locale. Similarly, if a programmer is referring
|
||
to "accessing the locale routines", they are referring to the complete
|
||
suite of routines that access all of the locale's information.
|
||
|
||
One uses the expression "Native Language Support", or merely NLS,
|
||
for speaking of the overall activity or feature encompassing both
|
||
internationalization and localization, allowing for multi-lingual
|
||
interactions in a program. In a nutshell, one could say that
|
||
internationalization is the operation by which further localizations
|
||
are made possible.
|
||
|
||
Also, very roughly said, when it comes to multi-lingual messages,
|
||
internationalization is usually taken care of by programmers, and
|
||
localization is usually taken care of by translators.
|
||
|
||
|
||
File: gettext.info, Node: Aspects, Next: Files, Prev: Concepts, Up: Introduction
|
||
|
||
Aspects in Native Language Support
|
||
==================================
|
||
|
||
For a totally multi-lingual distribution, there are many things to
|
||
translate beyond output messages.
|
||
|
||
* As of today, GNU `gettext' offers a complete toolset for
|
||
translating messages output by C programs. Perl scripts and shell
|
||
scripts will also need to be translated. Even if there are today
|
||
some hooks by which this can be done, these hooks are not
|
||
integrated as well as they should be.
|
||
|
||
* Some programs, like `autoconf' or `bison', are able to produce
|
||
other programs (or scripts). Even if the generating programs
|
||
themselves are internationalized, the generated programs they
|
||
produce may need internationalization on their own, and this
|
||
indirect internationalization could be automated right from the
|
||
generating program. In fact, quite usually, generating and
|
||
generated programs could be internationalized independently, as
|
||
the effort needed is fairly orthogonal.
|
||
|
||
* A few programs include textual tables which might need translation
|
||
themselves, independently of the strings contained in the program
|
||
itself. For example, RFC 1345 gives an English description for
|
||
each character which GNU `recode' is able to reconstruct at
|
||
execution. Since these descriptions are extracted from the RFC by
|
||
mechanical means, translating them properly would require a prior
|
||
translation of the RFC itself.
|
||
|
||
* Almost all programs accept options, which are often worded out so
|
||
to be descriptive for the English readers; one might want to
|
||
consider offering translated versions for program options as well.
|
||
|
||
* Many programs read, interpret, compile, or are somewhat driven by
|
||
input files which are texts containing keywords, identifiers, or
|
||
replies which are inherently translatable. For example, one may
|
||
want `gcc' to allow diacriticized characters in identifiers or use
|
||
translated keywords; `rm -i' might accept something else than `y'
|
||
or `n' for replies, etc. Even if the program will eventually make
|
||
most of its output in the foreign languages, one has to decide
|
||
whether the input syntax, option values, etc., are to be localized
|
||
or not.
|
||
|
||
* The manual accompanying a package, as well as all documentation
|
||
files in the distribution, could surely be translated, too.
|
||
Translating a manual, with the intent of later keeping up with
|
||
updates, is a major undertaking in itself, generally.
|
||
|
||
As we already stressed, translation is only one aspect of locales.
|
||
Other internationalization aspects are not currently handled by GNU
|
||
`gettext', but perhaps may be handled in future versions. There are
|
||
many attributes that are needed to define a country's cultural
|
||
conventions. These attributes include beside the country's native
|
||
language, the formatting of the date and time, the representation of
|
||
numbers, the symbols for currency, etc. These local "rules" are termed
|
||
the country's locale. The locale represents the knowledge needed to
|
||
support the country's native attributes.
|
||
|
||
There are a few major areas which may vary between countries and
|
||
hence, define what a locale must describe. The following list helps
|
||
putting multi-lingual messages into the proper context of other tasks
|
||
related to locales, and also presents some other areas which GNU
|
||
`gettext' might eventually tackle, maybe, one of these days.
|
||
|
||
*Characters and Codesets*
|
||
The codeset most commonly used through out the USA and most English
|
||
speaking parts of the world is the ASCII codeset. However, there
|
||
are many characters needed by various locales that are not found
|
||
within this codeset. The 8-bit ISO 8859-1 code set has most of
|
||
the special characters needed to handle the major European
|
||
languages. However, in many cases, the ISO 8859-1 font is not
|
||
adequate. Hence each locale will need to specify which codeset
|
||
they need to use and will need to have the appropriate character
|
||
handling routines to cope with the codeset.
|
||
|
||
*Currency*
|
||
The symbols used vary from country to country as does the position
|
||
used by the symbol. Software needs to be able to transparently
|
||
display currency figures in the native mode for each locale.
|
||
|
||
*Dates*
|
||
The format of date varies between locales. For example, Christmas
|
||
day in 1994 is written as 12/25/94 in the USA and as 25/12/94 in
|
||
Australia. Other countries might use ISO 8061 dates, etc.
|
||
|
||
Time of the day may be noted as HH:MM, HH.MM, or otherwise. Some
|
||
locales require time to be specified in 24-hour mode rather than
|
||
as AM or PM. Further, the nature and yearly extent of the
|
||
Daylight Saving correction vary widely between countries.
|
||
|
||
*Numbers*
|
||
Numbers can be represented differently in different locales. For
|
||
example, the following numbers are all written correctly for their
|
||
respective locales:
|
||
|
||
12,345.67 English
|
||
12.345,67 French
|
||
1,2345.67 Asia
|
||
|
||
Some programs could go further and use different unit systems, like
|
||
English units or Metric units, or even take into account variants
|
||
about how numbers are spelled in full.
|
||
|
||
*Messages*
|
||
The most obvious area is the language support within a locale.
|
||
This is where GNU `gettext' provides the means for developers and
|
||
users to easily change the language that the software uses to
|
||
communicate to the user.
|
||
|
||
In the near future we see no chance that components of locale
|
||
outside of message handling will be made available for use in other
|
||
packages. The reason for this is that most modern systems provide a
|
||
more or less reasonable support for at least some of the missing
|
||
components. Another point is that the GNU `libc' and Linux will get a
|
||
new and complete implementation of the whole locale functionality which
|
||
could be adopted by system lacking a reasonable locale support.
|
||
|
||
|
||
File: gettext.info, Node: Files, Next: Overview, Prev: Aspects, Up: Introduction
|
||
|
||
Files Conveying Translations
|
||
============================
|
||
|
||
The letters PO in `.po' files means Portable Object, to distinguish
|
||
it from `.mo' files, where MO stands for Machine Object. This
|
||
paradigm, as well as the PO file format, is inspired by the NLS
|
||
standard developed by Uniforum, and implemented by Sun in their Solaris
|
||
system.
|
||
|
||
PO files are meant to be read and edited by humans, and associate
|
||
each original, translatable string of a given package with its
|
||
translation in a particular target language. A single PO file is
|
||
dedicated to a single target language. If a package supports many
|
||
languages, there is one such PO file per language supported, and each
|
||
package has its own set of PO files. These PO files are best created by
|
||
the `xgettext' program, and later updated or refreshed through the
|
||
`msgmerge' program. Program `xgettext' extracts all marked messages
|
||
from a set of C files and initializes a PO file with empty
|
||
translations. Program `msgmerge' takes care of adjusting PO files
|
||
between releases of the corresponding sources, commenting obsolete
|
||
entries, initializing new ones, and updating all source line
|
||
references. Files ending with `.pot' are kind of base translation
|
||
files found in distributions, in PO file format, and `.pox' files are
|
||
often temporary PO files.
|
||
|
||
MO files are meant to be read by programs, and are binary in nature.
|
||
A few systems already offer tools for creating and handling MO files as
|
||
part of the Native Language Support coming with the system, but the
|
||
format of these MO files is often different from system to system, and
|
||
non-portable. They do not necessary use `.mo' for file extensions, but
|
||
since system libraries are also used for accessing these files, it
|
||
works as long as the system is self-consistent about it. If GNU
|
||
`gettext' is able to interface with the tools already provided with
|
||
systems, it will consequently let these provided tools take care of
|
||
generating the MO files. Or else, if such tools are not found or do
|
||
not seem usable, GNU `gettext' will use its own ways and its own format
|
||
for MO files. Files ending with `.gmo' are really MO files, when it is
|
||
known that these files use the GNU format.
|
||
|
||
|
||
File: gettext.info, Node: Overview, Prev: Files, Up: Introduction
|
||
|
||
Overview of GNU `gettext'
|
||
=========================
|
||
|
||
The following diagram summarizes the relation between the files
|
||
handled by GNU `gettext' and the tools acting on these files. It is
|
||
followed by a somewhat detailed explanations, which you should read
|
||
while keeping an eye on the diagram. Having a clear understanding of
|
||
these interrelations would surely help programmers, translators and
|
||
maintainers.
|
||
|
||
Original C Sources ---> PO mode ---> Marked C Sources ---.
|
||
|
|
||
.---------<--- GNU gettext Library |
|
||
.--- make <---+ |
|
||
| `---------<--------------------+-----------'
|
||
| |
|
||
| .-----<--- PACKAGE.pot <--- xgettext <---' .---<--- PO Compendium
|
||
| | | ^
|
||
| | `---. |
|
||
| `---. +---> PO mode ---.
|
||
| +----> msgmerge ------> LANG.pox --->--------' |
|
||
| .---' |
|
||
| | |
|
||
| `-------------<---------------. |
|
||
| +--- LANG.po <--- New LANG.pox <----'
|
||
| .--- LANG.gmo <--- msgfmt <---'
|
||
| |
|
||
| `---> install ---> /.../LANG/PACKAGE.mo ---.
|
||
| +---> "Hello world!"
|
||
`-------> install ---> /.../bin/PROGRAM -------'
|
||
|
||
The indication `PO mode' appears in two places in this picture, and
|
||
you may safely read it as merely meaning "hand editing", using any
|
||
editor of your choice, really. However, for those of you being the
|
||
lucky users of GNU Emacs, PO mode has been specifically created for
|
||
providing a cozy environment for editing or modifying PO files. While
|
||
editing a PO file, PO mode allows for the easy browsing of auxiliary
|
||
and compendium PO files, as well as for following references into the
|
||
set of C program sources from which PO files have been derived. It has
|
||
a few special features, among which are the interactive marking of
|
||
program strings as translatable, and the validatation of PO files with
|
||
easy repositioning to PO file lines showing errors.
|
||
|
||
As a programmer, the first step to bringing GNU `gettext' into your
|
||
package is identifying, right in the C sources, those strings which are
|
||
meant to be translatable, and those which are untranslatable. This
|
||
tedious job can be done a little more comfortably using emacs PO mode,
|
||
but you can use any means familiar to you for modifying your C sources.
|
||
Beside this some other simple, standard changes are needed to properly
|
||
initialize the translation library. *Note Sources::, for more
|
||
information about all this.
|
||
|
||
For newly written software the strings of course can and should be
|
||
marked while writing the it. The `gettext' approach makes this very
|
||
easy. Simply put the following lines at the beginning of each file or
|
||
in a central header file:
|
||
|
||
#define _(String) (String)
|
||
#define N_(String) (String)
|
||
#define textdomain(Domain)
|
||
#define bindtextdomain(Package, Directory)
|
||
|
||
Doing this allows you to prepare the sources for internationalization.
|
||
Later when you feel ready for the step to use the `gettext' library
|
||
simply remove these definitions, include `libintl.h' and link against
|
||
`libintl.a'. That is all you have to change.
|
||
|
||
Once the C sources have been modified, the `xgettext' program is
|
||
used to find and extract all translatable strings, and create an
|
||
initial PO file out of all these. This `PACKAGE.pot' file contains all
|
||
original program strings. It has sets of pointers to exactly where in
|
||
C sources each string is used. All translations are set to empty. The
|
||
letter `t' in `.pot' marks this as a Template PO file, not yet oriented
|
||
towards any particular language. *Note xgettext Invocation::, for more
|
||
details about how one calls the `xgettext' program. If you are
|
||
*really* lazy, you might be interested at working a lot more right
|
||
away, and preparing the whole distribution setup (*note
|
||
Maintainers::.). By doing so, you spare yourself typing the `xgettext'
|
||
command, as `make' should now generate the proper things automatically
|
||
for you!
|
||
|
||
The first time through, there is no `LANG.po' yet, so the `msgmerge'
|
||
step may be skipped and replaced by a mere copy of `PACKAGE.pot' to
|
||
`LANG.pox', where LANG represents the target language.
|
||
|
||
Then comes the initial translation of messages. Translation in
|
||
itself is a whole matter, still exclusively meant for humans, and whose
|
||
complexity far overwhelms the level of this manual. Nevertheless, a
|
||
few hints are given in some other chapter of this manual (*note
|
||
Translators::.). You will also find there indications about how to
|
||
contact translating teams, or becoming part of them, for sharing your
|
||
translating concerns with others who target the same native language.
|
||
|
||
While adding the translated messages into the `LANG.pox' PO file, if
|
||
you do not have GNU Emacs handy, you are on your own for ensuring that
|
||
your efforts fully respect the PO file format, and quoting conventions
|
||
(*note PO Files::.). This is surely not an impossible task, as this is
|
||
the way many people have handled PO files already for Uniforum or
|
||
Solaris. On the other hand, by using PO mode in GNU Emacs, most details
|
||
of PO file format are taken care of for you, but you have to acquire
|
||
some familiarity with PO mode itself. Besides main PO mode commands
|
||
(*note Main PO Commands::.), you should know how to move between entries
|
||
(*note Entry Positioning::.), and how to handle untranslated entries
|
||
(*note Untranslated Entries::.).
|
||
|
||
If some common translations have already been saved into a compendium
|
||
PO file, translators may use PO mode for initializing untranslated
|
||
entries from the compendium, and also save selected translations into
|
||
the compendium, updating it (*note Compendium::.). Compendium files
|
||
are meant to be exchanged between members of a given translation team.
|
||
|
||
Programs, or packages of programs, are dynamic in nature: users write
|
||
bug reports and suggestion for improvements, maintainers react by
|
||
modifying programs in various ways. The fact that a package has
|
||
already been internationalized should not make maintainers shy of
|
||
adding new strings, or modifying strings already translated. They just
|
||
do their job the best they can. For the Translation Project to work
|
||
smoothly, it is important that maintainers do not carry translation
|
||
concerns on their already loaded shoulders, and that translators be
|
||
kept as free as possible of programmatic concerns.
|
||
|
||
The only concern maintainers should have is carefully marking new
|
||
strings as translatable, when they should be, and do not otherwise
|
||
worry about them being translated, as this will come in proper time.
|
||
Consequently, when programs and their strings are adjusted in various
|
||
ways by maintainers, and for matters usually unrelated to translation,
|
||
`xgettext' would construct `PACKAGE.pot' files which are evolving over
|
||
time, so the translations carried by `LANG.po' are slowly fading out of
|
||
date.
|
||
|
||
It is important for translators (and even maintainers) to understand
|
||
that package translation is a continuous process in the lifetime of a
|
||
package, and not something which is done once and for all at the start.
|
||
After an initial burst of translation activity for a given package,
|
||
interventions are needed once in a while, because here and there,
|
||
translated entries become obsolete, and new untranslated entries
|
||
appear, needing translation.
|
||
|
||
The `msgmerge' program has the purpose of refreshing an already
|
||
existing `LANG.po' file, by comparing it with a newer `PACKAGE.pot'
|
||
template file, extracted by `xgettext' out of recent C sources. The
|
||
refreshing operation adjusts all references to C source locations for
|
||
strings, since these strings move as programs are modified. Also,
|
||
`msgmerge' comments out as obsolete, in `LANG.pox', those already
|
||
translated entries which are no longer used in the program sources
|
||
(*note Obsolete Entries::.). It finally discovers new strings and
|
||
inserts them in the resulting PO file as untranslated entries (*note
|
||
Untranslated Entries::.). *Note msgmerge Invocation::, for more
|
||
information about what `msgmerge' really does.
|
||
|
||
Whatever route or means taken, the goal is to obtain an updated
|
||
`LANG.pox' file offering translations for all strings. When this is
|
||
properly achieved, this file `LANG.pox' may take the place of the
|
||
previous official `LANG.po' file.
|
||
|
||
The temporal mobility, or fluidity of PO files, is an integral part
|
||
of the translation game, and should be well understood, and accepted.
|
||
People resisting it will have a hard time participating in the
|
||
Translation Project, or will give a hard time to other participants! In
|
||
particular, maintainers should relax and include all available official
|
||
PO files in their distributions, even if these have not recently been
|
||
updated, without banging or otherwise trying to exert pressure on the
|
||
translator teams to get the job done. The pressure should rather come
|
||
from the community of users speaking a particular language, and
|
||
maintainers should consider themselves fairly relieved of any concern
|
||
about the adequacy of translation files. On the other hand, translators
|
||
should reasonably try updating the PO files they are responsible for,
|
||
while the package is undergoing pretest, prior to an official
|
||
distribution.
|
||
|
||
Once the PO file is complete and dependable, the `msgfmt' program is
|
||
used for turning the PO file into a machine-oriented format, which may
|
||
yield efficient retrieval of translations by the programs of the
|
||
package, whenever needed at runtime (*note MO Files::.). *Note msgfmt
|
||
Invocation::, for more information about all modalities of execution
|
||
for the `msgfmt' program.
|
||
|
||
Finally, the modified and marked C sources are compiled and linked
|
||
with the GNU `gettext' library, usually through the operation of
|
||
`make', given a suitable `Makefile' exists for the project, and the
|
||
resulting executable is installed somewhere users will find it. The MO
|
||
files themselves should also be properly installed. Given the
|
||
appropriate environment variables are set (*note End Users::.), the
|
||
program should localize itself automatically, whenever it executes.
|
||
|
||
The remainder of this manual has the purpose of explaining in depth
|
||
the various steps outlined above.
|
||
|
||
|
||
File: gettext.info, Node: Basics, Next: Sources, Prev: Introduction, Up: Top
|
||
|
||
PO Files and PO Mode Basics
|
||
***************************
|
||
|
||
The GNU `gettext' toolset helps programmers and translators at
|
||
producing, updating and using translation files, mainly those PO files
|
||
which are textual, editable files. This chapter stresses the format of
|
||
PO files, and contains a PO mode starter. PO mode description is
|
||
spread throughout this manual instead of being concentrated in one
|
||
place. Here we present only the basics of PO mode.
|
||
|
||
* Menu:
|
||
|
||
* Installation:: Completing GNU `gettext' Installation
|
||
* PO Files:: The Format of PO Files
|
||
* Main PO Commands:: Main Commands
|
||
* Entry Positioning:: Entry Positioning
|
||
* Normalizing:: Normalizing Strings in Entries
|
||
|
||
|
||
File: gettext.info, Node: Installation, Next: PO Files, Prev: Basics, Up: Basics
|
||
|
||
Completing GNU `gettext' Installation
|
||
=====================================
|
||
|
||
Once you have received, unpacked, configured and compiled the GNU
|
||
`gettext' distribution, the `make install' command puts in place the
|
||
programs `xgettext', `msgfmt', `gettext', and `msgmerge', as well as
|
||
their available message catalogs. To top off a comfortable
|
||
installation, you might also want to make the PO mode available to your
|
||
GNU Emacs users.
|
||
|
||
During the installation of the PO mode, you might want modify your
|
||
file `.emacs', once and for all, so it contains a few lines looking
|
||
like:
|
||
|
||
(setq auto-mode-alist
|
||
(cons '("\\.po[tx]?\\'\\|\\.po\\." . po-mode) auto-mode-alist))
|
||
(autoload 'po-mode "po-mode")
|
||
|
||
Later, whenever you edit some `.po', `.pot' or `.pox' file, or any
|
||
file having the string `.po.' within its name, Emacs loads
|
||
`po-mode.elc' (or `po-mode.el') as needed, and automatically activates
|
||
PO mode commands for the associated buffer. The string *PO* appears in
|
||
the mode line for any buffer for which PO mode is active. Many PO
|
||
files may be active at once in a single Emacs session.
|
||
|
||
If you are using Emacs version 20 or better, and have already
|
||
installed the appropriate international fonts on your system, you may
|
||
also manage for the these fonts to be automatically loaded and used for
|
||
displaying the translations on your Emacs screen, whenever necessary.
|
||
For this to happen, you might want to add the lines:
|
||
|
||
(autoload 'po-find-file-coding-system "po-mode")
|
||
(modify-coding-system-alist 'file "\\.po[tx]?\\'\\|\\.po\\."
|
||
'po-find-file-coding-system)
|
||
|
||
to your `.emacs' file.
|
||
|
||
|
||
File: gettext.info, Node: PO Files, Next: Main PO Commands, Prev: Installation, Up: Basics
|
||
|
||
The Format of PO Files
|
||
======================
|
||
|
||
A PO file is made up of many entries, each entry holding the relation
|
||
between an original untranslated string and its corresponding
|
||
translation. All entries in a given PO file usually pertain to a
|
||
single project, and all translations are expressed in a single target
|
||
language. One PO file "entry" has the following schematic structure:
|
||
|
||
WHITE-SPACE
|
||
# TRANSLATOR-COMMENTS
|
||
#. AUTOMATIC-COMMENTS
|
||
#: REFERENCE...
|
||
#, FLAG...
|
||
msgid UNTRANSLATED-STRING
|
||
msgstr TRANSLATED-STRING
|
||
|
||
The general structure of a PO file should be well understood by the
|
||
translator. When using PO mode, very little has to be known about the
|
||
format details, as PO mode takes care of them for her.
|
||
|
||
Entries begin with some optional white space. Usually, when
|
||
generated through GNU `gettext' tools, there is exactly one blank line
|
||
between entries. Then comments follow, on lines all starting with the
|
||
character `#'. There are two kinds of comments: those which have some
|
||
white space immediately following the `#', which comments are created
|
||
and maintained exclusively by the translator, and those which have some
|
||
non-white character just after the `#', which comments are created and
|
||
maintained automatically by GNU `gettext' tools. All comments, of
|
||
either kind, are optional.
|
||
|
||
After white space and comments, entries show two strings, giving
|
||
first the untranslated string as it appears in the original program
|
||
sources, and then, the translation of this string. The original string
|
||
is introduced by the keyword `msgid', and the translation, by `msgstr'.
|
||
The two strings, untranslated and translated, are quoted in various
|
||
ways in the PO file, using `"' delimiters and `\' escapes, but the
|
||
translator does not really have to pay attention to the precise quoting
|
||
format, as PO mode fully intend to take care of quoting for her.
|
||
|
||
The `msgid' strings, as well as automatic comments, are produced and
|
||
managed by other GNU `gettext' tools, and PO mode does not provide
|
||
means for the translator to alter these. The most she can do is merely
|
||
deleting them, and only by deleting the whole entry. On the other
|
||
hand, the `msgstr' string, as well as translator comments, are really
|
||
meant for the translator, and PO mode gives her the full control she
|
||
needs.
|
||
|
||
The comment lines beginning with `#,' are special because they are
|
||
not completely ignored by the programs as comments generally are. The
|
||
comma separated list of FLAGs is used by the `msgfmt' program to give
|
||
the user some better disgnostic messages. Currently there are two
|
||
forms of flags defined:
|
||
|
||
`fuzzy'
|
||
This flag can be generated by the `msgmerge' program or it can be
|
||
inserted by the translator herself. It shows that the `msgstr'
|
||
string might not be a correct translation (anymore). Only the
|
||
translator can judge if the translation requires further
|
||
modification, or is acceptable as is. Once satisfied with the
|
||
translation, she then removes this `fuzzy' attribute. The
|
||
`msgmerge' programs inserts this when it combined the `msgid' and
|
||
`msgstr' entries after fuzzy search only. *Note Fuzzy Entries::.
|
||
|
||
`c-format'
|
||
`no-c-format'
|
||
These flags should not be added by a human. Instead only the
|
||
`xgettext' program adds them. In an automatized PO file processing
|
||
system as proposed here the user changes would be thrown away
|
||
again as soon as the `xgettext' program generates a new template
|
||
file.
|
||
|
||
In case the `c-format' flag is given for a string the `msgfmt'
|
||
does some more tests to check to validity of the translation.
|
||
*Note msgfmt Invocation::.
|
||
|
||
It happens that some lines, usually whitespace or comments, follow
|
||
the very last entry of a PO file. Such lines are not part of any entry,
|
||
and PO mode is unable to take action on those lines. By using the PO
|
||
mode function `M-x po-normalize', the translator may get rid of those
|
||
spurious lines. *Note Normalizing::.
|
||
|
||
The remainder of this section may be safely skipped by those using
|
||
PO mode, yet it may be interesting for everybody to have a better idea
|
||
of the precise format of a PO file. On the other hand, those not
|
||
having GNU Emacs handy should carefully continue reading on.
|
||
|
||
Each of UNTRANSLATED-STRING and TRANSLATED-STRING respects the C
|
||
syntax for a character string, including the surrounding quotes and
|
||
imbedded backslashed escape sequences. When the time comes to write
|
||
multi-line strings, one should not use escaped newlines. Instead, a
|
||
closing quote should follow the last character on the line to be
|
||
continued, and an opening quote should resume the string at the
|
||
beginning of the following PO file line. For example:
|
||
|
||
msgid ""
|
||
"Here is an example of how one might continue a very long string\n"
|
||
"for the common case the string represents multi-line output.\n"
|
||
|
||
In this example, the empty string is used on the first line, to allow
|
||
better alignment of the `H' from the word `Here' over the `f' from the
|
||
word `for'. In this example, the `msgid' keyword is followed by three
|
||
strings, which are meant to be concatenated. Concatenating the empty
|
||
string does not change the resulting overall string, but it is a way
|
||
for us to comply with the necessity of `msgid' to be followed by a
|
||
string on the same line, while keeping the multi-line presentation
|
||
left-justified, as we find this to be a cleaner disposition. The empty
|
||
string could have been omitted, but only if the string starting with
|
||
`Here' was promoted on the first line, right after `msgid'.(1) It was
|
||
not really necessary either to switch between the two last quoted
|
||
strings immediately after the newline `\n', the switch could have
|
||
occurred after *any* other character, we just did it this way because
|
||
it is neater.
|
||
|
||
One should carefully distinguish between end of lines marked as `\n'
|
||
*inside* quotes, which are part of the represented string, and end of
|
||
lines in the PO file itself, outside string quotes, which have no
|
||
incidence on the represented string.
|
||
|
||
Outside strings, white lines and comments may be used freely.
|
||
Comments start at the beginning of a line with `#' and extend until the
|
||
end of the PO file line. Comments written by translators should have
|
||
the initial `#' immediately followed by some white space. If the `#'
|
||
is not immediately followed by white space, this comment is most likely
|
||
generated and managed by specialized GNU tools, and might disappear or
|
||
be replaced unexpectedly when the PO file is given to `msgmerge'.
|
||
|
||
---------- Footnotes ----------
|
||
|
||
(1) This limitation is not imposed by GNU `gettext', but comes from
|
||
the `msgfmt' implementation on Solaris.
|
||
|
||
|
||
File: gettext.info, Node: Main PO Commands, Next: Entry Positioning, Prev: PO Files, Up: Basics
|
||
|
||
Main PO mode Commands
|
||
=====================
|
||
|
||
After setting up Emacs with something similar to the lines in *Note
|
||
Installation::, PO mode is activated for a window when Emacs finds a PO
|
||
file in that window. This puts the window read-only and establishes a
|
||
po-mode-map, which is a genuine Emacs mode, in a way that is not derived
|
||
from text mode in any way. Functions found on `po-mode-hook', if any,
|
||
will be executed.
|
||
|
||
When PO mode is active in a window, the letters `PO' appear in the
|
||
mode line for that window. The mode line also displays how many
|
||
entries of each kind are held in the PO file. For example, the string
|
||
`132t+3f+10u+2o' would tell the translator that the PO mode contains
|
||
132 translated entries (*note Translated Entries::., 3 fuzzy entries
|
||
(*note Fuzzy Entries::.), 10 untranslated entries (*note Untranslated
|
||
Entries::.) and 2 obsolete entries (*note Obsolete Entries::.).
|
||
Zero-coefficients items are not shown. So, in this example, if the
|
||
fuzzy entries were unfuzzied, the untranslated entries were translated
|
||
and the obsolete entries were deleted, the mode line would merely
|
||
display `145t' for the counters.
|
||
|
||
The main PO commands are those which do not fit into the other
|
||
categories of subsequent sections. These allow for quitting PO mode or
|
||
for managing windows in special ways.
|
||
|
||
`U'
|
||
Undo last modification to the PO file.
|
||
|
||
`Q'
|
||
Quit processing and save the PO file.
|
||
|
||
`q'
|
||
Quit processing, possibly after confirmation.
|
||
|
||
`O'
|
||
Temporary leave the PO file window.
|
||
|
||
`?'
|
||
`h'
|
||
Show help about PO mode.
|
||
|
||
`='
|
||
Give some PO file statistics.
|
||
|
||
`V'
|
||
Batch validate the format of the whole PO file.
|
||
|
||
The command `U' (`po-undo') interfaces to the GNU Emacs *undo*
|
||
facility. *Note Undoing Changes: (emacs)Undo. Each time `U' is typed,
|
||
modifications which the translator did to the PO file are undone a
|
||
little more. For the purpose of undoing, each PO mode command is
|
||
atomic. This is especially true for the `<RET>' command: the whole
|
||
edition made by using a single use of this command is undone at once,
|
||
even if the edition itself implied several actions. However, while in
|
||
the editing window, one can undo the edition work quite parsimoniously.
|
||
|
||
The commands `Q' (`po-quit') and `q' (`po-confirm-and-quit') are
|
||
used when the translator is done with the PO file. The former is a bit
|
||
less verbose than the latter. If the file has been modified, it is
|
||
saved to disk first. In both cases, and prior to all this, the
|
||
commands check if some untranslated message remains in the PO file and,
|
||
if yes, the translator is asked if she really wants to leave off
|
||
working with this PO file. This is the preferred way of getting rid of
|
||
an Emacs PO file buffer. Merely killing it through the usual command
|
||
`C-x k' (`kill-buffer') is not the tidiest way to proceed.
|
||
|
||
The command `O' (`po-other-window') is another, softer way, to leave
|
||
PO mode, temporarily. It just moves the cursor to some other Emacs
|
||
window, and pops one if necessary. For example, if the translator just
|
||
got PO mode to show some source context in some other, she might
|
||
discover some apparent bug in the program source that needs correction.
|
||
This command allows the translator to change sex, become a programmer,
|
||
and have the cursor right into the window containing the program she
|
||
(or rather *he*) wants to modify. By later getting the cursor back in
|
||
the PO file window, or by asking Emacs to edit this file once again, PO
|
||
mode is then recovered.
|
||
|
||
The command `h' (`po-help') displays a summary of all available PO
|
||
mode commands. The translator should then type any character to resume
|
||
normal PO mode operations. The command `?' has the same effect as `h'.
|
||
|
||
The command `=' (`po-statistics') computes the total number of
|
||
entries in the PO file, the ordinal of the current entry (counted from
|
||
1), the number of untranslated entries, the number of obsolete entries,
|
||
and displays all these numbers.
|
||
|
||
The command `V' (`po-validate') launches `msgfmt' in verbose mode
|
||
over the current PO file. This command first offers to save the
|
||
current PO file on disk. The `msgfmt' tool, from GNU `gettext', has
|
||
the purpose of creating a MO file out of a PO file, and PO mode uses
|
||
the features of this program for checking the overall format of a PO
|
||
file, as well as all individual entries.
|
||
|
||
The program `msgfmt' runs asynchronously with Emacs, so the
|
||
translator regains control immediately while her PO file is being
|
||
studied. Error output is collected in the GNU Emacs `*compilation*'
|
||
buffer, displayed in another window. The regular GNU Emacs command
|
||
`C-x`' (`next-error'), as well as other usual compile commands, allow
|
||
the translator to reposition quickly to the offending parts of the PO
|
||
file. Once the cursor is on the line in error, the translator may
|
||
decide on any PO mode action which would help correcting the error.
|
||
|